xenia-rs

Author	SHA1	Message	Date
MechaCat02	acb29db444	[iterate-3AL] Superblock dispatch: chain basic blocks per slot-visit (~1.6x boot-to-splash) Replace the one-basic-block-per-slot-per-round lockstep dispatch with a SUPERBLOCK runner: each slot-visit chains straight-line blocks through their terminating branches up to a deterministic instruction budget, amortizing the per-round (timebase/coord/round_schedule) and per-slot (worker_prologue) dispatch tax over ~128 instructions instead of ~6. Yield-points (end the chain, return to the round) are pure functions of guest state, preserving the lockstep cross-thread interleaving correctness: - non-Continue step result (Yield/SystemCall/Trap/Unimpl/Halted); db16cyc Yield is the spin-wait producer hand-off. - sync-sensitive block: lwarx/ldarx/stwcx./stdcx. or sync/eieio/isync (new PpcOpcode::is_sync_sensitive, flagged on DecodedBlock at build). - MMIO touch: new GuestMemory::mmio_access_count() watermark, sampled per block, keeps GPU/register ordering at one-block granularity. - next PC leaves ordinary guest code (import thunk / halt sentinel / unmapped) -> hand to the full worker_prologue next round. - instruction budget reached. Instruction-count/clock accounting stays exact: per-block cycle_count deltas are summed and handed to worker_epilogue once (instruction_count + decrement_quantum advance by the precise retired count). XENIA_SUPERBLOCK_BUDGET=1 reproduces the old one-block schedule byte-for-byte. Budget tuned to 128 (env-overridable): boot progression stays healthy up to 256, sharp cliff at ~384 (a boot producer/consumer handoff starves); 128 is 3x below the cliff. Also scale the inline-GPU per-round fairness cap with the budget (flat 64 throttled GPU command processing 17x under superblocks and collapsed the present loop). PERF (check -n 100M --gpu-inline): 25.3 -> 42.7 MIPS (1.69x); 1B: 26.0 -> 41.4 MIPS (1.59x). Callgrind n=5M: host instructions 2.178B -> 1.507B (-31%); worker_prologue -90%, coord_pre_round -91%, begin_slot_visit / round_schedule_into / coord_post_round / update_timestamp_bundle each ~-90%; interpreter execute byte-identical (real work unchanged). GATES: C1 boot progression 150M draws 7391/swaps 2164 (baseline 7415/2172), 1B draws 88547/swaps 29228 linear no stall, K8888 decode + RTs=2 intact. C2 determinism: n50m stable digest byte-identical across fresh runs; golden re-baselined intentionally (pacing-only deltas: imports 333453->243387, draws 1274->1279). C3 milestone-1 render: texture_decodes/draws/swaps/ present cadence track baseline (3AJ fade-in pacing preserved). C4: 690 tests green (+2 sync_sensitive). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-19 22:31:54 +02:00
MechaCat02	19659d7f76	feat(kernel): KRNBUG-XAM-001 — XGetAVPack returns 8 (HDMI), not 0x16 Mirrors canary's cvars::avpack default (xam_info.cc:35) and Sylpheed's accepted set {3,4,6,8} (xam_info.cc:250-251). With KRNBUG-XEX-001 having flipped the priv-10 gate, XGetAVPack now reaches its caller in sub_824AB578; returning 0x16 caused Sylpheed to abort the AV/crypto block before XeCryptSha. Cascade walks one step (canary-only export list 11 → 10); sub_824ABA98 is the next candidate. Tests: 589 → 590. Goldens re-baselined (n50m: 50000005→50000004, imports 407417→407416). Lockstep deterministic across 3 reruns at -n 100M (instructions=100000010, import_calls=987686 +2.4×, swaps=2). 9-PC producer probe still 0×; parked handles 0x1004/0x100c/0x15e0 still signal_attempts=0. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 18:54:24 +02:00
MechaCat02	1a892d4641	feat(kernel): KRNBUG-XEX-001 — real XexCheckExecutablePrivilege from XEX header bitmap Replace stub_return_zero with a canary-faithful implementation that returns bit `priv` of the loaded XEX's XEX_HEADER_SYSTEM_FLAGS (key 0x00030000) bitmap. Mirrors xenia-canary xboxkrnl_modules.cc:22-39: `(flags >> priv) & 1` for priv < 32, else 0. Plumbing: - xenia-xex: header_keys::SYSTEM_FLAGS const + get_system_flags() accessor. - xenia-kernel/state.rs: pub xex_system_flags: u32 + xex_priv_logged HashSet for one-shot per-priv tracing. - xenia-app: kernel.xex_system_flags wired in cmd_exec_inner. - xenia-kernel/exports.rs: real export body + unit test covering bits 10/11/0/64 + zero-flags case. Sylpheed's bitmap is 0x00000400 (only XEX_SYSTEM_PAL50_INCOMPATIBLE, bit 10). At -n 500M with the fix: - XGetAVPack: 0 -> 1 (priv-10 gate at lr=0x824ab598 flipped). - 10 other canary-only exports + 9 producer PCs + 3 parked handles unchanged. Priv-11 site at sub_824A9710 is downstream and still not reached — AV/crypto block aborts after XGetAVPack returns our placeholder 0x16 (canary returns 8/HDMI; Sylpheed accepts only 3/4/6/8 per xenia-canary xam_info.cc:250-251). Tests 588 -> 589. Lockstep deterministic (3 reruns identical): n50m goes 50000008 -> 50000005 instr / 407415 -> 407417 imp / swaps=2 / draws=0. Goldens re-baselined (sylpheed_n50m, sylpheed_n2m); oracle test green. Full chain-of-effects + next-frontier hand-off in audit-findings.md under KRNBUG-XEX-001. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 18:32:51 +02:00
MechaCat02	bae9305982	xenia-app: observability subsystem, --parallel runtime, stress harness observability.rs installs the tracing subscriber stack (env-filter + JSON file appender + chrome trace + error layer) and the metrics recorder shared by the workspace. main.rs grows the new CLI surface: --parallel, --reservations-table, --trace-handles, --analyze= {rust,sql,both}, xenia dis --json, --ui, plus the wiring that runs the CPU through the new scheduler, drives the GPU's threaded backend, and surfaces the framebuffer + HUD via xenia-ui. Add tests/parallel_stress.rs (#[ignore]-gated long form, short form runs 20×@5M) and tests/golden/sylpheed_n2m.json — the digest the lockstep/parallel combos compare against. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 16:30:26 +02:00

4 Commits