Commit Graph

50 Commits

Author SHA1 Message Date
MechaCat02
034ec8b47f [iterate-2O] GPU: drain indirect buffers correctly — Sylpheed renders splash (draws 0→78)
Ours' GPU never drained the D3D driver's system command buffer past the first
11-dword indirect buffer, so DRAW_INDX / reg-0x57C-arm packets never executed
and draws stayed 0 (the long-hunted render gate; see UPDATE-18). Runtime tracing
(temporary, removed) showed the guest submits 6 INDIRECT_BUFFER packets at boot
(CP_RB_WPTR 22→37) but ours executed exactly ONE IB and then spun 15.7M packets
inside it. Three coupled command-processor bugs, all corrected to match canary:

1. `sync_with_mmio` applied the primary CP_RB_WPTR to whichever ring was active,
   including an executing indirect buffer — `37 % 11 = 3` clobbered the IB's
   write pointer so its read pointer looped 0→2→5→0 forever and never popped
   back to the primary ring. CP_RB_WPTR governs ONLY the primary ring; while an
   IB executes, the primary is the bottom of the IB stack. Canary executes each
   IB through a separate `RingBuffer reader_` (command_processor.cc), so the
   primary write pointer is structurally inapplicable to an IB.

2. Indirect buffers were treated as circular rings: read wrapped at `size_dwords`
   (`11 % 11 = 0`) and never reached the fixed write pointer, so even without the
   clobber the IB could not terminate. An IB is a fixed *linear* sub-stream; add
   `RingBufferView.indirect` and drain `[0, ib_size)` monotonically, then pop.

3. `is_ready` only checked the active ring, so an IB that now correctly exhausts
   would never get `execute_one` called again to pop back to the primary ring
   (whose WPTR may have advanced). Check the whole IB stack.

Also: the ring was sized `1 << size_log2` bytes (1024 dwords) vs canary's
`1 << (size_log2 + 3)` (8192 dwords) — an 8× undersize that desynced WPTR-wrap
math from the guest. Fixed in `GpuSystem::initialize_ring_buffer` (and the
dead bookkeeping copy in `vd_initialize_ring_buffer`).

Cascade (deterministic; threaded-default backend, byte-identical across runs):
reg 0x57C now written, IB jumps 1→12, packets 15.7M→9,825, and the splash
renders — draws 0→78, shaders 0→3, render_targets 0→2, swaps 2→3 — stable at
50M / 200M / 1B. Boot then reaches a new downstream gate (draws plateau at 78,
interrupts keep climbing → engine alive, not deadlocked).

golden `sylpheed_n50m.json` re-baselined (draws 78). `cargo test --workspace`
green (674; +2 ring_view regression tests). vd_swap's synthetic-swap
short-circuit is now redundant but left untouched (cascade works without
changing it); cleaning it up is a separate follow-up.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 22:06:16 +02:00
MechaCat02
93f60a3ba0 [iterate-2M] PCR+0x10C (PRCB.current_cpu): init per-HW-thread to unwedge spin-barrier
Ours never initialized the PRCB `current_cpu` byte at PCR+0x10C
(prcb_data@0x100 + current_cpu@0xC). Canary sets it from
`GetFakeCpuNumber(affinity)` (xthread.cc:847 `pcr->prcb_data.current_cpu =
cpu_index`), which equals the HW thread id ours already writes at PCR+0x2C.
Left unwritten it read 0 for every thread.

Guest spin-barrier `sub_824D1328` (used by the audio/update pump threads at
entries 0x824D2878 / 0x824D2940, ours tid 9 / tid 10) indexes a per-HW-thread
occupancy byte array via `lbz r11, 268(r13)` then `stbx ..., [array+index]`.
With index 0 for all threads, every thread marked slot 0; the multi-byte
rendezvous signature it then spins on (`ld [obj+0x164]` compared against the
packed per-slot expectation) could never assemble. Both pump threads busied at
pc 0x824d140c/0x824d1410 forever (Ready, 5M+ barrier iterations) and never ran
their `KeSetEvent` loops — so the events they signal (the 21k-per-thread
heartbeat in canary) never fired, starving the downstream worker handshake.

Fix: write `hw_id` to PCR+0x10C alongside PCR+0x2C in both the static thread
image init (thread.rs) and the dynamic PcrWriter (state.rs, used by scheduler
spawn + affinity migration) so the two stay in sync.

Runtime-verified BOTH engines. Post-fix the pump threads escape the barrier
(barrier iterations 5M+ -> 3) and advance into their loop bodies, now correctly
Blocked(WaitAny) at pc 0x824d28d0 / 0x824d29c0 (was spinning at 0x824d140c).
imports at n50M 339,766 -> 451,508; deterministic (two cold runs byte-identical).
draws still 0 (a later, separate render gate). golden re-baselined.
cargo test --workspace: 672 passed, 0 failed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 18:08:46 +02:00
MechaCat02
ed2e0e72fd [iterate-2J] KeTimeStampBundle deterministic tick: fix frozen+mislaid guest clock
The xboxkrnl data export KeTimeStampBundle (ordinal 0x00AD, import slot
0x820007d0 — confirmed via sylpheed.db imports table) was set up with TWO
defects in the import-patch pass:

  1. FROZEN: the block was written once at boot and never updated, so every
     field stayed a constant for the whole run (observed: the guest's clock
     reader sub_824AA830 = [[0x820007d0]+0x10] returned a constant
     0x01d6bc0c from 5M..150M instructions).
  2. WRONG LAYOUT: it stuffed the FILETIME high-dword at +0x10. The canonical
     X_TIME_STAMP_BUNDLE (xenia-canary kernel_state.h) is:
       +0x00 interrupt_time u64 (100ns since boot)
       +0x08 system_time    u64 (FILETIME 100ns since 1601)
       +0x10 tick_count      u32 (milliseconds since boot)
       +0x14 padding
     so [block+0x10] is tick_count in ms, not a FILETIME dword.

Fix (deterministic, no wall-clock):
  * Initialize the block with the correct field layout (tick_count = 0 at
    boot, system_time = FILETIME base, interrupt_time = 0).
  * Store the block VA on KernelState::timestamp_bundle_addr during the
    import patch.
  * Add KernelState::update_timestamp_bundle(mem, clock) and call it every
    round in BOTH the lockstep (run_execution) and parallel
    (run_execution_parallel) outer loops, right where the deterministic
    Scheduler::global_clock is advanced. The clock is the retired-instruction
    monotonic global_clock, so every guest-visible time value stays a pure
    function of guest progress (lockstep byte-reproducible).
  * Cadence: 1 global_clock unit = 100ns (coherent with parse_timeout, which
    divides 100ns timeouts by 100 onto the same basis), so
    INSTRUCTIONS_PER_MS = 10_000. tick_count now advances 0 -> ~4999ms over
    a 50M-instruction window. Also make KeQuerySystemTime read the same
    100ns clock instead of a frozen FILETIME constant.

Verification: tick_count at 0x40002010 now advances (deadline arm at
0x82450d0c stores clock+66 = 0x260,0x269,...,0x51d,... advancing, vs the
frozen 0x01d6bc4e before the fix). Determinism: two cold --stable-digest
runs are byte-identical; the n50m golden is UNCHANGED (the clock-affected
counter is not in the stable digest). 672/672 tests pass.

HONEST CAVEAT — the predicted render cascade did NOT materialize on this
branch. The diagnosed consuming gate at 0x82450b10 (the clock-vs-deadline
compare in the worker-hub channel loop sub_82450A68) is unreachable here:
the loop always branches away at 0x82450b0c ([this+220] >= channel-index),
so the hub already dispatches sub_82450B68 342x in BOTH the frozen and
fixed builds. Guest trajectory (imports 339766@50M / 1738001@200M /
9212446@1B), draws (0), swaps (2) and thread topology (tid14 Ready, not
blocked on 0x109c) are identical frozen-vs-fixed. This commit is therefore
a correct latent-clock-bug fix and determinism-safe prerequisite, NOT the
render unblock. The 0x109c/tid14 starvation premise was not reproduced at
f75bc96; the next gate must be re-localized.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 11:54:44 +02:00
MechaCat02
de21c7a544 [iterate-2G] db16cyc spin-hint cooperative yield: unblock title-screen 0x10a0 gate
The silph title state machine (tid13) blocked on event 0x10a0, never signaled.
Root: the event's producer chain runs on the silph worker (entry 0x821C4AD0,
our tid14), which was starved. tid14 shares a HW slot with a guest spinlock/
barrier participant (sub_824D1328, entry 0x824D2940) that busy-spins on the
db16cyc hint `or r31,r31,r31` (encoding 0x7FFFFB78) at 0x824D140C. Under our
round-robin lockstep the spinner consumed its whole block every round and
starved the co-located tid14 (only 9 progress hits over 200M instr) — so the
producer never reached the event-create/duplicate/signal dance the canary
oracle performs (handle F80000E8 set by the submitter F8000044 via a duplicated
handle).

Fix (canary-faithful): recognize the db16cyc spin hint exactly as canary's
InstrEmit_orx does (code 0x7FFFFB78 -> DelayExecution) and surface it as a new
StepResult::Yield. The scheduler's yield_current() promotes every Ready peer on
the slot past STARVE_LIMIT so begin_slot_visit picks one next round, then they
reset and the spinner reclaims the slot — fair alternation, no priority
inversion, pure function of slot state (deterministic).

Result (lockstep, cache-persist, -n 200M): tid14 progresses past its old stall
into a real wait; tid13 advances off 0x10a0 to a new event; hub/submitter
re-enter their wait loops. imports 280k->592k, packets 124M->164M, swaps 1->2.
draws still 0 (the splash's first draw is a further-upstream gate).

Determinism preserved (two cold n50m runs byte-identical). n50m golden
re-baselined (imports 90296->339766, swaps 1->2; draws unchanged 0). n2m
golden unchanged (db16cyc not reached in first 2M). Tests 670/670.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 10:38:17 +02:00
MechaCat02
f3b7e8b760 [iterate-2F] Scheduler anti-starvation floor: fix job-4 handoff render gate
The lockstep scheduler's pick_runnable is strict priority
(max_by_key (priority, -idx)). On a cooperative single-host HW slot,
a CPU-bound spinner that never blocks (the silph poll loop pinned by
affinity to hw=5) wins pick_runnable every round forever, permanently
starving a co-located peer (the submitter, tid6) that the spinner is
actually waiting on. On real hardware those threads run on separate SMT
contexts concurrently, so the spinner never starves the submitter; ours
collapses them onto one slot with no anti-starvation, turning priority
(or equal-priority index order) into permanent starvation.

The starved submitter never dequeued job-4 -> the worker-hub (tid5)
blocked INFINITE on completion event 0x1080 -> silph (tid13) wedged on
0x1078 -> no vsync -> draws_seen=0, the publisher splash never renders.
(decrement_quantum's within-slot rotation is dead: begin_slot_visit
unconditionally re-pick_runnable()s each round, discarding the rotated
running_idx. The fix is therefore evaluated at pick time, not via that
discarded rotation.)

Fix (Option A, bounded anti-starvation, deterministic):
- Add per-thread steps_starved counter to GuestThread.
- begin_slot_visit increments it for every Ready peer passed over this
  visit, resets it to 0 for the picked thread.
- pick_runnable selects by effective_priority: once steps_starved
  reaches STARVE_LIMIT (4096) the thread is lifted to i32::MAX and wins
  exactly one pick, then resets. The genuinely higher-priority thread
  still wins ~4095/4096 visits -- the boost grants periodic forward
  progress only, it does NOT invert priority. Pure function of
  counter/priority/index -> deterministic (no wall-clock, no RNG).

Cascade (lockstep exec, XENIA_CACHE_PERSIST=1, -n 200M):
- submitter dequeue sub_82458508 now fires 4x (was 3x); the 4th job
  (buf 0x40baa2c0) is dequeued at cycle 6.15M.
- hub tid5 leaves Blocked(0x1080) -> now Ready (no more INFINITE wait).
- GPU packets 0 -> 116,101,363 (command stream now flowing).
- tid13 (silph::UImpl) advances past the old 0x1078 wedge to a NEW
  downstream wait (handle 0x10a0); 3 new threads spawn (tid14/15/16).
- draws_seen still 0 -> the splash's first draw is a NEW downstream gate,
  not this starvation.

Determinism: two cold lockstep `check -n 5M` runs byte-identical (full
and stable digests). New n50m stable digest deterministic across two
cold runs. Golden re-baselined: instructions 50000007->50000003,
imports 92317->90296 (trajectory shift from the changed pick order).

Tests: 666/666 (+1 test_anti_starvation_bounded_progress).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 10:02:02 +02:00
MechaCat02
7e2603a9e5 [iterate-2E] Extend coherent monotonic clock to lockstep (timebase-desync livelock fix)
Lockstep livelocked the scheduler the same way --parallel did before
0332d19: the kernel deadline-arithmetic (`now_basis_at`) read per-thread
`ctx(hw_id).timebase`, but a parked/poll thread has `running_idx == None`
so `Scheduler::ctx()` returns `idle_ctx` (timebase 0). A poll thread (tid=7,
a `KeWaitForSingleObject` loop with a 30ms relative timeout) computing its
deadline via `parse_timeout` therefore read `now = 0` and registered
`deadline = 0 + 3000 = 3000` — a constant ~7.78M units in the past.
`coord_idle_advance` then re-armed that same constant 3000 deadline forever,
pinning virtual time and starving every other thread's real future deadline.

Render-gate impact: the submitter (tid=6) re-enters a 16ms-timeout
WaitForMultiple after its first jobs; that timeout never fired because vtime
was pinned at 3000, so virtual time never reached real future deadlines.

Fix (Option A — mirror the parallel fix): drive the existing deterministic
`Scheduler::global_clock` in lockstep too (floored up once per outer round
to `stats.instruction_count`, a pure function of retired guest instructions —
no wall-clock), and route `KernelState::now_basis_at` through `global_clock()`
in BOTH modes. New `Scheduler::advance_global_clock_to(now)` floor-up keeps it
monotone alongside `advance_all_timebases_to`. Parallel behavior unchanged
(it already read `global_clock()`).

Verified (lockstep, 50M):
- DETERMINISM: two cold `check -n 5M` and two cold `-n 50M` runs byte-identical.
- LIVELOCK GONE: "advanced to deadline" went from 592,679 fires / 2 unique
  values / 562,084 pinned at 3000  ->  18,586 fires / 18,567 unique /
  0 pinned, strictly increasing 5.4M -> 50M. Poll thread tid=7 now ends
  Blocked with a real future deadline Some(60002824) instead of spin-Ready
  on the past 3000.
- imports 1,790,936 -> 92,317 at 50M (the spin no longer burns import calls).

Cascade (lockstep, XENIA_CACHE_PERSIST=1, -n 200M): engine now runs to budget
instead of hard-deadlocking. Hub enqueue (sub_82458068) 4x; submitter dequeue
(sub_82458508) still 3x — the lost 4th-job HANDOFF (count/notify between
sub_82458068's tail and the submitter queue) is a SEPARATE downstream gate,
not the timebase. New gate: tid=5 (hub) Blocked INFINITE on event 0x1080
(job-4 completion); tid=6 (submitter) Ready, parked in WaitForMultiple
(sub_824AB214), loop-top stops at cycle 6.23M. draws still 0, VdSwap 1.

Golden re-baseline (same commit): sylpheed_n50m
  instructions 50000004 -> 50000007, imports 1790936 -> 92317
  (swaps/draws/RTs/shaders/textures unchanged). sylpheed_n2m unchanged
  (livelock onsets after 2M). Suite 665/665 + oracle green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 21:42:28 +02:00
MechaCat02
5aaadfec36 [iterate-2E] Add XENIA_AUDIT_DEREF pointer-chase probe
On each AUDIT-PC-PROBE fire, treat gpr[reg] as a base object, dump its
first 64 bytes, follow [base+off] to a sub-object, dump that, then follow
[[base+off]+0] to its vtable and dump 48 slots. Env-gated
(XENIA_AUDIT_DEREF=<reg>:<off>), read-only, lockstep digest unaffected.

Captures the live work-item + stream object + vtable at sub_824510E0
before the pool recycles the slot — which overturned the prior session's
"infinite spin" diagnosis: the streaming read PROGRESSES 68/68 128KB
chunks of a 9MB file, then the hub (tid=5) blocks INFINITE on a
self-created Event/Manual (0x1060) that is never signaled.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-12 20:29:01 +02:00
MechaCat02
0332d1990d [Track 2] Parallel-scoped global clock fixes timebase-desync livelock
In --parallel mode a long run livelocked: the scheduler spun
"advanced to deadline 3000 waking hw=2 idx=0" ~14k times in
microseconds. Root cause: each guest thread owns ctx.timebase
(+1/instr in step_block), and all kernel deadline arithmetic read
Scheduler::ctx(hw_id).timebase as "now". But the parallel worker
extracts its PpcContext via mem::replace(ctx_mut_ref, PpcContext::new())
— leaving a ZEROED timebase in the slot while it steps unlocked — and
advance_all_timebases_to only walks runqueue (never idle_ctx). So the
coordinator's coord_pre_round drain and a woken thread's parse_timeout
could read a zeroed/stale basis decoupled from the deadline the
scheduler just advanced to. The thread re-armed the same constant
deadline forever; the global clock never moved.

Fix: add a single monotonic Scheduler::global_clock, advanced by the
per-block retired-instruction count on each parallel writeback and
floored up by advance_all_timebases_to. Kernel deadline reads route
through KernelState::now_basis_at(hw_id), which returns global_clock
ONLY when parallel_active; lockstep keeps reading the exact pre-existing
ctx(hw_id).timebase expression, so the deterministic lockstep trace is
byte-identical (sylpheed_n50m golden unchanged, zero re-baseline).

Verified:
- 50M --parallel run completes (was: hung). Deadlines now strictly
  increasing 5.4M -> 49.1M (18097 unique of 18116; max repeat 2) vs
  pre-fix constant 3000 x ~14000.
- sylpheed_n50m golden byte-identical via plain `check` (no persist).
- Full suite 665/665 green.

Note: an intermittent parallel hang/crash (~1-2/20 at -n 5M) is
pre-existing (master 1/20, this build 2/20 — within noise) and distinct
from the timebase livelock: it is a parallel-race class (e.g. the
unsafe block_ptr deref in run_execution_parallel). Tracked separately;
lockstep remains the recommendation for long runs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 19:32:14 +02:00
MechaCat02
341196a111 [Issue-1 PPCBUG-020] Word-form ALU ops produce full 64-bit results
Xenon is a 64-bit PPC core (32-bit *pointer* ABI, but 64-bit registers and
integer arithmetic). The interpreter was truncating every word-form integer
ALU writeback to 32 bits and zero-extending, on a false "MSR.SF=0 / 32-bit
ABI" premise. This silently corrupted any genuine 64-bit value flowing through
word-form arithmetic.

Confirmed load-bearing via runtime ours-vs-canary capture: Sylpheed's
millisecond->LARGE_INTEGER timeout converter sub_824ACA88 does
`clrldi; mulli r11,r11,-10000; std`. For a 16 ms wait the correct result is
-160000 = 0xFFFFFFFF_FFFD8F00 (relative). canary stores exactly that; ours'
truncating `mulli` stored 0x00000000_FFFD8F00 (positive) -> the i64 timeout
read as a huge *absolute* deadline -> a ~26000x over-wait that froze the main
frame loop. After the fix the timeout matches canary and the previously-frozen
frame/worker loops run (parallel boot NtWaitForMultipleObjectsEx 94 -> 30428;
KeWaitForSingleObject/critical-section loops resume).

Fix mirrors canary's INT64 emitters (ppc_emit_alu.cc) op-by-op for the 17
data-losing word-form ops: addis, addic(.), subfic(.), mulli, add(c/e/ze/me)x,
subf(c/e/ze/me)x, negx, mullwx. Only the result *writeback* widens to full
64 bit; the 32-bit carry (XER[CA]) and overflow (XER[OV]) computations and the
CR0 i32 view are preserved byte-identical (the low 32 bits of the new result
equal the old truncated result), so this is a strict no-op for clean 32-bit
values and only restores the previously-zeroed upper bits for genuine 64-bit
values. Genuinely-32-bit ops (rlwinm/slw/srw/cmpw, mulhw/divw whose upper bits
are ISA-undefined) are left untouched.

Updated 7 unit tests that asserted the truncation (they encoded the bug) to the
canary-correct full-64-bit values. Re-baselined the sylpheed_n50m golden
(imports 40454 -> 1790936: the unwedged frame/worker loops now cycle under the
instruction-count timebase); sylpheed_n2m unchanged (pre-frame-loop). Lockstep
determinism preserved (two 50M runs identical). Full suite 660/660.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 16:21:11 +02:00
MechaCat02
b20c99f141 [Subsystem-fixes] 6 verified ours-vs-canary divergence fixes
From the 2026-06-12 5-subsystem differential audit. All verified against
canary as oracle; 660/660 workspace tests green (655 + 5 new).

1. nt_create_event polarity (exports.rs) — `manual_reset = gpr[5] != 0`
   was INVERTED. Canary xboxkrnl_threading.cc:668 `Initialize(!event_type,..)`
   + xevent.cc:41 (type 0 = NotificationEvent = manual, type 1 = Sync = auto).
   Now `== 0`. Was the dormant 2.AI fix on chore/portable-snapshot, never
   merged. The Ke-path was already correct; only the Nt-path was wrong.

2. 2.AF deadline drain (main.rs coord_pre_round) — expired KeWait/KeDelay
   deadlines never fired under load because advance_to_next_wake_if_due was
   only called in coord_idle_advance (no-Ready-threads path). Added a
   per-round drain loop; covers BOTH lockstep and parallel outer loops since
   both call coord_pre_round. Was the dormant 2.AF fix, never merged.

3. handle slab-recycle ABA guard (state.rs + scheduler.rs) — release_handle_slot
   (my round-34 regression) recycled a closed slot even with a thread still
   parked on it, risking a stale-waiter wake when the slot is re-minted. Added
   Scheduler::any_thread_waiting_on; decline to recycle a still-waited slot.

4. vpkpx pixel-pack (vmx.rs) — wrong field mapping (~100% mismatch). Now
   exact canary ppc_emit_altivec.cc:1795 shift/mask (red 6b out[15:10] from
   w[24:19], green out[9:5] from w[14:10], blue out[4:0] from w[7:3]; no
   fabricated alpha bit). +unit test.

5. VFS GDFX attribute plumbing (vfs/*, exports.rs query fns) — VfsEntry now
   carries the real on-disc attribute byte (GDFX dirent +12, canary
   disc_image_device.cc:136/154) instead of inferring directory-ness from
   path shape. Query exports report the real FILE_ATTRIBUTE_* bits. Candidate
   driver of the XamShowDirtyDiscErrorUI gate. +tests.

6. MmGetPhysicalAddress region-aware mirror (exports.rs) — flat 0x1FFFFFFF
   mask missed canary's +0x1000 host_address_offset for 0xE0000000+ mirror
   (memory.cc:2317). Read-only query; proven byte-identical 50M digest. +test.

Investigated and intentionally NOT changed:
- zero-on-recommit: no-op; ours has no region-reuse path (bump allocators,
  free is a stub).
- 32-bit ALU writeback truncation (PPCBUG-020): documented-deliberate; premise
  (MSR.SF=0) is questionable but flipping it is out of scope here.
- KeSetEvent/NtSetEvent return value: ours returns true previous state
  (hardware-faithful); canary returns constant 1 — NOT an ours bug.

sylpheed_n50m golden will need re-baselining (legit behavior change).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-12 14:57:38 +02:00
MechaCat02
db90ad0f7d [AUDIT-059 R-D2] Phase D auto-signal POC confirms audit-049 wedge diagnosis
Hook NtCreateEvent for the silph::UImpl tid=13 chain (entry=0x821748F0,
start_context=0x4024a840, frame-1 LR=0x821CB15C inside sub_821CB030+0x128)
and auto-signal the resulting handle after XENIA_SILPH_UI_AUTOSIGNAL_DELAY
instructions. Env-gated; default off.

SR4 verdict B (partial unwedge):
- handle 0x1078 signal_attempts 0->1
- tid=13 Blocked(WaitAny[0x1078]) -> Ready pc=0x824a9108
- ExCreateThread 10 -> 12 (new silph::UImpl tid=14, worker tid=15)
- New downstream wedges 0x1084 + 0x1088
- cxx_throw runtime_error on tid=5 inside R26 dispatcher
  (BST not-registered instance lhs=0x715a7af0)
- VdSwap stays 1; no draws (POC is diagnostic, not final fix)

Confirms Phase C diagnosis end-to-end. The real signaler must (a) drive
NtSetEvent on the silph KEVENT AND (b) register the dispatcher's BST
instance upstream; this POC only does (a).

Reading-error class #20: ctx.lr at kernel export entry is the thunk
wrapper's return slot, NOT the guest caller's post-bl PC. Walk back-chain
1 step to get frames[1].lr.

Reading-error class #21: --parallel and lockstep have SEPARATE outer
loops in main.rs (run_execution_parallel line 2928 vs run_execution
line 2706). Per-round hooks must be wired in BOTH paths.

Files:
- crates/xenia-cpu/src/scheduler.rs: GuestThread.start_entry/start_context
  fields + spawn() population + current_thread_entry_and_ctx() helper
- crates/xenia-kernel/src/state.rs: AutoSignalPending struct, env-parsed
  silph_autosignal_delay, pending Vec, last_cycle_hint, set_now_cycle_hint,
  maybe_register_silph_autosignal (walks back-chain), fire_due_silph_autosignals
- crates/xenia-kernel/src/exports.rs: hook in nt_create_event
- crates/xenia-app/src/main.rs: fire-site + cycle hint in both outer loops
- audit-runs/audit-059-handle-disambiguation/round-D2-autosignal-poc/FINDINGS.md

Tests 655/655 green. Default behavior byte-identical when env unset.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-11 18:38:38 +02:00
MechaCat02
9340ff4592 [Audit] --audit-r3-dump-bytes: dump N bytes at r3 when probe fires
AUDIT-059 round 15 — diagnostic. When `--audit-r3-dump-bytes=N` is set,
every `--audit-pc-probe-hex` fire emits a paired `AUDIT-R3-DUMP` line
with N bytes of guest memory from r3 as u32 lanes (4-byte aligned, cap
256B). Sized for the 80-byte stack-local struct at sub_82452DC0's
`r31+96` (probe sub_8245B000 entry where r3 IS the struct ptr).

Settable via `XENIA_AUDIT_R3_DUMP_BYTES` env. Read-only; lockstep digest
unaffected (empty-set fast path in fire_audit_pc_probe_if_match).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-07 19:39:22 +02:00
MechaCat02
bcd018659b [Audit] --audit-mem-dump-chain: deref a guest address N levels for diagnosis
Round-14 of AUDIT-2BF (singleton-dump). The bctrl at sub_822F1AA8+0x90
(PC 0x822F1B4C) loads [0x828E1F08] (a global singleton), dereferences
its vtable, and indirect-calls vtable[0]. Canary returns; ours hangs.
To name the resolved target we need to dump the (singleton, vtable,
vtable[0]) chain on probe firing.

Adds `--audit-mem-read-hex` / `XENIA_AUDIT_MEM_READ` taking a single
guest VA. When set and any `--audit-pc-probe-hex` PC fires, the kernel
emits a paired `AUDIT-MEM-READ` line with three guest reads:

  AUDIT-MEM-READ addr=0x828E1F08 val=<*addr> vtable=<**addr> \
                 vtable[0]=<***addr+0> vtable[24]=<***addr+24> ...

`vtable[24]` is included as the slot-6 method (audit-059 round 9
documented the canary silph chain dispatching slot 6 of a vtable here).

Read-only; lockstep digest unaffected. ~30 LOC across state.rs and
main.rs. `cmd_check` opts out of the flag (same policy as the existing
audit_pc_probe_hex).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-07 12:13:42 +02:00
MechaCat02
09e59e09b7 Audit-2BF.delta: add --audit-pc-probe-hex for silph-init bctrl probe
Adds a per-PC probe analogous to --lr-trace / --branch-probe but tuned
for the silph init chain's virtual-dispatch site at sub_82172BA0+0x1E8
(PC 0x82172D88, the bctrl after a 3-deep `lwz` chain that loads vtable
slot 6). Each fire emits one AUDIT-PC-PROBE line with (pc, tid, hw,
cycle, lr, r3, r11) plus four guest-memory dereferences off r3 — the
vtable, slot-6 method pointer, auxiliary handle field, and embedded
sub-object vtable — so the line can be compared head-to-head with
canary's round-9 capture (r3=0xBCCC52C0, [r3+0]=0x820A3644,
slot6=sub_821B55D8, [r3+0xC]=0xF80000D8, [r3+0x30]=0x820A1870) to
identify whether ours dispatches to the wrong vtable on a correct
object (case A) or to a wrong object entirely (case B).

Why this addition rather than reuse of an existing probe: --lr-trace
emits JSONL designed for canary-side diffing and only captures
r3/r4/r5/r6/lr (no memory dereferences); --branch-probe captures CR
flags and lr but again no memory; --ctor-probe is single-shot per PC
and walks the stack back-chain. None of them load the four indirect
fields needed to identify a vtable-shape divergence.

Implementation:
  - state.rs: new HashSet<u32> field `audit_pc_probe_pcs` and helper
    `fire_audit_pc_probe_if_match(hw_id, mem)`. Empty-set fast-path
    keeps the cost to one is_empty() check per worker_prologue call
    when the flag is unused. Read-only — no guest state mutation,
    lockstep digest unchanged.
  - main.rs: new CLI flag --audit-pc-probe-hex with bare-hex comma
    parsing (tolerates `0x` prefix), settable also via
    XENIA_AUDIT_PC_PROBE env var. Threaded through cmd_exec_inner;
    cmd_check passes None so check digests are unaffected.

Probe wired into worker_prologue alongside fire_ctor_probe / fire_-
branch_probe / fire_lr_trace. Like its siblings, it fires once per
basic-block entry — known limitation (audit-045 reading-error class
13); use a block-entry PC if probing a mid-block instruction.

Verification: kernel 127/127, app 5/5 non-ignored, no behaviour
change with empty flag.

Cross-references audit-059 round 9's canary capture and lays the
groundwork for the round-10 ours-side comparison.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-07 10:59:03 +02:00
MechaCat02
5a8fe21ad5 Iterate-2.BF.γ: refine is_in_callback gate to per-thread exclusion
Lockstep vsync delivery was capped at 54/run despite the ticker firing
333 periods and dispatcher being called 1.2M times. Root cause: the
blanket `is_in_callback()` gate skipped dispatch entirely whenever the
async audio path held `interrupts.saved`, which is essentially the
entire boot (audio worker rarely hits its LR_HALT_SENTINEL between
back-to-back callbacks). 5.85M dispatch_skip_in_callback events drowned
out the 55 with-pending windows.

Graphics dispatch (iterate-2.BE) runs the ISR synchronously and
restores the borrowed context before returning — it doesn't touch
`interrupts.saved`. The only real conflict is if graphics picks the
*same* thread audio borrowed (which would stomp audio's
SavedCallbackCtx). Replace the blanket gate with per-thread exclusion:
when audio is mid-flight, exclude only its `injected_ref` from
victim selection. Falls through to the existing no-victim drop if
that's the only candidate.

Lockstep (50M instr): gpu.interrupt.delivered{source=0} 54 → 295
(5.5×), all 333 ticker periods either delivered or unarmed (no more
queue_full_drops). Wallclock unchanged ~3 s.

Parallel (30M instr): 1193 → 3458 baseline lift (2.9×), no regression.

Tests: xenia-kernel 127/127, xenia-app 5/5 non-ignored. Lockstep
goldens will drift (interrupts.delivered is in the digest); deferred
to next iterate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-06 19:52:16 +02:00
MechaCat02
51489e34db Iterate-2.BE Path β: tick vsync from coord_idle_advance
The iterate-2.BE host-driven synchronous ISR dispatcher relies on
something queueing v-syncs. In lockstep that's `tick_vsync_instr`,
called from `coord_pre_round` per round. If the scheduler stalls into
`coord_idle_advance` (no Ready threads), the instruction counter
freezes — the accumulator stops incrementing, the ticker stops
queueing, and the dispatcher is left starved for the duration of the
idle wait.

Tick `tick_vsync_wallclock` at the top of `coord_idle_advance` so
v-syncs keep firing on host time even when the guest scheduler is
parked. The dispatcher in the outer loop drains whatever we queue on
the next iteration. Same MMIO `D1MODE_VBLANK_VLINE_STATUS` bit-set as
the production path.

Note: empirically in Sylpheed at 50M/500M instruction horizons,
`coord_idle_advance` is never reached (tids 9/10/12 stay Ready through
the early-boot deadlock), so this commit doesn't move
`gpu.interrupt.delivered{source=0}` off 54 for this title at these
horizons. It is the correct fix for the documented starvation pattern
and will activate as soon as the kernel reaches a state where Ready
threads drop to zero with timers/waits pending.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-06 19:22:03 +02:00
MechaCat02
9a93152981 Iterate-2.BE: host-driven synchronous graphics ISR delivery
Replaces the victim-thread-mutate-then-wait scheme for vsync / CP
interrupts with synchronous in-line dispatch on the coordinator host
thread. Mirrors canary's EmulateCPInterruptDPC -> Processor::Execute
path (kernel_state.cc:1370, processor.cc:413): pick a guest thread,
borrow its PpcContext, jam ISR PC + args in, run the interpreter
inline until LR_HALT_SENTINEL, restore the borrowed context.

Why: audit-059 measured gpu.interrupt.delivered{source=0} = 54 over
3.9 s vs canary's 4712 over 30 s. Per-second shortfall ~11×. Old
asynchronous LR-sentinel injection (try_inject_graphics_interrupt)
needed a Ready or Blocked guest thread to land on; once the Sylpheed
main thread and worker threads all idled post-boot, no victim was
available and every queued vsync got dropped. Host-driven dispatch
decouples delivery from guest-thread readiness.

Smoke test (lockstep): unchanged 54 — under current Sylpheed boot
trajectory the ticker is gated by guest-instruction progress, not
victim availability; lockstep stalls into idle-advance after ~5M
instructions of real work and the synthetic tick_vsync_instr stops
firing. Under --parallel (wallclock ticker) gpu.interrupt.delivered
climbs to ~1131 over a 128 s run, confirming the synchronous
dispatcher itself works as intended. Architectural piece is now in
place; raising the lockstep delivery rate requires ticking the
synthetic vsync inside coord_idle_advance, which is a separate
change.

Changes:
- crates/xenia-kernel/src/interrupts.rs: doc-comment update only.
  SavedCallbackCtx + CALLBACK_STACK_PAD retained — the audio
  callback path (audit-048) still uses the asynchronous LR-sentinel
  inject on a dedicated per-client worker.
- crates/xenia-app/src/main.rs:
  * dispatch_graphics_interrupts(kernel, mem, &mut stats,
    &mut decode_cache, thunk_map): new fn. Drains the full FIFO per
    call. Victim selection same shape (Ready preferred, else
    Blocked, skip Idle/Exited/ServicingIrq), but the call is
    synchronous - we run step_cached + import-thunk dispatch inline
    on the borrowed ctx until pc == LR_HALT_SENTINEL.
    MAX_INSTRS_PER_ISR = 1M safety budget.
  * coord_pre_round: graphics-IRQ injection call removed. Audio
    path unchanged (still calls try_inject_audio_callback).
  * run_execution + run_execution_parallel: each now owns a
    persistent isr_decode_cache and calls
    dispatch_graphics_interrupts after coord_pre_round.
  * try_inject_graphics_interrupt: deleted (118 LOC).

No new public APIs, no new dependencies, no changes to xenia-cpu.

Tests: workspace 765 passed / 0 failed / 4 ignored (parallel_stress
+ sylpheed_n50m, all gated). Kernel 127/127, app 5/5, cpu 288/288.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-06 18:58:40 +02:00
MechaCat02
ac2f89a7bb Re-baseline sylpheed_n50m golden post-AUDIT-054
instructions: 50000002 → 50000001 (1-instr shift from FILE_DIRECTORY_FILE
plumbing on NtCreateFile path; all other digest fields unchanged —
imports/swaps/draws/render-targets/shaders/textures all match
prior golden).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 18:11:11 +02:00
MechaCat02
25704c5811 Re-baseline sylpheed_n50m golden post-AUDIT-032
Companion to 49f3eaf (AUDIT-032 dedicated audio worker). With the
audio callback ticker now on by default, the boot trajectory at
50M instr changes:

  instructions  50000009 -> 50000002  (interpreter stop boundary shift)
  imports         407215 -> 40454     (-90% — left audio-wait busy loop)
  swaps                2 -> 1         (degenerate splash repeat lost;
                                       main thread advances past splash)
  draws                0 -> 0         (audio gate != renderer gate per
                                       AUDIT-032 methodology correction)

The 10x imports drop reflects exiting the NtWaitForSingleObjectEx
busy-wait pattern (1.49M -> 30 calls per audit-runs/audit-048-*).
Boot now reaches Stfs/Xam content/crypto init phase. The single
remaining swap is the first splash; main thread is then blocked on
a different handle (0x1280) for follow-up.

sylpheed_n2m unchanged — at 2M instr the audio worker hasn't fired
yet, so the digest is byte-identical pre/post AUDIT-032.

Verified deterministic via two consecutive --expect runs at the new
digest (cargo test -p xenia-app --test sylpheed_oracles -- --ignored
passes in 2.82s).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 15:07:40 +02:00
MechaCat02
49f3eafa15 AUDIT-032: dedicated audio worker thread per client (Plan B)
Replaces APUBUG-PRODUCER-001's random-victim-hijack audio injection
with a dedicated per-client guest worker thread, mirroring xenia-canary's
apu/audio_system.cc:84-159 WorkerThreadMain pattern in xenia-rs's
threading model. Audio callback ticker is now safe to enable by default.

## What changed

- xenia-kernel/src/xaudio.rs: new XAudioState fields worker_handles +
  worker_refs (one slot per of XAUDIO_MAX_CLIENTS=8). Synthetic
  park-handle helper (0xF000_0000 | client_idx) — outside the normal
  alloc range so wake_eligible_waiters never finds it; the only
  legitimate state-flip is via try_inject_audio_callback.
- xenia-kernel/src/exports.rs: xaudio_register_render_driver spawns a
  64KB-stack guest thread (create_suspended=true) via
  state.scheduler.spawn after registration succeeds. Immediately flips
  the spawned thread's state from Blocked(Suspended) to
  Blocked(WaitAny[synthetic]) so it's parked but not woken. Stores the
  kernel handle so find_by_handle resolves a fresh ThreadRef after slot
  compaction. Failure paths log + leave xaudio.worker_refs[i] = None,
  in which case the ticker drops fires (no random-victim fallback).
- xenia-app/src/main.rs: try_inject_audio_callback resolves the worker
  via worker_handles[index] instead of scanning runqueues for a Ready
  or Blocked victim. The PC+r3 injection and SavedCallbackCtx capture
  are unchanged; the existing LR_HALT restore path re-blocks the
  worker on its synthetic handle for the next tick. Flag handling
  reworked: --xaudio-tick / XENIA_XAUDIO_TICK now act as explicit
  override (truthy = force on, falsey = force off, absent = use the
  KernelState default).
- xenia-kernel/src/state.rs: xaudio_tick_enabled default flipped from
  false to true. Pre-fix it was off because the random-victim hijack
  regressed swaps=2->1; with the dedicated worker that whole class of
  regression is gone.

## Cascade verification at -n 500M (audit-runs/audit-048-audio-host-pump/)

Pre-fix baseline: audit-runs/audit-047-gamma-wedges/ours-end-state.log.

| Dim | Predicted (AUDIT-032)               | Observed                        |
|-----|-------------------------------------|---------------------------------|
| A   | tid=9 leaves Blocked[0x828A3254]    | Ready @ pc=0x824d1404           |
| B   | tid=10 leaves Blocked[0x828A3230]   | Ready @ same pc/lr              |
| C   | XAudioSubmitRenderDriverFrame > 0   | Mixer setup path executed       |
| D   | KeReleaseSemaphore 0 -> non-zero    | 0 -> 1; xaudio.callback.delivered=1 |

Bonus: audit-042's tid=6 worker pair on 0x10A0+0x10A4 also went
Blocked->Ready as a downstream effect.

Boot trajectory shifted significantly: NtWaitForSingleObjectEx
1,489,791 -> 30; NtSetEvent 3,334 -> 68; new exports firing
(StfsCreateDevice, ObCreateSymbolicLink, XamContentCreateEnumerator,
XamEnumerate, XamTaskSchedule, ExCreateThread x10, KeSetAffinityThread x7,
NtCreateSemaphore x4, NtWaitForMultipleObjectsEx x94, NtDuplicateObject x14,
XeCryptSha, XeKeysConsolePrivateKeySign). The system left the
audio-wait busy loop and entered the savegame/content/crypto init phase.

swaps regressed 2 -> 1 (degenerate splash repeat lost; main thread now
advances past splash entirely, blocked on a different handle). draws
unchanged at 0 — expected per AUDIT-032 (audio gate != renderer gate).

## Tests + scope

- cargo build --release succeeds, no new warnings.
- cargo test -p xenia-kernel --lib: 127/127 pass (incl. xaudio).
- cargo test -p xenia-app --lib: 5/5 non-ignored pass.
- Lockstep goldens (sylpheed_n2m / sylpheed_n50m) WILL drift on this
  fix and need re-baselining as a follow-up commit.

75 net non-comment LOC across 4 files, well under AUDIT-032's
60-120 LOC budget.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 15:06:25 +02:00
MechaCat02
e428ce33aa M9.5 + M11.5 + VMX + SJIS/UTF-8: close the post-M5.5 deferred set
Closes the four remaining deferred follow-up items in one bundle.
All four are smaller-scope and additive; lockstep determinism
unaffected (analyzer-only changes).

## M9.5 — __CxxFrameHandler scope-table parsing

- New `xenia_analysis::eh_scope` module. Magic-scans .rdata for the
  three documented MSVC FuncInfo signatures (0x19930520/21/22) on
  4-byte alignment. Each match is parsed as the documented struct
  (BE u32 fields), with sanity caps on max_state / n_try_blocks /
  pointer validity.
- Walks pUnwindMap (UnwindMapEntry, 8 bytes) and pTryBlockMap
  (TryBlockMapEntry, 20 bytes) into one row each.
- New tables eh_funcinfo, eh_unwind_map, eh_try_blocks.
- Sylpheed yield: 2,588 FuncInfo (all version 0x19930522) /
  10,019 unwind entries / 315 try-blocks.

## M11.5 — Static-init driver chain detection

- New `xenia_analysis::static_init` module. Walks every function
  looking for the canonical _initterm loop: lwz cursor; mtctr;
  bcctrl; addi cursor, cursor, 4 bounded by a compare against another
  constant register. Extracts (array_start, array_end) and reads
  the array.
- Reuses `function_pointer_arrays` table — drivers' arrays land with
  kind='static_init' (replacing M11's prologue-heuristic output where
  the structurally-grounded pattern fires).
- Sylpheed yield: 0 drivers detected — the binary's static-init
  structure does not match the canonical CRT loop. Infrastructure
  ready; future M11.6 can relax.

## VMX vector-store xrefs (M6 follow-up)

- Adds AltiVec/VMX X-form load/store XOs to the M6 opcode-31
  dispatch: lvx/lvxl/lvebx/lvehx/lvewx (reads) and
  stvx/stvxl/stvebx/stvehx/stvewx (writes), all addr_mode=
  'x_form_indexed'. Static resolution still requires both rA and rB
  constant.
- Sylpheed yield: 110 newly-detected stvx writes.

## Shift_JIS + UTF-8 localised-string detection (M7 follow-up)

- Extends `xenia_analysis::strings::analyze` with scan_shift_jis (JIS
  X 0208 lead/trail byte ranges + half-width katakana pass-through)
  and scan_utf8 (2- and 3-byte sequences). At least one multi-byte
  unit required so pure-ASCII strings aren't double-counted.
- SJIS bytes rendered as \xHH escapes for diagnostic readability;
  full SJIS→UTF-8 decoding deferred.
- Sylpheed yield: 790 Shift_JIS strings (Japanese debug + UI text)
  + 39 UTF-8.

## Tests

- +2 EH (parses_minimal_funcinfo_v0, rejects_bogus_max_state)
- +2 static_init (detects_canonical_initterm_loop, rejects_function_without_pattern)
- +2 strings (detects_shift_jis_string, detects_utf8_multibyte_string)

Tests 649→655 (+6 unit tests). DB schema golden + write_analysis_results
signature updated for new EH parameter.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 00:36:53 +02:00
MechaCat02
56ffa40a6a M5.5: this-flow indirect-dispatch resolution via vptr-write inference
Closes the dominant case M5 could not resolve — `lwz vt, off(this);
lwz fn, slot(vt); mtctr; bcctrl` (real C++ dispatch). Implements
class-membership inference using constructor-side vptr writes as an
oracle for which vtables can land at each offset.

## Algorithm

Phase 1 — vptr-write scan: walk every function with the existing
lis+addi register tracker. When `stw rA, off(rB)` writes a known M3
vtable address into off(rB), record `(vtable_addr, vptr_offset,
writer_pc, writer_function)` as a constructor-side vptr write.

Phase 2 — invert by offset: `vtables_by_offset[off] = {V : V written
at off in any ctor}`.

Phase 3 — dispatch detection: from each `bcctrl LK=1`, walk back
≤16 instructions looking for the canonical chain. Bail on register
clobber, branch, or label (basic-block) boundary.

Phase 4 — edge emission: for `(dispatch_pc, vptr_off, slot)`, emit one
`xrefs.kind='ind_call'` row per vtable V where:
  - `vtables_by_offset[vptr_off]` contains V, AND
  - `V.length > slot` (V actually has a method at that slot)

Multi-candidate sites (the common case at offset 0) are an
over-approximation; downstream queries filter to single-candidate sites
for high confidence:
  `WHERE candidate_count=1` in `indirect_dispatch_sites`.

## Schema

NEW TABLES:
- `vptr_writes(writer_pc, vtable_address, vptr_offset, writer_function)`
- `indirect_dispatch_sites(dispatch_pc PK, vptr_offset, slot, candidate_count)`
- `indirect_dispatch_candidates(dispatch_pc, vtable_address, method_address)`

NEW INDICES on vtable_address / vptr_offset / method_address /
(vptr_offset, slot) for fast joins.

## Sylpheed yield

- 567 vptr writes / 214 vtables / 29 offsets (offset 0 = 88%).
- 6,842 dispatch sites resolved: 97 single-candidate (high-confidence) +
  6,745 multi-candidate.
- 687,963 ind_call xref rows.
- 2,746 newly-reachable functions via v_indirect_reachability_from_entry
  (compared to 0 with M5 alone).
- Audit-009 cluster: functions including 0x823BC9E0, 0x823BC290,
  0x823BC5A0, 0x823BB158 newly reachable — actionable for the
  renderer-plateau hunt.

Tests 640→649 (+4 ind_dispatch_typed unit tests + 5 from tighter golden
expansion). Schema golden + write_analysis_results signature updated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 23:35:05 +02:00
MechaCat02
77034b6cbf audit-038: persistent cache:/* VFS via host-FS backing
Replaces the "Synthesized empty file" stub for cache:/* paths with a
real host-FS HostPathDevice-style mount. Each KernelState gets a fresh
per-process tmpdir under /tmp/xenia-rs-cache-<pid>-<id>/ which is
cleared on init for lockstep determinism (mirrors canary's
xenia_main.cc:649 RegisterSymbolicLink("cache:", "\\CACHE") +
HostPathDevice in xenia-canary/src/xenia/vfs/devices/host_path_device.cc).

NtCreateFile now honours create_disposition for cache: paths:
  FILE_OPEN          -> NOT_FOUND if missing
  FILE_CREATE        -> NAME_COLLISION if present
  FILE_OPEN_IF       -> open or create
  FILE_OVERWRITE_IF  -> create or truncate
  FILE_OVERWRITE     -> NOT_FOUND if missing, else truncate
  FILE_SUPERSEDE     -> create or truncate

NtReadFile / NtWriteFile / NtSetInformationFile (XFileEndOfFileInformation)
/ NtQueryInformationFile / NtQueryFullAttributesFile route through
std::fs against the per-handle host_path; non-cache paths keep their
legacy semantics (read-only disc image, synth-empty stubs).

Verified by audit-037 cascade:
- sub_82459D18 (cache-miss restore): 0 fires (was firing constantly)
- sub_8245D230 (resize/zero-fill):  0 fires (was firing constantly)
- 105+ real cache-file writes per 500M run; 4+ MB of game data persisting
  to disk per boot; cache:/recent, cache:/access, cache:/d4ea*.tmp, etc.
- Lockstep deterministic at instructions=100000004 / imports=987485
  across 3+ reruns (digest shifted as expected; goldens re-baselined).
- swaps=2 plateau still in place; cluster L1 unactivated. Cascade
  dimension D (cluster activation) — UNKNOWN, no L1 fires.

Tests 640 -> 645 (+5 cache-specific unit tests; full workspace green).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-09 14:34:27 +02:00
MechaCat02
5af792c9fc M8+M9+M10+M11+M12: LOW-tier milestones — funcptr-arrays, EH flag, TLS, lr-trace
Five LOW-priority milestones bundled. Total ~700 LOC across 11 files.

## M9 — has_eh derived from pdata.flags exception bit
- New `functions.has_eh BOOLEAN NOT NULL` column. Derived from M1's
  already-parsed `pdata.flags` (bit 31 of the packed word — the
  exception-handler-present flag, distinct from bit 30 which is the
  always-1 32-bit-code flag). Index idx_functions_has_eh.
- Sylpheed: 2,975 of 23,073 pdata-validated functions have EH (12.9%).

## M10 — .tls section / IMAGE_TLS_DIRECTORY32 parser
- New `xenia_xex::tls::parse_tls` parses the directory + zero-terminated
  callback array. Returns None when the binary has no .tls section.
- New `tls_info` (singleton row) + `tls_callbacks(slot, address)` tables.
- New `DbWriter::write_tls()` no-ops on None.
- Sylpheed has no .tls section → 0 rows; infra ready for binaries with
  __declspec(thread).

## M8 + M11 — function_pointer_arrays (dispatch tables + static initialisers)
- New `xenia_analysis::funcptr_arrays::analyze` widens M3's vtable scan:
  detects runs of ≥2 function pointers in .rdata and classifies each as
  `vtable` (M3 re-emit), `dispatch_table` (M8), or `static_init` (M11)
  via a constructor-prologue heuristic (mfspr + small stwu).
- New tables `function_pointer_arrays(address PK, length, kind)` and
  `function_pointer_array_entries(array_address, slot, function_address)`.
- Sylpheed: 722 vtables + 388 dispatch_tables = 1,110 arrays / 6,347 slots.
  0 static_init detected (Sylpheed's ctors don't all match the
  conservative heuristic; M11.5 future work can chain via the entry-
  point's static-init driver).

## M12 — --lr-trace runtime canary-diff harness
- New CLI `exec --lr-trace=PC[,PC,...]` and `--lr-trace-out=PATH` flags.
  Symbolic resolution (Class::method, Class::*) via M4 lookup. Env vars
  XENIA_LR_TRACE / XENIA_LR_TRACE_OUT also work.
- New `KernelState::lr_trace_pcs` + `lr_trace_writer` + helper
  `fire_lr_trace_if_match(hw_id)` invoked from the per-instr probe slot.
- JSONL output: pc/tid/hw/cycle/r3/r4/r5/r6/lr — superset of what
  xenia-canary's --log_lr_on_pc patch emits, with a cycle counter for
  cross-run reproducibility. Diff-friendly via `jq`.
- Lockstep digest unaffected: smoke test on entry-point PC fires once
  with cycle=0/lr=BCBCBCBC/all-GPR-zero (correct initial state).

Tests 636→640 (+2 TLS tests, +2 funcptr_arrays tests). Schema golden
updated for new tables + has_eh column. Lockstep determinism preserved
(instructions=2000005 ×2 reruns identical).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 22:29:35 +02:00
MechaCat02
38d8871e8d M6: addr_mode column on xrefs + extended store/load classes
Adds finer-grained addressing-mode classification to every data xref row
plus new dispatch for instruction families not previously emitted:
- New `xrefs.addr_mode VARCHAR NULL` column. NULL for control-flow edges
  (call / ind_call / j / br); one of d_form / lis_addi / lis_ori /
  multiword / x_form_indexed / x_form_byterev / atomic / dcbz for data
  edges. Index idx_xrefs_addr_mode.
- New `xenia_analysis::xref::AddrMode` enum + Xref::addr_mode field.
- Opcode 46/47 (lmw/stmw) expand to one xref per slot — D-form multi-word
  load/store now resolves all (32-rS) consecutive addresses.
- Opcode 31 X-form dispatch — stwx/stbx/sthx/stwux/stbux/sthux/stdx/stdux,
  lwzx/lbzx/lhzx/lhax/lwzux/lbzux/lhzux/lhaux/ldx/ldux,
  stwcx./stdcx. (atomic),
  stwbrx/sthbrx/lwbrx/lhbrx (byte-reverse),
  dcbz (cache-line clear).
- X-form rows are emitted ONLY when both rA and rB resolve to known
  constants (rare but present); the dominant runtime-indexed pattern
  remains correctly skipped.

Sylpheed yield (regen on master + merge):
- 442 newly-detected x_form_indexed reads (lwzx/lhzx into static tables).
- 40 newly-detected atomic writes (stwcx./stdcx. with resolvable address).
- 28,834 lis_addi refs, 18,485 d_form reads, 3,288 d_form writes — every
  pre-existing data row now tagged.
- 0 multiword / dcbz / byterev (these instructions exist but aren't on
  lis+addi-tracked code paths).

Tests 633→636 (+3 xref unit tests covering AddrMode tag uniqueness,
data-edge addr_mode round-trip, control-edge None invariant). Schema
golden updated (xrefs gains addr_mode column).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 21:38:47 +02:00
MechaCat02
ab4fe211e5 M5+M7: indirect-dispatch reachability + .rdata string detection
Two MEDIUM milestones bundled (both opportunistic per plan; both small).

## M5 — indirect-dispatch reachability

- `xenia_analysis::indirect`: per-basic-block register tracker over each
  detected function. Recognises the canonical static-vtable pattern
  `lis+addi → lwz off(rA) → mtctr → bcctrl` where rA holds a known M3
  vtable address. Emits one `Xref { kind: IndirectCall }` per resolvable
  bcctrl site.
- PowerPC ABI awareness: `bl`-style calls clobber volatile r0..r12 + ctr
  but preserve non-volatile r13..r31, so a vtable pointer parked in r30/r31
  before a call survives.
- Label-based basic-block boundaries kill register state — bounds
  false-positive risk for jump-IN paths.
- New `XrefKind::IndirectCall` variant (DB tag `'ind_call'`).
- New SQL view `v_indirect_reachability_from_entry` — strict superset of
  `v_reachability_from_entry`, taking `ind_call` edges in the BFS.

Sylpheed yield: 0 edges detected. The binary's 1,001 static lis+addi
references into vtables are nearly all constructor-side vptr writes, not
dispatches; real method dispatch goes through `this->vptr` which requires
alias analysis we explicitly don't do. Documented in SCHEMA.md as the
expected limitation. Three unit tests cover the synthetic-correctness path.

## M7 — string / constant-pool detection

- `xenia_analysis::strings`: scans `.rdata` for runs of ≥ 6 printable
  ASCII bytes (NUL-terminated) and ≥ 6 UTF-16LE code units (basic-plane
  printable ASCII, NUL u16 terminator).
- New `strings(address PK, encoding, length, content)` table + encoding index.
- Implicit cross-ref via existing `xrefs.kind='ref'` rows whose target
  matches a strings.address.

Sylpheed yield: 6,311 ASCII strings (including embedded HLSL shader source
and AS_CB_SURFACE_SWIZZLE_* assertion strings). 9,132 lis+addi sites
cross-reference detected strings — names source PCs near each string in
one query. Four unit tests cover encoding detection, NUL termination, and
short-run rejection.

Tests 626→633 (+3 indirect, +4 strings).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 21:22:50 +02:00
MechaCat02
4ff08f6116 M4: class-aware probe tokens via M3 vtable+method tables
CLI extension only — no schema change. Adds symbolic resolution for
--pc-probe / --branch-probe / --ctor-probe tokens:
- `0xADDR` / `2186674160` — numeric (current behavior, no DB load).
- `Class::method` — joins classes × methods × demangled_names.
- `Class::*` — joins classes × methods (all slots).
- `function_name` — falls back to functions.name for free functions /
  saverestore stubs / labels.

New `xenia_analysis::lookup::resolve_probe_token(db_path, token)` opens the
DB read-only ONLY when a token is non-numeric, so legacy numeric flows pay
no IO. New `--probe-db PATH` flag (or `XENIA_PROBE_DB` env / default
`sylpheed.db` next to the .iso) selects the DB.

Symbolic resolution happens BEFORE any guest exec, so it cannot affect the
lockstep digest. Verified deterministic across two reruns at -n 2M
(instructions=2000005 identical).

End-to-end smoke test on Sylpheed: `--pc-probe='ANON_Class_6B674251::*'`
resolves to all 45 method PCs of that anonymous class (matching the
methods-table row count for that vtable).

Tests 621→626 (+5 lookup unit tests covering numeric passthrough,
symbolic-without-DB error, Class::method resolution, Class::* expansion,
and functions.name fallback).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:22:21 +02:00
MechaCat02
1d6c51fbf8 M3: vtable scan + MSVC RTTI walk + 3 new tables
Adds detection of statically-allocated MSVC vtables in .rdata/.data:
- New `xenia_analysis::vtables` walks read-only sections looking for runs of
  ≥3 contiguous big-endian u32 values where each value lands on a known
  function start (from M1's corrected functions table). 2-slot runs are
  rejected to keep false-positive rate down.
- For each candidate the MSVC RTTI walk vtable[-1] → CompleteObjectLocator
  → TypeDescriptor → mangled name is attempted; on success the demangled
  class name is recorded along with a best-effort RTTIClassHierarchyDescriptor
  walk to fill base_classes_json. On failure (RTTI stripped — common for
  shipped game binaries) the class is named ANON_Class_<fnv1a-hash> keyed
  by sorted method-PC list, so identical vtables collapse to one entry.
- DB: new tables `vtables`, `methods`, `classes` with indices on
  function_address and rtti_present. `write_analysis_results` takes a
  `&[Vtable]` slice; `write_disasm` (back-compat) passes empty.
- cmd_dis wires the scan after xref analysis using
  `func_analysis.functions.keys()` as the function-start oracle.

Validation on Sylpheed (RTTI stripped, as expected): 722 vtables / 499
unique classes / 5571 methods. Sanity invariant: every methods.function_address
joins to functions.address (0 broken refs). Largest vtable: 131 slots.

Tests 617→621 (+4 vtable unit tests covering 3-slot detect, 2-slot reject,
synth name stability, and synth name divergence).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 20:17:45 +02:00
MechaCat02
70120465a3 M1: parse .pdata RUNTIME_FUNCTION; cross-validate function boundaries
Adds an authoritative function-boundary source from the linker:
- New `xenia_xex::pdata` parses .pdata 8-byte entries (BeginAddress + packed
  prolog/length/flags). Bit layout per Microsoft PE32 PowerPC spec: prolog in
  bits 0..7, function_length in bits 8..29, flags in 30..31.
- `func::analyze_with_pdata` unions pdata BeginAddresses into the candidate
  set, attaches `pdata_validated`/`pdata_length` to each `FuncInfo`, and trims
  any function whose `end` overlaps the next start (catches mis-merge where
  one row spanned two prologues — the audit-031 sub_824D23B0/sub_824D29F0
  case).
- DB: extends `functions` with `pdata_validated BOOLEAN`, `pdata_length BIGINT`;
  new table `pdata_entries`; index on pdata_validated.
- New `crates/xenia-analysis/SCHEMA.md` documents M1 layer + forward work.

Validation on Sylpheed: 25481 functions (was 12156) / 23073 pdata_validated /
0 orphans / 0 mis-merges. Audit-031 mis-merge resolved: sub_824D29F0 now has
its own row with `pdata_length=280` (70 dwords); sub_824D23B0 now correctly
ends at 0x824D2878 (`pdata_length=1224` matches prologue walk).

Tests 605→610. New 5-test pdata unit suite covers bit layout + sentinel +
out-of-range filtering + real-world layout round-trip.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 19:44:02 +02:00
MechaCat02
690943ceef gate dump-section reads on is_mapped; trim doc comments
Without the page-state guard, read_bulk faulted on PROT_NONE pages of
the 4 GiB host reservation. Per-page is_mapped check skips uncommitted
pages, leaving the buffer's leading zero bytes in place. Total LOC
budget after trim: 70.
2026-05-07 21:45:54 +02:00
MechaCat02
412ba858b4 move dump-section flush above quiet gate so it fires under --quiet runs
The headless cmd_exec path passes quiet=false in normal use but the
diagnostic --dump-section is independent of the chatty thread/dump
prints, so it should not be gated by --quiet. Lockstep digest preserved.
2026-05-07 21:42:33 +02:00
MechaCat02
08d41cf2fc add --dump-section=BASE:LEN:PATH for end-of-run guest memory snapshot
Drives byte-level memory diffs against canary's Memory::Save dump.
Hot-path zero-cost when absent; lockstep digest unaffected
(instructions=100000003 deterministic across reruns).
2026-05-07 21:40:45 +02:00
MechaCat02
978a6950d1 feat(memory): --mem-watch=ADDR per-store writer trace
Adds an opt-in diagnostic that emits one tracing line per guest store
overlapping any armed byte address, naming the writer (tid, pc, lr)
plus old/new u32 lanes. Mirrors the --pc-probe / --branch-probe shape;
pc/lr are stamped from worker_prologue via a thread-local Cell, so
default runs (empty watch set) take a single is_empty() check on each
write. Lockstep digest preserved (instructions=100000003 across reruns,
sylpheed_n50m.json golden byte-identical).

Diagnostic infra only; no functional change. Used to identify producers
of dispatch-state writes for the audit-017 / audit-019 hunt.
2026-05-06 21:00:20 +02:00
MechaCat02
76dfe7fd7a fix(kernel): KRNBUG-KE-001 — real KeResumeThread per canary mirror
Replace the no-op cookie-returner with a real impl per canary
xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_threading.cc:216-227
(XObject::GetNativeObject<XThread>()->Resume()). Mirrors
nt_resume_thread plumbing two functions below:
resolve_pseudo_handle -> scheduler.find_by_handle -> resume_ref.

Returns STATUS_SUCCESS if the KTHREAD-pointer-as-handle resolves,
STATUS_INVALID_HANDLE otherwise — matches canary's Resume()/!thread
return semantics.

Cascade-prediction scorecard (audit-018 -> post-fix):
- A PASS: tids 9 (entry=0x824D2878) and 10 (entry=0x824D2940)
  leave Suspended -> run prologue -> park on audio buffer-completion
  semaphores 0x828A3254 / 0x828A3230.
- B PARTIAL FAIL: NtSetEvent 667->3334; KeReleaseSemaphore=0;
  XAudioSubmitRenderDriverFrame=0.
- C FAIL (predicted 2->1, actual 2->2): both ExTerminateThread +
  KeReleaseSemaphore still canary-only.
- D FAIL: gamma-cluster blocker unchanged — pc-probe at
  0x82184318/0x82184374 no fires; dump-addr 0x828F4070 no DUMP;
  signal_attempts on 0x1004/0x100c/0x1020/0x15e4 still 0.

Necessary-but-not-sufficient: workers unsuspend but park on a
downstream gate that's part of the audit-009/-016/-017 gamma cluster.

Tests 600 -> 601 (+ke_resume_thread_unblocks_suspended_worker).
Lockstep instructions=100000003 imports=987516 deterministic x2.
Goldens re-baselined: sylpheed_n50m.json instructions
50000003->50000011, imports 407255->407247.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 20:46:46 +02:00
MechaCat02
b78e6fd205 fix(kernel): KRNBUG-IO-004 — real XamNotifyCreateListener + XNotifyGetNext per canary
Canary's RegisterNotifyListener (kernel_state.cc:1013-1033) auto-enqueues four
startup notifications on the first listener whose mask covers kXNotifySystem
(SystemUI=0x09 + SystemSignInChanged=0x0A) and kXNotifyLive
(LiveConnectionChanged=0x02000001 + LiveLinkStateChanged=0x02000003). XNotifyGetNext
(xam_notify.cc:22-96) pops the queue with mask + version filtering on enqueue per
xnotifylistener.cc:38-51. Our prior stubs returned 0 forever; the dispatch loop
at 0x822f1be8 in sub_822F1AA8 was thus bypassed indefinitely.

Implementation:
- KernelObject::NotifyListener { mask, max_version, queue, waiters } variant.
- KernelState::has_notified_startup + has_notified_live_startup gates.
- xam_notify_create_listener: mask=r3 (qword), max_version=r4 (clamped <=10),
  alloc handle, conditional 4-tuple startup enqueue.
- xnotify_get_next: handle/match_id/id_ptr/param_ptr in r3..r6; pop_front
  (or scan-by-id), with mask + version filter applied at enqueue time.
- 5 unit tests covering: full-mask 4 startup notifications, second-listener
  no re-fire, system-only mask filtering, max_version=0 too-new drop,
  unknown handle returning 0.

Tests: 594 -> 599. Lockstep `-n 100M` instructions=100000012 deterministic
across 2 reruns; bit-identical run-to-run diff.

Cascade (verified at -n 500M):
- dispatch arm 0x822f1be8 fires; sub_82173DC8 entered.
- 3/21 renderer-cluster L1 PCs newly reached: 0x822c6870 (2 workers),
  0x824563e0, 0x823ddb50.
- canary-only export delta 7 -> 3 (reclassified to fired:
  KeResetEvent, ObCreateSymbolicLink, XamTaskCloseHandle, XamTaskSchedule).
- worker thread count 18 -> 20.
- signal_attempts on handle 0x15e0 = 1 (primary=1), was 0.
- draws=0 still expected at this step.

LOC: 119 (97 impl + 22 scaffolding pattern matches across main.rs / objects.rs
/ state.rs) <= 120.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-06 16:55:51 +02:00
MechaCat02
a1a7265f29 fix(kernel): KRNBUG-IO-003 — NtDeviceIoControlFile real impl mirroring NullDevice::IoControl
Replace the stub_success registration of NtDeviceIoControlFile at
exports.rs:90 with a real handler for FsCtlCodes 0x70000 (drive
geometry) and 0x74004 (partition info), mirroring xenia-canary
xboxkrnl_io.cc:645-678 + null_device.{h,cc}. The 16-byte 0x74004
response with cache_size=0xFF000 at OUT+8 is the gate that lets
sub_824ABD88 return SUCCESS and sub_824A9710 reach the priv-11
XexCheckExecutablePrivilege site identified by KRNBUG-AUDIT-007.

Stack args 9-10 (OutputBuffer, OutputBufferLength) read from the
caller's parameter save area at [sp+0x54] / [sp+0x5C] per the Xbox
360 PowerPC EABI (linkage area sp+0..sp+8, 8-quadword spill area
sp+0x14..sp+0x54, then stack args every 8 bytes). First HLE export
in the codebase to need 9+ args.

Cascade vs. KRNBUG-AUDIT-007 prediction (5/8 held):
- XexCheckExecutablePrivilege count 1 → 2 (priv=0xA + priv=0xB) ✓
- XamTaskSchedule count 0 → 1 ✓
- canary-only exports 7 → 3 (audit predicted ≤3) ✓
- 0x15e0 semaphore signal_attempts 0 → 1 (bonus)
- 0x100c worker spawn DID NOT fire (still UNCREATED) ✗
- 0x1004 signal_attempts unchanged ✗
- Worker spawn count unchanged at 19 ✗

Tests: 592 → 594. Lockstep deterministic at -n 100M (run1 ≡ run2 ≡
run3, byte-identical). instructions=100000010 → 100000019, imports
407417 → 987524 (+2.4×). swaps=2 draws=0 plateau persists.

sylpheed_n50m golden re-baselined instructions=50000004→50000003,
imports=407362→407255. sylpheed_n2m unchanged.

Still canary-only after this fix: ExTerminateThread,
KeReleaseSemaphore, XamUserReadProfileSettings. The next downstream
gate is somewhere past XamTaskSchedule's completion path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 22:00:12 +02:00
MechaCat02
c51f51f9cb feat(kernel): KRNBUG-AUDIT-007 — --branch-probe instrumentation; sub_824A9710 exit gate identified
Sister to --pc-probe / --ctor-probe but emits a single compact one-line
BRANCH-PROBE record per fire (pc, tid, hw, cycle, r3, lr, cr0/cr6 flags)
with no back-chain. Designed for tracing every conditional-branch fire
inside a candidate-gate function so the last PC reached before the
function epilogue identifies the exit branch.

Runtime trace at audit-runs/audit-007/sub_824A9710-trace.log decisively
identifies the priv-11 gate:

- Exit branch: 0x824a9944 (post bl sub_824ABD88 first call)
- Responsible kernel call: NtDeviceIoControlFile, FsCtlCode=0x74004
  (registered as stub_success at exports.rs:90)
- Mechanical chain: stub returns 0/SUCCESS without writing OUT, game
  reads [out_buf+8], finds zero, assigns hardcoded 0xC0000034
  (STATUS_OBJECT_NAME_NOT_FOUND) at sub_824ABD88:0x824abea8-ac, exits
  via 0x824a9944's lt branch before priv-11 site at 0x824a99a0.

592→592 tests; lockstep instructions=100000010, swaps=2, draws=0
deterministic across reruns. Read-only diagnostic — no fix this session.
Next session: KRNBUG-IO-003 (real NtDeviceIoControlFile per canary
NullDevice::IoControl for FsCtlCodes 0x70000 + 0x74004).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 21:35:10 +02:00
MechaCat02
bef9793aec feat(kernel): KRNBUG-IO-001 — NtReadFile on synth-empty file returns SUCCESS+0, not EOF
AUDIT-005's static attribution to sub_824ABA98 was wrong. The 0xC0000011
(STATUS_END_OF_FILE) at lr=0x824a97e4 traces to the NtReadFile call at
0x824a9810 inside sub_824A9710 — the cache-loader reads 1024 B from
offset 2048 of `\Device\Harddisk0\partition0`. Our synth-empty fallback
returned EOF (start_pos 2048 > size 0), so the function bailed via
RtlNtStatusToDosError before sub_824ABA98 was ever called.

Canary mounts partition0 to a NullDevice; `NullFile::ReadSync`
([null_file.cc:24-31](xenia-canary/src/xenia/vfs/devices/null_file.cc))
returns X_STATUS_SUCCESS with bytes_read=0 and never touches the
buffer. Sylpheed's caller pre-zeroes the 1024-byte stack buffer
(`memset(sp+208, 0, 1024)` at sub_824A9710 prologue), validates a
"Josh" magic on the first read, and falls back to the cache-recreate
path when the magic doesn't match.

The fix mirrors NullFile semantics: when the open synthesized a
zero-length file (`data.is_empty() && size == 0`), NtReadFile returns
SUCCESS with information=0 and the buffer untouched.

Effects (chain-of-effects verification at -n 500M):
  - tests: 590 → 591 (added regression covering NullDevice semantics)
  - lockstep: deterministic across 3 reruns (same instructions=100000010,
    swaps=2)
  - sylpheed_n50m golden re-baselined: instructions 50000004→50000000,
    imports 407416→407362
  - canary kernel-call diff: 10 → 7 missing exports
    (XeCryptSha + XeKeysConsolePrivateKeySign + NtDeviceIoControlFile
    now run; the cache-recreate path executes through to NtWriteFile)
  - boot reaches silph::Silph::Impl::OnInit: 19 worker threads spawn
    (was 6 before the fix)
  - parked-handle 0x1004 still signal_attempts=0; the original 0x100c
    and 0x15e0 are now <UNCREATED> because cascade walked past them and
    the handle assignments shifted; new parked sites: 0x12fc/0x1600/
    0x1040/0x10b8/0x15e8/0x1014/0x101c/0x10bc/0x1044
  - draws=0 plateau persists; renderer is multi-causal blocked

Next blocker: per the canary-only diff, XamTaskSchedule + the cluster
of XAM exports (XamTaskCloseHandle, XamUserReadProfileSettings,
ObCreateSymbolicLink) and the post-thread-exit chain (ExTerminateThread,
KeReleaseSemaphore, KeResetEvent) are the next-up frontier.
2026-05-04 20:20:10 +02:00
MechaCat02
19659d7f76 feat(kernel): KRNBUG-XAM-001 — XGetAVPack returns 8 (HDMI), not 0x16
Mirrors canary's cvars::avpack default (xam_info.cc:35) and Sylpheed's
accepted set {3,4,6,8} (xam_info.cc:250-251). With KRNBUG-XEX-001 having
flipped the priv-10 gate, XGetAVPack now reaches its caller in
sub_824AB578; returning 0x16 caused Sylpheed to abort the AV/crypto
block before XeCryptSha. Cascade walks one step (canary-only export
list 11 → 10); sub_824ABA98 is the next candidate.

Tests: 589 → 590. Goldens re-baselined (n50m: 50000005→50000004,
imports 407417→407416). Lockstep deterministic across 3 reruns at
-n 100M (instructions=100000010, import_calls=987686 +2.4×, swaps=2).
9-PC producer probe still 0×; parked handles 0x1004/0x100c/0x15e0
still signal_attempts=0.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 18:54:24 +02:00
MechaCat02
1a892d4641 feat(kernel): KRNBUG-XEX-001 — real XexCheckExecutablePrivilege from XEX header bitmap
Replace stub_return_zero with a canary-faithful implementation that
returns bit `priv` of the loaded XEX's XEX_HEADER_SYSTEM_FLAGS
(key 0x00030000) bitmap. Mirrors xenia-canary
xboxkrnl_modules.cc:22-39: `(flags >> priv) & 1` for priv < 32, else 0.

Plumbing:
- xenia-xex: header_keys::SYSTEM_FLAGS const + get_system_flags() accessor.
- xenia-kernel/state.rs: pub xex_system_flags: u32 + xex_priv_logged
  HashSet for one-shot per-priv tracing.
- xenia-app: kernel.xex_system_flags wired in cmd_exec_inner.
- xenia-kernel/exports.rs: real export body + unit test covering
  bits 10/11/0/64 + zero-flags case.

Sylpheed's bitmap is 0x00000400 (only XEX_SYSTEM_PAL50_INCOMPATIBLE,
bit 10). At -n 500M with the fix:
- XGetAVPack: 0 -> 1 (priv-10 gate at lr=0x824ab598 flipped).
- 10 other canary-only exports + 9 producer PCs + 3 parked handles
  unchanged. Priv-11 site at sub_824A9710 is downstream and still
  not reached — AV/crypto block aborts after XGetAVPack returns
  our placeholder 0x16 (canary returns 8/HDMI; Sylpheed accepts
  only 3/4/6/8 per xenia-canary xam_info.cc:250-251).

Tests 588 -> 589. Lockstep deterministic (3 reruns identical):
n50m goes 50000008 -> 50000005 instr / 407415 -> 407417 imp / swaps=2 /
draws=0. Goldens re-baselined (sylpheed_n50m, sylpheed_n2m); oracle
test green.

Full chain-of-effects + next-frontier hand-off in audit-findings.md
under KRNBUG-XEX-001.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 18:32:51 +02:00
MechaCat02
3e2fc1ec88 feat(kernel): KRNBUG-AUDIT-005 — --pc-probe extension + canary diff identifies XexCheckExecutablePrivilege stub cascade
Extends `--ctor-probe` machinery into `--pc-probe` (clap alias) with
the optional `PC@DISPATCHER:OFFSET` token form: on a hit, the helper
additionally logs `[disp+off]` — what the producer's
`lwz r3, OFFSET(r3)` is about to read. Reuses `parse_hex_u32`; both
flags share parser + storage.

Read-only diagnostic. Lockstep digest preserved (`run digest matches
golden` at -n 50M `--stable-digest`). 588 tests green.

Decisive findings (full deliverable in `audit-findings.md` /
`audit-runs/audit-005/`):

- Failure mode α confirmed for KRNBUG-AUDIT-004: all 9 producer call
  sites for handles 0x100c (5 sites) and 0x15e0 (4 sites) fire 0x at
  -n 500M. The producer code path is not reached.

- Set-diff of kernel-call sequences (canary.log oracle vs ours.log
  at -n 500M) identifies 11 exports canary calls and we don't:
  XGetAVPack, XeCryptSha, XeKeysConsolePrivateKeySign,
  ObCreateSymbolicLink, NtDeviceIoControlFile (×2),
  XamUserReadProfileSettings (×2), XamTaskSchedule, XamTaskCloseHandle,
  KeReleaseSemaphore (×268), KeResetEvent, ExTerminateThread (×2).

- XGetAVPack has exactly one caller (sub_824AB578 at 0x824AB5A0).
  The 4 instructions immediately preceding it are:
      addi r3, r0, 10            ; privilege bit 10
      bl   XexCheckExecutablePrivilege
      cmpli 0, r3, 0
      bc 12, eq, 0x824AB724      ; if r3==0, skip whole block

- exports.rs:193 registers XexCheckExecutablePrivilege as
  stub_return_zero. Always returning 0 -> guest takes the branch
  and skips the entire AV/crypto/save-data init block.

- The other call site (sub_824A9710 at 0x824A99A0) queries privilege
  11 with opposite polarity (bne) -> gates XamTaskSchedule on the
  privilege-NOT-set arm. With both stubs returning 0, the guest
  walks the wrong arm of every privilege-gated branch.

- This explains why the dispatcher fields read zero
  ([0x828F3D08+0x50]=0, [0x828F4070+0x24]=0 from AUDIT-004 dumps):
  the ctors run, but the producers that would populate those fields
  with a non-zero handle never execute.

Next session: replace XexCheckExecutablePrivilege stub with real
priv-bit lookup from XEX header. See audit-findings.md
KRNBUG-AUDIT-005 for the validation matrix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 18:06:22 +02:00
MechaCat02
7108d6d131 feat(kernel): KRNBUG-AUDIT-004 — --ctor-probe PC hook + --dump-addr struct dump
Diagnostic-only, read-only. Lockstep `instructions=100000002`
preserved bit-exact at -n 100M --stable-digest. 586 → 588 tests.

Adds two read-only diagnostics for the parked-waiter producer hunt:

  * `--ctor-probe=0x8217C850,0x...` — at every interpreter step,
    if `ctx.pc` is in the configured set, print one `CTOR-PROBE`
    line capturing live r3 (= `this` in MSVC PPC ctors), lr
    (= return site), sp, plus an 8-frame back-chain with
    saved-r31/r30 per frame. Fires once per hit, exactly what the
    8-instance-pool probe needed.

  * `--dump-addr=0x828F3D08,0x828F4070,0x828F3EC0,...` — at end of
    run (after the FOCUS report in `dump_thread_diagnostic`), each
    address gets a 128-byte hex + be32 + ASCII dump. Used to
    inspect the static dispatcher / job-queue struct layouts
    AUDIT-003 identified.

Both gated default-off; empty set is a single `is_empty()` test on
the hot path. No guest state is mutated, so the
`sylpheed_n*m.json` lockstep digest is preserved.

KRNBUG-AUDIT-004 findings (corrects KRNBUG-AUDIT-002/003):

1. **The "8-instance pool" hypothesis for handle 0x1004 is FALSE.**
   Probing the inner per-instance ctors `[0x821783D8, 0x82181750,
   0x821701C8]` at -n 50M shows each fires EXACTLY ONCE with
   r3 = `[0x828F3EC0, 0x828F3D08, 0x828F4070]` respectively. All
   three handles are Meyers-style singletons with one dispatcher
   each. The "called 8 times" claim came from miscounting raw
   entries to the OUTER getter sub_8217C850 — but that getter is
   itself a Meyers-singleton-getter; only the FIRST entry cascades
   through to bl 0x821783D8 (gated on `[0x828F48D8] bit 0`).

2. **The producer indirection layer is the singleton-getter
   itself.** Static byte-scan of .rdata / .data shows 0 hits for
   the dispatcher addresses — no static registry table holds them.
   But the xrefs table for the OUTER getters reveals 5–6 callers
   each, MOSTLY non-create-chain, sharing the canonical producer
   pattern: `bl outer_singleton_getter; lwz r3, OFFSET(r3); bl
   0x824AA1D8` (with OFFSET=80 for 0x100c, =36 for 0x15e0). So the
   AUDIT-003 xref audit was necessary but not sufficient — it
   correctly saw "no direct producer references" but missed the
   singleton-getter indirection layer.

3. **Dispatcher struct layouts** (128-byte dumps captured at -n
   50M --halt-on-deadlock):
     - 0x828F3D08 (handle 0x100c): event_handle at +0x4C (0x100c),
       thread_handle at +0x48 (0x1010), self-pointer at +0x74,
       capacity 7 at +0x28, queue empty (+0/+3C = -1).
     - 0x828F4070 (handle 0x15e0): event_handle at +0x20 (0x15e0),
       sibling-handle 0x15E4 at +0x1C, queue empty (+0x10 = -1).
     - 0x828F3EC0 (handle 0x1004): event_handle at +0x78 (0x1004),
       4 guest-heap sub-buffers at +0x20/+0x3C/+0x44/+0x50 in
       0x4xxxxxxx range — noticeably different layout from the
       other two pure POD job queues.

Files:
  crates/xenia-kernel/src/state.rs   ctor_probe_pcs / dump_addrs +
                                     fire_ctor_probe_if_match + 2 tests
  crates/xenia-app/src/main.rs       Exec --ctor-probe / --dump-addr
                                     CLI parsing, prologue hook,
                                     end-of-run struct dumper
  audit-findings.md                  KRNBUG-AUDIT-004 entry
  audit-runs/audit-004/              50M probe runs (v1 outer-getter
                                     hits, v2 inner-ctor hits proving
                                     the singleton hypothesis)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 17:09:47 +02:00
MechaCat02
f84e947547 feat(kernel): KRNBUG-AUDIT-003 — vtable/RTTI class probe at handle creation + wait
Adds a read-only MSVC RTTI traversal helper (`read_class_at_this`)
and a `probe_create_stack_classes` integration that walks each
captured back-chain frame for handle creates in `--trace-handles-focus`
and probes each frame's most-likely `this` candidate (live r31/r30/r3
for frame 0; saved-r31/r30 from the prologue spill area at [fp-12]/
[fp-16] for deeper frames). False-positive guard rejects the CRT
static-init iterator pattern (vtable's first two slots must be image-
range function pointers — PPC instruction words like `mflr r12` are
not in 0x82xxxxxx).

`dump_thread_diagnostic` now takes `&GuestMemory` so the FOCUS report
prints, for each parked waiter, a WAIT-THREAD block with full back-
chain frames and per-slot saved-register dump for offline lookup.

End-to-end finding (-n 500M producer-trace):
  * Handle 0x100c dispatcher = 0x828F3D08 (image rdata; verified by
    sub_82181750 disasm + xref table). [this+0] = -1 sentinel — POD
    job queue, NOT a C++ polymorphic class.
  * Handle 0x15e0 dispatcher = 0x828F4070 (same shape).
  * Handle 0x1004's 8-instance pool members still TBD (MSVC ctors
    didn't preserve `this` in r31).
  * 0x42450b5c is a separate audit class (heap-allocated, parks via
    non-`do_wait_single` path).

Decisive xref audit: every reference to 0x828F3D08 / 0x828F4070 in
the static analysis is in a ctor or the CRT init driver. NO producer
code references either dispatcher base. Confirms `signal_attempts=0`
is unreachable-producer, not broken-producer.

Tests: 581 → 586 green (+5: RTTI-intact / RTTI-stripped / non-object
/ cstring / probe_create_stack integration). `--stable-digest -n
100M` instructions=100000002 unchanged. Master HEAD prior: 6440261.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 21:14:56 +02:00
MechaCat02
2a9fd1fc86 feat(kernel): KRNBUG-AUDIT-002 — multi-frame guest stack capture at handle creation
Adds `walk_guest_back_chain` (PPC EABI back-chain walker) and a
`record_create_with_stack` audit hook gated on `--trace-handles-focus`.
NtCreateEvent / NtCreateSemaphore / NtCreateTimer / XamTaskSchedule now
route through the new helper so focused handles capture up to 6 stack
frames at allocation time. Diagnostic-only, read-only memory access:
unfocused handles pay one HashSet lookup, focused ones pay six
back-chain dereferences. Lockstep determinism preserved.

End-to-end finding: handles 0x1004 (8-instance pool via static ctor at
0x8280F810), 0x100c (singleton built inside main()), 0x15e0 (singleton
in distinct cluster) are silph-framework dispatcher objects whose
producer code is unreached at -n 500M. The producer hunt now has class
ownership; vtable/RTTI readout is the next step.

Tests: 576 → 581 green. `--stable-digest -n 100M` instructions=100000002
unchanged. Master HEAD prior: 9d45efe.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 20:41:06 +02:00
MechaCat02
07068e7616 feat(audio): APUBUG-PRODUCER-001 — XAudio register driver client + opt-in callback ticker
Replace the three XAudio kernel-export stubs (Register/Unregister/SubmitFrame)
with canary-faithful implementations and add a periodic buffer-complete
callback ticker reusing the existing SavedCallbackCtx injection machinery.

Canary parity:
- xboxkrnl_audio.cc:56-93 — read callback_ptr[0..1], wrap callback_arg in a
  4-byte big-endian guest heap buffer (`wrapped_callback_arg`), write
  `0x4155_xxxx` to *driver_ptr.
- audio_system.cc:139-141 — guest callback receives r3 = wrapped pointer,
  not raw callback_arg.
- audio_driver.h:21-24 — frame rate 256 samples / 48 kHz ≈ 5.33 ms.

Implementation:
- New `crates/xenia-kernel/src/xaudio.rs` — `XAudioClient`, `XAudioState`
  (8-slot table, pending FIFO, dual-mode ticker), `XAUDIO_INSTR_PERIOD =
  48_000` (lockstep) and `XAUDIO_PERIOD = 5.333 ms` (--parallel), same
  pattern as KRNBUG-D08 v-sync.
- `try_inject_audio_callback` in xenia-app mirrors `try_inject_graphics_interrupt`,
  shares `interrupts.saved` slot for mutex with graphics callbacks.

Gating: ticker + injector run only when `--xaudio-tick` /
`XENIA_XAUDIO_TICK=1`. Default off because Sylpheed's audio callback
enters an infinite `KeWaitForSingleObject` loop on first invocation
(canary's host worker thread provides the buffer-completion fence we
don't model), which hijacks a guest HW thread and regresses
`swaps=2 → 1`. Default-off preserves the lockstep `sylpheed_n*m.json`
goldens exactly.

Producer hunt outcome (FALSIFIED for parked handles 0x1004/0x100c/0x15e4):
at `-n 500M --xaudio-tick` all 3 handles still show
`signal_attempts=0 (primary=0, ghost=0)`. Audio callback is not the
missing producer. Next candidate per audit-findings.md is Timer DPC
delivery (KeSetTimer / KeInsertQueueDpc).

Tests: 562 → 576 green (10 in `xaudio.rs`, 4 in `exports.rs`).
Lockstep `--stable-digest -n 100M` default-off: instructions=100000002,
swaps=2 (matches pre-change baseline byte-for-byte).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 19:50:22 +02:00
MechaCat02
27d3608174 fix(kernel): KRNBUG-D08 — wall-clock v-sync under --parallel
The synthetic v-sync ticker used a per-instruction proxy
(VSYNC_INSTR_PERIOD = 150 k) tuned for ~10 MIPS lockstep
throughput → 60 Hz. Audit M11 observed this drifts under
`--parallel`: with 6 worker threads sharing the kernel mutex,
the dispatcher executes more PPC instructions per tick
callback, so the accumulator never crosses 150 k. Result:
~629 v-syncs/100M lockstep → ~2 v-syncs/100M --parallel.

Hybrid solution preserves lockstep determinism (which the
goldens depend on) while fixing --parallel:

* `tick_vsync_instr(instr_count)` — legacy instruction-count
  ticker, used by lockstep. Bit-stable across runs.

* `tick_vsync_wallclock()` — new Instant-based ticker. Fires
  `floor(elapsed / VSYNC_PERIOD)` v-syncs since the anchor
  and advances the anchor by that many full periods (no
  lazy backlog). Capped at INTERRUPT_QUEUE_CAP per call so a
  forward-jumping clock can't overflow the FIFO.

* `KernelState.parallel_active` flag set at startup from
  `--parallel` / `XENIA_PARALLEL=1`. Read by `coord_pre_round`
  in main.rs to choose between the two tickers.

Verification:

* cargo test --workspace --release: 561 passing (+3 new
  wall-clock tests vs prior 558 baseline).
* lockstep -n 100M --stable-digest: BIT-IDENTICAL to
  pre-Phase-3 baseline. interrupts_delivered preserved at
  ~630 (was ~629 pre-fix).
* --parallel --reservations-table -n 30M: interrupts_delivered
  rose from ~2 to 17. (FIFO INTERRUPT_QUEUE_CAP=4 still caps
  burst delivery; that's a separate bottleneck — addressed
  by raising cap when --parallel queue depth becomes the
  next blocker.)

Trade-off: --parallel runs are non-deterministic at the
v-sync rate by design (per audit M05 PPCBUG-703 already).
Lockstep stays bit-identical, so the `sylpheed_n*m.json`
goldens are untouched.

Audit IDs: KRNBUG-D08 (closed).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 17:34:30 +02:00
MechaCat02
d1105aafae diag(audit): KRNBUG-AUDIT-001 — focused parked-waiter ghost-trail diagnostic
Adds a one-run diagnostic that distinguishes "guest never called
Nt/KeSetEvent on this handle" from "signal landed but waiter wasn't
woken", for any handle named via `--trace-handles-focus`.

Parked-waiter context (project_xenia_rs_sylpheed_stage3_2026_04_29):
four worker threads block Sylpheed past `draws=0` on handles
0x1004 / 0x100c / 0x15e4 / 0x42450b5c (mr=true, sig=false). The
pre-existing audit dropped signal-attempts that targeted handles
without a primary trail, so we couldn't tell whether the producer
was unreachable in the guest or whether the signal landed but missed
its waiter.

Three changes:

* audit.rs: `HandleAudit` gains `focus: HashSet<u32>` and
  `ghost_trails: HashMap<u32, GhostTrail>`. `record_signal`
  auto-falls-through to a new `record_signal_attempt_ghost` when no
  primary trail exists AND the handle is in `focus`. Bounded by
  AUDIT_RING_CAPACITY per handle. Two new tests cover the focus
  ghost-trail and no-double-record invariants.

* main.rs: new `--trace-handles-focus=<LIST>` flag (hex 0x or decimal,
  comma-separated) populates `kernel.audit.focus`. Implies
  `--trace-handles`. New "=== Handle audit (focus) ===" section in
  `dump_thread_diagnostic` emits per-handle:
    - signal_attempts (primary + ghost), waits, wakes
    - merged cycle-sorted timeline (last 16)
    - GuestExport / KernelInternal classification
    - <AUDIT_BLIND> marker when waiter_count > 0 but the audit
      saw no waits (i.e. waiter parked via a non-audit path —
      CS / spinlock / DPC).
    - DIAGNOSIS conclusion that selects between five branches.

* `cmd_check` passes None for focus → goldens unaffected.

Empirical run output at -n 500M lockstep with
`--trace-handles-focus=0x1004,0x100c,0x15e4,0x42450b5c`:

  handle=0x00001004 kind=Event/Manual waiters=1 signaled=false
                    signal_attempts=0 (primary=0, ghost=0)
                    waits=1 wakes=0
     created cycle=0 tid=1 lr=0x824a9f6c src=NtCreateEvent
     => producer is a missing kernel signal source
        (or BST-paradox upstream)
  ... (same shape for 0x100c, 0x15e4)
  handle=0x42450b5c kind=<UNCREATED> waiters=1 signal_attempts=0
                    waits=0 wakes=0 <AUDIT_BLIND>
     => waiter parked via non-audited path

Conclusion: hypothesis (A) confirmed for all 4 handles. Producer is
NOT a wake/eligibility bug — it is a genuinely missing kernel signal
source. The 3 Event/Manual handles share a creator
(lr=0x824a9f6c, tid=1) and the same wait-call wrapper at
lr=0x824ac578 — these are 3 worker threads all parked on
"work-available" notifications that never come.

Verification:
* cargo test --workspace --release: 558 passing (+2 new ghost-trail
  tests vs prior 556 baseline)
* lockstep -n 100M --stable-digest: bit-identical to master HEAD

Audit IDs: KRNBUG-AUDIT-001 (closed — diagnostic instrumentation).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 17:22:14 +02:00
MechaCat02
1f416aaa2e test(check): ORACBUG-004 — sylpheed_n50m stable-digest oracle
Adds a regression-catcher golden for Sylpheed boot at -n 50M lockstep,
covering the first VdSwap pair (the n2m oracle is swap-blind because
the first VdSwap fires at ~18M instructions). The new --stable-digest
flag emits/compares only fields that are deterministic in lockstep:

  instructions, imports, unimpl, draws, swaps,
  unique_render_targets, shader_blobs_live, texture_cache_entries

Excluded:

  packets — empirically ±2-8% lockstep variance (GPU thread race per
    audit M11)
  resolves, interrupts_delivered, interrupts_dropped, texture_decodes —
    scheduling-sensitive under --parallel
  path — cwd-dependent

Empirical determinism: 3 consecutive lockstep -n 50M runs produce
byte-identical stable-digest output.

The n4b canonical-invocation golden the audit's recommended next sprint
also called for is deferred. Per audit memory `--parallel
--reservations-table` is pathologically slow (>32 min for -n 100M), so
-n 4B in that mode would be many hours per run, not the 5-15 min the
plan estimated. n4b will be captured one-shot post-renderer-unblock as
a manual artifact under audit-runs/post-fix/, not as a test golden. See
crates/xenia-app/tests/golden/README.md.

Test infrastructure:
- crates/xenia-app/tests/sylpheed_oracles.rs — invokes
  CARGO_BIN_EXE_xenia-rs against the ISO. Path resolved via SYLPHEED_ISO
  env var (skips gracefully if missing).
- #[ignore]-gated; run via:
    cargo test --release -p xenia-app --test sylpheed_oracles \\
      -- --ignored --nocapture

Closes ORACBUG-004 (P0). Partial: ORACBUG-006 (P1 deferred).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 13:46:02 +02:00
MechaCat02
bae9305982 xenia-app: observability subsystem, --parallel runtime, stress harness
observability.rs installs the tracing subscriber stack (env-filter +
JSON file appender + chrome trace + error layer) and the metrics
recorder shared by the workspace. main.rs grows the new CLI surface:
--parallel, --reservations-table, --trace-handles, --analyze=
{rust,sql,both}, xenia dis --json, --ui, plus the wiring that runs
the CPU through the new scheduler, drives the GPU's threaded backend,
and surfaces the framebuffer + HUD via xenia-ui.

Add tests/parallel_stress.rs (#[ignore]-gated long form, short form
runs 20×@5M) and tests/golden/sylpheed_n2m.json — the digest the
lockstep/parallel combos compare against.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 16:30:26 +02:00
MechaCat02
c694bb3f43 Initial commit: xenia-rs workspace for Xbox 360 RE
Rust reimplementation of the xenia Xbox 360 emulator targeting reverse-
engineering and preservation, initially scoped to Project Sylpheed.

Includes:
- XEX2 loader (LZX decompression, AES decryption, PE parsing)
- XISO / XGD2 disc image VFS
- PPC interpreter with 200+ opcodes and VMX128 decoding
- Static analyzer: functions, cross-references, labels, asm + SQLite output
- HLE kernel covering the xboxkrnl/xam subset used by Sylpheed init
- Debugger with in-memory and SQLite-backed execution tracing
- `xenia-rs` CLI with extract/dis/exec commands that produce cumulative,
  superset SQLite databases and opt-in instruction/import/branch traces

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-16 23:14:56 +02:00