# Phase D — Audit-049 Auto-Signal POC — FINDINGS **Branch**: `iterate-2C/silph-ui-spawn-trace` (extends Phase C `481591f`) **Date**: 2026-06-11 **Sub-rounds**: D2.SR1 → D2.SR4 (4/4 used) **Verdict**: **B — partial unwedge** ## Mission Phase C diagnosed the audit-049 wedge as tid=13 (silph::UImpl@GamePart_Title) waiting INFINITE on a KEVENT created at `sub_821CB030+0x128` (`lr=0x821cb15c`, post-bl PC). The Phase D POC tests this diagnosis by hooking `NtCreateEvent` from that exact call site and auto-signaling the resulting handle after a configurable delay (`XENIA_SILPH_UI_AUTOSIGNAL_DELAY` instructions). If tid=13 unblocks, the diagnosis is confirmed. If new wedges or new threads appear downstream, even better — that's actual game progression past the wedge. ## Result summary | Symptom | SR2/SR3 baseline | SR4 (POC firing) | |---|---|---| | `silph autosignal: scheduled handle=0x1078 caller_lr=0x821cb15c` | yes (SR2/SR3) | yes | | `silph autosignal: firing handle=0x1078` | NO | **yes (cycle 16326209)** | | handle 0x1078 final | `signaled=false waiters=1 ` | `signal_attempts=1 waiters=0` | | tid=13 final state | `Blocked(WaitAny[0x1078])` | **`Ready` pc=0x824a9108** | | tid=1 final state | `Blocked(WaitAny[0x1070])` thread-join | `Blocked(WaitAny[0x1070])` (tid=13 not yet exited) | | ExCreateThread total | 10 | **12 (+tid=14, +tid=15)** | | New downstream wedges | none past 0x1078 | **0x1084 (Event/Auto), 0x1088 (Event/Manual)** | | `cxx_throw` runtime_error decoded | none | **yes, stack depth 6, top L0=0x82612b50 → L4=sub_82450B60+0x1A8 → L6=sub_82450a50** | | VdSwap | 1 | 1 | | gpu.interrupt.delivered{source=0} | 6393 | 4539 (different trajectory, no draws) | **Conclusion**: tid=13 unwedged cleanly from the audit-049 wait, spawned two follow-on threads (tid=14 entry=`silph` ctx=`0x40929c00`, tid=15 a worker), and progressed deep enough into the silph::UImpl state machine to throw a `runtime_error` from sub_82450a50 → sub_82450B60+0x1A8 (the dispatcher cluster from round 26). The auto-signal **is not** the proper signaler — it lets tid=13 proceed but downstream state-machine invariants the missing real signaler would have established are not in place, so the dispatcher trips on a "not-registered instance" lookup. This is a **clean confirmation** of the Phase C diagnosis: the wedge handle, the wait site, and the LR filter are all correct. The fix shape is: - Either: synthesize the missing signaler properly (cluster-B silph_ui_synth.rs analogue from R33's deferred plan) - Or: track what the auto-signal needed to write into the work-item state (`[+8]` field per R26) BEFORE signaling, so the dispatcher's BST lookup succeeds ## Sub-round detail ### D2.SR1 — initial run, hook never fires (wrong LR filter) Filter checked `creator_lr ∈ [0x821CB15C, 0x821CB160]` against `ctx.lr` at `nt_create_event` entry. But `ctx.lr` is the **thunk wrapper return slot** (`0x824a9f6c`), not the guest caller's post-bl PC. Confirmed via handle-audit `created stack` dump: frame 0 lr=`0x824a9f6c`, frame 1 lr=`0x821cb15c`. The guest caller's LR lives one frame up the PPC EABI back-chain. Diagnosis classification: **D (filter mismatch)**. Reading-error class #20 (new). ### D2.SR2 — frame-1-LR fix; hook schedules, never fires Refactored `maybe_register_silph_autosignal` to take `(ctx, mem)`, walk back-chain via existing `walk_guest_back_chain` (1 step), match the saved LR. Hook now fires: ``` silph autosignal: scheduled handle=0x1078 caller_lr=0x821cb15c for cycle 10000 (now=0, delay=10000) ``` But no "firing" log appears, and tid=13 stays Blocked. Classification: **D (drain site never reached)**. ### D2.SR3 — diagnostic added; confirms drain site never visited Added a one-shot info-level "tick (first visit, none due)" log inside `fire_due_silph_autosignals` when pending is non-empty but nothing due. Re-ran. **The tick-diagnostic never fired either** — proving the function isn't being called at all in `--parallel` mode. Root cause: `--parallel` dispatches to `run_execution_parallel` (line 2928 of main.rs), which has its own outer loop at line 3186. My Phase D wiring only touched the lockstep path at line 2763. Classification: **D (wrong code path wired)**. ### D2.SR4 — parallel-path wiring added; hook fires; tid=13 unblocks Added the same `set_now_cycle_hint` + `fire_due_silph_autosignals` calls inside the parallel outer loop, right after `coord_pre_round` (and under the same `kernel_arc` guard, so no extra locking). Re-built, re-ran. Now all three log lines appear: ``` silph autosignal: scheduled handle=0x1078 caller_lr=0x821cb15c for cycle 16326202 (now=16316202, delay=10000) silph autosignal: tick (first visit, none due) now=16316213 pending=1 first_deadline=16326202 silph autosignal: firing handle=0x1078 prev_signaled=Some(false) at cycle 16326209 ``` `now=16316202` at schedule time confirms `set_now_cycle_hint` is wired through correctly (the parallel path was simply never visited in SR2/SR3). Fire at cycle 16326209 = deadline 16326202 + 7-cycle scheduler granularity. Diagnostic classification: **B (partial unwedge — new waits and cxx_throw downstream)**. ## Code shape POC is ~70 LOC across four files, all env-gated. Default off. | File | Change | Lines | |---|---|---| | `crates/xenia-cpu/src/scheduler.rs` | `GuestThread.start_entry/start_context` fields; `spawn()` populates; `current_thread_entry_and_ctx()` helper | +18 | | `crates/xenia-kernel/src/state.rs` | `AutoSignalPending` struct; `silph_autosignal_*` fields; `set_now_cycle_hint`, `maybe_register_silph_autosignal`, `fire_due_silph_autosignals` methods | +95 | | `crates/xenia-kernel/src/exports.rs` | Hook in `nt_create_event` | +3 | | `crates/xenia-app/src/main.rs` | Fire-site wiring in lockstep loop (line 2788) **and** parallel loop (line 3215) | +12 | Tests stay green at **655/655**. ## Reading-error class #20 (new) **`ctx.lr` at kernel export entry ≠ guest caller's post-bl PC.** When a guest `bl` calls an export thunk, the thunk-wrapper has its own frame between the guest caller and the export body. At export-body entry, `ctx.lr` holds the *wrapper's* return slot, not the guest caller's post-bl PC. To match a specific guest call site by LR, the export must walk one step up the back-chain (`walk_guest_back_chain(ctx.gpr[1], ctx.lr, mem, 2)`) and use `frames[1].lr`. SR1 burned one full sub-round on this. Detect early in future POCs by comparing `ctx.lr` against the handle-audit's `created stack` frame dump for a known-good event (e.g. one created from a labelled site). ## Reading-error class #21 (new) **`--parallel` and lockstep have separate outer loops in main.rs.** They share `coord_pre_round` (carved out exactly for this reason), but anything wired adjacent to that call site only takes effect on the path it's wired on. Lockstep is `run_execution` (line 2706, outer loop at 2763). Parallel is `run_execution_parallel` (line 2928, outer loop at 3186). Per-round hooks added for a specific build mode must be wired in **both** paths. SR2/SR3 burned two sub-rounds on this. ## Files modified + LR mapping (for follow-up sessions) **Wedge handle creation** (confirmed by handle-audit dump): ``` created cycle=0 tid=13 lr=0x824a9f6c [src=NtCreateEvent thunk return] created stack (6 frames): [ 0] fp=0x715a7a10 lr=0x824a9f6c ← ctx.lr at nt_create_event [ 1] fp=0x715a7aa0 lr=0x821cb15c ← guest caller's post-bl PC (filter on this) [ 2] fp=0x715a7bd0 lr=0x821cbae0 ← sub_821CBA08 frame [ 3] fp=0x715a7cd0 lr=0x821cc454 ← sub_821CC3F8 frame [ 4] fp=0x715a7d60 lr=0x821c4f18 ← sub_821C4EB0 frame (silph::UImpl@GamePart_Title) [ 5] fp=0x715a7e00 lr=0x82174a80 ← sub_821748F0 trampoline frame ``` **Downstream cxx_throw stack** (after auto-signal fires, tid=5 throws runtime_error): ``` L0 lr=0x82612b50 std::exception throw path L1 lr=0x825f2444 L2 lr=0x824547e8 L3 lr=0x82451418 L4 lr=0x82450d08 ← sub_82450B60+0x1A8 (dispatcher, audit-059 R26) L5 lr=0x82450b34 L6 lr=0x82450a50 ← sub_82450a50 (worker dispatch) cxx_throw runtime_error decoded magic=0x19930520 cxx_throw BST ceil search candidate_key=0x828e2b2c match_found=false cxx_throw lhs (not-registered instance) lhs=0x715a7af0 ``` This confirms the dispatcher reached audit-049 territory (R26's `sub_82450B60+0x1A8` PC `0x82450D08`), looked up a runtime instance in its BST keyed by VA, and the instance was never registered. **The auto-signal bypassed an upstream registration step** the real signaler would have driven. ## Recommendation Ship the POC env-gated (default off; no behavior change unless opted in). The verdict-B success makes it a useful diagnostic flag for future audit-049 work: future investigations can set `XENIA_SILPH_UI_AUTOSIGNAL_DELAY=10000` to skip the wedge and probe downstream behavior without first writing the proper signaler. Long-term fix path remains the R33 silph_ui_synth.rs analogue: synthesize the missing signaler + its precondition state (BST instance registration at `0x715a7af0`-equivalent, work-item state `[+8]` per R26). The auto-signal POC is **not** the final fix — it confirms diagnosis but doesn't honor the dispatcher's BST registry invariant. ## Artifacts - `poc-sr1.log`, `poc-sr1.stderr` — initial run, filter mismatch (D) - `poc-sr2.log`, `poc-sr2.stderr` — frame-1-LR fix, no fire (D) - `poc-sr3.log`, `poc-sr3.stderr` — diagnostic added, no fire (D, parallel path unwired) - `poc-sr4.log`, `poc-sr4.stderr` — parallel-path wired, **fires + partial unwedge (B)** All `.log`/`.stderr` files are `.gitignore`d; this `FINDINGS.md` is the only artifact-side commit.