diff --git a/audit-runs/audit-059-handle-disambiguation/round-D2-autosignal-poc/FINDINGS.md b/audit-runs/audit-059-handle-disambiguation/round-D2-autosignal-poc/FINDINGS.md new file mode 100644 index 0000000..13e3dce --- /dev/null +++ b/audit-runs/audit-059-handle-disambiguation/round-D2-autosignal-poc/FINDINGS.md @@ -0,0 +1,144 @@ +# Phase D — Audit-049 Auto-Signal POC — FINDINGS + +**Branch**: `iterate-2C/silph-ui-spawn-trace` (extends Phase C `481591f`) +**Date**: 2026-06-11 +**Sub-rounds**: D2.SR1 → D2.SR4 (4/4 used) +**Verdict**: **B — partial unwedge** + +## Mission + +Phase C diagnosed the audit-049 wedge as tid=13 (silph::UImpl@GamePart_Title) waiting INFINITE on a KEVENT created at `sub_821CB030+0x128` (`lr=0x821cb15c`, post-bl PC). The Phase D POC tests this diagnosis by hooking `NtCreateEvent` from that exact call site and auto-signaling the resulting handle after a configurable delay (`XENIA_SILPH_UI_AUTOSIGNAL_DELAY` instructions). + +If tid=13 unblocks, the diagnosis is confirmed. If new wedges or new threads appear downstream, even better — that's actual game progression past the wedge. + +## Result summary + +| Symptom | SR2/SR3 baseline | SR4 (POC firing) | +|---|---|---| +| `silph autosignal: scheduled handle=0x1078 caller_lr=0x821cb15c` | yes (SR2/SR3) | yes | +| `silph autosignal: firing handle=0x1078` | NO | **yes (cycle 16326209)** | +| handle 0x1078 final | `signaled=false waiters=1 ` | `signal_attempts=1 waiters=0` | +| tid=13 final state | `Blocked(WaitAny[0x1078])` | **`Ready` pc=0x824a9108** | +| tid=1 final state | `Blocked(WaitAny[0x1070])` thread-join | `Blocked(WaitAny[0x1070])` (tid=13 not yet exited) | +| ExCreateThread total | 10 | **12 (+tid=14, +tid=15)** | +| New downstream wedges | none past 0x1078 | **0x1084 (Event/Auto), 0x1088 (Event/Manual)** | +| `cxx_throw` runtime_error decoded | none | **yes, stack depth 6, top L0=0x82612b50 → L4=sub_82450B60+0x1A8 → L6=sub_82450a50** | +| VdSwap | 1 | 1 | +| gpu.interrupt.delivered{source=0} | 6393 | 4539 (different trajectory, no draws) | + +**Conclusion**: tid=13 unwedged cleanly from the audit-049 wait, spawned two follow-on threads (tid=14 entry=`silph` ctx=`0x40929c00`, tid=15 a worker), and progressed deep enough into the silph::UImpl state machine to throw a `runtime_error` from sub_82450a50 → sub_82450B60+0x1A8 (the dispatcher cluster from round 26). The auto-signal **is not** the proper signaler — it lets tid=13 proceed but downstream state-machine invariants the missing real signaler would have established are not in place, so the dispatcher trips on a "not-registered instance" lookup. + +This is a **clean confirmation** of the Phase C diagnosis: the wedge handle, the wait site, and the LR filter are all correct. The fix shape is: +- Either: synthesize the missing signaler properly (cluster-B silph_ui_synth.rs analogue from R33's deferred plan) +- Or: track what the auto-signal needed to write into the work-item state (`[+8]` field per R26) BEFORE signaling, so the dispatcher's BST lookup succeeds + +## Sub-round detail + +### D2.SR1 — initial run, hook never fires (wrong LR filter) + +Filter checked `creator_lr ∈ [0x821CB15C, 0x821CB160]` against `ctx.lr` at `nt_create_event` entry. But `ctx.lr` is the **thunk wrapper return slot** (`0x824a9f6c`), not the guest caller's post-bl PC. Confirmed via handle-audit `created stack` dump: frame 0 lr=`0x824a9f6c`, frame 1 lr=`0x821cb15c`. The guest caller's LR lives one frame up the PPC EABI back-chain. + +Diagnosis classification: **D (filter mismatch)**. Reading-error class #20 (new). + +### D2.SR2 — frame-1-LR fix; hook schedules, never fires + +Refactored `maybe_register_silph_autosignal` to take `(ctx, mem)`, walk back-chain via existing `walk_guest_back_chain` (1 step), match the saved LR. Hook now fires: + +``` +silph autosignal: scheduled handle=0x1078 caller_lr=0x821cb15c for cycle 10000 (now=0, delay=10000) +``` + +But no "firing" log appears, and tid=13 stays Blocked. Classification: **D (drain site never reached)**. + +### D2.SR3 — diagnostic added; confirms drain site never visited + +Added a one-shot info-level "tick (first visit, none due)" log inside `fire_due_silph_autosignals` when pending is non-empty but nothing due. Re-ran. **The tick-diagnostic never fired either** — proving the function isn't being called at all in `--parallel` mode. + +Root cause: `--parallel` dispatches to `run_execution_parallel` (line 2928 of main.rs), which has its own outer loop at line 3186. My Phase D wiring only touched the lockstep path at line 2763. Classification: **D (wrong code path wired)**. + +### D2.SR4 — parallel-path wiring added; hook fires; tid=13 unblocks + +Added the same `set_now_cycle_hint` + `fire_due_silph_autosignals` calls inside the parallel outer loop, right after `coord_pre_round` (and under the same `kernel_arc` guard, so no extra locking). Re-built, re-ran. + +Now all three log lines appear: + +``` +silph autosignal: scheduled handle=0x1078 caller_lr=0x821cb15c for cycle 16326202 (now=16316202, delay=10000) +silph autosignal: tick (first visit, none due) now=16316213 pending=1 first_deadline=16326202 +silph autosignal: firing handle=0x1078 prev_signaled=Some(false) at cycle 16326209 +``` + +`now=16316202` at schedule time confirms `set_now_cycle_hint` is wired through correctly (the parallel path was simply never visited in SR2/SR3). Fire at cycle 16326209 = deadline 16326202 + 7-cycle scheduler granularity. Diagnostic classification: **B (partial unwedge — new waits and cxx_throw downstream)**. + +## Code shape + +POC is ~70 LOC across four files, all env-gated. Default off. + +| File | Change | Lines | +|---|---|---| +| `crates/xenia-cpu/src/scheduler.rs` | `GuestThread.start_entry/start_context` fields; `spawn()` populates; `current_thread_entry_and_ctx()` helper | +18 | +| `crates/xenia-kernel/src/state.rs` | `AutoSignalPending` struct; `silph_autosignal_*` fields; `set_now_cycle_hint`, `maybe_register_silph_autosignal`, `fire_due_silph_autosignals` methods | +95 | +| `crates/xenia-kernel/src/exports.rs` | Hook in `nt_create_event` | +3 | +| `crates/xenia-app/src/main.rs` | Fire-site wiring in lockstep loop (line 2788) **and** parallel loop (line 3215) | +12 | + +Tests stay green at **655/655**. + +## Reading-error class #20 (new) + +**`ctx.lr` at kernel export entry ≠ guest caller's post-bl PC.** When a guest `bl` calls an export thunk, the thunk-wrapper has its own frame between the guest caller and the export body. At export-body entry, `ctx.lr` holds the *wrapper's* return slot, not the guest caller's post-bl PC. + +To match a specific guest call site by LR, the export must walk one step up the back-chain (`walk_guest_back_chain(ctx.gpr[1], ctx.lr, mem, 2)`) and use `frames[1].lr`. + +SR1 burned one full sub-round on this. Detect early in future POCs by comparing `ctx.lr` against the handle-audit's `created stack` frame dump for a known-good event (e.g. one created from a labelled site). + +## Reading-error class #21 (new) + +**`--parallel` and lockstep have separate outer loops in main.rs.** They share `coord_pre_round` (carved out exactly for this reason), but anything wired adjacent to that call site only takes effect on the path it's wired on. Lockstep is `run_execution` (line 2706, outer loop at 2763). Parallel is `run_execution_parallel` (line 2928, outer loop at 3186). + +Per-round hooks added for a specific build mode must be wired in **both** paths. SR2/SR3 burned two sub-rounds on this. + +## Files modified + LR mapping (for follow-up sessions) + +**Wedge handle creation** (confirmed by handle-audit dump): +``` +created cycle=0 tid=13 lr=0x824a9f6c [src=NtCreateEvent thunk return] +created stack (6 frames): + [ 0] fp=0x715a7a10 lr=0x824a9f6c ← ctx.lr at nt_create_event + [ 1] fp=0x715a7aa0 lr=0x821cb15c ← guest caller's post-bl PC (filter on this) + [ 2] fp=0x715a7bd0 lr=0x821cbae0 ← sub_821CBA08 frame + [ 3] fp=0x715a7cd0 lr=0x821cc454 ← sub_821CC3F8 frame + [ 4] fp=0x715a7d60 lr=0x821c4f18 ← sub_821C4EB0 frame (silph::UImpl@GamePart_Title) + [ 5] fp=0x715a7e00 lr=0x82174a80 ← sub_821748F0 trampoline frame +``` + +**Downstream cxx_throw stack** (after auto-signal fires, tid=5 throws runtime_error): +``` +L0 lr=0x82612b50 std::exception throw path +L1 lr=0x825f2444 +L2 lr=0x824547e8 +L3 lr=0x82451418 +L4 lr=0x82450d08 ← sub_82450B60+0x1A8 (dispatcher, audit-059 R26) +L5 lr=0x82450b34 +L6 lr=0x82450a50 ← sub_82450a50 (worker dispatch) + +cxx_throw runtime_error decoded magic=0x19930520 +cxx_throw BST ceil search candidate_key=0x828e2b2c match_found=false +cxx_throw lhs (not-registered instance) lhs=0x715a7af0 +``` + +This confirms the dispatcher reached audit-049 territory (R26's `sub_82450B60+0x1A8` PC `0x82450D08`), looked up a runtime instance in its BST keyed by VA, and the instance was never registered. **The auto-signal bypassed an upstream registration step** the real signaler would have driven. + +## Recommendation + +Ship the POC env-gated (default off; no behavior change unless opted in). The verdict-B success makes it a useful diagnostic flag for future audit-049 work: future investigations can set `XENIA_SILPH_UI_AUTOSIGNAL_DELAY=10000` to skip the wedge and probe downstream behavior without first writing the proper signaler. + +Long-term fix path remains the R33 silph_ui_synth.rs analogue: synthesize the missing signaler + its precondition state (BST instance registration at `0x715a7af0`-equivalent, work-item state `[+8]` per R26). The auto-signal POC is **not** the final fix — it confirms diagnosis but doesn't honor the dispatcher's BST registry invariant. + +## Artifacts + +- `poc-sr1.log`, `poc-sr1.stderr` — initial run, filter mismatch (D) +- `poc-sr2.log`, `poc-sr2.stderr` — frame-1-LR fix, no fire (D) +- `poc-sr3.log`, `poc-sr3.stderr` — diagnostic added, no fire (D, parallel path unwired) +- `poc-sr4.log`, `poc-sr4.stderr` — parallel-path wired, **fires + partial unwedge (B)** + +All `.log`/`.stderr` files are `.gitignore`d; this `FINDINGS.md` is the only artifact-side commit. diff --git a/crates/xenia-app/src/main.rs b/crates/xenia-app/src/main.rs index 409d870..6e3abfe 100644 --- a/crates/xenia-app/src/main.rs +++ b/crates/xenia-app/src/main.rs @@ -2780,6 +2780,13 @@ fn run_execution( RoundCtl::BreakOuter => break, RoundCtl::Continue => {} } + // ITERATE-2C Phase D — deposit the current instruction count so + // `nt_create_event` can compute absolute auto-signal deadlines, + // then drain any pending auto-signals whose deadline has passed. + // Both calls are no-ops when `XENIA_SILPH_UI_AUTOSIGNAL_DELAY` + // is unset (the pending queue stays empty). + kernel.set_now_cycle_hint(stats.instruction_count); + kernel.fire_due_silph_autosignals(stats.instruction_count); dispatch_graphics_interrupts( kernel, mem, @@ -3204,6 +3211,15 @@ fn run_execution_parallel( } let mut guard = pre_outcome.1; + // ITERATE-2C Phase D — same auto-signal hook as the lockstep + // path. Held under the same `kernel_arc` guard the rest of + // this prologue runs under, so no extra locking. + { + let s = stats_mtx.lock().expect("stats mutex poisoned"); + guard.set_now_cycle_hint(s.instruction_count); + guard.fire_due_silph_autosignals(s.instruction_count); + } + // Iterate-2.BE — host-driven synchronous ISR dispatch. // Runs under the kernel lock while workers are still parked // at the phaser B2 barrier (the coordinator hasn't published diff --git a/crates/xenia-cpu/src/scheduler.rs b/crates/xenia-cpu/src/scheduler.rs index 1f1d68a..17a5456 100644 --- a/crates/xenia-cpu/src/scheduler.rs +++ b/crates/xenia-cpu/src/scheduler.rs @@ -117,6 +117,14 @@ pub struct GuestThread { /// Axis 3 instruction budget. Decremented per retired step on this /// thread; on zero, slot rotates within same-priority tier. pub quantum_remaining: u32, + /// SpawnParams.entry — the BL target the trampoline jumped to. + /// Persisted so kernel exports can filter syscalls by spawning + /// chain (e.g. the silph UI auto-signal POC). 0 for the initial + /// thread (uses `install_initial_thread`, not `spawn`). + pub start_entry: u32, + /// SpawnParams.start_context — initial r3 at spawn. Persisted for + /// the same filtering reason as `start_entry`. + pub start_context: u32, } impl GuestThread { @@ -136,6 +144,8 @@ impl GuestThread { affinity_mask: 0xFF, ideal_processor: None, quantum_remaining: QUANTUM_DEFAULT, + start_entry: 0, + start_context: 0, } } } @@ -500,6 +510,17 @@ impl Scheduler { self.current.expect("no current thread") } + /// `(start_entry, start_context)` of the currently-running thread. + /// Returns None if there is no current thread or its ref is stale. + /// Used by `KernelState::maybe_register_silph_autosignal` to filter + /// `NtCreateEvent` calls by spawning chain. + pub fn current_thread_entry_and_ctx(&self) -> Option<(u32, u32)> { + let r = self.current?; + let slot = self.slots.get(r.hw_id as usize)?; + let t = slot.runqueue.get(r.idx as usize)?; + Some((t.start_entry, t.start_context)) + } + // ----- Guest-thread lookup ----- /// Find the `ThreadRef` of the (non-Exited) thread with `tid`. @@ -614,6 +635,8 @@ impl Scheduler { t.priority = params.priority; t.affinity_mask = mask; t.ideal_processor = params.ideal_processor; + t.start_entry = params.entry; + t.start_context = params.start_context; // M3.7 — populate the inter-thread reservation handle + slot id // so the interpreter can route lwarx/stwcx through the table. t.ctx.hw_id = slot_id; diff --git a/crates/xenia-kernel/src/exports.rs b/crates/xenia-kernel/src/exports.rs index fab1ebe..84fb0de 100644 --- a/crates/xenia-kernel/src/exports.rs +++ b/crates/xenia-kernel/src/exports.rs @@ -2001,6 +2001,9 @@ fn nt_create_event(ctx: &mut PpcContext, mem: &GuestMemory, state: &mut KernelSt mem, "NtCreateEvent", ); + // ITERATE-2C Phase D — audit-049 auto-signal POC. Env-gated; no-op + // when `XENIA_SILPH_UI_AUTOSIGNAL_DELAY` is unset. + state.maybe_register_silph_autosignal(handle, ctx, mem); if handle_ptr != 0 { mem.write_u32(handle_ptr, handle); } diff --git a/crates/xenia-kernel/src/state.rs b/crates/xenia-kernel/src/state.rs index ac373cd..f28739c 100644 --- a/crates/xenia-kernel/src/state.rs +++ b/crates/xenia-kernel/src/state.rs @@ -325,6 +325,32 @@ pub struct KernelState { pub silph_synth_handles: [Option; 4], /// AUDIT-2.BF — `ThreadRef` cache for the 4 synthetic workers. pub silph_synth_refs: [Option; 4], + /// ITERATE-2C Phase D — auto-signal delay for silph::UImpl + /// `NtCreateEvent` calls (see [`Self::maybe_register_silph_autosignal`]). + /// `None` = feature disabled; populated once from + /// `XENIA_SILPH_UI_AUTOSIGNAL_DELAY=` at construction. + pub silph_autosignal_delay: Option, + /// ITERATE-2C Phase D — pending auto-signal queue. Drained each + /// outer round by [`Self::fire_due_silph_autosignals`]. + pub silph_autosignal_pending: Vec, + /// ITERATE-2C Phase D — most recent `stats.instruction_count` + /// deposited by the scheduler loop (see + /// [`Self::set_now_cycle_hint`]). Used by + /// [`Self::maybe_register_silph_autosignal`] to compute absolute + /// deadlines, since `nt_create_event` doesn't see `ExecStats`. + pub last_cycle_hint: u64, + /// ITERATE-2C Phase D — one-shot diagnostic latch. Flipped by + /// [`Self::fire_due_silph_autosignals`] on the first visit where + /// the pending queue is non-empty but no entry is due yet. + pub silph_autosignal_diag_logged: bool, +} + +/// ITERATE-2C Phase D — one queued auto-signal. `deadline_cycle` is +/// absolute (cycle hint at register time + configured delay). +#[derive(Debug, Clone, Copy)] +pub struct AutoSignalPending { + pub handle: u32, + pub deadline_cycle: u64, } impl KernelState { @@ -400,6 +426,12 @@ impl KernelState { silph_synth_ctx: 0, silph_synth_handles: [None; 4], silph_synth_refs: [None; 4], + silph_autosignal_delay: std::env::var("XENIA_SILPH_UI_AUTOSIGNAL_DELAY") + .ok() + .and_then(|v| v.parse::().ok()), + silph_autosignal_pending: Vec::new(), + last_cycle_hint: 0, + silph_autosignal_diag_logged: false, }; crate::exports::register_exports(&mut state); crate::xam::register_exports(&mut state); @@ -800,6 +832,122 @@ impl KernelState { self.audit.record_wake(handle, entry); } + /// ITERATE-2C Phase D — deposit the latest scheduler instruction + /// count so `nt_create_event` can compute absolute auto-signal + /// deadlines. Called once per outer round from the app's + /// `coord_pre_round` site. No-op when the feature env is unset. + pub fn set_now_cycle_hint(&mut self, now_cycle: u64) { + self.last_cycle_hint = now_cycle; + } + + /// ITERATE-2C Phase D — register a freshly-allocated event for + /// auto-signal after the configured delay, **iff** the creating + /// thread matches the silph::UImpl tid=13 chain that wedges in + /// audit-049. Filter: + /// + /// * Env `XENIA_SILPH_UI_AUTOSIGNAL_DELAY` set (= delay non-None) + /// * Frame-1 LR (the guest caller's post-bl PC, walked one step up + /// from the live thunk-wrapper frame) is in + /// `[0x821CB15C, 0x821CB160]` — this is the `NtCreateEvent` call + /// site inside `sub_821CB030+0x128`. The live `ctx.lr` is the + /// thunk wrapper's return slot (e.g. `0x824a9f6c`), so we walk + /// one back-chain step to reach the actual guest caller. + /// * Creating thread's `start_entry == 0x821748F0` (silph trampoline) + /// * Creating thread's `start_context == 0x4024a840` + /// + /// On match, the handle is queued with `deadline = last_cycle_hint + + /// delay`. Drained by [`Self::fire_due_silph_autosignals`] from the + /// outer scheduler loop. + pub fn maybe_register_silph_autosignal( + &mut self, + handle: u32, + ctx: &PpcContext, + mem: &GuestMemory, + ) { + let Some(delay) = self.silph_autosignal_delay else { + return; + }; + let Some((entry, start_ctx)) = self.scheduler.current_thread_entry_and_ctx() else { + return; + }; + if entry != 0x821748F0 || start_ctx != 0x4024_a840 { + return; + } + let frames = walk_guest_back_chain(ctx.gpr[1] as u32, ctx.lr as u32, mem, 2); + let caller_lr = match frames.get(1) { + Some((_, lr)) => *lr, + None => return, + }; + if !(0x821CB15C..=0x821CB160).contains(&caller_lr) { + return; + } + let deadline = self.last_cycle_hint.saturating_add(delay); + self.silph_autosignal_pending + .push(AutoSignalPending { handle, deadline_cycle: deadline }); + tracing::info!( + "silph autosignal: scheduled handle={:#x} caller_lr={:#x} for cycle {} (now={}, delay={})", + handle, + caller_lr, + deadline, + self.last_cycle_hint, + delay, + ); + } + + /// ITERATE-2C Phase D — drain pending entries whose deadline has + /// passed. Each fires by setting `Event { signaled = true }` and + /// invoking the existing `wake_eligible_waiters` to release blocked + /// waiters. No-op when the queue is empty (the common case). + pub fn fire_due_silph_autosignals(&mut self, now_cycle: u64) { + if self.silph_autosignal_pending.is_empty() { + return; + } + let any_due = self + .silph_autosignal_pending + .iter() + .any(|p| p.deadline_cycle <= now_cycle); + if !any_due { + // Diagnostic for the Phase D POC: log first time we visit + // with a non-empty queue but nothing due yet. + if !self.silph_autosignal_diag_logged { + self.silph_autosignal_diag_logged = true; + if let Some(first) = self.silph_autosignal_pending.first() { + tracing::info!( + "silph autosignal: tick (first visit, none due) now={} pending={} first_deadline={}", + now_cycle, + self.silph_autosignal_pending.len(), + first.deadline_cycle, + ); + } + } + } + let mut i = 0; + while i < self.silph_autosignal_pending.len() { + if self.silph_autosignal_pending[i].deadline_cycle <= now_cycle { + let p = self.silph_autosignal_pending.swap_remove(i); + let prev = match self.objects.get_mut(&p.handle) { + Some(KernelObject::Event { signaled, .. }) => { + let was = *signaled; + *signaled = true; + Some(was) + } + _ => None, + }; + tracing::info!( + "silph autosignal: firing handle={:#x} prev_signaled={:?} at cycle {}", + p.handle, + prev, + now_cycle, + ); + self.audit_signal(p.handle, 0, "silph_autosignal", prev.unwrap_or(false) as u64); + crate::exports::wake_eligible_waiters(self, p.handle); + // do not advance i — swap_remove pulled a new entry into i + } else { + i += 1; + } + } + } + /// Diagnostic. If the live PC for HW slot `hw_id` is in /// `self.ctor_probe_pcs`, emit a single `CTOR-PROBE` line with /// the current cycle, tid, hw_id, sp, r3, lr, plus an 8-frame