diff --git a/audit-findings.md b/audit-findings.md index 6c033ba..ebb6b7f 100644 --- a/audit-findings.md +++ b/audit-findings.md @@ -3890,3 +3890,87 @@ plateau persists because: the addic/subfic class (canary semantics are a 64-bit add against guest memory the Mm layer doesn't fully model yet). + + +--- + +## Follow-up session 2026-05-03 — outcome + +Three audit IDs closed across 3 commits, merged to master with `--no-ff`. +HEAD: `8668550`. Tests: 556 → 561 (+5 from new wall-clock + ghost-trail tests). + +### Audit IDs landed + +| ID | Commit | Description | +|---|---|---| +| **GPUBUG-DRAIN-001** | `7a1b6b3` | VdSwap PM4 fallback warning silenced under `--parallel`. New `drain_until_wptr(target, time_budget)` mirrors canary's `WorkerThreadMain` predicate; vd_swap skips PM4 ring injection (unreliable when ring backs up under --parallel) and uses direct `notify_xe_swap`. The slot-0 fetch-constant patch is deferred (GPUBUG-FETCH-PATCH-001). DrainFence handler publishes the digest mirror before reply (was racing the CPU's post-drain digest_snapshot read). | +| **KRNBUG-AUDIT-001** | `d1105aa` | Diagnostic instrumentation: `--trace-handles-focus=` flag + per-handle DIAGNOSIS report. `record_signal` falls through to ghost-trail capture for focused handles even when no `record_create` exists. Producer-class classification (GuestExport / KernelInternal). Distinguishes "guest never tried" from "signal landed but missed waiter" in one run. | +| **KRNBUG-D08** | `27d3608` | V-sync wall-clock under `--parallel`. Lockstep stays on the deterministic instruction-count proxy (sylpheed goldens unchanged). `--parallel` switches to wall-clock via `tick_vsync_wallclock`, raising delivered v-syncs from ~2 → 17 at -n 30M. INTERRUPT_QUEUE_CAP=4 still bottlenecks burst delivery. | + +### Parked-waiter producer-trace finding + +Empirical run at -n 500M lockstep with the new +`--trace-handles-focus=0x1004,0x100c,0x15e4,0x42450b5c`: + +``` +handle=0x00001004 kind=Event/Manual waiters=1 signaled=false + signal_attempts=0 (primary=0, ghost=0) waits=1 wakes=0 + created cycle=0 tid=1 lr=0x824a9f6c src=NtCreateEvent + timeline: cycle=0 tid=10 lr=0x824ac578 src=do_wait_single[wait] + GuestExport=0 KernelInternal=0 waits=1 + => producer is a missing kernel signal source (or BST-paradox upstream) +``` + +Same shape for 0x100c and 0x15e4. 0x42450b5c shows `` + +`` (waiter parked via a non-`do_wait_single` path). + +**Conclusion**: hypothesis (A) confirmed for 3 of the 4 handles. The +producer code path is genuinely missing — NO Nt/KeSetEvent / +KePulseEvent / KeReleaseSemaphore call EVER targets these handles +during 500M instructions of execution. The PPC-vs-Rust traversal +paradox (BST-bug from `project_xenia_rs_sylpheed_event_chain_2026_04_29`) +is **NOT** the cause for these specific handles. The 3 handles share +the same creator (lr=0x824a9f6c, tid=1, all at cycle=0) and the same +wait-call wrapper (lr=0x824ac578) — likely 3 sibling worker threads +all waiting for "work to do" notifications that never come. Most +likely producer-class candidates for next session: + +- File I/O completion (`signal_io_completion_event`) — currently a + real implementation but possibly never reached; trace `NtReadFile` + paths to see if completion events would target these handles. +- XAM async task completion — F2/F3 deferred from prior sprint. +- Audio buffer-complete — `XAudioRegisterRenderDriverClient` is a + one-shot stub. +- Timer DPCs — `KeSetTimer` real impl but APC delivery may be + routing wrong. + +### Acceptance criteria + +| # | Criterion | Met? | +|---|-----------|------| +| 1 | Phase 1: zero "PM4_XE_SWAP not consumed" warnings under canonical invocation | ✅ | +| 2 | Phase 2: per-handle DIAGNOSIS for all four parked handles | ✅ | +| 3 | Phase 3: vsync rate restored under --parallel; n2m golden untouched | ✅ partial — rate up but FIFO cap=4 still bottlenecks | +| 4 | cargo test ≥556 | ✅ 561 | +| 5 | All work merged to master | ✅ | +| 6 | **STRETCH** ≥1 of 4 handles signals | ❌ — but data-driven hypothesis fail-fast tells us why (producer missing, not wake-eligibility bug) | +| 7 | **STRETCH** draws > 0 at -n 100M lockstep | ❌ — gating remains parked-waiter handles | + +### Recommended next session + +1. **Producer hunt** for the 3 Event/Manual handles. With the + diagnostic baked in, a focused hunt: identify the guest function + at `lr=0x824ac578` (the shared wait-call wrapper), walk its + callers, find what kernel signal source SHOULD be wired for each + handle. Likely starting points: file I/O completion + (`signal_io_completion_event`), XamTaskSchedule callback (F2), + XAudio buffer-complete. +2. **Raise INTERRUPT_QUEUE_CAP** for `--parallel` workloads — the + 3044 dropped vsyncs at -n 30M --parallel suggest the FIFO is the + next bottleneck. +3. **F2/F3** (XAM async completion) per the still-deferred list, + especially if Phase 2 of next session pinpoints a missing XAM + producer. +4. **GPUBUG-FETCH-PATCH-001**: re-enable the PM4_TYPE0 + fetch-constant patch via a side-channel (GpuCommand variant) + when draws actually start firing — relevant for bloom/blur N+1.