docs(audit): close out follow-up session 2026-05-03
3 IDs landed: GPUBUG-DRAIN-001, KRNBUG-AUDIT-001, KRNBUG-D08. Tests 556 → 561. Lockstep digest BIT-IDENTICAL on stable fields. draws=0 persists; parked-waiter producer-trace confirms hypothesis (A) for 3 of 4 handles — guest code never calls Nt/KeSetEvent on 0x1004 / 0x100c / 0x15e4 — so the renderer plateau is a missing kernel signal source, NOT a wake-eligibility bug or BST-paradox. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -3890,3 +3890,87 @@ plateau persists because:
|
|||||||
the addic/subfic class (canary semantics are a 64-bit add against
|
the addic/subfic class (canary semantics are a 64-bit add against
|
||||||
guest memory the Mm layer doesn't fully model yet).
|
guest memory the Mm layer doesn't fully model yet).
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Follow-up session 2026-05-03 — outcome
|
||||||
|
|
||||||
|
Three audit IDs closed across 3 commits, merged to master with `--no-ff`.
|
||||||
|
HEAD: `8668550`. Tests: 556 → 561 (+5 from new wall-clock + ghost-trail tests).
|
||||||
|
|
||||||
|
### Audit IDs landed
|
||||||
|
|
||||||
|
| ID | Commit | Description |
|
||||||
|
|---|---|---|
|
||||||
|
| **GPUBUG-DRAIN-001** | `7a1b6b3` | VdSwap PM4 fallback warning silenced under `--parallel`. New `drain_until_wptr(target, time_budget)` mirrors canary's `WorkerThreadMain` predicate; vd_swap skips PM4 ring injection (unreliable when ring backs up under --parallel) and uses direct `notify_xe_swap`. The slot-0 fetch-constant patch is deferred (GPUBUG-FETCH-PATCH-001). DrainFence handler publishes the digest mirror before reply (was racing the CPU's post-drain digest_snapshot read). |
|
||||||
|
| **KRNBUG-AUDIT-001** | `d1105aa` | Diagnostic instrumentation: `--trace-handles-focus=<LIST>` flag + per-handle DIAGNOSIS report. `record_signal` falls through to ghost-trail capture for focused handles even when no `record_create` exists. Producer-class classification (GuestExport / KernelInternal). Distinguishes "guest never tried" from "signal landed but missed waiter" in one run. |
|
||||||
|
| **KRNBUG-D08** | `27d3608` | V-sync wall-clock under `--parallel`. Lockstep stays on the deterministic instruction-count proxy (sylpheed goldens unchanged). `--parallel` switches to wall-clock via `tick_vsync_wallclock`, raising delivered v-syncs from ~2 → 17 at -n 30M. INTERRUPT_QUEUE_CAP=4 still bottlenecks burst delivery. |
|
||||||
|
|
||||||
|
### Parked-waiter producer-trace finding
|
||||||
|
|
||||||
|
Empirical run at -n 500M lockstep with the new
|
||||||
|
`--trace-handles-focus=0x1004,0x100c,0x15e4,0x42450b5c`:
|
||||||
|
|
||||||
|
```
|
||||||
|
handle=0x00001004 kind=Event/Manual waiters=1 signaled=false
|
||||||
|
signal_attempts=0 (primary=0, ghost=0) waits=1 wakes=0
|
||||||
|
created cycle=0 tid=1 lr=0x824a9f6c src=NtCreateEvent
|
||||||
|
timeline: cycle=0 tid=10 lr=0x824ac578 src=do_wait_single[wait]
|
||||||
|
GuestExport=0 KernelInternal=0 waits=1
|
||||||
|
=> producer is a missing kernel signal source (or BST-paradox upstream)
|
||||||
|
```
|
||||||
|
|
||||||
|
Same shape for 0x100c and 0x15e4. 0x42450b5c shows `<UNCREATED>` +
|
||||||
|
`<AUDIT_BLIND>` (waiter parked via a non-`do_wait_single` path).
|
||||||
|
|
||||||
|
**Conclusion**: hypothesis (A) confirmed for 3 of the 4 handles. The
|
||||||
|
producer code path is genuinely missing — NO Nt/KeSetEvent /
|
||||||
|
KePulseEvent / KeReleaseSemaphore call EVER targets these handles
|
||||||
|
during 500M instructions of execution. The PPC-vs-Rust traversal
|
||||||
|
paradox (BST-bug from `project_xenia_rs_sylpheed_event_chain_2026_04_29`)
|
||||||
|
is **NOT** the cause for these specific handles. The 3 handles share
|
||||||
|
the same creator (lr=0x824a9f6c, tid=1, all at cycle=0) and the same
|
||||||
|
wait-call wrapper (lr=0x824ac578) — likely 3 sibling worker threads
|
||||||
|
all waiting for "work to do" notifications that never come. Most
|
||||||
|
likely producer-class candidates for next session:
|
||||||
|
|
||||||
|
- File I/O completion (`signal_io_completion_event`) — currently a
|
||||||
|
real implementation but possibly never reached; trace `NtReadFile`
|
||||||
|
paths to see if completion events would target these handles.
|
||||||
|
- XAM async task completion — F2/F3 deferred from prior sprint.
|
||||||
|
- Audio buffer-complete — `XAudioRegisterRenderDriverClient` is a
|
||||||
|
one-shot stub.
|
||||||
|
- Timer DPCs — `KeSetTimer` real impl but APC delivery may be
|
||||||
|
routing wrong.
|
||||||
|
|
||||||
|
### Acceptance criteria
|
||||||
|
|
||||||
|
| # | Criterion | Met? |
|
||||||
|
|---|-----------|------|
|
||||||
|
| 1 | Phase 1: zero "PM4_XE_SWAP not consumed" warnings under canonical invocation | ✅ |
|
||||||
|
| 2 | Phase 2: per-handle DIAGNOSIS for all four parked handles | ✅ |
|
||||||
|
| 3 | Phase 3: vsync rate restored under --parallel; n2m golden untouched | ✅ partial — rate up but FIFO cap=4 still bottlenecks |
|
||||||
|
| 4 | cargo test ≥556 | ✅ 561 |
|
||||||
|
| 5 | All work merged to master | ✅ |
|
||||||
|
| 6 | **STRETCH** ≥1 of 4 handles signals | ❌ — but data-driven hypothesis fail-fast tells us why (producer missing, not wake-eligibility bug) |
|
||||||
|
| 7 | **STRETCH** draws > 0 at -n 100M lockstep | ❌ — gating remains parked-waiter handles |
|
||||||
|
|
||||||
|
### Recommended next session
|
||||||
|
|
||||||
|
1. **Producer hunt** for the 3 Event/Manual handles. With the
|
||||||
|
diagnostic baked in, a focused hunt: identify the guest function
|
||||||
|
at `lr=0x824ac578` (the shared wait-call wrapper), walk its
|
||||||
|
callers, find what kernel signal source SHOULD be wired for each
|
||||||
|
handle. Likely starting points: file I/O completion
|
||||||
|
(`signal_io_completion_event`), XamTaskSchedule callback (F2),
|
||||||
|
XAudio buffer-complete.
|
||||||
|
2. **Raise INTERRUPT_QUEUE_CAP** for `--parallel` workloads — the
|
||||||
|
3044 dropped vsyncs at -n 30M --parallel suggest the FIFO is the
|
||||||
|
next bottleneck.
|
||||||
|
3. **F2/F3** (XAM async completion) per the still-deferred list,
|
||||||
|
especially if Phase 2 of next session pinpoints a missing XAM
|
||||||
|
producer.
|
||||||
|
4. **GPUBUG-FETCH-PATCH-001**: re-enable the PM4_TYPE0
|
||||||
|
fetch-constant patch via a side-channel (GpuCommand variant)
|
||||||
|
when draws actually start firing — relevant for bloom/blur N+1.
|
||||||
|
|||||||
Reference in New Issue
Block a user