Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
7.7 KiB
Phase XAudio-Resume — ESCALATION (case IV)
Date: 2026-05-19 Outcome: Resume mechanism is correctly implemented. The 60% missing event volume is gated on a DOWNSTREAM application-level spin-poll, not on the resume itself. No engine change landed.
Canary's resume mechanism (Step 1+2)
For each suspended XAudio worker (entry_pc=0x824D2878 aff=16 → tid=14;
entry_pc=0x824D2940 aff=32 → tid=15), canary tid=6 (main) emits an identical
6-call sequence immediately after ExCreateThread:
canary tid=6 idx=106750..106766 (host_ns 1726.0..1726.2 ms)
106750 import.call ExCreateThread
106751 kernel.call ExCreateThread
106752 handle.create (raw_handle 0x???????? — tid=14 handle)
106753 thread.create (entry_pc=0x824d2878, suspended=true)
106754 kernel.return ExCreateThread
106755 import.call ObReferenceObjectByHandle
106756 kernel.call ObReferenceObjectByHandle
106757 kernel.return ObReferenceObjectByHandle
106758 import.call KeSetBasePriorityThread
106759 kernel.call KeSetBasePriorityThread
106760 kernel.return KeSetBasePriorityThread
106761 import.call KeResumeThread ← RESUME (xboxkrnl ord 146)
106762 kernel.call KeResumeThread
106763 kernel.return KeResumeThread
106764 import.call ObDereferenceObject
106765 kernel.call ObDereferenceObject
106766 kernel.return ObDereferenceObject
Block repeats verbatim at idx 106767-106783 for entry_pc=0x824D2940. Containing
function is XAudioRegisterRenderDriverClient (visible at idx 106817).
Ours's behavior at the matched site (Step 3)
Cold ours (-n 500M, --halt-on-deadlock, fresh cache wipe), checked against
/tmp/ours-xaudio.jsonl (121,569 events captured before halt):
ours tid=1 idx=106756..106786 (host_ns 1626 ms — boot is ~100 ms ahead of canary)
106756 import.call ExCreateThread
106757 kernel.call ExCreateThread
106758 handle.create
106759 thread.create (entry_pc=0x824d2878, suspended=true) ← matches canary
106760 kernel.return ExCreateThread
...
106767 import.call KeResumeThread ← RESUME fires
106768 kernel.call KeResumeThread
106769 kernel.return KeResumeThread
...
106776 thread.create (entry_pc=0x824d2940, suspended=true) ← matches canary
...
106784 import.call KeResumeThread ← second RESUME fires
106785 kernel.call KeResumeThread
106786 kernel.return KeResumeThread
ours's per-tid first-events (cold) for the spawned children:
tid=9 (=canary tid=14, entry 0x824d2878): 77 events, idx 0..76 identical to canary tid=14
tid=10 (=canary tid=15, entry 0x824d2940): 17 events, idx 0..16 identical to canary tid=15
Ours's tid=9 / tid=10 EXECUTE the canary-matching XAudio init sequence:
KeWaitForSingleObject (with immediate signal) → spinlock/IRQL cycle →
XAudioGetVoiceCategoryVolumeChangeMask → KeReleaseSemaphore →
more IRQL cycles. Then halt.
Halt-on-deadlock diagnostic shows tids 9 and 10 in state Ready at
pc=0x824d1404 lr=0x824d22b4 — they are NOT blocked on a missing kernel
API, they are inside a guest-side spin-poll loop:
0x824d1400: beqlr cr6 ; return if poll succeeded
0x824d1404: cmpd cr6, r10, r11 ; r10 vs r11
0x824d1408: beq cr6, 0x824D1420 ; ok-branch
0x824d140c: mr r31, r31 ; nop (yield hint)
0x824d1410: ld r11, 0(r4) ; reload [r4]
0x824d1414: cmpdi cr6, r11, 0
0x824d1418: bne cr6, 0x824D1404 ; if nonzero, loop
0x824d141c: blr
r4 = r31+356 (caller pushes addi r4, r31, 356 at 0x824d22a8). The threads
are spin-polling guest memory at [r31+356] waiting for it to reach 0.
Classification: case IV (not I / II / III)
The plan's original classification anticipated:
- (I) ours doesn't reach the spawn LR ← refuted: spawn fires at idx 106756/106773
- (II) ours reaches spawn but no resume ← refuted: KeResumeThread fires at idx 106768/106785
- (III) ours's NtResumeThread is misimplemented ← refuted:
resume_ref()correctly clearsBlocked(BlockReason::Suspended)→Ready; halt diagnostic confirms post-resume Ready state and identical first-77/17 events to canary
Actual classification (IV): Resume succeeds; XAudio threads start running and
execute their init sequence verbatim against canary; then enter a guest-side
application spin-poll on [r31+356] that never resolves in ours. The producer
of the 0-write to that location is part of canary's audio/GPU host bridge chain
that AUDIT-048 only partially restored (cascades A/B/D landed; cascade C —
XAudioSubmitRenderDriverFrame — remained 0 per that audit's own assessment).
Why the 60% volume claim doesn't follow from a resume-only fix
Phase NonMatch's "60% missing event volume" attribution to XAudio assumed
the threads simply weren't running. They ARE running — they emit identical
first events, get scheduled, and reach the spin loop. The volume bottleneck
is the post-init steady-state pump: canary's 6.15 M tid=14 events come from
26,126 repeated iterations of the XAudioGetVoiceCategoryVolumeChangeMask /
KeReleaseSemaphore / IRQL-cycle loop, each iteration gated on the host
bridge clearing the [r31+356] flag. With the flag stuck non-zero in ours,
the loop never re-enters; only the single first iteration (idx 0-76) ever
executes. No quantum of resume-side change is going to unstick this.
Out-of-scope for this session
Per session authorization, fixing the host-bridge memory-write that clears
[r31+356] requires touching xenia-apu/xenia-gpu host code, which is
explicitly forbidden ("the host bridge is separate"). Therefore no engine
change lands in this session.
Progression metric (re-validation gate, baseline-only)
Not re-measured for a change — there was no change. Pre-existing baseline
remains the C+23+absorber state (23cf4c4cbf61a577caa4118ab2308ba6 /
ba5b5e07… depending on Phase D stage). swaps and draws unchanged. Per-chain
matched-prefixes from MEMORY.md remain:
- main tid=6→1: 105,046 (with Phase D D-extension absorber)
- sister chains 11/32/4/41/16: preserved
Recommended next attack target
The remaining XAudio gate is AUDIT-048 cascade C: producer of the
[r31+356]=0 write. This is the part of the audio host-bridge chain that did
NOT land in AUDIT-048. It likely involves:
XAudioSubmitRenderDriverFramehost-side callback firing the buffer-complete event with a side effect that decrements/clears a counter at offset 356 of the XAudio client struct.KeReleaseSemaphoreon a paired semaphore that produces the host-side buffer-complete notification.
A targeted re-attack would:
- Read xenia-canary's
apu/audio_system.cc+apu/xma_decoder.ccto find the host-side write that clearsr31+356(likely an XAUDIO_CLIENT_STATE struct field). - Mirror it in xenia-rs's
xaudio.rs/ audio worker context. - Re-validate the cold cycle. swaps may move 1→2 if the audio pump reaches the renderer fence; draws likely remain 0 (audio ≠ renderer per AUDIT-048).
That work is the AUDIT-048-cascade-C completion task, NOT the resume gate. It's the natural sister of the deferred sub_825070F0 main-gate Path P.
Per-chain delta (no change this session)
| chain | pre | post | delta |
|---|---|---|---|
| tid=6→1 main | 105,046 | 105,046 | 0 |
| tid=11→11 | preserved | preserved | 0 |
| tid=14→9 XAudio | 41 | 41 | 0 |
| tid=15→10 XAudio | 16 | 16 | 0 |
| tid=4→4 | preserved | preserved | 0 |
| tid=16→16 | preserved | preserved | 0 |
Artifacts
tid6_window.json— canary tid=6 events idx 106700..108200 around the XAudio spawn bursttid14_first.json/tid15_first.json— canary tid=14/15 first 120 eventsextract_window.py— extraction scriptescalation.md— this file