Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
9.6 KiB
9.6 KiB
Canary boot-to-first-draw trajectory
Source data: xenia-canary/build-cross/bin/Windows/Debug/canary-jitter-1.jsonl
(4.4 GB, 18.7M events, 90s wallclock, cold run). Profile builder at
xenia-rs/audit-runs/phase-nonmatch-investigation/build_profiles.py.
TL;DR
- First boot-time
VdSwapfires on canary's tid=6 (guest main) at ~9.5 s wallclock, immediately after the rendering subsystem is initialized. This is the empty / system-command-buffer swap that ours also reaches (ours's metricswaps=1is this swap). - First gameplay
VdSwap(intro-movie frame) fires on canary's tid=13 (renderer) starting at ~10.7 s wallclock, after thesub_825070F0worker fan-out at host_ns ≈ 10.382-10.384 s. Canary tid=13 emits 12,092VdSwap+VdGetSystemCommandBuffercalls in the 90-s window, i.e. ~150 fps sustained. - The gating event between "boot swap" and "first gameplay swap" is
the 4-worker fan-out spawned by
sub_825070F0at PCs0x82506528 / 0x82506558 / 0x82506588 / 0x825065B8with ctx0xBCE251C0. Three of the four workers begin emitting events at host_ns ≈ 10.705 s (tids 27/28/29 — seecanary-tid-profiles.mdrow 33-35).
Phase-by-phase trajectory
| t (host_ns) | Phase | What | Citation |
|---|---|---|---|
| 0–660 ms | XEX load / startup | XexLoadImage, ELF→guest init, kernel-state ctor. Spawn tid=6 ("guest main") at host_ns=660 ms. |
phase-nonmatch-investigation/canary-tid-profiles.md:14 |
| 660 ms–1.42 s | Pre-spawn init | tid=6 sets up TLS, runs CRT init. Establishes vtables / globals. Sylpheed-specific: writes 0x8200A1E8 (vtable for ANON_Class_713383D7) at the install-epoch host_ns ≈ 9.4–9.6 s via a 12-byte POD struct copy {vptr, self, self} (see project_audit_068_session3). Critical: this is the vtable whose slot 1 = sub_825070F0. |
project_audit_068_session3_2026_05_20.md |
| 1.42–1.94 s | Main init burst | 10 thread spawns (tids 8–17) by tid=6. Ours matches this 1:1. Entries include 0x82181830, 0x8245A5D0, 0x82450A28, 0x82457EF0, 0x824CD458, 0x822F1EE0 (renderer, susp=T), 0x824D2878/0x824D2940 (XAudio, susp=T), 0x82178950 (XMA), 0x821748F0 (file IO spawner, susp=T). |
canary-tid-profiles.md:42-55 |
| 1.671 s | Renderer spawn | tid=6 calls ExCreateThread with entry 0x822F1EE0, ctx 0xBCE24A40, suspended=True. Becomes canary tid=13. |
canary-tid-profiles.md:21,49 |
| 1.726–1.728 s | XAudio spawn | tids 14/15 (XAudio voice-mask poll + sister) spawned suspended. Will dominate event volume (~11M events combined). | canary-tid-profiles.md:50-51 |
| 1.94–2.15 s | Secondary init burst | 8 more spawns (tids 18–25), file-IO + XAM helpers. Ours emits 0 here — already wedged. | result.md:48 |
| 9.4–9.6 s | vtable install epoch | Host-side POD struct copy installs 0x8200A1E8 at run-specific arena address (0xBCE25340 or 0xBCE251C0 per arena drift). This is the ANON_Class_713383D7 instance whose slot 1 = sub_825070F0. |
project_audit_068_session3_2026_05_20.md |
| ~9.5 s | Boot-init VdSwap (on tid=6) |
After VdInitializeEngines + VdShutdownEngines + VdInitializeEngines + VdSetGraphicsInterruptCallback + VdSetSystemCommandBufferGpuIdentifierAddress + VdInitializeRingBuffer + VdEnableRingBufferRPtrWriteBack + VdGetSystemCommandBuffer, tid=6 emits one VdSwap to publish the boot framebuffer. draws=0 still (no PM4 draw packets). |
Mirror of ours-postfix.jsonl idx 105044-105285; canary same shape. |
| 10.080 s | tid=26 second-call helper | 0x821748F0 second invocation. |
canary-tid-profiles.md:32 |
| 10.383 s | sub_825070F0 worker fan-out | Four ExCreateThread calls in 1 ms spawn entries 0x82506528 / 0x82506558 / 0x82506588 / 0x825065B8 all sharing ctx 0xBCE251C0 (the ANON_Class instance). These are the workers that consume cache-file IO and signal the wedge event(s) that AUDIT-049 found dangling in ours. |
canary-tid-profiles.md:63-66, sub_825070F0.md |
| 10.7 s | Worker resume / first events | tids 27, 28, 29 emit their first events. tid=28 dominates (3.26M events) doing file IO (530× NtReadFile of cache:\…), heavy CS contention (1.07M RtlEnterCS), and signaling the wedge events. |
canary-tid-profiles.md:33-35, sub_82452DC0.md |
| ~10.7+ s | Renderer wakes | Once sub_825070F0 workers begin, the events that canary's tid=13 was waiting on get signaled. tid=13 transitions Blocked→Running, starts producing VdGetSystemCommandBuffer/VdSwap pairs at ~150 fps. |
canary-tid-profiles.md:21, result.md:30-39 |
| ~10.7–90 s | Sustained rendering | tid=13 emits 12,092 VdSwap calls. Intro movie ⇒ title screen ⇒ gameplay (depends on user input). In an unattended cold run, canary likely plateaus on the title screen but is genuinely rendering. |
canary-tid-profiles.md:21 |
Canary call-chain from entry_point to first gameplay draw
canary tid=6 (guest main)
entry_point
→ sub_8216EA68 (post-init dispatcher)
→ sub_822F1AA8 (game-loop dispatcher) (sub_822F1AA8.md)
→ bctrl vtable[0]({sub_82175330 → tail → sub_82173990})
→ sub_82173990 (sync task-spawn-and-join) (sub_82173990.md)
→ bl sub_821746B0 (alloc task + spawn worker thid=17, F8000094)
[worker thid=17 runs body sub_821748F0
→ sub_821C4EB0 → sub_821CC3F8 → sub_821CBA08
→ sub_821CB030 (creates Event, submits work via sub_82452DC0)
→ … cache file loads (cache:\aab216c3\..., cache:\87719002\..., etc.)
→ spawns child workers via ExCreateThread(...,821C4AD0,...)
→ eventually ExTerminateThread(0)]
→ KeWaitForSingleObject(thid=17.handle) INFINITE
[blocks ~445 log lines wallclock; completes when thid=17 terminates]
← returns
← returns to sub_822F1AA8 outer loop
→ iterates sub_821741C8 → sub_82172BA0 → bctrl vtable[6]
→ sub_821B55D8 → sub_824F8398 → sub_824F7CD0 → sub_824F7800
→ bctrl vtable[1] = sub_825070F0 (sub_825070F0.md)
→ 4× ExCreateThread(...,0x82506528/58/88/B8, ctx=0xBCE25xxx, susp=T)
→ 4× NtResumeThread / scheduler enables the workers
[workers tids 27/28/29/+1 begin executing]
→ outer loop continues
→ KeWaitForSingleObject (4040×/60 s = ~67 fps frame-pacing wait)
→ bctrl vtable[2] → various per-frame work
→ tid=6's main loop produces no VdSwap directly past the init swap
canary tid=13 (renderer; spawned by tid=6 at 0x822F1EE0)
[stays suspended OR Blocked-on-event until worker fan-out at 10.38 s]
→ after wake, enters render loop:
while (running) {
VdGetSystemCommandBuffer(...) ; 12,092× / 90 s
… build per-frame command buffer …
VdSwap(buffer_ptr, fetch_ptr, …) ; 12,092× / 90 s
}
Pre-conditions canary establishes before first gameplay draw
In time order, all must hold:
- GPU subsystem initialized:
VdInitializeEngines → VdInitializeRingBuffer → VdEnableRingBufferRPtrWriteBack → VdSetGraphicsInterruptCallback. Ours: ✓ (idx 105044-105117). - Renderer thread alive: tid=13 created suspended via
ExCreateThread(entry=0x822F1EE0, susp=T). Ours: ✓ (idx 105348). - Worker-cluster activation: 4 workers spawned by
sub_825070F0consumingsub_82452DC0work. Ours: ✗ 0 fires. sub_821CB030's Event signaled: the per-load completion event created atsub_821CB030+0x128and waited at+0x1ACmust be signaled by asub_825070F0worker. Ours: ✗NO_SIGNALS_DESPITE_WAITSon handle 0x12d0.sub_82173990's join-wait completes: tid=6's wait atsub_82173990+0x2D0on the thid=17 thread handle. Ours: ✗ tid=1 stuck on handle 0x12c8 (= tid=13's thread handle).- Renderer wakes: per AUDIT-049, the worker-cluster must signal whatever guards tid=13's body. Canary: ✓. Ours: ✗ tid=13 itself wedges in sub_821CB030.
Numerical signature of canary at ~50 s wallclock (for reference)
- 18.7 M events / 28 tids.
- Renderer tid=13: 594 k events, including 12,092 VdSwap.
- Worker tid=28 (sub_825070F0 worker 0): 3.26 M events.
- XAudio tid=14/15: 6.15 M / 4.78 M events.
- ours at 50 M-instr / ~3 s wallclock: 121 k events / 13 tids. Renderer tid=13 in ours: ~80 events (wedged).
- The order of magnitude differs by ~150× because ours wedges ~7 s before
canary's
sub_825070F0fan-out fires.
Uncertainty / open questions
- What is the precise host-side install of the
ANON_Class_713383D7vtable0x8200A1E8? AUDIT-068 sessions 1–4 localized this to a POD struct copy in the install epoch [9.4 s, 9.6 s], with the writer identified at GUEST PPCsub_824FD240+0x24(NOT a host-side kernel import as initially feared). But in ours,sub_824FD240and its callerssub_824F7800/CD0/8398fire 0× because that chain is downstream of the tid=13 wedge. Seeproject_audit_068_session4. - First "gameplay draw" precisely: the first VdSwap that emits PM4
draw packets (e.g.
PM4_TYPE3 DRAW_INDX) into the ringbuffer. Need to inspect canary's PM4 ring at host_ns ≈ 10.7 s to confirm. AUDIT history hasn't disambiguated boot/empty-swap from gameplay-swap at the PM4-packet level. This is a methodology gap. - What unwedges canary's worker-cluster activation chain? AUDIT-068
pinned the install epoch but not the trigger — what guest call
causes
sub_824FD240+0x24's POD-copy to fire? Identifying the trigger and replaying it in ours is the unanswered Path β attack.