Files
xenia-rs/audit-runs/review-a-boot-state/canary-boot-trajectory.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

122 lines
9.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Canary boot-to-first-draw trajectory
**Source data:** `xenia-canary/build-cross/bin/Windows/Debug/canary-jitter-1.jsonl`
(4.4 GB, 18.7M events, 90s wallclock, cold run). Profile builder at
`xenia-rs/audit-runs/phase-nonmatch-investigation/build_profiles.py`.
## TL;DR
- **First boot-time `VdSwap` fires on canary's tid=6 (guest main) at
~9.5 s wallclock**, immediately after the rendering subsystem is
initialized. This is the *empty / system-command-buffer* swap that
ours also reaches (ours's metric `swaps=1` is this swap).
- **First gameplay `VdSwap` (intro-movie frame) fires on canary's
tid=13 (renderer) starting at ~10.7 s wallclock**, after the
`sub_825070F0` worker fan-out at host_ns ≈ 10.382-10.384 s. Canary
tid=13 emits **12,092** `VdSwap` + `VdGetSystemCommandBuffer` calls
in the 90-s window, i.e. ~150 fps sustained.
- The gating event between "boot swap" and "first gameplay swap" is
the 4-worker fan-out spawned by `sub_825070F0` at PCs `0x82506528 /
0x82506558 / 0x82506588 / 0x825065B8` with ctx `0xBCE251C0`. Three
of the four workers begin emitting events at host_ns ≈ 10.705 s
(tids 27/28/29 — see `canary-tid-profiles.md` row 33-35).
## Phase-by-phase trajectory
| t (host_ns) | Phase | What | Citation |
|------:|-------|------|----------|
| 0660 ms | XEX load / startup | `XexLoadImage`, ELF→guest init, kernel-state ctor. Spawn tid=6 ("guest main") at host_ns=660 ms. | `phase-nonmatch-investigation/canary-tid-profiles.md:14` |
| 660 ms1.42 s | **Pre-spawn init** | tid=6 sets up TLS, runs CRT init. Establishes vtables / globals. *Sylpheed-specific*: writes `0x8200A1E8` (vtable for `ANON_Class_713383D7`) at the install-epoch host_ns ≈ 9.49.6 s via a 12-byte POD struct copy `{vptr, self, self}` (see `project_audit_068_session3`). **Critical**: this is the vtable whose slot 1 = `sub_825070F0`. | `project_audit_068_session3_2026_05_20.md` |
| 1.421.94 s | **Main init burst** | 10 thread spawns (tids 817) by tid=6. Ours matches this 1:1. Entries include `0x82181830`, `0x8245A5D0`, `0x82450A28`, `0x82457EF0`, `0x824CD458`, **`0x822F1EE0` (renderer, susp=T)**, `0x824D2878/0x824D2940` (XAudio, susp=T), `0x82178950` (XMA), `0x821748F0` (file IO spawner, susp=T). | `canary-tid-profiles.md:42-55` |
| 1.671 s | **Renderer spawn** | tid=6 calls `ExCreateThread` with entry `0x822F1EE0`, ctx `0xBCE24A40`, suspended=True. Becomes canary tid=13. | `canary-tid-profiles.md:21,49` |
| 1.7261.728 s | **XAudio spawn** | tids 14/15 (XAudio voice-mask poll + sister) spawned suspended. Will dominate event volume (~11M events combined). | `canary-tid-profiles.md:50-51` |
| 1.942.15 s | **Secondary init burst** | 8 more spawns (tids 1825), file-IO + XAM helpers. **Ours emits 0** here — already wedged. | `result.md:48` |
| 9.49.6 s | **vtable install epoch** | Host-side POD struct copy installs `0x8200A1E8` at run-specific arena address (`0xBCE25340` or `0xBCE251C0` per arena drift). This is the ANON_Class_713383D7 instance whose slot 1 = `sub_825070F0`. | `project_audit_068_session3_2026_05_20.md` |
| ~9.5 s | **Boot-init `VdSwap` (on tid=6)** | After `VdInitializeEngines + VdShutdownEngines + VdInitializeEngines + VdSetGraphicsInterruptCallback + VdSetSystemCommandBufferGpuIdentifierAddress + VdInitializeRingBuffer + VdEnableRingBufferRPtrWriteBack + VdGetSystemCommandBuffer`, tid=6 emits **one** `VdSwap` to publish the boot framebuffer. draws=0 still (no PM4 draw packets). | Mirror of `ours-postfix.jsonl` idx 105044-105285; canary same shape. |
| 10.080 s | tid=26 second-call helper | `0x821748F0` second invocation. | `canary-tid-profiles.md:32` |
| **10.383 s** | **sub_825070F0 worker fan-out** | **Four `ExCreateThread` calls in 1 ms** spawn entries `0x82506528 / 0x82506558 / 0x82506588 / 0x825065B8` all sharing ctx `0xBCE251C0` (the ANON_Class instance). These are the workers that consume cache-file IO and signal the wedge event(s) that AUDIT-049 found dangling in ours. | `canary-tid-profiles.md:63-66`, `sub_825070F0.md` |
| 10.7 s | **Worker resume / first events** | tids 27, 28, 29 emit their first events. tid=28 dominates (3.26M events) doing file IO (`530× NtReadFile` of `cache:\…`), heavy CS contention (1.07M RtlEnterCS), and signaling the wedge events. | `canary-tid-profiles.md:33-35`, `sub_82452DC0.md` |
| ~10.7+ s | **Renderer wakes** | Once `sub_825070F0` workers begin, the events that canary's tid=13 was waiting on get signaled. tid=13 transitions Blocked→Running, starts producing `VdGetSystemCommandBuffer`/`VdSwap` pairs at ~150 fps. | `canary-tid-profiles.md:21`, `result.md:30-39` |
| ~10.790 s | **Sustained rendering** | tid=13 emits 12,092 `VdSwap` calls. Intro movie ⇒ title screen ⇒ gameplay (depends on user input). In an unattended cold run, canary likely plateaus on the title screen but is genuinely rendering. | `canary-tid-profiles.md:21` |
## Canary call-chain from entry_point to first gameplay draw
```
canary tid=6 (guest main)
entry_point
→ sub_8216EA68 (post-init dispatcher)
→ sub_822F1AA8 (game-loop dispatcher) (sub_822F1AA8.md)
→ bctrl vtable[0]({sub_82175330 → tail → sub_82173990})
→ sub_82173990 (sync task-spawn-and-join) (sub_82173990.md)
→ bl sub_821746B0 (alloc task + spawn worker thid=17, F8000094)
[worker thid=17 runs body sub_821748F0
→ sub_821C4EB0 → sub_821CC3F8 → sub_821CBA08
→ sub_821CB030 (creates Event, submits work via sub_82452DC0)
→ … cache file loads (cache:\aab216c3\..., cache:\87719002\..., etc.)
→ spawns child workers via ExCreateThread(...,821C4AD0,...)
→ eventually ExTerminateThread(0)]
→ KeWaitForSingleObject(thid=17.handle) INFINITE
[blocks ~445 log lines wallclock; completes when thid=17 terminates]
← returns
← returns to sub_822F1AA8 outer loop
→ iterates sub_821741C8 → sub_82172BA0 → bctrl vtable[6]
→ sub_821B55D8 → sub_824F8398 → sub_824F7CD0 → sub_824F7800
→ bctrl vtable[1] = sub_825070F0 (sub_825070F0.md)
→ 4× ExCreateThread(...,0x82506528/58/88/B8, ctx=0xBCE25xxx, susp=T)
→ 4× NtResumeThread / scheduler enables the workers
[workers tids 27/28/29/+1 begin executing]
→ outer loop continues
→ KeWaitForSingleObject (4040×/60 s = ~67 fps frame-pacing wait)
→ bctrl vtable[2] → various per-frame work
→ tid=6's main loop produces no VdSwap directly past the init swap
canary tid=13 (renderer; spawned by tid=6 at 0x822F1EE0)
[stays suspended OR Blocked-on-event until worker fan-out at 10.38 s]
→ after wake, enters render loop:
while (running) {
VdGetSystemCommandBuffer(...) ; 12,092× / 90 s
… build per-frame command buffer …
VdSwap(buffer_ptr, fetch_ptr, …) ; 12,092× / 90 s
}
```
## Pre-conditions canary establishes before first gameplay draw
In time order, all must hold:
1. **GPU subsystem initialized**: `VdInitializeEngines → VdInitializeRingBuffer → VdEnableRingBufferRPtrWriteBack → VdSetGraphicsInterruptCallback`. Ours: ✓ (idx 105044-105117).
2. **Renderer thread alive**: tid=13 created suspended via `ExCreateThread(entry=0x822F1EE0, susp=T)`. Ours: ✓ (idx 105348).
3. **Worker-cluster activation**: 4 workers spawned by `sub_825070F0` consuming `sub_82452DC0` work. Ours: **✗ 0 fires**.
4. **`sub_821CB030`'s Event signaled**: the per-load completion event created at `sub_821CB030+0x128` and waited at `+0x1AC` must be signaled by a `sub_825070F0` worker. Ours: **`NO_SIGNALS_DESPITE_WAITS` on handle 0x12d0**.
5. **`sub_82173990`'s join-wait completes**: tid=6's wait at `sub_82173990+0x2D0` on the thid=17 thread handle. Ours: **✗ tid=1 stuck on handle 0x12c8 (= tid=13's thread handle)**.
6. **Renderer wakes**: per AUDIT-049, the worker-cluster must signal whatever guards tid=13's body. Canary: ✓. Ours: **✗ tid=13 itself wedges in sub_821CB030**.
## Numerical signature of canary at ~50 s wallclock (for reference)
- 18.7 M events / 28 tids.
- Renderer tid=13: 594 k events, including 12,092 VdSwap.
- Worker tid=28 (sub_825070F0 worker 0): 3.26 M events.
- XAudio tid=14/15: 6.15 M / 4.78 M events.
- ours at 50 M-instr / ~3 s wallclock: 121 k events / 13 tids. Renderer
tid=13 in ours: ~80 events (wedged).
- The order of magnitude differs by ~150× because ours wedges ~7 s before
canary's `sub_825070F0` fan-out fires.
## Uncertainty / open questions
- **What is the precise host-side install of the `ANON_Class_713383D7`
vtable `0x8200A1E8`?** AUDIT-068 sessions 14 localized this to a
POD struct copy in the install epoch [9.4 s, 9.6 s], with the writer
identified at GUEST PPC `sub_824FD240+0x24` (NOT a host-side kernel
import as initially feared). But in ours, `sub_824FD240` and its
callers `sub_824F7800/CD0/8398` fire 0× because that chain is
downstream of the tid=13 wedge. See `project_audit_068_session4`.
- **First "gameplay draw" precisely**: the first VdSwap that emits PM4
draw packets (e.g. `PM4_TYPE3 DRAW_INDX`) into the ringbuffer. Need
to inspect canary's PM4 ring at host_ns ≈ 10.7 s to confirm. AUDIT
history hasn't disambiguated boot/empty-swap from gameplay-swap at
the PM4-packet level. This is a methodology gap.
- **What unwedges canary's worker-cluster activation chain?** AUDIT-068
pinned the install epoch but not the **trigger** — what guest call
causes `sub_824FD240+0x24`'s POD-copy to fire? Identifying the
trigger and replaying it in ours is the unanswered Path β attack.