# Phase Non-match Investigation — Results **Date**: 2026-05-19 **Source**: `xenia-canary/build-cross/bin/Windows/Debug/canary-jitter-1.jsonl` (4.4 GB, 18.7M events, 28 tids) **Companion ours data**: `audit-runs/phase-w-wedge-reattack/ours-postfix.jsonl` (121,569 events, 13 tids) **Outcome**: **(A) — AUDIT-058/063/067 framing CONFIRMED** end-to-end using new Phase A thread.create events. ## TL;DR Per Phase A `thread.create` events (wired in C+15-α), canary spawns **23 threads**; the final 4 fire at `host_ns ≈ 10.38 s` and have entry PCs `0x82506528 / 0x82506558 / 0x82506588 / 0x825065B8` with shared context `0xBCE251C0` and stack 65,536 — these are **exactly** the 4 worker entries documented in the `sub_825070F0` dossier. The historical AUDIT-058/063 framing is correct: `sub_825070F0` is the one-shot 4-worker fan-out that ours never reaches. Three of those four canary workers go on to dominate the trace: **tid=28 (3.26M events, sub_82506528), tid=27 (36k events, sub_82506558), tid=29 (91k events, sub_82506588)** — the fourth (`0x825065B8`) was never resumed in this 90s window. Ours emits **10 thread.create** events vs canary's 23, stops after spawn #10 (`0x821748F0` at 1.727s), and **never produces another thread.create** for the rest of the run. The 13 subsequent canary spawns including the critical sub_825070F0 batch are entirely missing. ## What canary's heavy workers DO | tid | events | role | entry_pc | |----:|-------:|------|----------| | 14 | **6.15 M** | **XAudio voice-mask poll** (26,126× XAudioGetVoiceCategoryVolumeChangeMask) | `0x824D2878` (aff=16) | | 15 | **4.78 M** | XAudio sister (KeWaitForSingleObject + heavy IRQL spinlock cycle) | `0x824D2940` (aff=32) | | 28 | **3.26 M** | **sub_825070F0 worker 0** (1.07 M × RtlEnterCS, 530× NtReadFile) | `0x82506528` (ctx `0xBCE251C0`) | | 16 | 1.80 M | XMA decoder (`XMACreateContext`, RtlEnterCS heavy) | `0x82178950` | | 21 | 1.00 M | NtWaitForMultipleObjectsEx worker | `0x824563E0` | | 13 | 594 k | **Renderer** (12,092× VdSwap, VdGetSystemCommandBuffer; 1,805× Ke/NtSetEvent; 475× wait.begin) | `0x822F1EE0` | The **biggest workers (tid=14, tid=15)** are NOT sub_825070F0 workers — they are spawned much earlier (1.726/1.727s) via `sub_824D2878 / sub_824D2940` and run forever as XAudio render/voice threads. **Ours spawns these two suspended (1.626s) but they never receive the resume call that would activate them** — ours produces 0 XAudio* events on these tids (verifiable from ours's tid event counts: ours has only 13 tids total, none with the 6M-event signature). ## Spawn-chain summary (full table in `canary-tid-profiles.md`) Three distinct fan-out clusters in canary, all from tid=6 (guest main): 1. **1.42–1.94 s — main init burst**: 10 spawns (tids 8–17). Ours matches this 1:1 in spawn count and entries. 2. **1.94–2.15 s — secondary burst** (XAM/XCONFIG helpers, tids 18–25): 8 additional spawns. **Ours emits 0**. 3. **10.08–10.38 s — XAudio worker fan-out**: 5 spawns (tids 26, 27, 28, 29, +1 unresumed). The last 4 are the `sub_825070F0` workers. **Ours emits 0**. ## sub_825070F0 spawn-chain confirmation (static + runtime) - `sylpheed.db` confirms `sub_825070F0` lives in `vtable 0x8200A208 slot 1` and `0x8200A928 slot 1` (anonymous class `ANON_Class_713383D7`, 7 slots each). - **Zero `vptr_writes` / zero `xrefs` / zero `indirect_dispatch_candidates`** reach either vtable. AUDIT-067's host-side install hypothesis is confirmed by static-analysis exhaustion. - Function body contains the 4 sequential `addi rN, r0, 0x8250652X` + `bl sub_824AA388` (= ExCreateThread wrapper) blocks at PCs `0x825071F8 / 0x82507244 / 0x82507290 / 0x825072DC`. - The 4 worker entry thunks (`0x82506528 / 0x82506558 / 0x82506588 / 0x825065B8`) are uniform vtable-slot callers: each loads `r3->vtable->[140|144|148|152]` and dispatches via CTR (offsets 35/36/37/38). - Runtime ctx `0xBCE251C0` is referenced **4× in canary jsonl** (the 4 spawn events) and **0× in ours-postfix.jsonl**. Ours never allocates the dispatcher object that holds the `0x8200A208` vtable. ## Wake/signal chain to wedge (partial) - Phase W: ours's wedge handle `0x12d0` (`Event/Auto` waited at `sub_821CB030+0x1B0` on tid=13 the renderer); main tid=1 join-waits on `Thread(id=13)` at `sub_82173990+0x2D4`. - Canary tid=13 (renderer) creates **10 handles**, calls Ke/NtSetEvent **1,805×**, calls wait.begin **475×** — it is alive and signaling. Earliest tid=13 handle.create at 2.396 s; explosion at 10.7 s **once the sub_825070F0 workers come online**. - Canary tid=13's signals correlate with the sub_825070F0 worker batch coming up at 10.7 s (tid=27/28/29 first-events are all 10.705 s). Without those workers, ours's renderer has no producer to wake the event it waits on, and main joins-on-renderer → full deadlock. - Full SID-level mapping of "which canary worker fires the NtSetEvent that wakes the renderer's wait" was not attempted (handle IDs and SIDs don't cross-correlate run-to-run; would require source-level read of `sub_821CB030`). The class of producer (`sub_825070F0` workers) is identified. ## Reading-error / methodology notes - **#16 EH-handler caution**: the `sub_824AA388` spawn helper is reached via `bl` (direct call, not via EH unwind) — no risk of misanchoring on a catch handler. - **#28 framing**: Phase A `thread.create.payload.parent_tid` redundantly equals the event's `tid` field (per `event_log.cc:312-326`: emitted ON the parent thread's stream, child tid is NOT in payload). Child-tid is recovered by FIFO matching to `first_event[tid]` chronologically. - **#30 cross-engine SIDs**: ours's wedge handle SID `d5e23609d3948568` does not appear in canary because these are worker-local Event handles, not process-global dispatchers; only the shared-global recipe is scheduling-invariant. - **Cold-run jitter** was not a factor here — only one canary jsonl was processed; the spawn-chain identification is robust because the SID-independent entry_pc + ctx_ptr + stack_size triplet is effectively a content-addressed fingerprint that survives reruns. ## Outcome: (A) — historical framing confirmed The Phase A `thread.create` data directly corroborates AUDIT-058/063/067: 1. `sub_825070F0` IS the function that spawns the 4 sub_82506528-family workers (confirmed in canary trace, never fires in ours). 2. The dispatcher class `ANON_Class_713383D7` whose vtable `0x8200A208` slot 1 points at `sub_825070F0` has its vtable installed via a path invisible to static guest analysis (AUDIT-067 unresolved). 3. The HEAVY workers (tid=14/15 → XAudio; tid=16 → XMA; tid=21 → NtWait worker) are spawned **earlier** via different entries (`sub_824D2878`, `sub_824D2940`, `sub_82178950`, `sub_824563E0`) but are all suspended; their resume gate is also missing in ours (those threads exist in ours-postfix but emit < 100 events each, all from the spawn-time bookkeeping). ## Recommended next attack target **Re-attempt the deferred AUDIT-067 / AUDIT-068 host-side vptr install probe** with current tooling. Specific subtasks: 1. **Identify the allocator that produces the `ANON_Class_713383D7` instance** with vtable `0x8200A208`. - Static search: which fn loads `0x8200A208` as a constant? (database says nothing — confirm with a fresh ghidra script that includes split-pair detection.) - Runtime probe: instrument both engines to log every `stw vptr, 0(obj)` where `vptr ∈ {0x8200A208, 0x8200A928}`. In canary, this MUST fire ≥ 1× before the 10.38 s spawn burst; in ours, it presumably never fires. Identify the PC. 2. **If host-side**: trace through the kernel exports table. The most likely path is one of `XAudio2*Create`, `XMACreateContext`, `XMPCreate*`, or an undocumented `XAudio` API. Per the tid=14 call profile, `XAudioGetVoiceCategoryVolumeChangeMask` is the only XAudio API actively touched — look at its dossier (or canary's `xboxkrnl_audio.cc` / `xam_audio.cc`) for object-construction side-effects. 3. **Alternative**: identify which Sylpheed API call is the **trigger** for the 10.38 s `sub_825070F0` firing. Canary main (tid=6) at host_ns ≈ 10.30–10.38 s does the work that leads up to this; ~300 ms before, tid=6 has activity that ours doesn't reach. Diff tid=6's event stream in canary vs ours's tid=1 in the time window [10 s, 10.4 s] (canary) / [whatever ours's wallclock-equivalent is] — but ours doesn't reach 10 s wallclock either, so the divergence is upstream. 4. **Secondary attack**: the XAudio tid=14/15 resume gate. Those threads are spawned suspended in BOTH engines (canary at 1.726/1.727 s, ours at 1.626 s); canary resumes them within ~1 ms and they emit 11 M events combined. **What guest call resumes them in canary?** Cross-thread NtResumeThread on the tid=14 handle. Sylpheed presumably resumes them via an XAudio2 API. If we can identify the resume call site in canary and figure out why ours doesn't reach it, we unblock 60% of the missing event volume (XAudio) independent of `sub_825070F0`. ## Artifacts All artifacts in `xenia-rs/audit-runs/phase-nonmatch-investigation/`: - `build_profiles.py` — streaming jsonl profile builder (~200 LOC) - `tid-event-counts.csv` — per-tid totals (28 rows) - `tid-top-calls.txt` — per-tid top-20 kernel.call names - `tid-ntset-handles.txt` — per-tid Ke/NtSetEvent handle distribution **(EMPTY — canary's kernel.call payloads have `args:{}` for NtSetEvent; handle is in resolved-arg JSON not exposed in current `args_resolved`. Not needed for Outcome (A) determination. Future Phase: extend Phase A `kernel.call` to also surface ALL register args in `args` for diff-tool consumption.)** - `tid-wait-handles.txt` — per-tid wait.begin handle distribution **(EMPTY for same reason: the `wait.begin` events I sampled have `raw_handle_id=None` because the payload uses a `handle_semantic_ids` array, not a single `raw_handle_id`. The handle.create map is populated correctly — see `handle-create.json`.)** - `thread-creates.json` — canary thread.create payloads keyed by child_tid (note: child_tid is FIFO-inferred, see profiles doc) - `thread-exits.json` — canary thread.exit events (3 in this trace: tid=17/18/26) - `excreate-events.json` — all ExCreateThread import.call events with idx/host_ns - `create-thread-events.json` — full thread.create event payloads - `handle-create.json` — all handle.create with raw_handle, sid, object_type - `spawn-chain.json` — auto-correlated spawn → ExCreateThread linkage - `canary-tid-profiles.md` — human-readable per-tid catalogue + spawn-chain tables - `result.md` — this file