# Iterate 2.D — Peer-producer LR trace (investigation log) **Date:** 2026-05-21 **Mode:** WRITE (investigation only; no engine source changes). **Binaries:** `xenia-canary/.../xenia_canary_i2d.exe`, `xenia-rs/target/release/xrs-i2d`. **Reuses:** canary `--audit_69_log_all_sets=true --audit_70_log_all_releases=true` (existing), ours `--lr-trace` (existing); no new cvars added. **LOC delta:** engine 0, canary 0. ## Step 0 — Existing artifact triage `audit-runs/audit-069-wait-signal-producer/s5/canary-release-trace.log` (S5) holds only NtReleaseSemaphore fires (414 events, single-handle restricted by configured handle list); **NtSetEvent dimension was never captured**. Audit-069 S6 bridge wrote a 414-vs-99 first-N release diff but did not extend coverage to NtSetEvent or to canary's wider tid family. Therefore a fresh capture covering BOTH operations was required. ## Step 1 — Capture both engines ### Canary cold run ``` wine xenia_canary_i2d.exe "" --mute=true \ --audit_69_log_all_sets=true --audit_70_log_all_releases=true \ --log_file=canary-i2d.log ``` Wallclock 90 s (timeout). Result: **79,014 peer-producer events** (61,659 NtSetEvent/XEvent::Set + 17,355 NtReleaseSemaphore/xeKeReleaseSemaphore). 21 distinct guest tids. ### Ours cold run ``` xrs-i2d exec "" --quiet \ --lr-trace=0x8284DDDC,0x8284E49C,0x8284DF5C,0x8284E07C \ --lr-trace-out=ours-i2d-iat-trace.jsonl ``` The probe PCs are the IAT thunks for KeSetEvent / KeReleaseSemaphore / NtSetEvent / NtReleaseSemaphore respectively — covering BOTH wrapper paths and direct callers (matches canary's `audit_69`/`audit_70` C++ hook coverage). Wallclock 60 s (timeout). Result: **153 peer-producer events** across 7 distinct guest tids (1, 2, 5, 6, 8, 9, 11, 13). A first pass with `--lr-trace=0x824AA2F0,0x824AB158` (game's NtSetEvent/NtReleaseSemaphore wrappers, sub_824AA2F0/sub_824AB158) yielded 150 events but **missed all KeReleaseSemaphore direct calls** (which bypass the game wrappers via the audio subsystem's `sub_824D21F0` and others). That trace is kept as `ours-i2d-lr-trace.jsonl` for cross-check; the IAT-thunk trace `ours-i2d-iat-trace.jsonl` is the authoritative one used for alignment. ## Step 2 — Cross-engine alignment ### Handle namespaces Canary handles `0xF8000XXX` (8-bit XAM-style); ours handles `0x10XX` (slot ids). Not directly comparable by raw value — must use (op, lr-containing-fn) tuple. ### Documented tid map (per AUDIT-068/S6 bridge) - canary tid=6 ↔ ours tid=1 (main) - canary tid=10 ↔ ours tid=5 (worker) - canary tid=17 ↔ ours tid=13 (cache thread) Other tid pairs (audio threads tid=14 / tid=2 / tid=4 etc.) require entry-PC matching since their IDs are not stably mapped. ### LR alignment (operation + lr-containing-fn) Cross-engine matching key: `(op_category, lr_value)`. Since BOTH engines route through identical game-side dispatch code, the same call site (lr) means the same source-level producer. | LR | Op | Function | Canary fires | Ours fires | Status | |----|----|----------|---:|---:|--------| | `0x824D229C` | release | sub_824D21F0 (audio dispatch) | 16,452 | **1** | UNDER (×16,452) | | `0x824D2A44` | set | sub_824D29F0 (audio worker entry) | 16,452 | **0** | MISSING | | `0x824D292C` | set | sub_824D2878 (audio worker entry-2) | 16,452 | **0** | MISSING | | `0x824AA304` | set | sub_824AA2F0 (NtSetEvent wrapper, generic) | 15,765 | (multi) | MIXED | | `0x82506C90` | set | sub_82506B08 (worker dispatch +0x188) | 2,378 | **0** | MISSING | | `0x82508510` | set | sub_82508400 (+0x110) | 2,373 | **0** | MISSING | | `0x82508524` | set | sub_82508400 (+0x124) | 2,373 | **0** | MISSING | | `0x82506F9C` | set | sub_82506DE8 (worker dispatch) | 2,355 | **0** | MISSING | | `0x82508358` | set | sub_825078D8 (worker dispatch +0xa80) | 2,350 | **0** | MISSING | | `0x824AAFC8` | set | sub_824AAF50 (KeSetEvent wrapper) | 1,113 | **0** | MISSING | | `0x824AB168` | release | sub_824AB158 (NtReleaseSemaphore wrapper internal) | 903 | 90 | UNDER (×10) | | `0x82450D2C` | release | sub_82450B68 (worker self-release-2) | (multi via 0x824AB168) | 75 | match-class | | `0x82450CE0` | release | sub_82450B68 (worker self-release-1) | (multi via 0x824AB168) | 7 | match-class | | `0x82450314` | release | sub_82450218 | (multi via 0x824AB168) | 8 | match-class | Total LRs missing in ours: **31 distinct call sites**, **61,659 canary fires** with **0 analogs in ours**. Plus the matched-class but-under-firing audio release (1 vs 16,452, ratio 1/16,452). ## Step 3 — First divergent producer (chronological) Time-ordered scan of canary's events (host_ns) against ours's events (per-tid cycle): - **Canary events 0–14** (host_ns 4.8 µs → 162.6 ms): all from tids 6/10/0 — these have exact ours analogs (tids 1/5/bootstrap), counts match approximately. - **Canary event 15** at **host_ns=277,967,100** (278 ms): tid=14 fires `xeKeReleaseSemaphore` on handle `0xF800006C` from lr=`0x824D229C` (sub_824D21F0+0xAC, audio dispatch). This is the **FIRST canary peer-producer ours doesn't match in volume.** ### Attribution `sub_824D21F0` is the audio subsystem's "post-process-then-release-semaphore" leaf, called from 3 sites: - `sub_824D2878+0xA0` (audio worker entry A) - `sub_824D2940+0x64` (audio worker entry B) - `sub_824D29F0+0xC4` (audio worker dispatch loop body) `sub_824D29F0` is the main audio worker fn: enters CS, fires `KeSetEvent`, calls `KeWaitForMultipleObjects`, then either `sub_824D2108` or `sub_824D21F0` (the semaphore release). The fn pointers for both audio worker entries (`sub_824D2878`/`sub_824D2940`) are loaded by `sub_824D2C08` (audio subsystem init), which then calls `ExCreateThread` ×2 to spawn them, followed by `KeSetBasePriorityThread` and `KeResumeThread`. ### Ours's actual audio thread behavior Phase-A event log (`/tmp/ours-i2d-events.jsonl`, 118,149 events) shows: - **host_ns=1,586,993,047** (1.587 s): `thread.create` entry_pc=`0x824d2878` (audio thread A), suspended - **host_ns=1,587,001,117**: `ObReferenceObjectByHandle` (audio init handshake) - **host_ns=1,587,011,797**: `KeSetBasePriorityThread` - **host_ns=1,587,018,827**: `KeResumeThread` (audio thread A resumed) - **host_ns=1,587,049,878**: `ExCreateThread` (audio thread B, entry_pc=`0x824d2940`) - **host_ns=1,587,088,839**: `KeResumeThread` (audio thread B resumed) - **host_ns=1,587,097,519**: ours **tid=10** (audio thread A) starts: `KeWaitForSingleObject`, `KeRaiseIrqlToDpcLevel`, `KeAcquireSpinLockAtRaisedIrql`, `KeReleaseSpinLockFromRaisedIrql`, `KfLowerIrql` (17 events total). Then **silent.** - **host_ns=1,659,028,012**: ours **tid=11** (audio thread B) starts: `RtlEnterCriticalSection`, `KeSetEvent`, `KeWaitForMultipleObjects` (11 events total). Then **silent.** - **ours tid=9** (also audio family, ctx_ptr=0x828a3230 = audio static dispatcher) fires `KeReleaseSemaphore` ONCE at cycle=631 from lr=`0x824d229c`, then **silent.** **Conclusion**: ours audio threads are spawned, resumed, execute *one* iteration of their work loop, then wedge in `KeWaitForMultipleObjects` (sub_824D29F0+0xA0) — the same wedge shape as tid=13 (cache thread). Canary's audio thread iterates ~16,452× over the same window; ours iterates ~1×. **The wedge is not localized to one thread — it is the same wedge pattern recurring in every peer-producer thread family.** ### Timing skew Canary boot timeline (audio subsystem): - host_ns=4.8 µs: first NtReleaseSemaphore (tid=6 main bootstrap) - host_ns=277.97 ms: audio worker (tid=14) first fires Ours boot timeline: - host_ns=5,4 ms: first NtReleaseSemaphore (tid=1 main, matched) - host_ns=**1,586.99 ms** = 1.587 s: audio thread spawned (5.7× later than canary's first audio fire) This 1.3-second delay implies **upstream init phase divergence** — ours's main thread is taking significantly longer to reach the `sub_824D2C08` init call than canary does. The cause of this upstream delay is likely the cumulative effect of every prior subsystem wedge: each producer-consumer pair that wedges in ours costs time before the next subsystem can init. ## Step 4 — Outcome class The plan defined three outcomes: - **(A) Single missing producer, thread-spawn-dependent**: NO — ours DOES spawn the audio threads (tid=9, tid=10, tid=11) successfully. The threads execute briefly then wedge. - **(B) Single missing producer, state-divergence in an existing-but-divergent thread**: NO — the divergence is not in *one* thread but in *all* peer-producer threads. - **(C) Many distributed producers missing**: **YES.** 31 distinct LRs missing across at least 4 thread families: - Audio workers (sub_824D2878, sub_824D2940, sub_824D29F0, sub_824D21F0 chain): ~50k fires canary, 1 fire ours. - Worker dispatch (sub_82506B08, sub_82508400, sub_82506DE8, sub_825078D8 family — the AUDIT-049 "sub_825070F0" caller universe): ~10k fires canary, 0 ours. - NtSetEvent wrapper-internal (lr=0x824AA304): 15,765 canary, 0 directly observed in ours via this LR (ours uses the IAT path differently). - Misc (sub_827E*, sub_82178*, sub_824D0*): smaller counts, mostly init paths. Outcome class = **(C) Many distributed producers missing.** ### Structural interpretation Every peer-producer thread family in ours executes its FIRST iteration normally (the bootstrap), then stalls in its wait primitive. The wait primitives are different across families (different events, different semaphores), but the **pattern is identical**: ``` ThreadFamily X enters loop → does bootstrap-once setup (1 release/set fires) → enters KeWaitForMultipleObjects or KeWaitForSingleObject → blocks forever because the producer for ITS wait event is itself another wedged thread ``` This is a **multi-producer ladder collapse**: each thread depends on a peer (in the same OR different family) for its wake-up signal; that peer is also wedged on a dependency; etc. The graph is not strictly circular (each thread's specific wait may differ) but the topology means **no thread can advance because every thread's wake-source is also blocked.** This subsumes the AUDIT-049 / AUDIT-069 framings into a unified picture: the wedge family includes at minimum {tid=10 audio-A, tid=11 audio-B, tid=9 audio-aux, tid=13 cache, all four sub_825070F0 worker spawnees}, all wedged on different wait sites, all unable to wake each other. ## Step 5 — Recommended next iterate Given outcome class (C), single-keystone iterates (2.B branch-probe, 2.C arg/return capture, single-thread wedge investigation) will not unlock the whole ladder — each one would unblock ONE thread, only to find it blocks again on the next un-signaled event. The plan's recommended pivot from outcome (C) is: **"may need a fundamentally different methodology"**. ### Option (1): Critical-path sweep (~400-600 LOC over multiple sessions) Identify which thread families' first-iteration produces signals consumed by another thread family, build a dependency DAG, then trace each edge's first-divergence in ours. Many of these edges may converge on a small number of "root cause" missed signals further upstream (e.g., a single missing signal in init code that cascades). ### Option (2): Boot-time delta replay (~100 LOC investigation) Compare canary's 0-278 ms event sequence (1,221 events before first audio fire) against ours's 0-278 ms event sequence. There's an upstream timing skew (canary boots audio at 278 ms, ours at 1.587 s — 5.7× slower). The CAUSE of the slow init is upstream-of-audio and may be a single fixable wedge in the init path. **Recommended: Option (2)** — cheaper, more focused. Diff the first ~1200 phase-a events of each engine to find the FIRST kernel-import-call divergence in early bootstrap. This may identify a single missing wake-signal in the early init flow that cascades to delay every subsequent subsystem. ### Option (3): Audio-specific micro-investigation (~50 LOC, narrower) The single canary audio fire (lr=`0x824d229c` count=1 in ours) shows ours's audio thread DOES reach the release site once. Find what event/semaphore canary signals between iterations 1 and 2 that ours doesn't. This is a narrower (B-shaped) sub-investigation that doesn't unblock the full ladder but adds disambiguation between "thread didn't spawn", "first iter wedged", and "subsequent iter wedged". ## Tripstones honored - **#28 (per-engine tid stability)**: explicitly used (op, lr-containing-fn) tuple, not raw tid. Confirmed via `entry_pc` matching in phase-a thread.create events. - **#32 (canary jitter)**: no relevant variance — both engines bit-equivalent on first 15 events (host_ns and counts match). Divergence starts at canary event 15 (audio fire). - **#37 (vtable base vs slot-N)**: not encountered. - **#39 (composite progression metric)**: not moved this iterate; investigation-only. - **#40 (single-keystone framing)**: explicitly broken by outcome (C). The "find THE missing producer" framing of S5/S6/2.A is **falsified** at the structural level — there are 31, not 1. ## Cascade - A (acquire both engines' producer traces): **PASS HIGH** (canary 79,014 events, ours 153) - B (align sequences): **PASS HIGH** (LR-keyed alignment; clear bit-equivalence on first 15 canary events). - C (identify first divergent producer): **PASS HIGH** — canary event 15 at host_ns=278ms, tid=14, lr=0x824D229C, sub_824D21F0 (audio dispatch). - D (attribute cause): **PASS MEDIUM** — distributed wedge ladder, not single-thread blockage. - E (outcome class named): **PASS HIGH** — Class (C), 31 missing LRs, 4+ thread families. 5 PASS / 0 FAIL. ## Artifacts All under `xenia-rs/audit-runs/iterate-2D-peer-producer-trace/`: - `canary-i2d.log` (10.7 MB, 79,014 peer-producer events from canary's audit_69+audit_70). - `canary-i2d.stdout` / `.stderr`: canary run logs. - `canary-peer-producers.jsonl`: parsed structured form of canary events (79,014 records). - `ours-i2d-lr-trace.jsonl`: ours first pass at game wrapper PCs (150 events; missing KeReleaseSemaphore direct path — kept for cross-check only). - `ours-i2d-iat-trace.jsonl`: ours authoritative trace at IAT thunks (153 events, comprehensive Ke+Nt coverage). - `ours-i2d.stdout` / `.stderr`: ours run logs. - `aligned-sequence.csv`: chronological per-engine sequence (first 200 canary + all ours). - `investigation.md`: this document. - `report.md`: short outcome summary. ## Discipline - xenia-rs HEAD UNCHANGED. canary HEAD UNCHANGED. - Binaries `xenia_canary_i2d.exe` and `xrs-i2d` are renamed copies; originals untouched. - Canary cache backed up to `/tmp/canary-cache-bak-iter2D` at session start; verified unchanged at session end (`diff -rq` returns empty). - `--mute=true` honored on canary run. - Investigation-only; no engine source changes. LOC delta: 0 engine, 0 canary.