Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
14 KiB
Iterate 2.D — Peer-producer LR trace (investigation log)
Date: 2026-05-21
Mode: WRITE (investigation only; no engine source changes).
Binaries: xenia-canary/.../xenia_canary_i2d.exe, xenia-rs/target/release/xrs-i2d.
Reuses: canary --audit_69_log_all_sets=true --audit_70_log_all_releases=true (existing),
ours --lr-trace (existing); no new cvars added.
LOC delta: engine 0, canary 0.
Step 0 — Existing artifact triage
audit-runs/audit-069-wait-signal-producer/s5/canary-release-trace.log (S5) holds only
NtReleaseSemaphore fires (414 events, single-handle restricted by configured handle list);
NtSetEvent dimension was never captured. Audit-069 S6 bridge wrote a 414-vs-99 first-N
release diff but did not extend coverage to NtSetEvent or to canary's wider tid family.
Therefore a fresh capture covering BOTH operations was required.
Step 1 — Capture both engines
Canary cold run
wine xenia_canary_i2d.exe "<iso>" --mute=true \
--audit_69_log_all_sets=true --audit_70_log_all_releases=true \
--log_file=canary-i2d.log
Wallclock 90 s (timeout). Result: 79,014 peer-producer events (61,659 NtSetEvent/XEvent::Set
- 17,355 NtReleaseSemaphore/xeKeReleaseSemaphore). 21 distinct guest tids.
Ours cold run
xrs-i2d exec "<iso>" --quiet \
--lr-trace=0x8284DDDC,0x8284E49C,0x8284DF5C,0x8284E07C \
--lr-trace-out=ours-i2d-iat-trace.jsonl
The probe PCs are the IAT thunks for KeSetEvent / KeReleaseSemaphore / NtSetEvent /
NtReleaseSemaphore respectively — covering BOTH wrapper paths and direct callers (matches
canary's audit_69/audit_70 C++ hook coverage). Wallclock 60 s (timeout). Result:
153 peer-producer events across 7 distinct guest tids (1, 2, 5, 6, 8, 9, 11, 13).
A first pass with --lr-trace=0x824AA2F0,0x824AB158 (game's NtSetEvent/NtReleaseSemaphore
wrappers, sub_824AA2F0/sub_824AB158) yielded 150 events but missed all KeReleaseSemaphore
direct calls (which bypass the game wrappers via the audio subsystem's sub_824D21F0 and
others). That trace is kept as ours-i2d-lr-trace.jsonl for cross-check; the IAT-thunk trace
ours-i2d-iat-trace.jsonl is the authoritative one used for alignment.
Step 2 — Cross-engine alignment
Handle namespaces
Canary handles 0xF8000XXX (8-bit XAM-style); ours handles 0x10XX (slot ids). Not directly
comparable by raw value — must use (op, lr-containing-fn) tuple.
Documented tid map (per AUDIT-068/S6 bridge)
- canary tid=6 ↔ ours tid=1 (main)
- canary tid=10 ↔ ours tid=5 (worker)
- canary tid=17 ↔ ours tid=13 (cache thread)
Other tid pairs (audio threads tid=14 / tid=2 / tid=4 etc.) require entry-PC matching since their IDs are not stably mapped.
LR alignment (operation + lr-containing-fn)
Cross-engine matching key: (op_category, lr_value). Since BOTH engines route through
identical game-side dispatch code, the same call site (lr) means the same source-level
producer.
| LR | Op | Function | Canary fires | Ours fires | Status |
|---|---|---|---|---|---|
0x824D229C |
release | sub_824D21F0 (audio dispatch) | 16,452 | 1 | UNDER (×16,452) |
0x824D2A44 |
set | sub_824D29F0 (audio worker entry) | 16,452 | 0 | MISSING |
0x824D292C |
set | sub_824D2878 (audio worker entry-2) | 16,452 | 0 | MISSING |
0x824AA304 |
set | sub_824AA2F0 (NtSetEvent wrapper, generic) | 15,765 | (multi) | MIXED |
0x82506C90 |
set | sub_82506B08 (worker dispatch +0x188) | 2,378 | 0 | MISSING |
0x82508510 |
set | sub_82508400 (+0x110) | 2,373 | 0 | MISSING |
0x82508524 |
set | sub_82508400 (+0x124) | 2,373 | 0 | MISSING |
0x82506F9C |
set | sub_82506DE8 (worker dispatch) | 2,355 | 0 | MISSING |
0x82508358 |
set | sub_825078D8 (worker dispatch +0xa80) | 2,350 | 0 | MISSING |
0x824AAFC8 |
set | sub_824AAF50 (KeSetEvent wrapper) | 1,113 | 0 | MISSING |
0x824AB168 |
release | sub_824AB158 (NtReleaseSemaphore wrapper internal) | 903 | 90 | UNDER (×10) |
0x82450D2C |
release | sub_82450B68 (worker self-release-2) | (multi via 0x824AB168) | 75 | match-class |
0x82450CE0 |
release | sub_82450B68 (worker self-release-1) | (multi via 0x824AB168) | 7 | match-class |
0x82450314 |
release | sub_82450218 | (multi via 0x824AB168) | 8 | match-class |
Total LRs missing in ours: 31 distinct call sites, 61,659 canary fires with 0 analogs in ours. Plus the matched-class but-under-firing audio release (1 vs 16,452, ratio 1/16,452).
Step 3 — First divergent producer (chronological)
Time-ordered scan of canary's events (host_ns) against ours's events (per-tid cycle):
- Canary events 0–14 (host_ns 4.8 µs → 162.6 ms): all from tids 6/10/0 — these have exact ours analogs (tids 1/5/bootstrap), counts match approximately.
- Canary event 15 at host_ns=277,967,100 (278 ms): tid=14 fires
xeKeReleaseSemaphoreon handle0xF800006Cfrom lr=0x824D229C(sub_824D21F0+0xAC, audio dispatch). This is the FIRST canary peer-producer ours doesn't match in volume.
Attribution
sub_824D21F0 is the audio subsystem's "post-process-then-release-semaphore" leaf, called from
3 sites:
sub_824D2878+0xA0(audio worker entry A)sub_824D2940+0x64(audio worker entry B)sub_824D29F0+0xC4(audio worker dispatch loop body)
sub_824D29F0 is the main audio worker fn: enters CS, fires KeSetEvent, calls
KeWaitForMultipleObjects, then either sub_824D2108 or sub_824D21F0 (the semaphore
release). The fn pointers for both audio worker entries (sub_824D2878/sub_824D2940) are
loaded by sub_824D2C08 (audio subsystem init), which then calls ExCreateThread ×2 to
spawn them, followed by KeSetBasePriorityThread and KeResumeThread.
Ours's actual audio thread behavior
Phase-A event log (/tmp/ours-i2d-events.jsonl, 118,149 events) shows:
- host_ns=1,586,993,047 (1.587 s):
thread.createentry_pc=0x824d2878(audio thread A), suspended - host_ns=1,587,001,117:
ObReferenceObjectByHandle(audio init handshake) - host_ns=1,587,011,797:
KeSetBasePriorityThread - host_ns=1,587,018,827:
KeResumeThread(audio thread A resumed) - host_ns=1,587,049,878:
ExCreateThread(audio thread B, entry_pc=0x824d2940) - host_ns=1,587,088,839:
KeResumeThread(audio thread B resumed) - host_ns=1,587,097,519: ours tid=10 (audio thread A) starts:
KeWaitForSingleObject,KeRaiseIrqlToDpcLevel,KeAcquireSpinLockAtRaisedIrql,KeReleaseSpinLockFromRaisedIrql,KfLowerIrql(17 events total). Then silent. - host_ns=1,659,028,012: ours tid=11 (audio thread B) starts:
RtlEnterCriticalSection,KeSetEvent,KeWaitForMultipleObjects(11 events total). Then silent. - ours tid=9 (also audio family, ctx_ptr=0x828a3230 = audio static dispatcher) fires
KeReleaseSemaphoreONCE at cycle=631 from lr=0x824d229c, then silent.
Conclusion: ours audio threads are spawned, resumed, execute one iteration of their
work loop, then wedge in KeWaitForMultipleObjects (sub_824D29F0+0xA0) — the same wedge
shape as tid=13 (cache thread). Canary's audio thread iterates ~16,452× over the same
window; ours iterates ~1×. The wedge is not localized to one thread — it is the same wedge
pattern recurring in every peer-producer thread family.
Timing skew
Canary boot timeline (audio subsystem):
- host_ns=4.8 µs: first NtReleaseSemaphore (tid=6 main bootstrap)
- host_ns=277.97 ms: audio worker (tid=14) first fires
Ours boot timeline:
- host_ns=5,4 ms: first NtReleaseSemaphore (tid=1 main, matched)
- host_ns=1,586.99 ms = 1.587 s: audio thread spawned (5.7× later than canary's first audio fire)
This 1.3-second delay implies upstream init phase divergence — ours's main thread is taking
significantly longer to reach the sub_824D2C08 init call than canary does. The cause of this
upstream delay is likely the cumulative effect of every prior subsystem wedge: each
producer-consumer pair that wedges in ours costs time before the next subsystem can init.
Step 4 — Outcome class
The plan defined three outcomes:
- (A) Single missing producer, thread-spawn-dependent: NO — ours DOES spawn the audio threads (tid=9, tid=10, tid=11) successfully. The threads execute briefly then wedge.
- (B) Single missing producer, state-divergence in an existing-but-divergent thread: NO — the divergence is not in one thread but in all peer-producer threads.
- (C) Many distributed producers missing: YES. 31 distinct LRs missing across at least
4 thread families:
- Audio workers (sub_824D2878, sub_824D2940, sub_824D29F0, sub_824D21F0 chain): ~50k fires canary, 1 fire ours.
- Worker dispatch (sub_82506B08, sub_82508400, sub_82506DE8, sub_825078D8 family — the AUDIT-049 "sub_825070F0" caller universe): ~10k fires canary, 0 ours.
- NtSetEvent wrapper-internal (lr=0x824AA304): 15,765 canary, 0 directly observed in ours via this LR (ours uses the IAT path differently).
- Misc (sub_827E*, sub_82178*, sub_824D0*): smaller counts, mostly init paths.
Outcome class = (C) Many distributed producers missing.
Structural interpretation
Every peer-producer thread family in ours executes its FIRST iteration normally (the bootstrap), then stalls in its wait primitive. The wait primitives are different across families (different events, different semaphores), but the pattern is identical:
ThreadFamily X enters loop
→ does bootstrap-once setup (1 release/set fires)
→ enters KeWaitForMultipleObjects or KeWaitForSingleObject
→ blocks forever because the producer for ITS wait event is itself another wedged thread
This is a multi-producer ladder collapse: each thread depends on a peer (in the same OR different family) for its wake-up signal; that peer is also wedged on a dependency; etc. The graph is not strictly circular (each thread's specific wait may differ) but the topology means no thread can advance because every thread's wake-source is also blocked.
This subsumes the AUDIT-049 / AUDIT-069 framings into a unified picture: the wedge family includes at minimum {tid=10 audio-A, tid=11 audio-B, tid=9 audio-aux, tid=13 cache, all four sub_825070F0 worker spawnees}, all wedged on different wait sites, all unable to wake each other.
Step 5 — Recommended next iterate
Given outcome class (C), single-keystone iterates (2.B branch-probe, 2.C arg/return capture, single-thread wedge investigation) will not unlock the whole ladder — each one would unblock ONE thread, only to find it blocks again on the next un-signaled event.
The plan's recommended pivot from outcome (C) is: "may need a fundamentally different methodology".
Option (1): Critical-path sweep (~400-600 LOC over multiple sessions)
Identify which thread families' first-iteration produces signals consumed by another thread family, build a dependency DAG, then trace each edge's first-divergence in ours. Many of these edges may converge on a small number of "root cause" missed signals further upstream (e.g., a single missing signal in init code that cascades).
Option (2): Boot-time delta replay (~100 LOC investigation)
Compare canary's 0-278 ms event sequence (1,221 events before first audio fire) against ours's 0-278 ms event sequence. There's an upstream timing skew (canary boots audio at 278 ms, ours at 1.587 s — 5.7× slower). The CAUSE of the slow init is upstream-of-audio and may be a single fixable wedge in the init path.
Recommended: Option (2) — cheaper, more focused. Diff the first ~1200 phase-a events of each engine to find the FIRST kernel-import-call divergence in early bootstrap. This may identify a single missing wake-signal in the early init flow that cascades to delay every subsequent subsystem.
Option (3): Audio-specific micro-investigation (~50 LOC, narrower)
The single canary audio fire (lr=0x824d229c count=1 in ours) shows ours's audio thread DOES
reach the release site once. Find what event/semaphore canary signals between iterations 1
and 2 that ours doesn't. This is a narrower (B-shaped) sub-investigation that doesn't unblock
the full ladder but adds disambiguation between "thread didn't spawn", "first iter wedged",
and "subsequent iter wedged".
Tripstones honored
- #28 (per-engine tid stability): explicitly used (op, lr-containing-fn) tuple, not raw
tid. Confirmed via
entry_pcmatching in phase-a thread.create events. - #32 (canary jitter): no relevant variance — both engines bit-equivalent on first 15 events (host_ns and counts match). Divergence starts at canary event 15 (audio fire).
- #37 (vtable base vs slot-N): not encountered.
- #39 (composite progression metric): not moved this iterate; investigation-only.
- #40 (single-keystone framing): explicitly broken by outcome (C). The "find THE missing producer" framing of S5/S6/2.A is falsified at the structural level — there are 31, not 1.
Cascade
- A (acquire both engines' producer traces): PASS HIGH (canary 79,014 events, ours 153)
- B (align sequences): PASS HIGH (LR-keyed alignment; clear bit-equivalence on first 15 canary events).
- C (identify first divergent producer): PASS HIGH — canary event 15 at host_ns=278ms, tid=14, lr=0x824D229C, sub_824D21F0 (audio dispatch).
- D (attribute cause): PASS MEDIUM — distributed wedge ladder, not single-thread blockage.
- E (outcome class named): PASS HIGH — Class (C), 31 missing LRs, 4+ thread families.
5 PASS / 0 FAIL.
Artifacts
All under xenia-rs/audit-runs/iterate-2D-peer-producer-trace/:
canary-i2d.log(10.7 MB, 79,014 peer-producer events from canary's audit_69+audit_70).canary-i2d.stdout/.stderr: canary run logs.canary-peer-producers.jsonl: parsed structured form of canary events (79,014 records).ours-i2d-lr-trace.jsonl: ours first pass at game wrapper PCs (150 events; missing KeReleaseSemaphore direct path — kept for cross-check only).ours-i2d-iat-trace.jsonl: ours authoritative trace at IAT thunks (153 events, comprehensive Ke+Nt coverage).ours-i2d.stdout/.stderr: ours run logs.aligned-sequence.csv: chronological per-engine sequence (first 200 canary + all ours).investigation.md: this document.report.md: short outcome summary.
Discipline
- xenia-rs HEAD UNCHANGED. canary HEAD UNCHANGED.
- Binaries
xenia_canary_i2d.exeandxrs-i2dare renamed copies; originals untouched. - Canary cache backed up to
/tmp/canary-cache-bak-iter2Dat session start; verified unchanged at session end (diff -rqreturns empty). --mute=truehonored on canary run.- Investigation-only; no engine source changes.
LOC delta: 0 engine, 0 canary.