Files
xenia-rs/audit-runs/phase-absorber-review/cross-reference.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

5.5 KiB
Raw Blame History

Cross-reference — absorbed events vs Phase W wedge ground truth

Baseline diff: Phase W cold canary canary-wedge.head250k.jsonl (56 MB, head-truncated to ~250k canary tid=6 events) vs ours ours-postfix.jsonl (28 MB). Default tid-map 6=1,4=11,7=2,12=7,14=9,15=10.

Total absorbed events: 8

absorber side events
shared-global ours 1
wait-begin canary 1
nested-cs canary 6 (1 [E,L] pair = 6 events)

Per-absorbed-event analysis

Nested-cs (6 events, canary tid=6 idx 104607104612)

Pure RtlEnterCriticalSection+RtlLeaveCriticalSection import/kernel.call/ kernel.return triples on canary's side. Return values all 0x00000000 (uncontended fast-path). No handles referenced. Not wedge-relevant by construction — these are CS API calls, not signal-flow events.

Wait-begin (1 event, canary tid=6 idx 104622)

SID: a25a16a4f6f547aa (object_type=1 EVENT)
raw_handle_id: 0xf8000044 (canary kernel-table slot)
created at: canary tid=10 idx=843 (worker thread)
used by: 108 wait.begin events across canary tids 6, 9, 10, 17, 18

Embedded inside an RtlEnterCriticalSection block (idx 104620 import.call → 104621 kernel.call → 104622 wait.begin → 104623 kernel.return). This is canary's CS slow-path — the CS was contended so the wait.begin fired on the CS dispatcher Event. Object_type=1 (EVENT) is the Xbox kernel's representation of the CS's owned-by-other-thread dispatcher; NOT a user-mode NtCreateEvent-created Event.

The Event is created on worker tid=10 because in canary the worker did run and contend on this CS. In ours the workers don't run so the CS is never contended; ours fast-paths through (uncontended kernel.return at idx 104616 with status 0).

Wedge handles in ours (per halt-on-deadlock-dump.txt) are: 0x12d0, 0x1020, 0x1040, 0x10b0, 0x10ec, 0x12e4 — all object_type=1 EVENT but all created via NtCreateEvent at LR=0x824a9f6c from worker tid=13. They're worker-LOCAL Events (SIDs d5e23609d3948568 etc., computed from ours's per-tid recipe), NOT the shared CS dispatcher Event a25a16a4f6f547aa.

Verdict: absorber is correctly suppressing CS-contention scheduling jitter, not wedge signal flow. The Event canary waits on is the CS dispatcher proxy, never the user-mode worker-private Events.

Shared-global (1 event, ours tid=10 idx 2)

SID: ac8315b371bcf7cb (object_type=3 SEMAPHORE)
raw_handle_id: 0x828a3230 (guest VA — well-known XAudio voice-volume
                            semaphore, documented in C+18)

ours emits handle.create for this Semaphore at idx 2 because ensure_dispatcher_object synthesizes the shadow KernelObject at first touch (Phase C+17). Canary doesn't emit a corresponding handle.create on the same tid pair because the canary first toucher was a different host thread — classic process-global first-toucher race that C+18 was specifically designed for.

Wedge handles are all EVENT (object_type=1). This is a SEMAPHORE (type=3). Different object class, different code path (XAudio voice volume). Not wedge-relevant.

Verdict: absorber is correctly suppressing first-toucher race for a shared XAudio dispatcher, not wedge signal flow.

Selective-disable matched-prefix deltas

Baseline (all absorbers ON): main tid=6→1 matched=105,128.

disabled absorber main matched delta sister 15→10 matched
(none — baseline) 105,128 0 16
shared-global 105,128 0 2 (14)
wait-begin 104,616 512 16
nested-cs 104,607 521 16

The delta pattern matches the absorbed events exactly:

  • nested-cs's 6 absorbed events at idx 104,607104,612 enabled the 104,607 → 105,128 advance (combined with subsequent wait-begin).
  • wait-begin's single absorb at idx 104,622 enabled the 104,616 → 105,128 advance (without it, the absorber-chain stops there).
  • shared-global's single absorb on tid=15→10 enabled that sister chain's 2 → 16 advance.

Cross-reference verdict

None of the absorbed events reference a wedge-relevant handle.

Specifically:

  1. Nested-cs absorbs RtlEnter/Leave API events — no handles involved.
  2. Wait-begin absorbs a CS-dispatcher Event used in CS contention. The wedge Events are user-mode NtCreateEvent outputs from worker tid=13 — DIFFERENT object class than CS dispatchers.
  3. Shared-global absorbs an XAudio SEMAPHORE — wedge handles are all EVENT type.

What the absorbers DO reveal indirectly

The wait-begin and nested-cs absorbers fire because canary's main thread (tid=6) waits on a CS that ours never contends on. The reason ours never contends on it is because the worker cluster (canary tid=9/10/14/15/17/18) never runs — they emit 17 and 77 events in ours (vs 995k and 1.9M in canary) per Phase W ground truth.

The absorbers are therefore CORRECTLY treating the contention pattern as scheduling jitter at the diff layer. The underlying root cause — workers don't bootstrap — is what Phase W identified and is unchanged by absorber behavior.

Even if we disabled all three absorbers, the surfaced divergences would be:

  • canary's main waiting on a CS dispatcher that ours doesn't create (because the contending worker is absent), AND
  • canary's main entering CS nested-cleanup branches because the CS-protected registry has more entries (because workers inserted them).

Both are downstream effects of the same upstream "workers don't run" root cause that Phase D's contention-replay (Stage 3/4) and quantum (Stage 0) experiments already failed to unblock. No new signal-flow gap is exposed.