Files
xenia-rs/audit-runs/audit-059-gamma-wedge/canary-summary.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

8.4 KiB
Raw Blame History

AUDIT-059 PROBE C — canary γ-wedge signaler triangulation

Date: 2026-05-11 Mode: READ-ONLY canary instrumentation (patch reverted clean). Canary HEAD before/after: 6de80dffe (clean tree confirmed). Patch: audit-030 --log_lr_on_pc (30 LOC across 4 files; saved to canary-patches-applied.diff). Build: cd build && ninja -f build-Debug.ninja xenia_canary → copied to xenia-canary-probe.

Phase 1 — handle creation at sub_821CB030+0x128 (PC 0x821CB15C)

Probe target: PC 0x821CB15C (post-bl after bl 0x824A9F18 NtCreateEvent wrapper). At this PC, r3 = freshly-created event handle.

2 fires captured in 130 seconds (canary-ntcreate.log):

# Wallclock pos tid (canary) r3 (handle) r31 (stack)
1 line 2058 F8000090 0xF8000098 0x7064FA70
2 line 10567 F80000CC 0xF8000108 0x708FF990

Both fires precede a synchronous file-IO sequence (RtlInitAnsiString → NtQueryFullAttributesFile → NtCreateFile for cache:\aab216c3\5\... paths).

Both events are then NtDuplicateObject'd (the duplicate is the real wait target):

Original handle Dup target Wait-site
F8000098 (XObject) F80000A0 (XEvent) tid F8000090, NtClose@line 2081 (fast)
F8000108 (XObject) F8000110 (XEvent) tid F80000CC, NtClose@line 10605

Phase 1b — wait-site at sub_821CB030+0x1AC (PC 0x821CB1DC)

Verifies the wait fires in canary too. 2 fires, both with lr=0x821CB1D0:

i> F8000090 TRACE-PC-LR pc=821CB1DC lr=821CB1D0 r3=F8000098 r4=FFFFFFFF r5=BC65CDC0
i> F80000C8 TRACE-PC-LR pc=821CB1DC lr=821CB1D0 r3=F8000108 r4=FFFFFFFF r5=BC667CC0

r4=FFFFFFFF → INFINITE wait timeout. Wait DOES execute in canary — but completes (matched by subsequent NtClose). This is the AUDIT-041 wait-site bl 0x824AA330.

Phase 2 — NtSetEvent triangulation

Probe target: NtSetEvent thunk PC 0x8284DF5C (53,701 fires in 130s). Cross-checked against the sub_824AA2F0 (NtSetEvent wrapper) entry probe (20,919 fires).

Identification of wedge-equivalent handle by NtSetEvent fire pattern

Hypothesis: the dup-XEvent (target of NtDuplicateObject) is what gets signaled.

In canary-ntsetevent.log, dup handle F8000110 appears in NtSetEvent exactly 2×:

i> F8000054 TRACE-PC-LR pc=8284DF5C lr=824AA304 r3=F8000110 r5=BC32CC60 r31=7036FDC0
i> F8000084 TRACE-PC-LR pc=8284DF5C lr=824AA304 r3=F8000110 r5=00000002 r31=705AF860

lr=824AA304 = wrapper-internal post-bl PC inside sub_824AA2F0 (NtSetEvent wrapper). To get the caller LR (i.e. who called the wrapper), probe the wrapper entry 0x824AA2F0.

Wrapper-entry probe — cross-run structural correlation

In the wrapper-entry run, the handle namespace shifted slightly (per-run slab-allocator nondeterminism), but the r31 stack invariant matches across runs.

Two-fire handle in the wrapper-entry run that matches r31 stack frames 7036FDC0 and 705AF860 exactly:

i> F8000054 TRACE-PC-LR pc=824AA2F0 lr=82458D14 r3=F8000118 r4=BC369420 r5=BC32CC60 r31=7036FDC0
i> F8000084 TRACE-PC-LR pc=824AA2F0 lr=8245ED80 r3=F8000118 r4=705AF8B0 r5=00000002 r31=705AF860

Cross-run match by (tid, r31): F8000054@7036FDC0 and F8000084@705AF860 are the same two threads/stack-frames signaling the cache-IO completion event in both runs.

Resolved canary signalers

LR Caller function Pre-bl insn Demangled
0x82458D14 sub_82458B90 bl 0x824AA2F0 @ 0x82458D10 NtSetEvent wrapper call
0x8245ED80 sub_8245EC10 bl 0x824AA2F0 @ 0x8245ED7C NtSetEvent wrapper call

Both LRs are NtSetEvent-wrapper call sites. Each fires once per wedge instance.

Cross-reference with ours-side (sibling PROBE O findings)

From ours-summary.md (Phase 3 candidate-signaler table):

Producer Fires in ours Distinct LRs Notes
sub_82458B90 1 0x82457f18 (sub_82457EF0+0x24) direct NtSetEvent caller; fires once but NOT on wedge handle
sub_8245EC10 0 0 static callers — indirect-dispatch-only (audit-050 dead)

Static caller chains in ours's database

sub_82458B90 callers:
  └─ sub_82457EF0+0x24 (only caller; sub_82457EF0 itself has 0 static callers — fnptr-array only)

sub_8245EC10 callers:
  └─ NONE STATICALLY
     Located in dispatch_table @ 0x820B5830 [slot 1]
       slot 0: sub_8245F1D0
       slot 1: sub_8245EC10
     Table referenced from:
       - sub_8245F1D0+0x1C  (self-ref recursive)
       - sub_8245FEB8+0x100 (stw r11, 0(r31) at 0x8245FFC0 — class vptr install)
     sub_8245FEB8 callers: sub_8245FB68 (2 sites), sub_824601A0 (1 site)
     sub_8245FB68 callers: sub_8245F880, sub_8245FAB0
     sub_824601A0 callers: sub_82460118

Both signaler functions live in the worker cluster 0x82458xxx-0x8245Exxx. sub_8245EC10 is a slot-1 entry in a 2-slot dispatch_table at 0x820B5830 — installed at struct offset 0 (vptr) by sub_8245FEB8's constructor. sub_82458B90's only static caller chain goes up through sub_82457EF0, which itself has 0 static callers.

Findings

  1. Wedge structural identification: sub_821CB030+0x128 creates a per-call file-IO completion XEvent that is immediately duplicated and submitted to a worker (sub_82452DC0 @ +0x19C) for asynchronous file load. The wait at +0x1AC blocks until the worker signals the duplicate XEvent.

  2. Canary signalers (the missing piece): Two distinct call-sites signal the wedge in canary:

    • sub_82458B90 (= LR 0x82458D14)
    • sub_8245EC10 (= LR 0x8245ED80)

    Both wrap bl 0x824AA2F0 (NtSetEvent wrapper). Each fires once per file-IO completion.

  3. Static-graph triangulation for ours:

    • sub_82458B90 has 1 static caller (sub_82457EF0+0x24); chain dies because sub_82457EF0 has 0 static callers (fnptr-array activation).
    • sub_8245EC10 has 0 static callers — vtable slot 1 in dispatch_table 0x820B5830, installed by sub_8245FEB8 ctor; ctor's reachability chain also dies in the 0x82458xxx-0x8245Fxxx cluster.
  4. The wedge is downstream of AUDIT-050's unreachability island. Both canary signalers live in the half-bootstrapped worker cluster. The work-submitter (sub_82452DC0) DOES fire in ours (8× per PROBE O) on tid=13 — but the queued work never reaches a worker that calls sub_82458B90 or sub_8245EC10 because the worker-side dispatch infrastructure (vtable install via sub_8245FEB8 ctor; fnptr-array activation of sub_82457EF0) never runs in ours.

  5. AUDIT-058's sub_825070F0 activation hypothesis is corroborated: sub_825070F0 (AUDIT-057's top missing-thread spawner, 4 workers @ ctx 0xBCE25340) is the plausible bootstrap for the workers that would receive the queued work and run the dispatch_table @ 0x820B5830 callbacks. Until that spawn happens in ours, the worker side stays dead → signal never lands.

  1. Direct path: probe sub_82452DC0+0x19C bl site in canary (with our existing --log_lr_on_pc=0x82452E5C or post-bl PC) to trace what happens after work submission. Find which worker thread (one of the 4 spawned by sub_825070F0) dequeues the job and ultimately calls sub_82458B90 or sub_8245EC10.

  2. Indirect path: probe sub_8245FEB8 (vptr installer for dispatch_table 0x820B5830) in canary AND ours. If it fires in canary but not ours, that confirms the worker-class constructor is in the unreachability island.

  3. Bootstrap path: trace what activates sub_825070F0 in canary (per AUDIT-058 it fires 1× post-\\dat\\movie ResolvePath). Capture LR at sub_825070F0 entry in canary, then check that LR's caller-fn for fire count in ours.

Artifacts

xenia-rs/audit-runs/audit-059-gamma-wedge/
  canary-patches-applied.diff   (audit-030 patch record before revert)
  canary-ntcreate.log/.err      (Phase 1: PC 0x821CB15C, 2 fires)
  canary-waitsite.log/.err      (Phase 1b: PC 0x821CB1DC, 2 fires)
  canary-ntsetevent.log/.err    (NtSetEvent thunk PC 0x8284DF5C; 53,701 fires; r3=F8000110 ×2)
  canary-setwrapper.log/.err    (NtSetEvent wrapper PC 0x824AA2F0; 20,919 fires; r3=F8000118 ×2)
  canary-summary.md             (this file)
  ours-summary.md               (sibling PROBE O ours-side findings)

Canary HEAD verified 6de80dffe, working tree clean. xenia-rs untouched.