Files
xenia-rs/audit-runs/iterate-2BA-canary-tid2/FINDINGS.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

4.4 KiB

2.BA — canary tid=2 frame-sync producer, RESOLVED

tid=2 identity

  • canary tid=2 = "GPU Frame limiter" XHostThread (graphics_system.cc:146-238).
  • NO thread.create record in either canary trace (phase-c22 88s, phase-d-stage1 115s): it is a HOST thread (new kernel::XHostThread(... GetIdleProcess())), not spawned via guest NtCreateThread, so it has no entry_pc/ctx in the guest thread.create stream.
  • priority = kLowest. Active from 1.667s (BEFORE game main loop tid=1's first import at 2.13s).
  • Whole-run behavior: 4660 (c22) / 6667 (d-stage1) events, ALL NtSetEvent, nothing else. No wait.begin, no other import. 4596/4660 inter-call gaps == 16.667ms == steady 60Hz.

producer mechanism

  • Loop body (graphics_system.cc:177-230): every ~16.6ms -> MarkVblank() -> NanoSleep.
  • MarkVblank() (line 364) -> DispatchInterruptCallback(source=0, cpu=2) (line 373) -> KernelState::EmulateCPInterruptDPC (kernel_state.cc:1365).
  • EmulateCPInterruptDPC line 1400: processor_->Execute(thread_state, interrupt_callback, {source,user_data}) runs the GUEST VSync ISR (sub_824BE9A0, registered via VdSetGraphicsInterruptCallback by guest tid=6 @1.577s) SYNCHRONOUSLY ON tid=2.
  • The ISR's downstream issues NtSetEvent on the frame-sync event => the 4660x 60Hz signal.
  • The wait BETWEEN signals is a host NanoSleep inside the frame-limiter loop, NOT a guest wait — that is why tid=2 shows ZERO wait.begin / zero other imports in the trace.
  • args_resolved is EMPTY for NtSetEvent in every canary trace, so the exact target handle/SID cannot be read from the trace; by 2.AR it is canary's frame-sync Event a45a5f48bc88eccc (raw 0xf8000114), the analog of ours Event 0x10e8 (SID 9ad1bebb6cae28c4).

+44 contradiction — VERDICT

  • The 60Hz signal is driven by EmulateCPInterruptDPC running the guest VSync ISR on the host frame-limiter thread. It IS the VSync-ISR / opt_callback path, executed at 60Hz — NOT a separate guest "signaller thread", and NOT a path that bypasses opt_callback.
  • ISR (sub_824BE9A0) branches on source: source==0 (vsync, both engines) takes the 0x824BEA30 block: MMIO gate [reg 0x1951 bit0] @0x824BEA38-44 (canary hardcodes return 1), then opt_callback at [user_data+15144]==0x822F2248 @0x824BEA80-AA8.
  • opt_callback 0x822F2248 dispatches vtable[+28] then atomic-enqueues into +84/+88 and branches on a state field — it is a work-enqueue/dispatch, it does NOT dead-end on +44.
  • The +44 NULL (2.AT/2.AV) is on 0x821753C8, a DIFFERENT method reached via a different vtable[+0x1C]; 2.AT conflated it with the real opt_callback 0x822F2248. The +44 chase is DEAD: canary signals 0x10e8-analog 4660x THROUGH 0x822F2248, never through the +44 slot.

ours-side analog status

  • Ours has NO dedicated host frame-limiter thread (no XHostThread analog of tid=2).
  • Ours fires vsync from coord_pre_round (main.rs:2455) via tick_vsync_instr (interrupts.rs:237, 150k-instr-gated in lockstep) and injects the ISR onto a victim GUEST thread via try_inject_graphics_interrupt (main.rs:3729).
  • The signal PATH in ours is present and wired: reg 0x1951 already hardcoded to 1 (2.AO), opt_callback +15144 IS installed once at boot (2.AP: setter 0x824C1920 fires 1x on tid=8, installs r4=0x822F2248, r3=0xBE8C8F00).
  • GAP: ISR DELIVERY CADENCE, not the signal path. Ours fires the ISR ~67-77x in early boot then STOPS (2.AV/2.AQ), because (a) instruction-count accumulator stops crossing 150k once guest threads block/exit and the stream slows, and (b) try_inject_graphics_interrupt needs a Ready/Blocked victim; once threads Idle/Exit it drops the vsync. Canary's host thread fires unconditionally at wall-clock 60Hz regardless of guest scheduler state.

R1 fix surface (named)

Make ours deliver the VSync ISR at a steady ~60Hz INDEPENDENT of guest-thread scheduler state — mirror canary's dedicated frame-limiter cadence — so the ISR -> opt_callback -> NtSetEvent(0x10e8) chain keeps firing after boot. Surface:

  • crates/xenia-kernel/src/interrupts.rs (tick_vsync_instr / tick_vsync_wallclock) + crates/xenia-app/src/main.rs coord_pre_round + try_inject_graphics_interrupt: guarantee a vsync is queued AND delivered every ~16.6ms-equivalent even when the guest is wedged/idle (e.g. a guaranteed cadence tied to the coordinator loop, plus a victim fallback that can run the ISR even with no Ready/Blocked guest thread, or a Halted-main re-entry). ~20-60 LOC MEDIUM. NOT a +44 crowbar, NOT a force-install.