Files
xenia-rs/audit-runs/iterate-2BA-canary-tid2/FINDINGS.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

63 lines
4.4 KiB
Markdown

# 2.BA — canary tid=2 frame-sync producer, RESOLVED
## tid=2 identity
- canary tid=2 = "GPU Frame limiter" XHostThread (graphics_system.cc:146-238).
- NO thread.create record in either canary trace (phase-c22 88s, phase-d-stage1 115s):
it is a HOST thread (`new kernel::XHostThread(... GetIdleProcess())`), not spawned
via guest NtCreateThread, so it has no entry_pc/ctx in the guest thread.create stream.
- priority = kLowest. Active from 1.667s (BEFORE game main loop tid=1's first import at 2.13s).
- Whole-run behavior: 4660 (c22) / 6667 (d-stage1) events, ALL NtSetEvent, nothing else.
No wait.begin, no other import. 4596/4660 inter-call gaps == 16.667ms == steady 60Hz.
## producer mechanism
- Loop body (graphics_system.cc:177-230): every ~16.6ms -> MarkVblank() -> NanoSleep.
- MarkVblank() (line 364) -> DispatchInterruptCallback(source=0, cpu=2) (line 373)
-> KernelState::EmulateCPInterruptDPC (kernel_state.cc:1365).
- EmulateCPInterruptDPC line 1400: processor_->Execute(thread_state, interrupt_callback, {source,user_data})
runs the GUEST VSync ISR (sub_824BE9A0, registered via VdSetGraphicsInterruptCallback
by guest tid=6 @1.577s) SYNCHRONOUSLY ON tid=2.
- The ISR's downstream issues NtSetEvent on the frame-sync event => the 4660x 60Hz signal.
- The wait BETWEEN signals is a host NanoSleep inside the frame-limiter loop, NOT a guest
wait — that is why tid=2 shows ZERO wait.begin / zero other imports in the trace.
- args_resolved is EMPTY for NtSetEvent in every canary trace, so the exact target handle/SID
cannot be read from the trace; by 2.AR it is canary's frame-sync Event a45a5f48bc88eccc
(raw 0xf8000114), the analog of ours Event 0x10e8 (SID 9ad1bebb6cae28c4).
## +44 contradiction — VERDICT
- The 60Hz signal is driven by EmulateCPInterruptDPC running the guest VSync ISR on the
host frame-limiter thread. It IS the VSync-ISR / opt_callback path, executed at 60Hz —
NOT a separate guest "signaller thread", and NOT a path that bypasses opt_callback.
- ISR (sub_824BE9A0) branches on source: source==0 (vsync, both engines) takes the
0x824BEA30 block: MMIO gate [reg 0x1951 bit0] @0x824BEA38-44 (canary hardcodes return 1),
then opt_callback at [user_data+15144]==0x822F2248 @0x824BEA80-AA8.
- opt_callback 0x822F2248 dispatches vtable[+28] then atomic-enqueues into +84/+88 and
branches on a state field — it is a work-enqueue/dispatch, it does NOT dead-end on +44.
- The +44 NULL (2.AT/2.AV) is on 0x821753C8, a DIFFERENT method reached via a different
vtable[+0x1C]; 2.AT conflated it with the real opt_callback 0x822F2248. The +44 chase is
DEAD: canary signals 0x10e8-analog 4660x THROUGH 0x822F2248, never through the +44 slot.
## ours-side analog status
- Ours has NO dedicated host frame-limiter thread (no XHostThread analog of tid=2).
- Ours fires vsync from coord_pre_round (main.rs:2455) via tick_vsync_instr
(interrupts.rs:237, 150k-instr-gated in lockstep) and injects the ISR onto a victim
GUEST thread via try_inject_graphics_interrupt (main.rs:3729).
- The signal PATH in ours is present and wired: reg 0x1951 already hardcoded to 1 (2.AO),
opt_callback +15144 IS installed once at boot (2.AP: setter 0x824C1920 fires 1x on tid=8,
installs r4=0x822F2248, r3=0xBE8C8F00).
- GAP: ISR DELIVERY CADENCE, not the signal path. Ours fires the ISR ~67-77x in early boot
then STOPS (2.AV/2.AQ), because (a) instruction-count accumulator stops crossing 150k once
guest threads block/exit and the stream slows, and (b) try_inject_graphics_interrupt needs
a Ready/Blocked victim; once threads Idle/Exit it drops the vsync. Canary's host thread
fires unconditionally at wall-clock 60Hz regardless of guest scheduler state.
## R1 fix surface (named)
Make ours deliver the VSync ISR at a steady ~60Hz INDEPENDENT of guest-thread scheduler
state — mirror canary's dedicated frame-limiter cadence — so the ISR -> opt_callback ->
NtSetEvent(0x10e8) chain keeps firing after boot. Surface:
- crates/xenia-kernel/src/interrupts.rs (tick_vsync_instr / tick_vsync_wallclock) +
crates/xenia-app/src/main.rs coord_pre_round + try_inject_graphics_interrupt:
guarantee a vsync is queued AND delivered every ~16.6ms-equivalent even when the guest
is wedged/idle (e.g. a guaranteed cadence tied to the coordinator loop, plus a victim
fallback that can run the ISR even with no Ready/Blocked guest thread, or a Halted-main
re-entry). ~20-60 LOC MEDIUM. NOT a +44 crowbar, NOT a force-install.