Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
3.1 KiB
3.1 KiB
2.AX — Why ours's VSync ISR stops after cycle 7.46M
Mechanism: HOST-TICKER-STALL (lockstep ticker keyed to guest instruction progress)
- Audit run = LOCKSTEP -> coord_pre_round uses
tick_vsync_instr(stats.instruction_count)(main.rs:2457), fires 1 VSYNC per 150K (VSYNC_INSTR_PERIOD) guest instructions. stats.instruction_countis bumped ONLY by real guest execution (main.rs:2868/2945/3056).- When
round_schedule()returns empty (ALL threads Blocked/Exited) the round skips execution and callscoord_idle_advance(main.rs:3193), which advances guest timebase (scheduler.rs:1189-1196) for timer deadlines but NEVER bumps instruction_count. - => once tid=1 wedges on Event 0x10e8 and every other thread is Blocked/Exited, the guest executes 0 instructions/round, instruction_count FREEZES, tick_vsync_instr delta=0 -> no VSYNC queued -> try_inject_graphics_interrupt has nothing to inject -> ISR stops.
Trace evidence (AQ lr-trace on 0x824be9a0)
- 77 ISR fires total: 76 r3==0 (INTERRUPT_SOURCE_VSYNC=0), 1 r3==1 (INTERRUPT_SOURCE_CP=1).
- First fire cyc 283,678; LAST fire cyc 7,461,492; then 0 fires for the rest of a 66M-event run.
- Early fires tid=7; from cyc 5.58M on tid=1; stops exactly when all threads block.
- The injector (main.rs:3729) HAS a Blocked-thread fallback (Pass 2), so it is NOT the blocker — it simply never receives a queued VSYNC after the ticker stalls.
r3==1 (CP) path
- Fires exactly ONCE (cyc 5,577,159), the only CP interrupt ever queued (gpu.has_pending_interrupts, main.rs:2622). Does NOT reach 0x824bea80 (the r3==0 opt_callback branch). Takes the [user_data+10772]->[+16]/[+20] gfx-int sub-callback path. Even if it KeSetEvent'd 0x10e8 it would do so once, not 60Hz. NOT a viable sustained producer in ours.
Cross-engine symmetry
- Canary delivers VSync 60Hz continuously (tid=2 NtSetEvent 4660x @16.667ms) because canary's vsync is host-wall-clock / GPU-thread driven, independent of guest CPU progress. ours's lockstep ticker is guest-instruction driven -> self-stalls. The stop IS a bug (canary analog is sustained).
Fix surface (NAME ONLY, no patch)
- File crates/xenia-app/src/main.rs
coord_pre_round~2454-2465 (and/or coord_idle_advance ~2528). - Condition to change: in LOCKSTEP, the VSync ticker must advance off a clock that keeps moving when the guest is wedged (the guest TIMEBASE that advance_all_timebases_to already advances during idle), NOT off stats.instruction_count. Options: (a) drive tick on timebase delta; (b) also call the ticker + injector from the idle path (coord_idle_advance) so a wedged-but-time-advancing guest still gets VSync injected on a Blocked thread (injector Pass-2 already supports Blocked victims).
- LOC: ~10-30 (MEDIUM). Determinism: must derive cadence from the deterministic guest timebase, not host wall-clock, to keep golden oracles bit-stable.
Caveat
- This unsticks ISR delivery cadence. Whether the delivered r3==0 ISR then actually signals 0x10e8 is the SEPARATE 2.AV question (opt_callback +44 is a dead-end; real 0x10e8 producer still unconfirmed). Fixing cadence is necessary but may not be sufficient.