Iterate-2.BE: host-driven synchronous graphics ISR delivery

Replaces the victim-thread-mutate-then-wait scheme for vsync / CP
interrupts with synchronous in-line dispatch on the coordinator host
thread. Mirrors canary's EmulateCPInterruptDPC -> Processor::Execute
path (kernel_state.cc:1370, processor.cc:413): pick a guest thread,
borrow its PpcContext, jam ISR PC + args in, run the interpreter
inline until LR_HALT_SENTINEL, restore the borrowed context.

Why: audit-059 measured gpu.interrupt.delivered{source=0} = 54 over
3.9 s vs canary's 4712 over 30 s. Per-second shortfall ~11×. Old
asynchronous LR-sentinel injection (try_inject_graphics_interrupt)
needed a Ready or Blocked guest thread to land on; once the Sylpheed
main thread and worker threads all idled post-boot, no victim was
available and every queued vsync got dropped. Host-driven dispatch
decouples delivery from guest-thread readiness.

Smoke test (lockstep): unchanged 54 — under current Sylpheed boot
trajectory the ticker is gated by guest-instruction progress, not
victim availability; lockstep stalls into idle-advance after ~5M
instructions of real work and the synthetic tick_vsync_instr stops
firing. Under --parallel (wallclock ticker) gpu.interrupt.delivered
climbs to ~1131 over a 128 s run, confirming the synchronous
dispatcher itself works as intended. Architectural piece is now in
place; raising the lockstep delivery rate requires ticking the
synthetic vsync inside coord_idle_advance, which is a separate
change.

Changes:
- crates/xenia-kernel/src/interrupts.rs: doc-comment update only.
  SavedCallbackCtx + CALLBACK_STACK_PAD retained — the audio
  callback path (audit-048) still uses the asynchronous LR-sentinel
  inject on a dedicated per-client worker.
- crates/xenia-app/src/main.rs:
  * dispatch_graphics_interrupts(kernel, mem, &mut stats,
    &mut decode_cache, thunk_map): new fn. Drains the full FIFO per
    call. Victim selection same shape (Ready preferred, else
    Blocked, skip Idle/Exited/ServicingIrq), but the call is
    synchronous - we run step_cached + import-thunk dispatch inline
    on the borrowed ctx until pc == LR_HALT_SENTINEL.
    MAX_INSTRS_PER_ISR = 1M safety budget.
  * coord_pre_round: graphics-IRQ injection call removed. Audio
    path unchanged (still calls try_inject_audio_callback).
  * run_execution + run_execution_parallel: each now owns a
    persistent isr_decode_cache and calls
    dispatch_graphics_interrupts after coord_pre_round.
  * try_inject_graphics_interrupt: deleted (118 LOC).

No new public APIs, no new dependencies, no changes to xenia-cpu.

Tests: workspace 765 passed / 0 failed / 4 ignored (parallel_stress
+ sylpheed_n50m, all gated). Kernel 127/127, app 5/5, cpu 288/288.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-06-06 18:58:40 +02:00
parent ac2f89a7bb
commit 9a93152981
2 changed files with 279 additions and 123 deletions

View File

@@ -8,13 +8,18 @@
//! guest-issued command stream; source code 1 (`INTERRUPT_SOURCE_CP`).
//!
//! Canary's [xboxkrnl_video.cc:303-310](xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_video.cc#L303-L310)
//! dispatches the callback on HW thread 0. We follow the same convention.
//! dispatches the callback on HW thread 0. We follow the same convention
//! for picking a *context donor*, but as of iterate-2.BE the dispatch
//! itself is **synchronous and host-driven**: the main loop runs the ISR
//! inline on the borrowed guest context, mirroring canary's
//! `EmulateCPInterruptDPC → Processor::Execute` path
//! ([kernel_state.cc:1370](../../../../xenia-canary/src/xenia/kernel/kernel_state.cc#L1370),
//! [processor.cc:413](../../../../xenia-canary/src/xenia/cpu/processor.cc#L413)).
//! Independent of whether the donor guest thread was Ready or Blocked.
//!
//! The delivery model is cooperative: we inject the callback entry into HW
//! thread 0 at the top of a scheduler round when it's safe (not mid-export,
//! not already inside another interrupt). When the callback returns to
//! [`LR_HALT_SENTINEL`] the main loop restores the saved [`PpcContext`]
//! fields and the HW thread picks up where it left off.
//! The audio callback path (audit-048) still uses asynchronous LR-sentinel
//! injection on a dedicated per-client worker thread; the
//! [`SavedCallbackCtx`] machinery below remains in use there.
use std::collections::VecDeque;
use std::time::{Duration, Instant};