[iterate-2W] Sustain the title present loop: viewport-size register + ISR CPU impersonation

The title's per-frame loop (sub_822F1AA8) is clock-B-paced and only re-fires
when the swap count [controller+88] changes, which advances only on source=1
CP swap-complete interrupts. Each present batch the guest submits (via the
sub_824CE348 -> sub_824BF4D0 builder) ends with a WAIT_REG_MEM on a per-CPU
swap-acknowledge fence [GCTX+0] (GCTX = [device+10772]); the GPU parks there
until the graphics ISR (sub_824BE9A0) clears that CPU's bit. Two coupled gaps
kept ours emitting only ONE source=1 then dead-locking (draws plateaued at 28,
run halted ~19.27M):

1. GPU MMIO register 0x1961 (AVIVO_D1MODE_VIEWPORT_SIZE) read as 0. The swap
   callback sub_824CE2B8 divides by its low 12 bits (display height) as a
   refresh-pacing term, so a 0 read tripped its `twi` divide-by-zero guard and
   aborted the ISR before it reached the fence-clear. Mirror canary
   GraphicsSystem::ReadRegister (graphics_system.cc:311): return 0x050002D0
   (1280x720).

2. The ISR ran on an arbitrary borrowed thread, so [r13+268] (the PCR
   processor number) did not match the interrupt's target CPU. The ISR clears
   `1 << current_cpu` from the fence; running on the wrong CPU cleared the
   wrong bit and the fence (bit 2, from cpu_mask 0x4) never reached 0. Carry
   the target CPU through the interrupt queue (bit index of the PM4_INTERRUPT
   cpu_mask for CP, 2 for vsync per canary DispatchInterruptCallback(0, 2)) and
   impersonate it on the borrowed thread's PCR around the ISR, mirroring canary
   EmulateCPInterruptDPC -> XThread::SetActiveCpu.

With both fixes the fence clears, the GPU drains each present batch, source=1
sustains per-present, clock B advances, and the loop runs continuously. Draws
climb linearly with the budget (no re-stall): 50M 28->718, 200M ->3411,
1B ->18734; swaps 2->147/950/6060. No "Unanticipated CPU_INTERRUPT" trap.
Inline-deterministic (--stable-digest byte-identical x2); n50m golden
re-baselined. 675 tests green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-06-14 20:49:32 +02:00
parent 66bd805726
commit a91f4c550b
5 changed files with 97 additions and 24 deletions

View File

@@ -30,6 +30,12 @@ use xenia_cpu::ThreadRef;
pub const INTERRUPT_SOURCE_VSYNC: u32 = 0;
pub const INTERRUPT_SOURCE_CP: u32 = 1;
/// The processor the graphics ISR impersonates for a v-sync interrupt.
/// Canary hard-codes this: `MarkVblank` → `DispatchInterruptCallback(0, 2)`
/// (graphics_system.cc:478). CP interrupts instead use the bit index of the
/// `PM4_INTERRUPT` `cpu_mask`.
pub const VSYNC_TARGET_CPU: u8 = 2;
/// Guest-registered V-sync / graphics-interrupt callback (from
/// `VdSetGraphicsInterruptCallback`).
#[derive(Debug, Clone, Copy)]
@@ -145,9 +151,16 @@ pub type PendingLocalIrq = [std::sync::atomic::AtomicU8;
pub struct InterruptState {
/// Registered callback (set by `VdSetGraphicsInterruptCallback`).
pub callback: Option<GraphicsInterruptCallback>,
/// Bounded FIFO of pending interrupt sources awaiting injection.
/// Push-back on queue, pop-front on inject. Over-cap pushes drop.
pub pending: VecDeque<u32>,
/// Bounded FIFO of pending interrupts awaiting injection, as
/// `(source, target_cpu)`. Push-back on queue, pop-front on inject.
/// Over-cap pushes drop. `target_cpu` is the processor the graphics
/// ISR must impersonate (canary `XThread::SetActiveCpu` / the
/// `DispatchInterruptCallback(source, cpu)` argument): the bit index
/// of the CP `PM4_INTERRUPT` `cpu_mask` for source=1, and a fixed `2`
/// for vsync (canary `DispatchInterruptCallback(0, 2)`). The ISR reads
/// it from the PCR (`[r13+268]`) to clear the matching per-CPU bit of
/// the swap-acknowledge fence.
pub pending: VecDeque<(u32, u8)>,
/// When `Some`, some HW thread is currently running a callback; on
/// return-to-sentinel we restore this and clear the flag.
pub saved: Option<SavedCallbackCtx>,
@@ -211,8 +224,9 @@ impl InterruptState {
});
}
/// Queue an interrupt for the next safe injection point.
pub fn queue_interrupt(&mut self, source: u32) {
/// Queue an interrupt for the next safe injection point. `cpu` is the
/// processor the ISR must impersonate (see `pending`).
pub fn queue_interrupt(&mut self, source: u32, cpu: u8) {
if self.callback.is_none() {
self.dropped += 1;
return;
@@ -221,18 +235,23 @@ impl InterruptState {
self.dropped += 1;
return;
}
self.pending.push_back(source);
self.pending.push_back((source, cpu));
}
/// Peek at the next pending source without removing it.
pub fn peek_next(&self) -> Option<u32> {
self.pending.front().copied()
self.pending.front().map(|&(source, _)| source)
}
/// Peek at the target CPU of the next pending interrupt.
pub fn peek_next_cpu(&self) -> Option<u8> {
self.pending.front().map(|&(_, cpu)| cpu)
}
/// Pop the next pending source (called by the injector after it has
/// committed to dispatching it).
pub fn take_next(&mut self) -> Option<u32> {
self.pending.pop_front()
self.pending.pop_front().map(|(source, _)| source)
}
/// **Legacy** — instruction-count v-sync ticker. Kept for unit tests
@@ -249,7 +268,7 @@ impl InterruptState {
let periods = self.vsync_accumulator / VSYNC_INSTR_PERIOD;
self.vsync_accumulator %= VSYNC_INSTR_PERIOD;
for _ in 0..periods {
self.queue_interrupt(INTERRUPT_SOURCE_VSYNC);
self.queue_interrupt(INTERRUPT_SOURCE_VSYNC, VSYNC_TARGET_CPU);
}
true
}
@@ -288,7 +307,7 @@ impl InterruptState {
self.last_vsync_instant = Some(anchor + advance);
let to_queue = (periods as usize).min(INTERRUPT_QUEUE_CAP);
for _ in 0..to_queue {
self.queue_interrupt(INTERRUPT_SOURCE_VSYNC);
self.queue_interrupt(INTERRUPT_SOURCE_VSYNC, VSYNC_TARGET_CPU);
}
true
}
@@ -306,7 +325,7 @@ mod tests {
#[test]
fn queue_interrupt_drops_without_callback() {
let mut s = InterruptState::default();
s.queue_interrupt(INTERRUPT_SOURCE_VSYNC);
s.queue_interrupt(INTERRUPT_SOURCE_VSYNC, VSYNC_TARGET_CPU);
assert_eq!(s.dropped, 1);
assert!(s.pending.is_empty());
}
@@ -315,9 +334,9 @@ mod tests {
fn queue_interrupt_fifo_preserves_order() {
let mut s = InterruptState::default();
s.set_callback(0x1000, 0xAB);
s.queue_interrupt(INTERRUPT_SOURCE_VSYNC);
s.queue_interrupt(INTERRUPT_SOURCE_CP);
s.queue_interrupt(INTERRUPT_SOURCE_VSYNC);
s.queue_interrupt(INTERRUPT_SOURCE_VSYNC, VSYNC_TARGET_CPU);
s.queue_interrupt(INTERRUPT_SOURCE_CP, 2);
s.queue_interrupt(INTERRUPT_SOURCE_VSYNC, VSYNC_TARGET_CPU);
assert_eq!(s.dropped, 0);
// FIFO: take_next hands them out in push order.
assert_eq!(s.take_next(), Some(INTERRUPT_SOURCE_VSYNC));
@@ -331,11 +350,11 @@ mod tests {
let mut s = InterruptState::default();
s.set_callback(0x1000, 0xAB);
for _ in 0..INTERRUPT_QUEUE_CAP {
s.queue_interrupt(INTERRUPT_SOURCE_VSYNC);
s.queue_interrupt(INTERRUPT_SOURCE_VSYNC, VSYNC_TARGET_CPU);
}
// Over-cap: drops rather than evicting the oldest.
s.queue_interrupt(INTERRUPT_SOURCE_VSYNC);
s.queue_interrupt(INTERRUPT_SOURCE_VSYNC);
s.queue_interrupt(INTERRUPT_SOURCE_VSYNC, VSYNC_TARGET_CPU);
s.queue_interrupt(INTERRUPT_SOURCE_VSYNC, VSYNC_TARGET_CPU);
assert_eq!(s.dropped, 2);
assert_eq!(s.pending.len(), INTERRUPT_QUEUE_CAP);
}