[iterate-2G] db16cyc spin-hint cooperative yield: unblock title-screen 0x10a0 gate

The silph title state machine (tid13) blocked on event 0x10a0, never signaled.
Root: the event's producer chain runs on the silph worker (entry 0x821C4AD0,
our tid14), which was starved. tid14 shares a HW slot with a guest spinlock/
barrier participant (sub_824D1328, entry 0x824D2940) that busy-spins on the
db16cyc hint `or r31,r31,r31` (encoding 0x7FFFFB78) at 0x824D140C. Under our
round-robin lockstep the spinner consumed its whole block every round and
starved the co-located tid14 (only 9 progress hits over 200M instr) — so the
producer never reached the event-create/duplicate/signal dance the canary
oracle performs (handle F80000E8 set by the submitter F8000044 via a duplicated
handle).

Fix (canary-faithful): recognize the db16cyc spin hint exactly as canary's
InstrEmit_orx does (code 0x7FFFFB78 -> DelayExecution) and surface it as a new
StepResult::Yield. The scheduler's yield_current() promotes every Ready peer on
the slot past STARVE_LIMIT so begin_slot_visit picks one next round, then they
reset and the spinner reclaims the slot — fair alternation, no priority
inversion, pure function of slot state (deterministic).

Result (lockstep, cache-persist, -n 200M): tid14 progresses past its old stall
into a real wait; tid13 advances off 0x10a0 to a new event; hub/submitter
re-enter their wait loops. imports 280k->592k, packets 124M->164M, swaps 1->2.
draws still 0 (the splash's first draw is a further-upstream gate).

Determinism preserved (two cold n50m runs byte-identical). n50m golden
re-baselined (imports 90296->339766, swaps 1->2; draws unchanged 0). n2m
golden unchanged (db16cyc not reached in first 2M). Tests 670/670.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-06-13 10:38:17 +02:00
parent f3b7e8b760
commit de21c7a544
31 changed files with 433587 additions and 3 deletions

View File

@@ -2619,6 +2619,10 @@ fn worker_prologue(
match result {
StepResult::Continue => {}
StepResult::Yield => {
// db16cyc spin-wait hint (per-instruction path): yield the slot.
kernel.scheduler.yield_current();
}
StepResult::SystemCall => {
tracing::warn!("SYSCALL at {:#010x} (hw={})", pc, hw_id);
}
@@ -2698,6 +2702,11 @@ fn worker_epilogue(
match result {
StepResult::Continue => {}
StepResult::Yield => {
// db16cyc spin-wait hint: hand the slot to a Ready peer so the
// spinner doesn't starve the co-located thread it is waiting on.
kernel.scheduler.yield_current();
}
StepResult::SystemCall => {
let last_pc = block.instrs.last().map(|i| i.addr).unwrap_or(pc_before);
tracing::warn!("SYSCALL at {:#010x} (hw={})", last_pc, hw_id);
@@ -3638,6 +3647,9 @@ fn dispatch_graphics_interrupts(
isr_instrs += 1;
match r {
StepResult::Continue => {}
// db16cyc inside the synchronous ISR has no slot to yield —
// the ISR runs to completion on the borrowed context.
StepResult::Yield => {}
StepResult::SystemCall => {
tracing::warn!("graphics ISR hit `sc` instruction; aborting");
break;

View File

@@ -1,9 +1,9 @@
{
"instructions": 50000003,
"imports": 90296,
"instructions": 50000000,
"imports": 339766,
"unimpl": 0,
"draws": 0,
"swaps": 1,
"swaps": 2,
"unique_render_targets": 0,
"shader_blobs_live": 0,
"texture_cache_entries": 0

View File

@@ -28,6 +28,15 @@ pub enum StepResult {
Trap,
/// Execution halted (by debugger or error).
Halted,
/// Executed the `db16cyc` spin-wait hint (`or r31,r31,r31`, encoding
/// `0x7FFFFB78`). The PC has already advanced past the hint; this is a
/// cooperative-yield signal so the scheduler hands the slot to a Ready
/// peer. On real hardware all six HW threads run concurrently and the
/// spin resolves naturally; under our round-robin lockstep a spinning
/// barrier/spinlock participant would otherwise monopolize its slot and
/// starve the co-located thread it is waiting on. Matches canary's
/// `InstrEmit_orx` db16cyc → `DelayExecution()` handling.
Yield,
}
/// Execute a single PPC instruction.
@@ -95,6 +104,9 @@ pub fn step_block(
ctx.cycle_count += 1;
ctx.timebase += 1;
if !matches!(result, StepResult::Continue) {
// `Yield` (db16cyc spin hint) terminates the block here so the
// scheduler regains control and can rotate the slot; the PC has
// already advanced past the hint inside `execute`.
return result;
}
// PC discontinuity within a block. By construction only the
@@ -548,6 +560,18 @@ fn execute(ctx: &mut PpcContext, mem: &dyn MemoryAccess, instr: &DecodedInstr) -
ctx.gpr[instr.ra()] = ctx.gpr[instr.rs()] | ctx.gpr[instr.rb()];
if instr.rc_bit() { ctx.update_cr_signed(0, ctx.gpr[instr.ra()] as u32 as i32 as i64); }
ctx.pc += 4;
// `or r31,r31,r31` with encoding 0x7FFFFB78 is the Xenon `db16cyc`
// spin-wait hint (a no-op write of r31 onto itself). Canary's
// `InstrEmit_orx` special-cases exactly this code → `DelayExecution()`.
// Under our round-robin lockstep, a guest spinlock/barrier loop that
// executes db16cyc would otherwise consume its whole block every round
// and starve the co-located thread it is waiting on (the lock holder /
// barrier peer). Surface it as a cooperative yield so the scheduler can
// hand the slot to a Ready peer. The semantic result of the op is
// already applied (r31 |= r31 is a no-op), so yielding is value-neutral.
if instr.raw == 0x7FFF_FB78 {
return StepResult::Yield;
}
}
PpcOpcode::orcx => {
// PPCBUG-028: same shape as andcx — operate in u32.
@@ -5042,6 +5066,40 @@ mod tests {
assert_eq!(ctx.pc, 4);
}
#[test]
fn test_db16cyc_yields() {
// `or r31,r31,r31` encoding 0x7FFFFB78 is the Xenon db16cyc spin hint.
// It must (a) be value-neutral (r31 unchanged), (b) advance PC, and
// (c) report StepResult::Yield so the scheduler can hand off the slot.
let mut ctx = PpcContext::new();
let mut mem = TestMem::new();
write_instr(&mut mem, 0, 0x7FFF_FB78);
ctx.pc = 0;
ctx.gpr[31] = 0x1234_5678_9ABC_DEF0;
let r = step(&mut ctx, &mut mem);
assert_eq!(ctx.gpr[31], 0x1234_5678_9ABC_DEF0, "db16cyc is value-neutral");
assert_eq!(ctx.pc, 4, "PC advances past the hint");
assert_eq!(r, StepResult::Yield, "db16cyc surfaces as a cooperative yield");
}
#[test]
fn test_plain_or_self_is_not_yield() {
// A regular `or rN,rN,rN` that is NOT the db16cyc encoding (e.g. r3)
// is an ordinary no-op move and must keep executing (Continue), so we
// only yield on the exact spin-hint code canary special-cases.
let mut ctx = PpcContext::new();
let mut mem = TestMem::new();
// or r3, r3, r3 (RT=RA=RB=3, Rc=0): 31<<26 | 3<<21 | 3<<16 | 3<<11 | 444<<1
let raw = (31u32 << 26) | (3 << 21) | (3 << 16) | (3 << 11) | (444 << 1);
write_instr(&mut mem, 0, raw);
ctx.pc = 0;
ctx.gpr[3] = 0xCAFE;
let r = step(&mut ctx, &mut mem);
assert_eq!(ctx.gpr[3], 0xCAFE);
assert_eq!(ctx.pc, 4);
assert_eq!(r, StepResult::Continue, "non-db16cyc or-self stays Continue");
}
#[test]
fn test_fadd() {
let mut ctx = PpcContext::new();

View File

@@ -902,6 +902,41 @@ impl Scheduler {
false
}
/// Cooperative yield: the currently-running thread executed a `db16cyc`
/// spin-wait hint (see `StepResult::Yield`). It is busy-spinning on a
/// guest spinlock/barrier whose release depends on a *co-located* peer
/// that cannot make progress while this thread keeps winning the slot.
///
/// Promote every Ready peer on this slot past `STARVE_LIMIT` so the next
/// `begin_slot_visit` picks one of them (their `effective_priority` →
/// `i32::MAX`), and reset the yielder's own counter. Each promoted peer
/// runs once and resets to 0 in `begin_slot_visit`; once all peers have
/// had their turn the spinner is picked again, spins, and re-yields —
/// producing a fair round-robin between the spinner and the threads it is
/// waiting on. This mirrors real hardware, where all six HW threads run
/// concurrently and the spin resolves as soon as the peer releases.
///
/// Pure function of the slot's current state (no RNG, no wall-clock), so
/// it preserves lockstep determinism. No-op if there is no Ready peer
/// (the spinner is alone on its slot — nothing to hand off to).
///
/// Returns `true` if at least one peer was promoted.
pub fn yield_current(&mut self) -> bool {
let Some(r) = self.current else { return false; };
let slot = &mut self.slots[r.hw_id as usize];
let me = r.idx as usize;
let mut promoted = false;
for (i, t) in slot.runqueue.iter_mut().enumerate() {
if i == me {
t.steps_starved = 0;
} else if matches!(t.state, HwState::Ready | HwState::ServicingIrq(_)) {
t.steps_starved = STARVE_LIMIT;
promoted = true;
}
}
promoted
}
// ----- Park / wake / exit -----
pub fn park_current(&mut self, reason: BlockReason) {
@@ -2062,6 +2097,71 @@ mod tests {
);
}
#[test]
fn test_db16cyc_yield_hands_slot_to_peer() {
// Reproduces the Sylpheed title-screen gate: a guest spinlock/barrier
// participant (tid=1) executes the `db16cyc` spin hint each round and
// would otherwise win `pick_runnable` forever (equal priority, lower
// index), starving the co-located peer (tid=2) it is waiting on.
// `yield_current` must promote the Ready peer so the very next
// `begin_slot_visit` picks it — without waiting STARVE_LIMIT rounds.
let mut s = mk_empty_scheduler();
for tid in [1u32, 2] {
let mut p = SpawnParams::default();
p.guest_tid = tid;
p.thread_handle = 0x1000 + tid * 4;
p.affinity_mask = 0b0001;
p.pcr_base = 0x4000_0000 + tid * 0x1000;
p.priority = 0; // equal priority — index would otherwise decide
s.spawn(p, &mut NullPcr).unwrap();
}
// Round 1: the spinner (lower index) wins.
s.begin_slot_visit(0);
let spinner = s.thread(s.current.unwrap()).tid;
assert_eq!(spinner, 1, "lower-index equal-priority thread wins first pick");
// It spins (db16cyc) → cooperative yield.
assert!(s.yield_current(), "yield promotes the Ready peer");
s.end_slot_visit();
// Round 2: the promoted peer must now be picked, not the spinner.
s.begin_slot_visit(0);
let after_yield = s.thread(s.current.unwrap()).tid;
assert_eq!(
after_yield, 2,
"after db16cyc yield the co-located peer runs (no STARVE_LIMIT wait)"
);
s.end_slot_visit();
// Round 3: peer's boost was consumed (reset to 0 when picked), so the
// spinner reclaims the slot — fair alternation, no priority inversion.
s.begin_slot_visit(0);
assert_eq!(
s.thread(s.current.unwrap()).tid,
1,
"spinner reclaims the slot after the peer has had its turn"
);
}
#[test]
fn test_yield_current_noop_when_alone() {
// A spinner with no Ready peer on its slot has nothing to hand off to;
// yield_current must be a no-op (returns false) and not panic.
let mut s = mk_empty_scheduler();
let mut p = SpawnParams::default();
p.guest_tid = 1;
p.thread_handle = 0x1004;
p.affinity_mask = 0b0001;
p.pcr_base = 0x4000_0000;
s.spawn(p, &mut NullPcr).unwrap();
s.begin_slot_visit(0);
assert!(!s.yield_current(), "no peer to promote → no-op");
// Still the same thread next round.
s.end_slot_visit();
s.begin_slot_visit(0);
assert_eq!(s.thread(s.current.unwrap()).tid, 1);
}
#[test]
fn test_cooperative_yield_does_not_need_quantum() {
let mut s = mk_empty_scheduler();