Files
xenia-rs/crates/xenia-cpu
MechaCat02 f3b7e8b760 [iterate-2F] Scheduler anti-starvation floor: fix job-4 handoff render gate
The lockstep scheduler's pick_runnable is strict priority
(max_by_key (priority, -idx)). On a cooperative single-host HW slot,
a CPU-bound spinner that never blocks (the silph poll loop pinned by
affinity to hw=5) wins pick_runnable every round forever, permanently
starving a co-located peer (the submitter, tid6) that the spinner is
actually waiting on. On real hardware those threads run on separate SMT
contexts concurrently, so the spinner never starves the submitter; ours
collapses them onto one slot with no anti-starvation, turning priority
(or equal-priority index order) into permanent starvation.

The starved submitter never dequeued job-4 -> the worker-hub (tid5)
blocked INFINITE on completion event 0x1080 -> silph (tid13) wedged on
0x1078 -> no vsync -> draws_seen=0, the publisher splash never renders.
(decrement_quantum's within-slot rotation is dead: begin_slot_visit
unconditionally re-pick_runnable()s each round, discarding the rotated
running_idx. The fix is therefore evaluated at pick time, not via that
discarded rotation.)

Fix (Option A, bounded anti-starvation, deterministic):
- Add per-thread steps_starved counter to GuestThread.
- begin_slot_visit increments it for every Ready peer passed over this
  visit, resets it to 0 for the picked thread.
- pick_runnable selects by effective_priority: once steps_starved
  reaches STARVE_LIMIT (4096) the thread is lifted to i32::MAX and wins
  exactly one pick, then resets. The genuinely higher-priority thread
  still wins ~4095/4096 visits -- the boost grants periodic forward
  progress only, it does NOT invert priority. Pure function of
  counter/priority/index -> deterministic (no wall-clock, no RNG).

Cascade (lockstep exec, XENIA_CACHE_PERSIST=1, -n 200M):
- submitter dequeue sub_82458508 now fires 4x (was 3x); the 4th job
  (buf 0x40baa2c0) is dequeued at cycle 6.15M.
- hub tid5 leaves Blocked(0x1080) -> now Ready (no more INFINITE wait).
- GPU packets 0 -> 116,101,363 (command stream now flowing).
- tid13 (silph::UImpl) advances past the old 0x1078 wedge to a NEW
  downstream wait (handle 0x10a0); 3 new threads spawn (tid14/15/16).
- draws_seen still 0 -> the splash's first draw is a NEW downstream gate,
  not this starvation.

Determinism: two cold lockstep `check -n 5M` runs byte-identical (full
and stable digests). New n50m stable digest deterministic across two
cold runs. Golden re-baselined: instructions 50000007->50000003,
imports 92317->90296 (trajectory shift from the changed pick order).

Tests: 666/666 (+1 test_anti_starvation_bounded_progress).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 10:02:02 +02:00
..