--- name: xenia-rs scheduler architecture (post-Axis-1-to-5 refactor, 2026-04-23) description: Canonical scheduler model — 6 HW slots × per-slot priority runqueues, single host thread, GuestThread as first-class, ThreadRef identity, bind-and-migrate affinity. Supersedes the old HwThread[32] one-thread-per-slot model. type: project originSessionId: a178fdd6-2965-4652-903a-f684cf80835d --- ## Model in one paragraph Single host thread runs the interpreter (`GuestMemory` pinned). Scheduler has **6 `HwSlot`s** matching Xenon hardware. Each slot holds `runqueue: Vec` + `running_idx: Option`. A `GuestThread` owns its own `PpcContext` inline — the live register file is always the one on whichever thread the slot has pinned as running, so context switch is just a `running_idx` flip (no memcpy). Unlimited guest threads per slot. ## Identity `ThreadRef { hw_id: u8, idx: u16 }` — 4-byte positional identity used across the boundary. Waiter lists in `KernelObject::{Event,Semaphore,Mutex,Thread}`, `state.cs_waiters`, `interrupts.injected_ref`, and `scheduler.timed_waits` all store `ThreadRef` (not raw hw_id). After `swap_remove` (Axis 4 migration), refs are fixed up via `MigrationFixup::apply`. ## Compat accessors (how ~30 call-sites survived the data-model refactor) `scheduler.ctx(hw_id) / ctx_mut(hw_id) / ctx_mut_ref(r) / state(hw_id) / tid(hw_id) / thread_handle(hw_id) / suspend_count_mut(hw_id) / current_hw_id()` — each resolves through `slots[hw_id].running_idx`. Safe sentinel (`idle_ctx`) returned when running_idx is None. This let the refactor avoid rewriting every `hw_threads[i].ctx` site in [main.rs](xenia-rs/crates/xenia-app/src/main.rs) and [exports.rs](xenia-rs/crates/xenia-kernel/src/exports.rs). ## Scheduling - **`HwSlot::pick_runnable`** — highest-priority Ready/ServicingIrq thread; tiebreak lowest idx. - **`Scheduler::round_schedule`** — emits slot ids in rotating order starting from `rotation_cursor`, filtered by `non_empty_runnable: u8` bitset. Empty-slot fast path. `OrderMode::Seeded` layers Fisher-Yates on top of the filtered list. - **`Scheduler::begin_slot_visit(hw_id)`** — called by main.rs at top of each slot iteration; picks runnable, sets `running_idx`, writes `self.current: Option`. - **`Scheduler::decrement_quantum()`** — Axis 3 per-instruction tick; on hit-zero, reloads to `QUANTUM_DEFAULT = 50_000` and rotates within same-priority tier (observed next round, not mid-instruction). ## Affinity + priority (Axis 4/5 wire-up) - **`KeSetAffinityThread(handle, mask) -> old_mask`** does real migration: `set_affinity_ref` finds the thread, updates mask, if current slot no longer allowed → `swap_remove` from source slot, push onto least-depth allowed slot, rewrite `PCR+0x2C`, return `MigrationFixup`. `KernelState::set_affinity` walks every waiter list and applies the fixup. - **Self-migration handling**: if the migrating thread is `scheduler.current`, the ref is updated in place. `call_export`'s post-call ctx restore re-reads `current` (not the stashed entry ref) so ctx lands on the new slot. `main.rs`'s post-export `pc = lr` advance uses `post_ref = scheduler.current` for the same reason. - **`KeSetBasePriorityThread` / `KeQueryBasePriorityThread`** store/read `GuestThread.priority: i32`. NT-style [-15..+15], default 0. Drives `pick_runnable`. - **`KeSetIdealProcessor` / `KeQueryIdealProcessor` / `NtSetInformationThread`** (classes 2/3/13) wired; ideal is a spawn-placement hint (not migrate-on-change). ## Lifecycle details - `exit_current` flips state to `Exited(code)` but does NOT `Vec::remove` (would invalidate peer ThreadRefs). Pruning happens at `spawn` time via `prune_exited_if_needed` when a slot reaches `PRUNE_DEPTH_THRESHOLD = 4`. - `install_initial_thread` on `Scheduler` lives next to `spawn`; both write `PCR+0x2C = hw_id` via the `PcrWriter` trait (impl `GuestMemoryPcr` in [state.rs](xenia-rs/crates/xenia-kernel/src/state.rs)). - `KernelObject::Thread.waiters: Vec` (not `Vec`) — necessary for correctness under per-slot runqueues. ## Known caveat (2026-04-23) Axis 4's real migration distributes Sylpheed's workers across slots differently than the old 32-slot one-per-slot model. The resulting wait/signal chain trips a single `scheduler.deadlock_recoveries` event during boot; default force-wake recovery resolves it and the game progresses to **VdSwap=2** (up from pre-Axis-4's 1). Under `--halt-on-deadlock` this trips `scheduler.deadlock_halts = 1` at ~7.5M cycles. The issue is a latent HLE sync-primitive gap exposed by correct migration, not an Axis 4 defect. Root cause: one of tid=1/3/4/7's blocking events isn't being signaled by its expected source after thread layout changes. Track down by instrumenting the specific handle values (0x10FC, 0x1014, 0x1104, 0x10DC/0x10F0) in a future session. ## Files - [xenia-cpu/src/scheduler.rs](xenia-rs/crates/xenia-cpu/src/scheduler.rs) — workhorse (~35 tests covering all 5 axes) - [xenia-kernel/src/state.rs](xenia-rs/crates/xenia-kernel/src/state.rs) — `KernelState::set_affinity` orchestrator, `call_export` ctx swap via `ThreadRef` - [xenia-kernel/src/exports.rs](xenia-rs/crates/xenia-kernel/src/exports.rs) — `ke_set_affinity_thread` (0x97), `ke_set_base_priority_thread` (0x99), `ke_query_base_priority_thread` (0x81), `ke_set_ideal_processor` (0x98), `ke_query_ideal_processor` (0x82), `nt_set_information_thread` (0xFB) - [xenia-kernel/src/objects.rs](xenia-rs/crates/xenia-kernel/src/objects.rs) — waiter lists as `Vec` - [xenia-kernel/src/interrupts.rs](xenia-rs/crates/xenia-kernel/src/interrupts.rs) — `injected_ref: Option` (not `injected_hw: u8`) ## Metrics added - `scheduler.spawn.ok` — successful spawns - `scheduler.spawn.rejected` — spawn failures (should stay 0) - `scheduler.deadlock_recoveries` — force-wake events (non-zero post-Axis-4; see caveat) - `scheduler.deadlock_halts` — halts under `--halt-on-deadlock`