# Ours threading model — Phase C+23 characterization Re-verifies xenia-rs's threading model in the current tree (HEAD per session start). Source-of-truth files re-read this session: - `xenia-rs/crates/xenia-cpu/src/scheduler.rs` (2094 lines) - `xenia-rs/crates/xenia-kernel/src/state.rs` (2383 lines) - `xenia-rs/crates/xenia-kernel/src/exports.rs` (9370 lines) - `xenia-rs/crates/xenia-kernel/src/contention_manifest.rs` (342 lines) ## 1. Threading abstraction: single host thread, 6 cooperative HW slots `scheduler.rs` defines `HW_THREAD_COUNT` and `Scheduler::round_schedule` (line 730). The Scheduler holds 6 `HwSlot` runqueues; each runqueue holds N guest XThreads. There is **no host `std::thread` per guest thread**. The single host thread that owns the CPU walks the slots in `rotation_cursor` order, picks the highest-priority Ready thread per slot, executes a quantum-worth of guest instructions, and moves on. Compared to canary's 1-host-per-1-guest model, this is *cooperative* in two senses: only one guest thread runs at a time (no true SMP), and context switches happen only at well-defined emulator boundaries (quantum exhaustion, explicit park, end-of-step). ## 2. OrderMode enum (scheduler.rs:232) ```rust pub enum OrderMode { Fixed, // default; ours digest e1dfcb15… Seeded { seed: u64 }, // pseudo-random shuffle of the round ScanQuantum { ticks: u32 },// Stage 0 spike, landed but null-result } ``` Selected via `XENIA_SCHED_ORDER` env var (`from_env` at line 244). Defaults to `Fixed`. Plus the env-var `XENIA_SCHED_QUANTUM` for `ScanQuantum` reload. There is no `ContentionReplay` variant in the current source today — the Phase D Stage 3 work landed instead a manifest-consultation *inside* `rtl_enter_critical_section` (exports.rs), not a new `OrderMode` (planner's hindsight: putting it in `OrderMode` would be cleaner; this is documented as a deviation from the original plan). ## 3. Per-slot quantum + decrement_quantum (scheduler.rs:800) `decrement_quantum` decrements the running thread's `quantum_remaining`. On reach-zero it reloads (per `quantum_for(order)` at line 793) and scans the slot's runqueue for a *same-priority* Ready peer to rotate to. If no peer exists, no rotation happens — the quantum reload is benign. Stage 0 (2026-05-18) sweep validated: - Fixed → ours digest `ba5b5e07…` (since Stage 0 baseline; prior baseline was `e1dfcb15…` before Stage 0 changed default-mode emission). - ScanQuantum × [10, 50, 200, 1000, 5000, 10000] → all byte-identical to Fixed default. **Why**: tid=1 alone on slot 0 during boot; no peer to rotate to regardless of quantum. Option B (forced-yield across slots) would face the same constraint (and was skipped). The lesson: rotating *within* a slot doesn't help; tid=1's monolithic boot region has no other thread on its slot to rotate to. ## 4. park_current / wake_ref (scheduler.rs:840) `park_current(BlockReason)` is the canonical primitive for parking the currently-running thread. Used by: - `RtlEnterCriticalSection` parking on `BlockReason::CriticalSection(cs_ptr)` (exports.rs ~2927). - `KeWaitForSingleObject` parking on `BlockReason::WaitSingle(handle)`. - Other primitives. The wake side calls `Scheduler::wake_ref(ref)` which transitions HwState::Blocked → HwState::Ready and re-marks the slot's `non_empty_runnable` mask. FIFO queues for each blocking object (`cs_waiters[cs_ptr]` etc) live in `kernel-state.rs` style data. Key property: parking + waking is deterministic per (host run, input), because every cross-thread interaction goes through the Scheduler which has no host-OS dependency. ## 5. rtl_enter_critical_section (exports.rs:2886-2946) Re-read for Phase C+23 verification. Branches: 1. `owner == 0 || !owner_is_live` → claim uncontended. 2. `owner == current_tid` → recursive bump. 3. otherwise → push self onto `cs_waiters[cs_ptr]`, `park_current(BlockReason::CriticalSection(cs_ptr))`. **No spin loop.** Goes straight to park. This is the deliberate asymmetry vs canary's `cs->header.absolute*256` spin. Documented and intentional — adding spin to ours would not help; the only way ours "contends" is if a peer thread has the lock at the exact moment ours's tid=1 reaches the call. In the boot region around event 104,604, ours tid=1 is the only runnable thread on slot 0 — no peer is even Ready to take the CS first. So ours invariably fast-paths. ## 6. Contention manifest loader (contention_manifest.rs) Phase D Stage 3 landed `crates/xenia-kernel/src/contention_manifest.rs` (342 LOC) with `consume_at_peek(tid, peek_idx)` that translates ours's per-tid idx back to canary's idx space (subtracts prior `contention.observed` emits). `XENIA_CONTENTION_MANIFEST_PATH` env var opts in. Per the Stage 3+4 result: replay-mode digest `1d7c6b45…` stable × 3 cold runs, but main matched-prefix **still 104,607** — the manifest's forced-contention entries fire at wrong logical positions because the divergence is upstream of any contention event. This is a critical input to C+23's recommendation: the Phase D replay infrastructure is built and stable, but it does NOT unblock the 104,607 cap. The actual cap-unblock came from the D-extension diff-tool absorber (band-aid, Phase D 2026-05-18). The structural fix never landed and has no clear next step. ## 7. Existing determinism guarantees - Default-mode ours cold digest **`ba5b5e07…`** × 3 reproducible (Stage 0 / Phase D baseline). Prior `e1dfcb15…` baseline is the C+19 era constant; the Stage 0 emission tweak shifted it without changing logic. - Phase B `image_loaded_sha256 ea8d160e…` unchanged across all 23+ phases. - All emitted Phase A events are stable on (input, cvars). ## 8. Mismatch surfaces with canary | dimension | canary | ours | |---|---|---| | host threads | 1 per XThread | 1 total | | inter-thread arbiter | host OS | Scheduler | | RtlEnterCS spin | spin then wait | park immediately | | Clock | wallclock (rdtsc) | fixed FILETIME `132_500_000_000_000_000` | | Wait wakeup ordering | pthread_cond_broadcast race | FIFO `cs_waiters` | | Yield primitive | host yield | `decrement_quantum` rotation | Of these, the **clock** and the **wait wakeup ordering** are the two surfaces beyond CS-contention where canary→ours divergence has potential to surface. So far Sylpheed exercises them lightly: 2 KeQuerySystemTime calls, 34 wait.begin events total. ## 9. Existing scheduler cvars / lockstep modes There is no `lockstep` cvar in ours. The closest mode is `OrderMode::Fixed` (default), which produces a deterministic schedule keyed entirely on the spawn/wake sequence. Replay via manifest is opt-in via `XENIA_CONTENTION_MANIFEST_PATH`. ## 10. Implication: ours is the strict side In any cross-engine deterministic-replay scheme, **ours has to bend toward canary**, not the other way. Canary's host-OS scheduling cannot be tamed without rewriting it (out of scope; would also invalidate it as the oracle, since the "real" Xbox 360 wasn't deterministic in this sense either). The Phase D plan's H'/H broad landed Stages 1-4 of this bend — the engine infrastructure is built, just not load-bearing for the 104,607 cap.