# Investigation Notes — Scheduler-Determinism Plan (2026-05-18) Source citations and probe results from the Phase-1 investigation. All claims here are verified against source or runtime data; speculation is flagged. ## 1. Canary threading & scheduling model **Verdict**: 1-host-thread-per-XThread; scheduling delegated to host OS (Wine on Linux). No internal scheduler. - Each guest `XThread` owns a host `xe::threading::Thread` (`xenia-canary/src/xenia/kernel/xthread.h:476`). - POSIX backend: pthread per XThread (`xenia-canary/src/xenia/base/threading_posix.cc`). - TLS bridge: `thread_local XThread* current_xthread_tls_` (`xthread.cc:105`). `XThread::TryGetCurrentThread()` returns null when called outside a guest thread (C+15-α robustness fix for the boot-time emitter). - Tid assignment: `thread_id_(++next_xthread_id_)` in ctor (`xthread.cc:62`). - KPCR per XThread, allocated at `pcr_address_` (`xthread.h:506`); contains scheduler-like state mirroring real Xenon KPRCB. - `CheckQuantumAndDecay()` (`xthread.h:437`) fires ~20ms via `KernelState`'s timer — simulates Xenon priority decay but does NOT preempt; runs on whichever host thread the host OS schedules. **No internal scheduler.** No `lockstep`, `deterministic`, `replay` cvar (grep confirmed across `xenia-canary/src/xenia/`). ## 2. Canary clock infrastructure **Verdict**: wallclock-driven (rdtsc or platform API). Optional scaling, no full deterministic mode. - Canonical class `xe::Clock` (`base/clock.h:30`). - `Clock::QueryHostTickCount()` (`base/clock.cc:128`): rdtsc on x64 if `clock_source_raw=true`, else platform API. - `Clock::QueryGuestSystemTime()` (`clock.h:82`): host time adjusted by `guest_time_scalar_`. - `KeQuerySystemTime_entry` (`xboxkrnl_threading.cc:459`): declared `void`, writes via OUT pointer; reads `Clock::QueryGuestSystemTime()`. (C+1 verified parity with ours's void-export framing.) - `KeWaitForSingleObject_entry` (`xboxkrnl_threading.cc:1003`): reads `*timeout_ptr` as i64×100 → ns (C+23 verified ours computes the same value). - Cvars: `clock_no_scaling` (`base/clock.cc:24`), `clock_source_raw` (`base/clock.cc:28`). Neither makes the clock deterministic across Wine runs — wallclock drift is irreducible. ## 3. Canary wait primitives **Verdict**: `xe::threading::Wait` → `pthread_cond_timedwait` (POSIX) / `WaitForMultipleObjects` (Win32). - `xeKeWaitForSingleObject()` (`xboxkrnl_threading.cc:969`) → `XObject::Wait()` → `xe::threading::Wait()` → host primitive. - Whether contention happens is purely host-OS-scheduler-driven. Reading-error #32 from C+20 documents this: 3 fresh canary cold runs at tid=6 idx 104,606 showed different patterns (no wait.begin / wait.begin contended / offset-shifted). ## 4. Canary RtlEnterCriticalSection — spin-then-wait (DISCOVERED) [xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_rtl.cc:596-633](../../../xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_rtl.cc) — `RtlEnterCriticalSection_entry`: ```c uint32_t spin_count = cs->header.absolute * 256; // game-supplied spin count if (cs->owning_thread == cur_thread) { recursion++; return; } while (spin_count--) { if (xe::atomic_cas(-1, 0, &cs->lock_count)) { /* acquired via spin */ break; } } if (xe::atomic_inc(&cs->lock_count) != 0) { xeKeWaitForSingleObject(...); // slow path } cs->owning_thread = cur_thread; cs->recursion_count = 1; ``` **Implication**: under low contention, spin succeeds and no `wait.begin` is emitted. Under high contention, spin fails and `wait.begin` fires. Whether spin succeeds depends on host-OS timing — non-deterministic across Wine runs. ## 5. Ours threading & scheduling **Verdict**: single host thread; 6 cooperative HW slots; deterministic by construction. - `xenia-rs/crates/xenia-cpu/src/scheduler.rs`: - `OrderMode { Fixed, Seeded { seed } }` (lines 230-258). - `round_schedule()` (lines 710-740): returns slot-id vector; advances `rotation_cursor` by 1. - `park_current(BlockReason)` (line 808). - `wake_ref(ThreadRef)` (line 831). - M3 optional `--parallel` mode (6 workers + coordinator, 7-party phaser) exists but is not default. **Determinism foundation**: 23 phases of stabilization invested in `e1dfcb15…` cold digest × 3 reproducible. ## 6. Ours RtlEnterCriticalSection — NO spin [xenia-rs/crates/xenia-kernel/src/exports.rs:2886-2946](../../../xenia-rs/crates/xenia-kernel/src/exports.rs) — `rtl_enter_critical_section`: ```rust let owner = mem.read_u32(cs_ptr + CS_OFFS_OWNING_THREAD); let owner_is_live = owner != 0 && state.scheduler.find_by_tid(owner).is_some(); if owner == 0 || !owner_is_live { /* claim immediately — write owning_thread, lock_count=0, recursion=1 */ return; } if owner == current_tid { /* recursive lock — increment counts */ return; } // Truly contended against a live peer — park IMMEDIATELY (no spin). state.cs_waiters.entry(cs_ptr).or_default().push(current_ref); state.scheduler.park_current(BlockReason::CriticalSection(cs_ptr)); ``` **Asymmetry summary**: canary spins ~256×N times before parking; ours parks immediately. Under the cooperative scheduler, ours's tid=1 runs monolithically until it parks — no other thread has a chance to acquire the CS first. Hence at 104,607, the CS is free when tid=1 tries, while in canary it was held by another thread that got scheduled in between. ## 7. Ours clock infrastructure **Verdict**: fixed FILETIME constant. No wallclock dependency in the hot path. - `KeQuerySystemTime` returns `132_500_000_000_000_000` (~2021) via OUT-ptr (`exports.rs:628`). - `KeQueryInterruptTime` returns `0x0000_0001_0000_0000` (`exports.rs:504`). - `event_log.rs` uses `Instant::now()` for the observability `host_ns` field — non-deterministic but not consumed by the matched-prefix metric. ## 8. Sylpheed workload profile (probe) Ran on `xenia-rs/audit-runs/phase-c22-rtl-enter-leave-control-flow/ours-cold.jsonl` (121,569 events): | event | count | notes | |---|---|---| | RtlEnterCriticalSection (kernel.call) | 19,494 | ≈80% of all kernel.calls | | RtlLeaveCriticalSection (kernel.call) | 19,492 | matches Enter (off-by-2 from boot edge) | | NtClose | 160 | | | NtCreateEvent | 103 | | | NtReleaseSemaphore | 99 | | | NtQueryInformationFile | 93 | | | NtWaitForMultipleObjectsEx | 92 | | | KeWaitForSingleObject | 5 | | | KeWaitForMultipleObjects | 1 | | | **KeQuerySystemTime** | **2** | clock-light workload | | KeQueryPerformanceFrequency | 6 | | | KeQueryPerformanceCounter | 0 | | | KeQueryInterruptTime | 0 | | | KeDelayExecutionThread | 0 | | | NtYieldExecution | 0 | | | wait.begin events (all kinds) | 34 | most with `timeout_ns=-1` (indefinite) | **Implications**: - Sylpheed is CS-dominated. Stage-1 emitter on RtlEnterCS captures the dominant signal. - Sylpheed barely touches the clock. Approach A (cycle clock in canary) addresses ≈2 events out of 121,569. Wrong target. - Wait surface is small (34 events). Wait-side replay is low-value; scope to CS only. ## 9. The 104,607 divergence (re-verified) From C+22 memory + jitter jsonl re-analysis: | sample | tid=6 events 104,604..104,615 (import.call only) | |---|---| | c21 archived | E E L L | | canary jitter-1 | E (wait.begin slow path) E L L | | canary jitter-2 | E E L L | | canary jitter-3 | (shifted) E E L L | | fresh c22 | E (wait.begin slow path) E L L | All canary samples have the EXTRA nested RtlEnterCriticalSection (second `E` before the final `L L`). Ours never does — it goes `E L NtClose`. Structural divergence post-absorber-engagement. Shared dispatcher: canary's wait.begin `handles_semantic_ids=['75ae880ec432eb36']` — this is the CS embedded Event dispatcher, lazy-wrapped by `XObject::GetNativeObject`. Same SID computed via C+18 shared-global recipe in both engines. ## 10. Cvar inventory (canary side) Grep across `xenia-canary/src/xenia/` for `DEFINE_bool|DEFINE_int|DEFINE_uint|DEFINE_string`: - `clock_no_scaling` (`base/clock.cc:24`) - `clock_source_raw` (`base/clock.cc:28`) - `ignore_thread_priorities` (`kernel/xthread.cc:30`) - `ignore_thread_affinities` (`kernel/xthread.cc:33`) - `stack_size_multiplier_hack` (`kernel/xthread.cc:37`) - `main_xthread_stack_size_multiplier_hack` (`kernel/xthread.cc:39`) - `phase_a_event_log_path` (`cpu/cpu_flags.cc:84`) — Phase A trace gate - `phase_a_event_log_mem_writes` (`cpu/cpu_flags.cc:88`) — reserved, not wired - `phase_b_snapshot_dir` (`cpu/cpu_flags.cc:94`) — Phase B image snapshot - `phase_b_snapshot_and_exit` (`cpu/cpu_flags.cc:100`) No `lockstep`, `deterministic`, `replay`, `single_thread`, `cooperative` cvars exist. **No built-in deterministic mode.** ## 11. Diff-tool absorber state (post-C+21) `xenia-rs/tools/diff-events/diff_events.py` (767 LOC): - `collect_shared_global_sids()`: pre-pass union of (a) recipe-matching SIDs (C+18) and (b) cross-tid usage heuristic — any SID used by handle.create OR wait.begin on ≥2 distinct tids. - `is_shared_global_wait_begin()`: classifies a wait.begin as floating if any handle_sid is in the shared-global set. - `diff_one_tid()`: floating-absorbs `handle.create` (C+18) and `wait.begin` (C+21) on kind mismatches. - `SKIP_PAYLOAD_FIELDS_BY_KIND`: skips engine-local fields per kind. **Reading-error #23 boundary**: absorbing the post-wait Enter/Leave block (canary's extra `E` then `L` at 104,610-104,615) would be folding real guest behavior, not transient observation. The plan's Stage 3 instead makes ours produce the same observation by forcing ours into the same contended state. ## 12. Tid-chain mapping (stable per memory baseline) | canary | ours | |---|---| | 6 | 1 | | 4 | 11 | | 7 | 2 | | 12 | 7 | | 14 | 9 | | 15 | 10 | This is a *display* convention for cross-engine alignment in diff reports. In the wire format, each engine emits its native tid. The manifest in Stage 2-3 keys on the source-side native tid — no translation needed since each side consumes events it produced. ## 13. Methodology rules in force - **Reading-error #28** (verify source first): applied — read both engines' RtlEnterCS implementations before designing. - **Reading-error #32** (canary non-deterministic in contention regions): characterized — 3 jitter samples documented. - **Reading-error #33** (canary cache lives in binary-dir under wine): not relevant here. - **Reading-error #34** (use `.iso` not loose `.xex`): apply in all validation runs. - **Cold-vs-cold protocol**: canary `--mute=true`, ours `XENIA_CACHE_WIPE=1`. - **Stop hook rename**: rename background binaries before any backgrounded run (e.g. `xrs-verify-stage0`, `xrs-replay`). ## 14. Confidence calibration | claim | source-verified | probe-verified | confidence | |---|---|---|---| | Canary spins, ours doesn't | yes (xboxkrnl_rtl.cc:613 + exports.rs:2927) | n/a (static) | high | | Sylpheed clock-light | n/a | yes (kernel.call counts) | high | | 104,607 divergence is structural | yes (C+22 mech) | yes (5 canary samples consistent) | high | | C+18 shared-global SID is cross-engine identical | yes (event_log.rs + event_log.cc) | implicit (matched in diff reports) | high | | Canary has no deterministic mode cvar | yes (grep) | n/a | high | | Stage-0 quantum spike may unblock | no (untested) | no | medium | | Stage-3 manifest replay unblocks | no | no | medium-high (mechanism sound, integration risk) | | Sister chain regression ≤5 acceptable | n/a | n/a | open question for user | ## Open unknowns (deferred to implementation) 1. The exact `cs_ptr` of the contended CS at canary tid=6 idx 104,608 is not directly emitted by the current schema (the `wait.begin` payload carries SID but not the raw pointer). Stage 1's `cs_ptr` field plugs this gap. 2. Does Sylpheed initialize the contended CS with `RtlInitializeCriticalSectionAndSpinCount(spin_count > 0)` or just `RtlInitializeCriticalSection` (zero spin)? Affects whether canary's spin path can succeed at this site. Probe by reading the cs's `header.absolute` field during a canary run. 3. The dispatcher Event's first-toucher tid differs across cold runs (canary tid=9 in one, others in others). Does this stable enough across cold runs of the SAME canary binary to be a reliable replay anchor? Stage 1 round-trip validation will reveal. 4. Does the M3 `--parallel` mode in ours reproduce the same divergence pattern? Untested. Out of scope for this plan but worth a future probe.