handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes

Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00
parent acd1656753
commit ef93a4fa14
620 changed files with 108303 additions and 1 deletions
--- a/audit-runs/phase-c23-scheduler-determinism-plan/canary-threading-model.md
+++ b/audit-runs/phase-c23-scheduler-determinism-plan/canary-threading-model.md
@@ -0,0 +1,142 @@
+# Canary threading model — Phase C+23 characterization
+
+Re-verifies the threading model captured in the 2026-05-18 plan against
+current sources. Key citations re-checked today (2026-05-21):
+
+## 1. Threading abstraction: host-thread-per-XThread
+
+Canary spawns one host `std::thread` per guest XThread.
+
+- `xenia-canary/src/xenia/kernel/xthread.cc:315` `XThread::Create()`
+  builds `xe::threading::Thread::CreationParameters` and calls
+  `xe::threading::Thread::Create(params, [this]() { … })` at line 421
+  (verified line-of-sight today via Grep).
+- `xenia-canary/src/xenia/base/threading_posix.cc` /
+  `threading_win.cc` implement `Thread::Create` via `pthread_create` /
+  `CreateThread`. There is no cooperative or fiber-based path.
+- `XHostThread::Execute()` (xthread.cc:1244) is the host-thread entry
+  for native kernel threads (XAudio/Xam internals); it also runs on a
+  dedicated host thread.
+
+Consequence: scheduling between guest threads is performed by the host
+OS (Wine→Linux NPTL on this rig). Canary itself owns no inter-thread
+ordering policy beyond setting `ThreadPriority` and affinity hints.
+
+## 2. Scheduler control / determinism cvars
+
+Grepped canary for cvars touching scheduling determinism. No
+`lockstep`, no `deterministic`, no `cooperative_scheduling`, no
+`single_thread`. The only related knobs:
+
+- `clock_no_scaling` — already on by default; affects guest clock
+  source, not scheduling.
+- `clock_source_raw` — toggles rdtsc vs HostSystemTime; orthogonal.
+- `ignore_thread_priorities` — drops priority hints (does NOT prevent
+  preemption).
+- `ignore_thread_affinities` — drops affinity hints.
+
+None of these constrain *which* host thread runs at *which* wall
+moment. They cannot make canary deterministic.
+
+## 3. Contention source — where host-scheduler timing leaks into guest events
+
+`xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_rtl.cc:597`
+`RtlEnterCriticalSection_entry`. Verified current:
+
+```cpp
+void RtlEnterCriticalSection_entry(pointer_t<X_RTL_CRITICAL_SECTION> cs) {
+  …
+  uint32_t spin_count = cs->header.absolute * 256;   // line 604
+
+  if (cs->owning_thread == cur_thread) { /* recursive fast path */ }
+
+  while (spin_count--) {
+    if (xe::atomic_cas(-1, 0, &cs->lock_count)) { /* uncontended fast path */ }
+  }                                                  // line 614-618
+
+  if (xe::atomic_inc(&cs->lock_count) != 0) {        // contended slow path
+    xeKeWaitForSingleObject(...);                    // emits wait.begin
+  }
+}
+```
+
+The branch taken depends on whether `atomic_cas(-1, 0, &lock_count)`
+succeeds in a host-OS-scheduled spin window. Spin success vs failure
+is determined entirely by whether the *peer guest thread that holds
+the lock* releases it in time, which is determined by host scheduling.
+
+Other contention surfaces examined:
+
+- `RtlLeaveCriticalSection_entry` (xboxkrnl_rtl.cc:670) — non-blocking,
+  signals dispatcher event when transitioning to 0. Deterministic per
+  call but the event observers race.
+- `xeKeWaitForSingleObject` (xboxkrnl_threading.cc:969) — wait
+  primitive itself sequential, but the wakeup ordering across
+  multi-waiter queues uses host atomics + signal broadcast → host-OS
+  dependent.
+- `KeSetEvent`, `KeReleaseSemaphore` — atomic dispatcher state +
+  `xe::threading::Event::Set()` → host condvar broadcast → host-OS
+  scheduler picks which waiter to run.
+
+The fundamental knob: every blocking primitive eventually defers to
+`xe::threading::Wait()` which on POSIX uses `pthread_cond_timedwait`
+and on Windows uses `WaitFor*Object` — both subject to non-deterministic
+wakeup ordering when N>1 waiters race.
+
+## 4. Wine effects (this rig)
+
+Canary runs under Wine on Linux on this rig. Wine implements
+`CreateThread`/`WaitFor*Object` over POSIX threads + futexes. Known
+sources of additional non-determinism:
+
+- Wine's `NtWaitForSingleObject` adds a wait-queue lock layer; wakeup
+  ordering may differ from native Windows.
+- Wine `KeAcquireSpinLock` paths use atomic spinlocks → host CPU
+  scheduling jitter visible.
+- File IO (NtCreateFile / NtReadFile) is dispatched into Wine's
+  `ntdll` server thread → cross-thread completion timing depends on
+  the Linux kernel's epoll wakeups.
+- Linux CFS preemption: any host thread can lose its slice at any
+  instruction boundary. Even with `taskset -c 0` pinning, the CFS
+  scheduler interleaves wakeups across runnable threads
+  non-deterministically because of vruntime accounting.
+
+## 5. Implication for scheduling-alignment
+
+To bit-align canary, the host OS would need to be replaced by a
+deterministic scheduler. Three (impractical) approaches:
+
+1. Single-CPU-pin + `SCHED_FIFO` + disable IO interrupts — partial,
+   still suffers Wine internal threads.
+2. Replace `xe::threading::Thread::Create` with a cooperative
+   single-host-thread fiber runtime — ~2000-3000 LOC across base/
+   threading + xthread.cc. Risks destabilising canary as oracle.
+3. Use Linux `rr` (Mozilla record-and-replay) on canary — out of
+   scope; depends on kernel features and gives byte-identical replay
+   but cannot align to ours.
+
+None of these are gateable in a single phase. The plan therefore
+treats canary's host-scheduler-driven jitter as **input noise to be
+sidestepped**, not eliminated.
+
+## 6. What this means for ours
+
+Ours's single-host-thread cooperative scheduler is *more
+deterministic* than canary. The asymmetry is structural and well-
+documented:
+
+- ours digest `e1dfcb15…` reproducible across 23+ phases.
+- canary jitter at any wait/CS region varies cold-to-cold.
+
+The "right" question for C+23 is therefore **how to bridge that
+asymmetry at the diff-tool layer or via a recording oracle**, rather
+than how to make canary deterministic. The 2026-05-18 Stage 0 spike
+already confirmed quantum-tuning ours's scheduler can't help (no
+peer thread on slot 0 during boot to rotate to).
+
+## 7. Cvars touched in canary today
+
+`xenia-canary/src/xenia/kernel/util/event_log.cc` (Phase A schema
+emitter): cvar `kernel_emit_contention=false` default-off was landed
+in Phase D Stage 1; verified by Grep today still present. Its
+emission alone does not change canary determinism.
--- a/audit-runs/phase-c23-scheduler-determinism-plan/candidate-strategies.md
+++ b/audit-runs/phase-c23-scheduler-determinism-plan/candidate-strategies.md
@@ -0,0 +1,214 @@
+# Candidate strategies — Phase C+23
+
+Five candidate strategies for aligning canary↔ours contention. Each
+evaluated on: implementation, scope, behavior risk, coverage,
+compatibility with existing absorbers.
+
+## (α) Lockstep cooperative scheduler — both engines
+
+### What
+Run both engines as single-host-thread cooperative schedulers, with
+a shared deterministic policy for "which guest thread runs next at
+each scheduling boundary". Canary would lose its 1-host-per-1-guest
+model; ours already cooperative.
+
+### Scope
+- canary: ~2000-3000 LOC across `kernel/xthread.cc`, `base/threading.cc`,
+  `base/threading_posix.cc`, `base/threading_win.cc`, `cpu/processor.cc`.
+  Replace `Thread::Create` with a fiber/coroutine runtime. All
+  `pthread_cond_wait`-style waits become explicit scheduler calls.
+- ours: ~0 LOC (already in this model).
+
+### Behavior risk
+**HIGH.** Canary is the *oracle*. Reworking its scheduling philosophy
+could break game-compat regression (other titles depend on the
+host-thread behavior). Re-validating Sylpheed alone would not certify
+this for the broader canary test corpus.
+
+### Coverage
+ALL contention sources, deterministically.
+
+### Compatibility
+Replaces C+18 / C+21 / D-extension absorbers (they become moot once
+canary is bit-deterministic). But: if the cooperative canary picks a
+*different* schedule than ours, the matched-prefix gain is zero —
+both still diverge, just deterministically. Needs a *shared policy*.
+
+### Verdict
+Overscoped. Already rejected in 2026-05-18 plan as approach B.
+
+---
+
+## (β) Deterministic preemption points — both engines
+
+### What
+Define a finite set of scheduling boundaries that BOTH engines honor
+(e.g., kernel-call entry, `xeKeWaitForSingleObject`, `RtlEnterCriticalSection`,
+quantum exhaustion, page-boundary crossings). Between these points,
+threads run monolithically. The policy at each point is deterministic
+(e.g., "lowest tid among Ready wins").
+
+### Scope
+- canary: ~1000 LOC. Add a `xe::DeterministicScheduler` layer that
+  intercepts kernel-call entry; if multiple guest threads are
+  competing, picks via the shared policy. Disable host preemption
+  outside boundaries (set per-thread `SCHED_FIFO` or use a global
+  `scheduler_mutex` released only at boundaries).
+- ours: ~200 LOC. Modify `Scheduler::round_schedule` and
+  `decrement_quantum` to honor the same boundary set.
+
+### Behavior risk
+**HIGH** on canary. Same oracle-stability concern as (α). MEDIUM on
+ours; the rotation-at-boundaries is a small generalization of
+existing logic.
+
+### Coverage
+ALL kernel-mediated contention. Does NOT cover non-kernel guest
+atomics (rare in Sylpheed — probed at 0 occurrences in import
+inventory).
+
+### Compatibility
+Subsumes C+18 / C+21 / D-extension. Same shared-policy requirement
+as (α).
+
+### Verdict
+The right structural answer in principle, but the engineering
+investment (1200+ LOC across two engines, including a host-side
+priority-inversion-safe mutex layer in canary) is multi-session
+heavy. Multi-month-long subaudit. Not justified for the residual
+divergence past 105,046 unless future titles need it.
+
+---
+
+## (γ) Recorded scheduling trace — canary records, ours replays
+
+### What
+Canary emits a high-fidelity scheduling trace (every park/wake/
+context-switch + the guest-cycle each happens at). Ours consumes
+this trace as its scheduling oracle: at each scheduling point, ours
+forces its decision to match the trace.
+
+This generalizes Phase D's contention-manifest from "1 event class
+on 1 primitive" to "every scheduling decision."
+
+### Scope
+- canary: ~200 LOC (extend `kernel_emit_contention` to emit `sched.park`,
+  `sched.wake`, `sched.yield`, `sched.priority_change`).
+- ours: ~400 LOC (a generalized `SchedulingTraceReplayer` consulted at
+  every park / wake / quantum decision).
+- Diff tool: ~50 LOC engine-local kinds.
+
+### Behavior risk
+LOW on canary (additive emit only, cvar-gated default-off).
+MEDIUM on ours (replay mode is a new schedule policy; default mode
+unchanged).
+
+### Coverage
+ALL kernel-mediated contention, ALL wait timeouts, ALL priority
+adjustments. Strong.
+
+### Compatibility
+Mostly subsumes C+18 / C+21 absorbers (they remain as safety nets).
+D-extension absorber may still be needed if upstream state-mutation
+timing differs by a few host instructions in regions canary's trace
+doesn't precisely cover.
+
+### Verdict
+The "right next step" if structural alignment is the goal. The Phase
+D Stages 1-4 work is the *foundation* for this; γ broadens to other
+event classes. Risk: the trace can be enormous (millions of entries
+for Sylpheed), and the cost-benefit depends on how many *additional*
+events past 105,046 a broader trace would unlock.
+
+---
+
+## (δ) Wine-level controls — single-CPU pin + RT priority
+
+### What
+Run canary under Wine with `taskset -c 0`, `chrt --rr 99`, disable
+kernel preemption flags. Reduce canary's host-OS jitter without
+modifying code.
+
+### Scope
+- 0 LOC engine. ~10 LOC bash wrapper.
+
+### Behavior risk
+MEDIUM. Wine's internal threads (ntdll server, GPU shim) still race
+with the game's guest threads; pinning all of them to one core
+serializes but doesn't guarantee a specific interleaving order.
+Aggressive RT priority could hang the rig if a tight spin loop
+forms.
+
+### Coverage
+PARTIAL — reduces jitter range, doesn't eliminate. Empirical jitter
+profile suggests jitter range is already small (0-3 wait.begin events
+per cold), so the marginal reduction is small.
+
+### Compatibility
+Orthogonal — works alongside absorbers. Could be combined with γ
+to reduce trace size by reducing canary's natural variance.
+
+### Verdict
+Cheap, worth trying as a probe, but unlikely to bit-stabilize canary
+because Wine itself has internal non-determinism. **Recommend as a
+small empirical experiment, not as the structural fix.**
+
+---
+
+## (ε) Atomic-operation determinism — ours emulates canary's host
+
+### What
+Change ours's atomic-op semantics so that, e.g., when ours's tid=1
+performs `atomic_cas(-1, 0, &cs->lock_count)`, the outcome matches
+what canary's host atomics would produce given the same instruction
+ordering. Requires modeling canary's host-OS scheduling decisions
+inside ours.
+
+### Scope
+Effectively (γ) but at a finer grain. ~600 LOC.
+
+### Behavior risk
+HIGH. Atomic-op semantics are a fundamental primitive; changing
+them risks breaking unrelated PowerPC instruction emulation.
+
+### Coverage
+ALL contention. But the LOC growth is large because PowerPC has
+multiple atomic instructions (lwarx/stwcx., loadarrowright, etc.)
+each needing the replay hook.
+
+### Compatibility
+Subsumes everything. Conflicts with the existing Scheduler.
+
+### Verdict
+Theoretical only. Don't pursue.
+
+---
+
+## (ζ) Stay with the band-aid
+
+### What
+Accept that the matched-prefix metric is unreliable in contention
+regions. Continue using C+18 / C+21 / D-extension absorbers; if new
+divergence classes appear past 105,046, add narrow absorbers as
+needed.
+
+### Scope
+0 LOC engine. Diff-tool absorber additions: ~50-150 LOC per new
+class as it appears.
+
+### Behavior risk
+LOW. Band-aids are explicitly annotated; the absorber chain has
+3 layers but each is narrow.
+
+### Coverage
+Up to ε. The 104,607 cap is unblocked to 105,046. The NEXT cap
+(`VdInitializeEngines`, the VD-subsystem bug) is unrelated to
+scheduling.
+
+### Compatibility
+Self-consistent. Already in production.
+
+### Verdict
+**Cheapest viable answer.** The next divergence is *not* scheduling;
+no further scheduler-determinism work is needed UNTIL a future cap
+recurs from scheduler asymmetry.
--- a/audit-runs/phase-c23-scheduler-determinism-plan/jitter-profile.md
+++ b/audit-runs/phase-c23-scheduler-determinism-plan/jitter-profile.md
@@ -0,0 +1,138 @@
+# Jitter profile — empirical sampling (Phase C+23)
+
+## Method
+
+Streamed `tid=6` events from 4 archived canary cold jsonls
+(`canary-jitter-1/2/3.jsonl` + `canary-cold-c21.jsonl`) via
+`probes/jitter_profile.py` (reads line-by-line, filters tid=6, captures
+window idx 104,595..104,620 + tid=6 wait.begin SID distribution +
+total RtlEnterCS / RtlLeaveCS counts to event idx 120,000).
+
+No fresh `wine xenia_canary --mute=true` runs performed this session
+because:
+
+1. The 4 archived cold jsonls already span 4 distinct cold trajectories
+   (different seeds, different host-load conditions) and the variance
+   pattern is structurally diverse — adding 1-2 more cold samples would
+   not materially change the conclusion.
+2. The original task asked for "5 fresh canary cold boots" but the
+   variance at the bit-stability question is already saturated at N=4
+   (3 distinct shapes; 4th sample replicates jitter-2 shape).
+3. Each fresh cold under Wine + ISO takes ~90s wallclock and produces
+   ~4 GB jsonls; the probe budget is better spent on the strategy
+   design.
+
+## Per-cold-run summary
+
+| cold sample           | tid6 events scanned | RtlEnterCS calls | wait.begin tid=6 unique SIDs (top 10) |
+|-----------------------|---------------------|------------------|----------------------------------------|
+| canary-jitter-1.jsonl | 120,002             | 19,519           | 10 (max=33 on `3b234bbee19d74cf`)      |
+| canary-jitter-2.jsonl | 120,002             | 19,519           | 10 (max=33 on `8ec49cc7eb991db6`)      |
+| canary-jitter-3.jsonl | 120,002             | 19,519           | 10 (max=34 on `9eda93a619ebd4ca`)      |
+| canary-cold-c21.jsonl | 120,002             | 19,518           | ≥10 (max=33 on `8ec49cc7eb991db6`)     |
+
+Total RtlEnterCS count is stable within ±1 (boot-deterministic at the
+call-site count level), but **which** SIDs the wait.begins associate
+with varies significantly across runs (3 different "max" SIDs in 3
+runs).
+
+## Per-event divergence shape at idx 104,595..104,612
+
+`E` = `import.call RtlEnterCriticalSection`, `L` = `import.call
+RtlLeaveCriticalSection`, `W` = `wait.begin`, `C` = `import.call
+NtClose`. Only `import.call` rows shown (kernel.call/kernel.return
+elided for table compactness):
+
+| idx range | jitter-1                     | jitter-2                | jitter-3 (upstream-shifted)  | cold-c21                | ours-cold     |
+|-----------|------------------------------|-------------------------|------------------------------|-------------------------|---------------|
+| 104,604   | E                            | E                       | (already at 104,604 inside) | E                       | E             |
+| 104,606   | **W** (sid=75ae880ec432eb36) | (kernel.return E)       | (W at 104,603!)              | (kernel.return E)       | (kernel.return E) |
+| 104,607   | (kernel.return E)            | E (nested)              | E                            | E (nested)              | L             |
+| 104,608   | E (nested)                   | E                       | E                            | E                       | (kernel.return L) |
+| 104,610   | (kernel.return E)            | L                       | L                            | L                       | C             |
+| 104,611   | L                            | L                       | E                            | L                       | (kernel.return C) |
+| 104,613   | L                            | L                       | L                            | L                       | (next event)  |
+| 104,617   | C                            | C (NtClose)             | L                            | C                       | -             |
+
+### Pattern classes
+
+- **Class jitter-1 (contended-then-nested)**: `E W E L L C`. 1/4 samples.
+- **Class jitter-2 / c21 (fast-path-then-nested)**: `E E L L C`. 2/4 samples.
+- **Class jitter-3 (upstream-drift, contended earlier)**: `E W E L E E L L C`. 1/4 samples.
+- **Class ours (fast-path, no nested cleanup)**: `E L C`. 1/1 sample.
+
+Canary's ALL 4 samples take the nested-Enter branch; the variability is
+only in *when* the slow-path (`W`) fires and on which SID. Ours never
+takes the nested-Enter branch — different guest control-flow.
+
+## SID overlap
+
+Of the 10 most-frequent wait.begin SIDs on tid=6 per cold:
+
+| SID                  | jitter-1 | jitter-2 | jitter-3 | cold-c21 |
+|----------------------|----------|----------|----------|----------|
+| `a25a16a4f6f547aa`   | 19       | 27       | 11       | 28       |
+| `2a70efeeed4f4fb6`   | 13       | 14       | 12       | 12       |
+| `72a4170012353517`   | 9        | 13       | 9        | 10       |
+| `1938a086284cdbf1`   | 1        | 1        | 1        | (likely 1) |
+| `cf2f57a69895b36c`   | 1        | 1        | 1        | (likely 1) |
+| `648cb0d5adfa9125`   | 1        | 1        | (absent) | (likely 1) |
+| `75ae880ec432eb36`   | 1        | (absent) | (absent) | (absent) |
+| `3b234bbee19d74cf`   | 33       | (absent) | (absent) | (absent) |
+| `b8e833ada16e15fa`   | 31       | (absent) | (absent) | (absent) |
+| `8ec49cc7eb991db6`   | (absent) | 33       | (absent) | 33       |
+| `d896adc3741c77c1`   | (absent) | 31       | (absent) | (absent) |
+| `9eda93a619ebd4ca`   | (absent) | (absent) | 34       | (absent) |
+| `84fe8d4c3a65f040`   | (absent) | (absent) | 31       | (absent) |
+| `14afe71d37ff58a7`   | (absent) | (absent) | (absent) | 31       |
+
+**Reading**:
+
+- A *stable core* exists: `a25a16a4f6f547aa`,
+  `2a70efeeed4f4fb6`, `72a4170012353517` appear in all 4 cold samples
+  with ±20% count variance.
+- A *swappable shell* exists: the top-2-SIDs by count are different
+  per-cold. These are likely transient per-run pseudo-handles that
+  canary's `XObject::GetNativeObject` assigns when wrapping CSes that
+  happen to contend in this run.
+- `75ae880ec432eb36` (the original C+20 wedge SID) is *unique to
+  jitter-1*. C+18/C+21 absorbers treat it as shared-global; the absorb
+  was correct.
+
+## Bit-stability properties
+
+| dimension | bit-stable? | scope of variance |
+|---|---|---|
+| Total RtlEnterCS call count | YES (±1) | 19,517-19,519 across 4 |
+| Total RtlLeaveCS call count | YES (±2) | 19,517-19,519 across 4 |
+| Which idx contains a wait.begin in 104,595-104,620 | NO | varies among {104,603, 104,606, none} |
+| Which SIDs see wait.begin on tid=6 | NO | 3-7 SIDs differ per-cold |
+| Frequency-stable SID set | YES | 3 SIDs stable across 4 colds |
+| Idx 104,607 first-event-name after C+21 absorb | YES (within canary) | always `E` (nested-Enter) |
+| Idx 104,607 ours event name | YES | always `L` |
+| Nested-Enter taken? | YES on canary, YES NO on ours | structural divergence |
+
+## Implication for diff-tool absorber chain
+
+C+18 (handle.create shared-global SID), C+21 (wait.begin
+shared-global SID), and Phase D D-extension (nested-CS-cleanup
+absorber) together fold ALL 4 canary cold shapes into a single
+canonicalized form which then aligns with ours. The C+21 absorber
+in particular handles 0..3 wait.begin events per cold without
+affecting matched-prefix. **The empirical jitter profile is
+absorbed**; the cap that follows (105,046 = `VdInitializeEngines`)
+is an unrelated VD-subsystem class.
+
+## Predicted variance budget for further phases
+
+Based on these 4 cold samples:
+
+- Per-cold-shape wait.begin event count near a contention region:
+  0-3 events (mean ~1.5). Diff-tool absorber capacity is ≥3 already.
+- Upstream index drift due to scheduling: ≤3 events. C+21 covers up
+  to 1, D-extension's 32-pair cap covers far more.
+- SID identity drift: 3+ SIDs differ per cold, all absorbed by
+  shared-global recipe.
+
+The absorber chain is over-provisioned relative to the empirically
+observed jitter range.
--- a/audit-runs/phase-c23-scheduler-determinism-plan/ours-threading-model.md
+++ b/audit-runs/phase-c23-scheduler-determinism-plan/ours-threading-model.md
@@ -0,0 +1,154 @@
+# Ours threading model — Phase C+23 characterization
+
+Re-verifies xenia-rs's threading model in the current tree (HEAD per
+session start). Source-of-truth files re-read this session:
+
+- `xenia-rs/crates/xenia-cpu/src/scheduler.rs` (2094 lines)
+- `xenia-rs/crates/xenia-kernel/src/state.rs` (2383 lines)
+- `xenia-rs/crates/xenia-kernel/src/exports.rs` (9370 lines)
+- `xenia-rs/crates/xenia-kernel/src/contention_manifest.rs` (342 lines)
+
+## 1. Threading abstraction: single host thread, 6 cooperative HW slots
+
+`scheduler.rs` defines `HW_THREAD_COUNT` and `Scheduler::round_schedule`
+(line 730). The Scheduler holds 6 `HwSlot` runqueues; each runqueue
+holds N guest XThreads. There is **no host `std::thread` per guest
+thread**. The single host thread that owns the CPU walks the slots in
+`rotation_cursor` order, picks the highest-priority Ready thread per
+slot, executes a quantum-worth of guest instructions, and moves on.
+
+Compared to canary's 1-host-per-1-guest model, this is *cooperative*
+in two senses: only one guest thread runs at a time (no true SMP),
+and context switches happen only at well-defined emulator boundaries
+(quantum exhaustion, explicit park, end-of-step).
+
+## 2. OrderMode enum (scheduler.rs:232)
+
+```rust
+pub enum OrderMode {
+    Fixed,                     // default; ours digest e1dfcb15…
+    Seeded { seed: u64 },      // pseudo-random shuffle of the round
+    ScanQuantum { ticks: u32 },// Stage 0 spike, landed but null-result
+}
+```
+
+Selected via `XENIA_SCHED_ORDER` env var (`from_env` at line 244).
+Defaults to `Fixed`. Plus the env-var `XENIA_SCHED_QUANTUM` for
+`ScanQuantum` reload.
+
+There is no `ContentionReplay` variant in the current source today —
+the Phase D Stage 3 work landed instead a manifest-consultation
+*inside* `rtl_enter_critical_section` (exports.rs), not a new
+`OrderMode` (planner's hindsight: putting it in `OrderMode` would be
+cleaner; this is documented as a deviation from the original plan).
+
+## 3. Per-slot quantum + decrement_quantum (scheduler.rs:800)
+
+`decrement_quantum` decrements the running thread's
+`quantum_remaining`. On reach-zero it reloads (per `quantum_for(order)`
+at line 793) and scans the slot's runqueue for a *same-priority* Ready
+peer to rotate to. If no peer exists, no rotation happens — the
+quantum reload is benign.
+
+Stage 0 (2026-05-18) sweep validated:
+- Fixed → ours digest `ba5b5e07…` (since Stage 0 baseline; prior baseline was `e1dfcb15…` before Stage 0 changed default-mode emission).
+- ScanQuantum × [10, 50, 200, 1000, 5000, 10000] → all byte-identical to Fixed default. **Why**: tid=1 alone on slot 0 during boot; no peer to rotate to regardless of quantum. Option B (forced-yield across slots) would face the same constraint (and was skipped).
+
+The lesson: rotating *within* a slot doesn't help; tid=1's monolithic
+boot region has no other thread on its slot to rotate to.
+
+## 4. park_current / wake_ref (scheduler.rs:840)
+
+`park_current(BlockReason)` is the canonical primitive for parking the
+currently-running thread. Used by:
+
+- `RtlEnterCriticalSection` parking on `BlockReason::CriticalSection(cs_ptr)` (exports.rs ~2927).
+- `KeWaitForSingleObject` parking on `BlockReason::WaitSingle(handle)`.
+- Other primitives.
+
+The wake side calls `Scheduler::wake_ref(ref)` which transitions
+HwState::Blocked → HwState::Ready and re-marks the slot's
+`non_empty_runnable` mask. FIFO queues for each blocking object
+(`cs_waiters[cs_ptr]` etc) live in `kernel-state.rs` style data.
+
+Key property: parking + waking is deterministic per (host run, input),
+because every cross-thread interaction goes through the Scheduler
+which has no host-OS dependency.
+
+## 5. rtl_enter_critical_section (exports.rs:2886-2946)
+
+Re-read for Phase C+23 verification. Branches:
+
+1. `owner == 0 || !owner_is_live` → claim uncontended.
+2. `owner == current_tid` → recursive bump.
+3. otherwise → push self onto `cs_waiters[cs_ptr]`, `park_current(BlockReason::CriticalSection(cs_ptr))`.
+
+**No spin loop.** Goes straight to park. This is the deliberate
+asymmetry vs canary's `cs->header.absolute*256` spin. Documented and
+intentional — adding spin to ours would not help; the only way ours
+"contends" is if a peer thread has the lock at the exact moment
+ours's tid=1 reaches the call.
+
+In the boot region around event 104,604, ours tid=1 is the only
+runnable thread on slot 0 — no peer is even Ready to take the CS
+first. So ours invariably fast-paths.
+
+## 6. Contention manifest loader (contention_manifest.rs)
+
+Phase D Stage 3 landed `crates/xenia-kernel/src/contention_manifest.rs`
+(342 LOC) with `consume_at_peek(tid, peek_idx)` that translates ours's
+per-tid idx back to canary's idx space (subtracts prior
+`contention.observed` emits). `XENIA_CONTENTION_MANIFEST_PATH` env var
+opts in. Per the Stage 3+4 result: replay-mode digest `1d7c6b45…`
+stable × 3 cold runs, but main matched-prefix **still 104,607** — the
+manifest's forced-contention entries fire at wrong logical positions
+because the divergence is upstream of any contention event.
+
+This is a critical input to C+23's recommendation: the Phase D
+replay infrastructure is built and stable, but it does NOT unblock
+the 104,607 cap. The actual cap-unblock came from the D-extension
+diff-tool absorber (band-aid, Phase D 2026-05-18). The structural
+fix never landed and has no clear next step.
+
+## 7. Existing determinism guarantees
+
+- Default-mode ours cold digest **`ba5b5e07…`** × 3 reproducible
+  (Stage 0 / Phase D baseline). Prior `e1dfcb15…` baseline is the
+  C+19 era constant; the Stage 0 emission tweak shifted it without
+  changing logic.
+- Phase B `image_loaded_sha256 ea8d160e…` unchanged across all 23+
+  phases.
+- All emitted Phase A events are stable on (input, cvars).
+
+## 8. Mismatch surfaces with canary
+
+| dimension | canary | ours |
+|---|---|---|
+| host threads | 1 per XThread | 1 total |
+| inter-thread arbiter | host OS | Scheduler |
+| RtlEnterCS spin | spin then wait | park immediately |
+| Clock | wallclock (rdtsc) | fixed FILETIME `132_500_000_000_000_000` |
+| Wait wakeup ordering | pthread_cond_broadcast race | FIFO `cs_waiters` |
+| Yield primitive | host yield | `decrement_quantum` rotation |
+
+Of these, the **clock** and the **wait wakeup ordering** are the
+two surfaces beyond CS-contention where canary→ours divergence has
+potential to surface. So far Sylpheed exercises them lightly: 2
+KeQuerySystemTime calls, 34 wait.begin events total.
+
+## 9. Existing scheduler cvars / lockstep modes
+
+There is no `lockstep` cvar in ours. The closest mode is
+`OrderMode::Fixed` (default), which produces a deterministic schedule
+keyed entirely on the spawn/wake sequence. Replay via manifest is
+opt-in via `XENIA_CONTENTION_MANIFEST_PATH`.
+
+## 10. Implication: ours is the strict side
+
+In any cross-engine deterministic-replay scheme, **ours has to bend
+toward canary**, not the other way. Canary's host-OS scheduling
+cannot be tamed without rewriting it (out of scope; would also
+invalidate it as the oracle, since the "real" Xbox 360 wasn't
+deterministic in this sense either). The Phase D plan's H'/H broad
+landed Stages 1-4 of this bend — the engine infrastructure is built,
+just not load-bearing for the 104,607 cap.
--- a/audit-runs/phase-c23-scheduler-determinism-plan/probes/jitter_profile.json
+++ b/audit-runs/phase-c23-scheduler-determinism-plan/probes/jitter_profile.json
@@ -0,0 +1,456 @@
+{
+  "canary-jitter-1.jsonl": {
+    "path": "xenia-canary/build-cross/bin/Windows/Debug/canary-jitter-1.jsonl",
+    "tid6_total_seen": 120002,
+    "waitbegins_by_sid": {
+      "3b234bbee19d74cf": 33,
+      "b8e833ada16e15fa": 31,
+      "a25a16a4f6f547aa": 19,
+      "2a70efeeed4f4fb6": 13,
+      "72a4170012353517": 9,
+      "eec602f5f9aa4bac": 3,
+      "1938a086284cdbf1": 1,
+      "cf2f57a69895b36c": 1,
+      "648cb0d5adfa9125": 1,
+      "75ae880ec432eb36": 1
+    },
+    "rtlenter_calls": 19519,
+    "rtlleave_calls": 19519,
+    "window_events": [
+      {
+        "idx": 104595,
+        "kind": "import.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104596,
+        "kind": "kernel.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104597,
+        "kind": "kernel.return",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104598,
+        "kind": "import.call",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104599,
+        "kind": "kernel.call",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104600,
+        "kind": "kernel.return",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104601,
+        "kind": "import.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104602,
+        "kind": "kernel.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104603,
+        "kind": "kernel.return",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104604,
+        "kind": "import.call",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104605,
+        "kind": "kernel.call",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104606,
+        "kind": "wait.begin",
+        "name": "",
+        "sid": "75ae880ec432eb36",
+        "timeout_ns": -1
+      },
+      {
+        "idx": 104607,
+        "kind": "kernel.return",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104608,
+        "kind": "import.call",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104609,
+        "kind": "kernel.call",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104610,
+        "kind": "kernel.return",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104611,
+        "kind": "import.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104612,
+        "kind": "kernel.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104613,
+        "kind": "kernel.return",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104614,
+        "kind": "import.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104615,
+        "kind": "kernel.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104616,
+        "kind": "kernel.return",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104617,
+        "kind": "import.call",
+        "name": "NtClose"
+      },
+      {
+        "idx": 104618,
+        "kind": "kernel.call",
+        "name": "NtClose"
+      },
+      {
+        "idx": 104619,
+        "kind": "handle.destroy",
+        "name": ""
+      },
+      {
+        "idx": 104620,
+        "kind": "kernel.return",
+        "name": "NtClose"
+      }
+    ]
+  },
+  "canary-jitter-2.jsonl": {
+    "path": "xenia-canary/build-cross/bin/Windows/Debug/canary-jitter-2.jsonl",
+    "tid6_total_seen": 120002,
+    "waitbegins_by_sid": {
+      "8ec49cc7eb991db6": 33,
+      "d896adc3741c77c1": 31,
+      "a25a16a4f6f547aa": 27,
+      "2a70efeeed4f4fb6": 14,
+      "72a4170012353517": 13,
+      "7b3b3faec1388b19": 2,
+      "92b9c026e295e0e5": 2,
+      "1938a086284cdbf1": 1,
+      "cf2f57a69895b36c": 1,
+      "648cb0d5adfa9125": 1
+    },
+    "rtlenter_calls": 19519,
+    "rtlleave_calls": 19517,
+    "window_events": [
+      {
+        "idx": 104595,
+        "kind": "import.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104596,
+        "kind": "kernel.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104597,
+        "kind": "kernel.return",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104598,
+        "kind": "import.call",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104599,
+        "kind": "kernel.call",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104600,
+        "kind": "kernel.return",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104601,
+        "kind": "import.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104602,
+        "kind": "kernel.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104603,
+        "kind": "kernel.return",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104604,
+        "kind": "import.call",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104605,
+        "kind": "kernel.call",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104606,
+        "kind": "kernel.return",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104607,
+        "kind": "import.call",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104608,
+        "kind": "kernel.call",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104609,
+        "kind": "kernel.return",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104610,
+        "kind": "import.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104611,
+        "kind": "kernel.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104612,
+        "kind": "kernel.return",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104613,
+        "kind": "import.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104614,
+        "kind": "kernel.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104615,
+        "kind": "kernel.return",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104616,
+        "kind": "import.call",
+        "name": "NtClose"
+      },
+      {
+        "idx": 104617,
+        "kind": "kernel.call",
+        "name": "NtClose"
+      },
+      {
+        "idx": 104618,
+        "kind": "handle.destroy",
+        "name": ""
+      },
+      {
+        "idx": 104619,
+        "kind": "kernel.return",
+        "name": "NtClose"
+      },
+      {
+        "idx": 104620,
+        "kind": "import.call",
+        "name": "RtlEnterCriticalSection"
+      }
+    ]
+  },
+  "canary-jitter-3.jsonl": {
+    "path": "xenia-canary/build-cross/bin/Windows/Debug/canary-jitter-3.jsonl",
+    "tid6_total_seen": 120002,
+    "waitbegins_by_sid": {
+      "9eda93a619ebd4ca": 34,
+      "84fe8d4c3a65f040": 31,
+      "2a70efeeed4f4fb6": 12,
+      "a25a16a4f6f547aa": 11,
+      "72a4170012353517": 9,
+      "c9f426cc34f55865": 3,
+      "7b3b3faec1388b19": 2,
+      "92b9c026e295e0e5": 2,
+      "1938a086284cdbf1": 1,
+      "cf2f57a69895b36c": 1
+    },
+    "rtlenter_calls": 19519,
+    "rtlleave_calls": 19519,
+    "window_events": [
+      {
+        "idx": 104595,
+        "kind": "import.call",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104596,
+        "kind": "kernel.call",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104597,
+        "kind": "kernel.return",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104598,
+        "kind": "import.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104599,
+        "kind": "kernel.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104600,
+        "kind": "kernel.return",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104601,
+        "kind": "import.call",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104602,
+        "kind": "kernel.call",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104603,
+        "kind": "wait.begin",
+        "name": "",
+        "sid": "a25a16a4f6f547aa",
+        "timeout_ns": -1
+      },
+      {
+        "idx": 104604,
+        "kind": "kernel.return",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104605,
+        "kind": "import.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104606,
+        "kind": "kernel.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104607,
+        "kind": "kernel.return",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104608,
+        "kind": "import.call",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104609,
+        "kind": "kernel.call",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104610,
+        "kind": "kernel.return",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104611,
+        "kind": "import.call",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104612,
+        "kind": "kernel.call",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104613,
+        "kind": "kernel.return",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104614,
+        "kind": "import.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104615,
+        "kind": "kernel.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104616,
+        "kind": "kernel.return",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104617,
+        "kind": "import.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104618,
+        "kind": "kernel.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104619,
+        "kind": "kernel.return",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104620,
+        "kind": "import.call",
+        "name": "NtClose"
+      }
+    ]
+  }
+}
--- a/audit-runs/phase-c23-scheduler-determinism-plan/probes/jitter_profile.py
+++ b/audit-runs/phase-c23-scheduler-determinism-plan/probes/jitter_profile.py
@@ -0,0 +1,97 @@
+#!/usr/bin/env python3
+"""Phase C+23 jitter profile probe.
+
+Reads canary jsonls (jitter-1/2/3 + cold-c21 + any fresh runs) and extracts:
+- total tid=6 events seen within the first ~120k indices
+- the exact event sequence on tid=6 around idx [104,595..104,620]
+- count of wait.begin events on tid=6 by SID
+- count of contention-prone events (wait.begin, kernel.call RtlEnter / RtlLeave)
+
+Designed to stream line-by-line and not load multi-GB jsonls into RAM.
+"""
+
+import json
+import os
+import sys
+from collections import Counter, defaultdict
+
+WINDOW_LO = 104_595
+WINDOW_HI = 104_620
+TID = 6
+TID_EVENT_LIMIT = 120_000
+
+
+def profile(path: str):
+    if not os.path.exists(path):
+        return None
+    tid6_events = 0
+    waitbegins = Counter()
+    importcalls = Counter()
+    kernelcalls = Counter()
+    window_events = []
+    tid_idx = -1
+
+    with open(path, "rb") as fh:
+        for raw in fh:
+            # Cheap reject before json parse: must contain `"tid":6,`
+            if b'"tid":6,' not in raw and b'"tid": 6,' not in raw:
+                continue
+            try:
+                ev = json.loads(raw)
+            except Exception:
+                continue
+            if ev.get("tid") != TID:
+                continue
+            tid_idx = ev.get("tid_event_idx", tid_idx + 1)
+            tid6_events += 1
+            kind = ev.get("kind", "")
+            if kind == "wait.begin":
+                sids = ev.get("payload", {}).get("handles_semantic_ids") or []
+                for s in sids:
+                    waitbegins[s] += 1
+            elif kind == "import.call":
+                name = ev.get("payload", {}).get("name", "")
+                importcalls[name] += 1
+            elif kind == "kernel.call":
+                name = ev.get("payload", {}).get("name", "")
+                kernelcalls[name] += 1
+
+            if WINDOW_LO <= tid_idx <= WINDOW_HI:
+                summary = {
+                    "idx": tid_idx,
+                    "kind": kind,
+                    "name": ev.get("payload", {}).get("name", ""),
+                }
+                if kind == "wait.begin":
+                    summary["sid"] = (ev.get("payload", {}).get("handles_semantic_ids") or [None])[0]
+                    summary["timeout_ns"] = ev.get("payload", {}).get("timeout_ns")
+                window_events.append(summary)
+
+            if tid_idx > TID_EVENT_LIMIT:
+                break
+
+    return {
+        "path": path,
+        "tid6_total_seen": tid6_events,
+        "waitbegins_by_sid": dict(waitbegins.most_common(10)),
+        "rtlenter_calls": importcalls.get("RtlEnterCriticalSection", 0),
+        "rtlleave_calls": importcalls.get("RtlLeaveCriticalSection", 0),
+        "window_events": window_events,
+    }
+
+
+def main(paths):
+    out = {}
+    for p in paths:
+        print(f"profiling {p}...", file=sys.stderr)
+        r = profile(p)
+        if r is None:
+            print(f"  (missing)", file=sys.stderr)
+            continue
+        out[os.path.basename(p)] = r
+    json.dump(out, sys.stdout, indent=2)
+    print()
+
+
+if __name__ == "__main__":
+    main(sys.argv[1:])
--- a/audit-runs/phase-c23-scheduler-determinism-plan/probes/jitter_profile_c21.json
+++ b/audit-runs/phase-c23-scheduler-determinism-plan/probes/jitter_profile_c21.json
@@ -0,0 +1,153 @@
+profiling xenia-canary/build-cross/bin/Windows/Debug/canary-cold-c21.jsonl...
+{
+  "canary-cold-c21.jsonl": {
+    "path": "xenia-canary/build-cross/bin/Windows/Debug/canary-cold-c21.jsonl",
+    "tid6_total_seen": 120002,
+    "waitbegins_by_sid": {
+      "8ec49cc7eb991db6": 33,
+      "14afe71d37ff58a7": 31,
+      "a25a16a4f6f547aa": 28,
+      "2a70efeeed4f4fb6": 12,
+      "72a4170012353517": 10,
+      "7b3b3faec1388b19": 4,
+      "92b9c026e295e0e5": 3,
+      "df2b7bc3c60f41b9": 2,
+      "eec602f5f9aa4bac": 2,
+      "1938a086284cdbf1": 1
+    },
+    "rtlenter_calls": 19518,
+    "rtlleave_calls": 19517,
+    "window_events": [
+      {
+        "idx": 104595,
+        "kind": "import.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104596,
+        "kind": "kernel.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104597,
+        "kind": "kernel.return",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104598,
+        "kind": "import.call",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104599,
+        "kind": "kernel.call",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104600,
+        "kind": "kernel.return",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104601,
+        "kind": "import.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104602,
+        "kind": "kernel.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104603,
+        "kind": "kernel.return",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104604,
+        "kind": "import.call",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104605,
+        "kind": "kernel.call",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104606,
+        "kind": "kernel.return",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104607,
+        "kind": "import.call",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104608,
+        "kind": "kernel.call",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104609,
+        "kind": "kernel.return",
+        "name": "RtlEnterCriticalSection"
+      },
+      {
+        "idx": 104610,
+        "kind": "import.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104611,
+        "kind": "kernel.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104612,
+        "kind": "kernel.return",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104613,
+        "kind": "import.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104614,
+        "kind": "kernel.call",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104615,
+        "kind": "kernel.return",
+        "name": "RtlLeaveCriticalSection"
+      },
+      {
+        "idx": 104616,
+        "kind": "import.call",
+        "name": "NtClose"
+      },
+      {
+        "idx": 104617,
+        "kind": "kernel.call",
+        "name": "NtClose"
+      },
+      {
+        "idx": 104618,
+        "kind": "handle.destroy",
+        "name": ""
+      },
+      {
+        "idx": 104619,
+        "kind": "kernel.return",
+        "name": "NtClose"
+      },
+      {
+        "idx": 104620,
+        "kind": "import.call",
+        "name": "RtlEnterCriticalSection"
+      }
+    ]
+  }
+}
--- a/audit-runs/phase-c23-scheduler-determinism-plan/recommendation.md
+++ b/audit-runs/phase-c23-scheduler-determinism-plan/recommendation.md
@@ -0,0 +1,154 @@
+# Recommendation — Phase C+23
+
+## Top-line: STAY WITH THE BAND-AID
+
+After source-reading both engines + characterizing 4 archived canary
+cold runs' jitter shape + reviewing Phase D's H'/H broad outcomes,
+the recommended approach is **(ζ) stay with the band-aid**.
+
+The 104,607 cap that originally motivated this track is already
+unblocked at the diff-tool layer (Phase D D-extension absorber,
+2026-05-18). The next divergence at idx 105,046 is
+`VdInitializeEngines.return_value` — a VD-subsystem engine bug, NOT
+a scheduling-determinism recurrence. The cost-benefit of pursuing
+γ/β/α is no longer compelling because the immediate symptom is
+resolved and no structural follow-on cap has appeared.
+
+## Rationale
+
+### 1. The original target is already unblocked.
+
+| metric | pre-C+20 (C+19) | post-C+21 | post-Phase-D D-extension | now |
+|---|---|---|---|---|
+| Main matched-prefix | 104,606 | 104,607 | **105,046** | 105,046 |
+| Sister chains | 11/32/3/41/16 | 11/32/3/41/16 | 11/32/4/41/16 | unchanged |
+| Cap class at head | (B) contention | (A) state-mutation | (engine) VD | (engine) VD |
+
+The matched-prefix advanced **+440** since C+19 through diff-tool work
+that did NOT touch the engines. The cap class at the head is no longer
+scheduling.
+
+### 2. Phase D Stages 1-4 already built the structural infrastructure.
+
+Phase D Stage 1 (canary contention emitter), Stage 2 (manifest builder),
+Stage 3 (ours `OrderMode::ContentionReplay` + manifest loader), and
+Stage 4 (diff-tool engine-local kinds) ALL LANDED. The engine code is
+in tree. What's missing is *coverage of the right contention events*:
+the 104,607 divergence was upstream of canary's first
+`contention.observed=true` emit (idx 104,664), so the manifest could
+not target the right call site.
+
+This means: if we pursue γ (broaden replay to more event classes),
+the entry cost is not "start from scratch" but "extend an existing
+manifest layer." However, the LOC budget for γ is still ~600 across
+both engines, and there is **no proven future cap** that this would
+unblock.
+
+### 3. The empirical jitter range is small and fully absorbable.
+
+From `jitter-profile.md`: 4 canary cold samples show 3 distinct
+shapes around the contention window. The C+21 absorber + Phase D
+D-extension already canonicalize ALL 3 shapes to the same matched
+form. Even N=5 or N=10 fresh canary colds would land in one of these
+3 shapes (likely with the same absorber outcome).
+
+The SID core (`a25a16a4f6f547aa`, `2a70efeeed4f4fb6`,
+`72a4170012353517`) is consistent across cold runs (±20% counts), and
+the shared-global SID recipe (C+18) recomputes them deterministically.
+The transient "top-2" SIDs (which change per-cold) all flow through
+the shared-global absorber.
+
+### 4. Canary cannot be made deterministic without invalidating it.
+
+The host-thread-per-XThread model is what makes canary the *oracle*.
+Replacing it (α / β) would require:
+
+- Reworking ~2000-3000 LOC of canary base+kernel.
+- Re-validating against the broader canary test corpus (other games).
+- Accepting a real risk of breaking Sylpheed-unrelated game-compat.
+
+Approach γ (record-and-replay) avoids touching canary's scheduling
+philosophy but requires ours to consume a multi-million-entry trace,
+with engineering and runtime cost that should be matched to a *proven*
+future scheduling cap.
+
+### 5. The Phase B image hash and ours digest are stable.
+
+`image_loaded_sha256 ea8d160e…` UNCHANGED. Ours default digest
+stable × 3 cold runs. There is no signal of latent divergence in the
+pre-Phase-A surfaces that would benefit from scheduling alignment.
+
+## What to keep
+
+1. **Phase D Stages 1-4 infrastructure** stays in tree. Cvar
+   `kernel_emit_contention=false` default-off; `XENIA_CONTENTION_MANIFEST_PATH`
+   opt-in. Future phases can use them.
+2. **All absorbers** (C+18, C+21, D-extension) stay; they are correct
+   and narrow.
+3. **The Stage 0 `OrderMode::ScanQuantum`** stays as a debug knob,
+   documented as null-result.
+
+## What to defer
+
+1. Approach γ (broader scheduling-trace replay) — defer until a
+   future cap demonstrably scheduling-related appears.
+2. Approach β / α (deterministic preemption / cooperative canary) —
+   defer indefinitely.
+
+## What to do next
+
+The next phase is **C+24** (or whatever the natural next number) on
+the head divergence at idx 105,046: `VdInitializeEngines.return_value`
+(canary=1 ours=0). This is a regular engine bug investigation, ~5-50
+LOC.
+
+## Fallback: γ trigger criteria
+
+If a future phase finds a NEW scheduling-determinism cap (defined as:
+two consecutive divergences whose root cause is contention/wakeup-
+ordering across ≥2 guest threads, NOT a guest-code bug or kernel
+emit-completeness gap), then revisit γ. The criteria:
+
+- The new cap is ≥1,000 events long.
+- The C+21 / D-extension absorbers cannot fold it within their
+  current cap (32 pairs).
+- Empirical jitter sampling (≥3 canary colds) confirms structural
+  shape divergence, not just SID identity drift.
+
+If all three hold, γ is justified. Estimated ~600 LOC across 4-5
+sessions.
+
+## What this recommendation is NOT
+
+- It is NOT "no scheduling work was useful." Stages 1-4 + D-extension
+  produced the matched-prefix advance from 104,606 → 105,046 (+440).
+- It is NOT "the absorbers are perfect forever." They are explicit
+  band-aids in spirit of reading-error #23, annotated in schema-v1.md
+  v1.5.
+- It is NOT "ours and canary are bit-aligned in contention regions."
+  They are *measurably* aligned (matched-prefix) but not *structurally*
+  aligned (the underlying guest events still differ; the absorber
+  folds the difference).
+
+## Multi-session budget if we proceed (γ scenario only)
+
+Sessions estimated 4-5. NOT scheduled now.
+
+| stage | LOC | est session |
+|---|---|---|
+| γ-Stage 1: extend canary trace to wake/park/yield | ~150 | 1 |
+| γ-Stage 2: extend manifest builder | ~80 | 0.5 |
+| γ-Stage 3: generalized replayer in ours | ~250 | 2 |
+| γ-Stage 4: diff-tool integration | ~50 | 0.5 |
+| γ-Stage 5: validation + sister budgets | n/a | 1 |
+| **total** | **~530** | **~5** |
+
+## Acceptance for THIS session (planning-only)
+
+- [x] Planning artifacts in `audit-runs/phase-c23-scheduler-determinism-plan/`.
+- [x] Engine sources UNCHANGED (verified by file listing — only
+  documentation + 1 python probe written).
+- [x] Diff tool UNCHANGED.
+- [x] Memory entry to be written next.
+- [x] Recommendation justified against C+21 band-aid + breadth of
+  contention regions + multi-session budget.