Files
xenia-rs/audit-runs/scheduler-determinism-plan/investigation.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

207 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Investigation Notes — Scheduler-Determinism Plan (2026-05-18)
Source citations and probe results from the Phase-1 investigation. All claims here are verified against source or runtime data; speculation is flagged.
## 1. Canary threading & scheduling model
**Verdict**: 1-host-thread-per-XThread; scheduling delegated to host OS (Wine on Linux). No internal scheduler.
- Each guest `XThread` owns a host `xe::threading::Thread` (`xenia-canary/src/xenia/kernel/xthread.h:476`).
- POSIX backend: pthread per XThread (`xenia-canary/src/xenia/base/threading_posix.cc`).
- TLS bridge: `thread_local XThread* current_xthread_tls_` (`xthread.cc:105`). `XThread::TryGetCurrentThread()` returns null when called outside a guest thread (C+15-α robustness fix for the boot-time emitter).
- Tid assignment: `thread_id_(++next_xthread_id_)` in ctor (`xthread.cc:62`).
- KPCR per XThread, allocated at `pcr_address_` (`xthread.h:506`); contains scheduler-like state mirroring real Xenon KPRCB.
- `CheckQuantumAndDecay()` (`xthread.h:437`) fires ~20ms via `KernelState`'s timer — simulates Xenon priority decay but does NOT preempt; runs on whichever host thread the host OS schedules.
**No internal scheduler.** No `lockstep`, `deterministic`, `replay` cvar (grep confirmed across `xenia-canary/src/xenia/`).
## 2. Canary clock infrastructure
**Verdict**: wallclock-driven (rdtsc or platform API). Optional scaling, no full deterministic mode.
- Canonical class `xe::Clock` (`base/clock.h:30`).
- `Clock::QueryHostTickCount()` (`base/clock.cc:128`): rdtsc on x64 if `clock_source_raw=true`, else platform API.
- `Clock::QueryGuestSystemTime()` (`clock.h:82`): host time adjusted by `guest_time_scalar_`.
- `KeQuerySystemTime_entry` (`xboxkrnl_threading.cc:459`): declared `void`, writes via OUT pointer; reads `Clock::QueryGuestSystemTime()`. (C+1 verified parity with ours's void-export framing.)
- `KeWaitForSingleObject_entry` (`xboxkrnl_threading.cc:1003`): reads `*timeout_ptr` as i64×100 → ns (C+23 verified ours computes the same value).
- Cvars: `clock_no_scaling` (`base/clock.cc:24`), `clock_source_raw` (`base/clock.cc:28`). Neither makes the clock deterministic across Wine runs — wallclock drift is irreducible.
## 3. Canary wait primitives
**Verdict**: `xe::threading::Wait``pthread_cond_timedwait` (POSIX) / `WaitForMultipleObjects` (Win32).
- `xeKeWaitForSingleObject()` (`xboxkrnl_threading.cc:969`) → `XObject::Wait()``xe::threading::Wait()` → host primitive.
- Whether contention happens is purely host-OS-scheduler-driven. Reading-error #32 from C+20 documents this: 3 fresh canary cold runs at tid=6 idx 104,606 showed different patterns (no wait.begin / wait.begin contended / offset-shifted).
## 4. Canary RtlEnterCriticalSection — spin-then-wait (DISCOVERED)
[xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_rtl.cc:596-633](../../../xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_rtl.cc) — `RtlEnterCriticalSection_entry`:
```c
uint32_t spin_count = cs->header.absolute * 256; // game-supplied spin count
if (cs->owning_thread == cur_thread) { recursion++; return; }
while (spin_count--) {
if (xe::atomic_cas(-1, 0, &cs->lock_count)) { /* acquired via spin */ break; }
}
if (xe::atomic_inc(&cs->lock_count) != 0) {
xeKeWaitForSingleObject(...); // slow path
}
cs->owning_thread = cur_thread; cs->recursion_count = 1;
```
**Implication**: under low contention, spin succeeds and no `wait.begin` is emitted. Under high contention, spin fails and `wait.begin` fires. Whether spin succeeds depends on host-OS timing — non-deterministic across Wine runs.
## 5. Ours threading & scheduling
**Verdict**: single host thread; 6 cooperative HW slots; deterministic by construction.
- `xenia-rs/crates/xenia-cpu/src/scheduler.rs`:
- `OrderMode { Fixed, Seeded { seed } }` (lines 230-258).
- `round_schedule()` (lines 710-740): returns slot-id vector; advances `rotation_cursor` by 1.
- `park_current(BlockReason)` (line 808).
- `wake_ref(ThreadRef)` (line 831).
- M3 optional `--parallel` mode (6 workers + coordinator, 7-party phaser) exists but is not default.
**Determinism foundation**: 23 phases of stabilization invested in `e1dfcb15…` cold digest × 3 reproducible.
## 6. Ours RtlEnterCriticalSection — NO spin
[xenia-rs/crates/xenia-kernel/src/exports.rs:2886-2946](../../../xenia-rs/crates/xenia-kernel/src/exports.rs) — `rtl_enter_critical_section`:
```rust
let owner = mem.read_u32(cs_ptr + CS_OFFS_OWNING_THREAD);
let owner_is_live = owner != 0 && state.scheduler.find_by_tid(owner).is_some();
if owner == 0 || !owner_is_live {
/* claim immediately — write owning_thread, lock_count=0, recursion=1 */
return;
}
if owner == current_tid { /* recursive lock — increment counts */ return; }
// Truly contended against a live peer — park IMMEDIATELY (no spin).
state.cs_waiters.entry(cs_ptr).or_default().push(current_ref);
state.scheduler.park_current(BlockReason::CriticalSection(cs_ptr));
```
**Asymmetry summary**: canary spins ~256×N times before parking; ours parks immediately. Under the cooperative scheduler, ours's tid=1 runs monolithically until it parks — no other thread has a chance to acquire the CS first. Hence at 104,607, the CS is free when tid=1 tries, while in canary it was held by another thread that got scheduled in between.
## 7. Ours clock infrastructure
**Verdict**: fixed FILETIME constant. No wallclock dependency in the hot path.
- `KeQuerySystemTime` returns `132_500_000_000_000_000` (~2021) via OUT-ptr (`exports.rs:628`).
- `KeQueryInterruptTime` returns `0x0000_0001_0000_0000` (`exports.rs:504`).
- `event_log.rs` uses `Instant::now()` for the observability `host_ns` field — non-deterministic but not consumed by the matched-prefix metric.
## 8. Sylpheed workload profile (probe)
Ran on `xenia-rs/audit-runs/phase-c22-rtl-enter-leave-control-flow/ours-cold.jsonl` (121,569 events):
| event | count | notes |
|---|---|---|
| RtlEnterCriticalSection (kernel.call) | 19,494 | ≈80% of all kernel.calls |
| RtlLeaveCriticalSection (kernel.call) | 19,492 | matches Enter (off-by-2 from boot edge) |
| NtClose | 160 | |
| NtCreateEvent | 103 | |
| NtReleaseSemaphore | 99 | |
| NtQueryInformationFile | 93 | |
| NtWaitForMultipleObjectsEx | 92 | |
| KeWaitForSingleObject | 5 | |
| KeWaitForMultipleObjects | 1 | |
| **KeQuerySystemTime** | **2** | clock-light workload |
| KeQueryPerformanceFrequency | 6 | |
| KeQueryPerformanceCounter | 0 | |
| KeQueryInterruptTime | 0 | |
| KeDelayExecutionThread | 0 | |
| NtYieldExecution | 0 | |
| wait.begin events (all kinds) | 34 | most with `timeout_ns=-1` (indefinite) |
**Implications**:
- Sylpheed is CS-dominated. Stage-1 emitter on RtlEnterCS captures the dominant signal.
- Sylpheed barely touches the clock. Approach A (cycle clock in canary) addresses ≈2 events out of 121,569. Wrong target.
- Wait surface is small (34 events). Wait-side replay is low-value; scope to CS only.
## 9. The 104,607 divergence (re-verified)
From C+22 memory + jitter jsonl re-analysis:
| sample | tid=6 events 104,604..104,615 (import.call only) |
|---|---|
| c21 archived | E E L L |
| canary jitter-1 | E (wait.begin slow path) E L L |
| canary jitter-2 | E E L L |
| canary jitter-3 | (shifted) E E L L |
| fresh c22 | E (wait.begin slow path) E L L |
All canary samples have the EXTRA nested RtlEnterCriticalSection (second `E` before the final `L L`). Ours never does — it goes `E L NtClose`. Structural divergence post-absorber-engagement.
Shared dispatcher: canary's wait.begin `handles_semantic_ids=['75ae880ec432eb36']` — this is the CS embedded Event dispatcher, lazy-wrapped by `XObject::GetNativeObject`. Same SID computed via C+18 shared-global recipe in both engines.
## 10. Cvar inventory (canary side)
Grep across `xenia-canary/src/xenia/` for `DEFINE_bool|DEFINE_int|DEFINE_uint|DEFINE_string`:
- `clock_no_scaling` (`base/clock.cc:24`)
- `clock_source_raw` (`base/clock.cc:28`)
- `ignore_thread_priorities` (`kernel/xthread.cc:30`)
- `ignore_thread_affinities` (`kernel/xthread.cc:33`)
- `stack_size_multiplier_hack` (`kernel/xthread.cc:37`)
- `main_xthread_stack_size_multiplier_hack` (`kernel/xthread.cc:39`)
- `phase_a_event_log_path` (`cpu/cpu_flags.cc:84`) — Phase A trace gate
- `phase_a_event_log_mem_writes` (`cpu/cpu_flags.cc:88`) — reserved, not wired
- `phase_b_snapshot_dir` (`cpu/cpu_flags.cc:94`) — Phase B image snapshot
- `phase_b_snapshot_and_exit` (`cpu/cpu_flags.cc:100`)
No `lockstep`, `deterministic`, `replay`, `single_thread`, `cooperative` cvars exist. **No built-in deterministic mode.**
## 11. Diff-tool absorber state (post-C+21)
`xenia-rs/tools/diff-events/diff_events.py` (767 LOC):
- `collect_shared_global_sids()`: pre-pass union of (a) recipe-matching SIDs (C+18) and (b) cross-tid usage heuristic — any SID used by handle.create OR wait.begin on ≥2 distinct tids.
- `is_shared_global_wait_begin()`: classifies a wait.begin as floating if any handle_sid is in the shared-global set.
- `diff_one_tid()`: floating-absorbs `handle.create` (C+18) and `wait.begin` (C+21) on kind mismatches.
- `SKIP_PAYLOAD_FIELDS_BY_KIND`: skips engine-local fields per kind.
**Reading-error #23 boundary**: absorbing the post-wait Enter/Leave block (canary's extra `E` then `L` at 104,610-104,615) would be folding real guest behavior, not transient observation. The plan's Stage 3 instead makes ours produce the same observation by forcing ours into the same contended state.
## 12. Tid-chain mapping (stable per memory baseline)
| canary | ours |
|---|---|
| 6 | 1 |
| 4 | 11 |
| 7 | 2 |
| 12 | 7 |
| 14 | 9 |
| 15 | 10 |
This is a *display* convention for cross-engine alignment in diff reports. In the wire format, each engine emits its native tid. The manifest in Stage 2-3 keys on the source-side native tid — no translation needed since each side consumes events it produced.
## 13. Methodology rules in force
- **Reading-error #28** (verify source first): applied — read both engines' RtlEnterCS implementations before designing.
- **Reading-error #32** (canary non-deterministic in contention regions): characterized — 3 jitter samples documented.
- **Reading-error #33** (canary cache lives in binary-dir under wine): not relevant here.
- **Reading-error #34** (use `.iso` not loose `.xex`): apply in all validation runs.
- **Cold-vs-cold protocol**: canary `--mute=true`, ours `XENIA_CACHE_WIPE=1`.
- **Stop hook rename**: rename background binaries before any backgrounded run (e.g. `xrs-verify-stage0`, `xrs-replay`).
## 14. Confidence calibration
| claim | source-verified | probe-verified | confidence |
|---|---|---|---|
| Canary spins, ours doesn't | yes (xboxkrnl_rtl.cc:613 + exports.rs:2927) | n/a (static) | high |
| Sylpheed clock-light | n/a | yes (kernel.call counts) | high |
| 104,607 divergence is structural | yes (C+22 mech) | yes (5 canary samples consistent) | high |
| C+18 shared-global SID is cross-engine identical | yes (event_log.rs + event_log.cc) | implicit (matched in diff reports) | high |
| Canary has no deterministic mode cvar | yes (grep) | n/a | high |
| Stage-0 quantum spike may unblock | no (untested) | no | medium |
| Stage-3 manifest replay unblocks | no | no | medium-high (mechanism sound, integration risk) |
| Sister chain regression ≤5 acceptable | n/a | n/a | open question for user |
## Open unknowns (deferred to implementation)
1. The exact `cs_ptr` of the contended CS at canary tid=6 idx 104,608 is not directly emitted by the current schema (the `wait.begin` payload carries SID but not the raw pointer). Stage 1's `cs_ptr` field plugs this gap.
2. Does Sylpheed initialize the contended CS with `RtlInitializeCriticalSectionAndSpinCount(spin_count > 0)` or just `RtlInitializeCriticalSection` (zero spin)? Affects whether canary's spin path can succeed at this site. Probe by reading the cs's `header.absolute` field during a canary run.
3. The dispatcher Event's first-toucher tid differs across cold runs (canary tid=9 in one, others in others). Does this stable enough across cold runs of the SAME canary binary to be a reliable replay anchor? Stage 1 round-trip validation will reveal.
4. Does the M3 `--parallel` mode in ours reproduce the same divergence pattern? Untested. Out of scope for this plan but worth a future probe.