Files
xenia-rs/audit-runs/iterate-2V-scheduler-fairness-fix/writer-report.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

256 lines
13 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Iterate 2.V — Scheduler fairness fix (age-priority anti-starvation)
**Date:** 2026-05-28. **LOC delta:** engine **~30 substantive added lines**
(scheduler.rs only; ~75 LOC including new doc comments). All retained.
**Option:** A (priority aging). **Tests:** xenia-cpu 300 / xenia-kernel 227
/ xenia-app 5 / xenia-path 19 + 30+ smaller suites — full workspace PASS,
0 regressions.
## Headline
**WEDGE-DISSOLVED-NEW-BLOCKER (PROGRESSION OBSERVED).**
The 18-day strict-priority starvation on CPU5 is broken. With `pick_runnable`
now ranking by *effective* priority `= base + age_bonus(rounds since last
pick)`, tid=6 (pri=0) finally runs after tid=10 (pri=15) ages out, and the
cascade that follows produces:
- **tid=6 signals handle 0x000012e4 exactly as predicted** — the primary
keystone gate. 1 `signal.match` event by `NtSetEvent` on
`target_handle:0x000012e4`, `waiter_tids:[5]`. **Was 0 at 2.T baseline.**
- **tid=6 event count 17 → 386** (~23×). Now Blocked on the wedge
handles 0x000010b0/0x000010b4 (deadline-bounded), not Ready-stuck.
- **tid=13 EXITED** with code 0 (was the original AUDIT-049 wedge from
10 May 2026 — stuck for 18 days).
- **Total events 121,641 → 13,003,881** (107× more events; first time
the boot has crossed multi-second wallclock progression in this trace).
- **Alive threads 13 → 21** (8 new threads spawned: 14, 15, 16, 17,
18, 19, 20, 21; 13 and 14 ran to completion and exited).
- **Wallclock last-event 766.86 ms → 51,011 ms** (66× longer trace).
Hard new wedges still exist (15 wedge_map entries vs 10 at baseline), but
they are *downstream* of the original wedge — the boot has structurally
advanced. The fix is **mechanism-correct and non-regressive**; the next
wedges are new territory.
## Option chosen: A (priority aging)
Justification: Option B (quantum-based round-robin to lower priority on
N-cycle timeout) requires either (a) violating priority ordering on every
expiry, which destabilizes existing tests like
`test_two_threads_same_slot_higher_priority_runs_first`, or (b) a
separate "starvation counter" that essentially reinvents aging. Option A
folds cleanly into the existing `max_by_key` shape, is fully
deterministic (counts on `Scheduler::round_count`), and degenerates to
the strict-priority rule on round 0 — so every existing test continues
to pass without modification.
## Patch summary
File: `crates/xenia-cpu/src/scheduler.rs`. ~30 substantive added LOC
(plus ~45 LOC of doc comments). Within scope (30-80 target, 150 hard
cap).
| change | purpose | LOC |
|---|---|---:|
| `const AGING_ROUNDS_PER_BONUS: u64 = 1;` | one round of starvation = +1 effective priority | 1 |
| `const MAX_AGE_BONUS: i32 = 31;` | cap (≥ any realistic NT priority diff; ≤ i32 safety margin) | 1 |
| `GuestThread::last_run_round: u64` field + init in `default_fields` | per-thread baseline for age math | 2 |
| `fn effective_priority(t, now_round) -> i32` | helper, saturating_sub + min + saturating_add | 6 |
| `HwSlot::pick_runnable(&self, now_round: u64)` | accepts round_count, ranks by `effective_priority` | 4 |
| `Scheduler::begin_slot_visit`: pass round_count, stamp winner's `last_run_round` | activates the fix per-pick | 4 |
| `Scheduler::spawn`: initialize `last_run_round = self.round_count` | prevent fresh threads inheriting giant ages | 1 |
| `Scheduler::install_initial_thread`: same | same | 1 |
| `Scheduler::decrement_quantum`: stamp `last_run_round` on rotation hand-off | keep age math consistent with the in-tier rotation path | 1 |
Doc comments on the new const, field, helper, and `pick_runnable` total
~45 LOC explaining the determinism, scope, and link back to this iterate.
The fix is purely additive — no existing field or method is removed.
`HwSlot::pick_runnable`'s signature changed from `(&self)` to
`(&self, now_round: u64)`; the only external caller
(`Scheduler::begin_slot_visit`) was updated in lockstep.
## Test results
```
cargo build --release -> OK (1 pre-existing dead_code warning unrelated)
cargo test --release --workspace:
xenia-cpu 300 passed, 0 failed
xenia-kernel 227 passed, 0 failed
xenia-app 5 passed, 0 failed (+ 3 ignored long-runners)
xenia-path 19 passed, 0 failed
+ ~25 smaller suites, 0 failures total
```
The test that exercises strict priority
(`test_two_threads_same_slot_higher_priority_runs_first`) still passes
because at `round_count = 0`, every thread has `last_run_round = 0`
age = 0 ⇒ age_bonus = 0 ⇒ effective_priority == base_priority. The age
math only kicks in once `round_count` advances beyond a thread's last
pick — i.e. after actual starvation begins.
The quantum-rotation test
(`test_quantum_does_not_rotate_without_same_priority_peer`) still passes
because it never advances `round_count` (it only calls `decrement_quantum`
within one slot visit).
## Determinism check
Two cold runs (XENIA_CACHE_WIPE=1, -n 500000000) produced **bit-identical
event counts: 13,003,881 events each** (`ours-cold.jsonl` /
`ours-cold-run2.jsonl`).
Diff of the two JSONL files (after stripping the `host_ns` wallclock
noise that's not deterministic in any of our runs): **6 events differ
out of 13,003,881, only in the `guest_cycle` field** (5,577,193 vs
5,577,214 on a single `KeAcquireSpinLockAtRaisedIrql` / `KeReleaseSpin
LockFromRaisedIrql` pair at idx 105,282-105,287). Kinds, names, ords,
tids, and event-idx sequence are identical. This pre-existing tiny
spinlock-cycle drift was visible in 2.T as well; it is not introduced by
this iterate and does not affect the event-stream shape.
Verdict: **determinism preserved at the event-sequence level** per the
spec's hard constraint.
## Primary gate results
| gate | predicate | result |
|---|---|---|
| **tid=6 signals handle 0x000012e4** | `signal.match` for `target_handle:0x000012e4` ≥ 1 | **PASS** — 1 event by tid=6 `NtSetEvent`, `waiter_tids:[5]`, at guest_cycle=0/host_ns=844.35ms |
| **tid=6 event count > 105** | tid=6 emits >105 Phase-A events | **PASS** — 386 events (was 17) |
| **tid=6 NOT Ready-stuck on exit** | exit-thread-state shows tid=6 in Blocked/Exited, not Ready | **PASS**`state:"Blocked"`, WaitAny on handles 0x000010b0 (Event) + 0x000010b4 (Semaphore), `deadline_ns_or_inf:42948072` |
All 3 primary gates pass. The mechanism is confirmed end-to-end:
tid=10 ages out → tid=6 picked → tid=6 progresses through prior wait
→ tid=6 advances past `NtSetEvent` (the missing signal in 2.T) → wakes
tid=5 → cascade unfolds.
## Secondary gates (cascade)
| gate | 2.T baseline | 2.V | direction |
|---|---:|---:|---|
| Total events | 121,641 | **13,003,881** | **107×** |
| Last event host_ns | 767 ms | **51,011 ms** | **66×** |
| Alive threads | 13 | **21** | **+8 spawned** |
| Exited threads (clean exit_code=0) | 0 | **2** (tid=13, tid=14) | new |
| Blocked @ PC=0x824ac578 (the AUDIT-049 set) | {1,3,4,5,13} | **{3,4,12,16,18}** | tid=1/5/13 unblocked; new tids appear |
| `signal.match` events | 36 | **75** | **+108%** |
| `wake.requested` events | 36 | **79** | **+119%** |
| Unique signal.match handles | small | **20+** | broader signaling surface |
| VdSwap calls (`import.call` count) | 1 | **2** | **+1** |
| Audio tid=10 events | 1 | **17** | **+16** (modest; aging works but tid=10 stays mostly CPU-bound between yields) |
| tid=6 events | 17 | **386** | **+23×** |
| tid=17 events (new worker) | n/a | **5,471,318** | massive new producer |
The originally-blocked set {1, 3, 4, 5, 13} at PC=0x824ac578 has
*completely changed*. tid=1 is now Ready, tid=5 has advanced to
PC=0x824ab214 (a different wait wrapper), tid=13 has exited cleanly.
Three of the original five threads are no longer parked on that PC.
VdSwap reached 2 (vs 1 baseline) — small absolute, but a definite gameplay
progression marker per tripstone #39. The second swap fires on tid=8 at
~1.22 s wallclock, vs the first on tid=1 at ~494 ms.
## Third-order observations (no claims, just data)
- **New wedge surface (15 entries vs 10)**. The new wedges include
several handles (0x14dc, 0x151c, 0x1510, 0x1514, 0x1020, 0x1004, 0x1308)
that didn't exist in the baseline trace — they correspond to handles
created by the new worker threads (15-21) that only exist post-cascade.
Not regressions; they are the next *natural* blocking point now that
the original blocker is dissolved.
- **One semaphore wedge with multiple waiters** (handle 0x00001308,
`count=0/max=2^31-1`, `waiters_tid:[15, 16]`) — classic
producer-underrun shape (AUDIT-069 family). Likely the next iterate's
target.
- **tid=10 / tid=9 still Ready at exit on CPU5/CPU4 at priority=15**
(the audio mixer pair). Both at PC=0x824d140c (vs 0x824d1404 at
baseline — moved by 8 bytes, i.e. one instruction past). The aging
bonus lets them yield occasionally; they're no longer pinning their
CPUs hard.
- **Run termination**: budget cap (50M instructions); no crash, no
deadlock, no `unblock_on_deadlock` fire.
## Tripstone audit
- **#28 (cross-engine tid stability)**: All tid claims are ours-side
within this trajectory. The new tids 15-21 are first observed in this
iterate; no cross-engine tid mapping claimed.
- **#39 (composite progression IS progression)**: Honored. VdSwap=2,
swap count UP, but draws/render_targets not measured here. Headline
uses WEDGE-DISSOLVED-NEW-BLOCKER framing — does *not* claim
"boot complete" or "gameplay reached". The mechanism gate
(signal.match on 0x12e4) is direct and not a progression-laundering
proxy.
- **#40 (single-keystone framing)**: Care taken. The headline names
*both* "wedge dissolved" *and* "new blocker", per the spec's matrix.
Cascade gates are reported separately from the primary gate. Open
follow-ups (the new producer-underrun wedge on handle 0x1308) are not
collapsed into the win.
- **#41 (categorized diff tags)**: N/A this iterate (no diff harness run).
- **#42 (Phase-A blind to blocked-forever)**: Used `exit-thread-state.json`
to characterize the new wedge set (Phase-A alone would show only the
signal-match cascade up to the new block point).
- **#43 (no budget-cap framing)**: Budget cap (-n 500000000) reached
but the trace had structural progression throughout, not a wedge.
Cascade observation is robust at this budget.
## Confidence
- **HIGH** that the patch is correct and minimal: 30 substantive LOC,
0 test regressions, determinism preserved bit-for-bit on event count.
- **HIGH** that the primary keystone gate passes: `signal.match
target_handle:0x000012e4 waiter_tids:[5]` is exactly the predicted
unblock — observed unambiguously in the trace.
- **HIGH** that the cascade is genuine (not just emit-volume noise):
tid=13 EXITED cleanly is a structural event the baseline never
achieved in 18 days; 8 new threads spawned that the baseline never
reached; new handles in the wedge set that didn't exist at baseline.
- **MEDIUM-HIGH** that the new wedge set (handle 0x1308 semaphore
producer-underrun, several events without signalers) represents the
next genuine investigation surface — these are downstream of the
original wedge and likely have their own causal chain.
- **MEDIUM** that gameplay is imminent. VdSwap went from 1 to 2 and
the wallclock reached 51 s, but draws_count was not measured and the
game is clearly still inside boot phase B. Several more cascade
iterations likely needed.
- **LOW** that any of the existing 25+ iterates' specific wedge
diagnoses (AUDIT-049, 062, 067, 068, 069) directly apply post-fix
— the geometry has changed enough that prior root-cause analyses
need re-validation.
## Next-iterate recommendation
**2.W — investigate the new producer-underrun on handle 0x00001308**
(semaphore count=0/max=2^31-1, waiters tid=[15, 16] both on CPU3 at
PC=0x824ac578). Use the existing `signal.match` / `wake.requested`
event surface (already active) to identify which tids if any are
releasing this semaphore — if zero, the next root cause is a missing
producer (AUDIT-069 family); if non-zero but rate is low, it's a
consume-rate divergence (AUDIT-068 family). ~0-50 LOC.
Alternative: **2.X — measure draws/render_targets** to quantify how
close we are to first gameplay frame. ~30-50 LOC instrumentation in
xenia-gpu's `D3D_DrawIndexedPrimitive` path.
**Strong recommend 2.W first** — the wedge is concrete and the tooling
already exists.
## Artifacts
Under `xenia-rs/audit-runs/iterate-2V-scheduler-fairness-fix/`:
- `ours-cold.jsonl` (3.13 GB, 13,003,881 events)
- `ours-cold.stdout.log` (empty — quiet mode)
- `ours-cold.stderr.log` (single emission-notice line)
- `exit-thread-state.json` (15.6 KB; 21 alive + 15 wedge_map entries)
- `ours-cold-run2.{jsonl,stdout.log,stderr.log}` (determinism check —
bit-identical event count, only 6 events with tiny `guest_cycle`
drift in a pre-existing spinlock pair)
- `writer-report.md` (this file)
Engine HEAD `e6d43a23ac393004d2e5adf2f0395fd0b5e6448b` + uncommitted
2.Q signal.match + 2.T wake.requested + this iterate's 2.V scheduler
fairness patch. xenia-canary UNCHANGED.