Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
6.6 KiB
Approach Tradeoff Matrix
Each approach is evaluated against the same criteria. The recommended approach is H' (manifest replay, scoped to RtlEnterCS), gated on Stage 0 spike.
Criteria
- Eng LOC: estimated engine-source modification (ours + canary).
- Tool LOC: estimated diff-tool / python tooling.
- Test LOC: estimated tests.
- Unblocks 104,607?: probability of advancing the main matched-prefix past the current cap.
- Preserves ours digest: whether
e1dfcb1559f987b35012a7f2dc6d93f5(Phase A) andea8d160e…(Phase B) remain unchanged in the default mode. - Preserves canary default: whether canary's default-mode (no new cvar) cold-run behavior is byte-identical.
- Wine-constraint: whether the approach requires changing Wine itself (always: NO — out of scope).
- Reading-error risk: which class of reading error this approach risks crossing.
Matrix
| approach | Eng LOC | Tool LOC | Test LOC | Unblocks 104,607? | Preserves ours digest | Preserves canary default | Reading-error risk | Verdict |
|---|---|---|---|---|---|---|---|---|
| A — cycle-counted clock in canary | ~200 (base/clock.cc) |
0 | ~50 | NO (Sylpheed: 2 KeQuerySystemTime calls) | yes | yes (cvar-gated) | #19 (wrong-target) | WRONG TARGET |
| B — single-thread cooperative canary | ~2000-3000 (xthread.cc, threading*.cc, processor.cc) |
~50 | ~300 | YES | yes | NO — fundamentally changes scheduling | #28 (rewrite-without-verify) | OVERSCOPED |
| C/H — manifest replay, broad (CS + wait) | ~600-700 | ~200 | ~200 | YES (with risk in wait-side semantics) | yes (default-off) | yes (cvar-off) | #23 (synthetic events) | 2nd choice |
| H' — manifest replay, scoped to RtlEnterCS | ~450-500 | ~180 | ~150 | YES | yes (default-off) | yes (cvar-off) | #23 (bounded) | RECOMMENDED |
| D — diff-harness absorption extension | 0 | ~150 (diff_events.py) | ~50 | PARTIAL (10-100 idx) | yes | yes | #23 (FOLDS REAL GUEST CODE) | fallback only |
| E — A+D hybrid | ~200 | ~150 | ~100 | LOW (clock isn't the lever; D hits #23 wall) | yes | yes | #19 + #23 | band-aid |
| F — make ours preemptive | ~500 (scheduler.rs) |
0 | ~100 | UNKNOWN (no replay anchor) | NO — destabilizes cold digest | n/a | #28 (loses 23 phases of stabilization) | WRONG DIRECTION |
| Stage 0 spike — cycle-quantum preemption | ~80 (scheduler.rs) |
0 | ~40 | TBD by spike | TBD (default Fixed unchanged) |
n/a | #19 (premature optimization if not validated) | GATE |
| spin-then-wait fix in ours | ~50 (exports.rs:2886) |
0 | ~30 | NO (wrong direction: adding spin makes contention less likely on ours's side) | yes | n/a | #28 (verified — would not help 104,607) | document, defer |
Detailed reasoning
Why H' over C/H (broad)
The broad variant (C/H) covers both RtlEnterCriticalSection and KeWaitForSingleObject. Phase 1 evidence shows:
- 19,494 RtlEnter calls in Sylpheed's boot
- 34 wait.begin events total
The CS surface is ~570× larger than the wait surface. Adding wait-side replay buys little. More importantly, wait-side replay has tougher semantics: when canary's KeWaitForSingleObject fires on a TIMER (with a host-wallclock deadline), ours can't replay because ours doesn't have a wallclock to match.
H' defers wait-side replay until evidence shows it's needed (backstop in plan.md §Backstop).
Why H' over B (single-thread canary)
B fundamentally changes the oracle. The oracle's stability across phases is a foundational invariant; modifying its scheduling layer introduces game-compatibility risk that we cannot fully test (only Sylpheed is in scope, but canary supports many titles). LOC is also 4-6× larger.
H' leaves the oracle's behavior unchanged in the default case. The contention emitter (Stage 1) is a passive observer; the manifest captures one canary cold run as canonical and ours replays it. Canary is not asked to be deterministic — it's asked to report its non-determinism.
Why H' over D (diff absorber extension)
The current C+21 absorber is already at the safe limit of reading-error #23. Extending the absorber to fold "post-wait nested Enter/Leave blocks" would hide REAL guest-code execution differences. The canary side's nested-Enter reads mutated memory and modifies state (lock_count, recursion_count) that affects subsequent events. Folding it at the diff layer means downstream divergences are misattributed.
D remains as a backstop (plan.md §Backstop item 2) for residual gaps post-Stage-3, with explicit reading-error annotation.
Why H' over F (make ours preemptive)
23 phases (C+1 through C+23) have stabilized ours's cold digest. Changing the default scheduler to preempt at fixed intervals would invalidate every prior baseline. Even if the new digest is stable, it severs continuity with the existing test infrastructure and audit-run archives.
H' preserves ours's default OrderMode::Fixed. The replay mode is opt-in via --scheduler-replay-manifest PATH. Default-mode digest is provably unchanged (Stage 3 validation #2).
Why Stage 0 first
Cost is 1 day, 80 LOC. If a tuned quantum advances the prefix past 104,607 with a stable digest, the manifest work (Stages 1-4, ~450-500 LOC, 3-5 sessions) is unnecessary. Even if Stage 0 doesn't fully unblock, the data informs the manifest design (e.g., "quantum=200 advances prefix by 800 events but stalls at 105,407" tells us the next divergence is a different class).
Stage 0 is strictly dominated by approaches that include it. Skipping risks doing 500 LOC of unnecessary work.
Why NOT spin-then-wait fix in ours
The 104,607 divergence is canary contending, ours NOT contending. Adding spin to ours would make ours's RtlEnterCS try harder to acquire without parking — which makes contention less likely on the ours side, the OPPOSITE of what we need. Documenting the spin asymmetry is valuable for future divergences in the opposite direction (where ours spuriously contends and canary doesn't), but it's not the lever for 104,607.
Open tradeoffs (decisions deferred to user / Stage 0 outcome)
- Stage 0 alone might suffice: if quantum=N produces a stable digest matching canary's behavior at 104,607, the plan collapses to a single 80-LOC change. Stage 0 decision tree is in plan.md.
- Sister chain regression budget: -5 per sister. If exceeded post-Stage-3, scope manifest to tid=6 only initially, then iterate.
- Wait-side replay (broad H): deferred unless sister chains (esp tid=12→7 timeout class) need it. Backstop only.
- Approach D extension as final band-aid: documented in backstop with explicit #23 annotation. Only land if Stages 0-4 leave residual divergence with no other path forward.