# Iterate 2.AO — VBLANK MMIO Hardcode (C-1 candidate from 2.AN) **Headline: FIX-INERT-C2-CONFIRMED.** The 2.AN Angle-A fix (hardcode `D1MODE_VBLANK_VLINE_STATUS` / reg `0x1951` to return `1` on read, matching xenia-canary `graphics_system.cc:309-310`) is **applied, builds, passes all tests, preserves determinism — and is fully inert**. VdSwap stays at 6, the total event trace is bit-identical to the 2.AI/2.AJ baseline (65,691,821 events), and the exit-thread-state / wedge map are byte-for-byte identical to 2.AJ. C-1 (the VBLANK read asymmetry) was **not** the active blocker. The deeper bottleneck C-2 (`opt_callback` at `user_data+15144` never installed) is confirmed as the prime suspect. --- ## Patch summary | File | Change | LOC | Notes | |------|--------|-----|-------| | `crates/xenia-gpu/src/mmio_region.rs` | read arm `reg::D1MODE_VBLANK_VLINE_STATUS` now returns `1` unconditionally instead of `read_vblank_status.load(Relaxed)` | **+9 / -1** (1 substantive + 8 doc/`let _` keep-alive) | single match arm | | `crates/xenia-kernel/src/exports.rs` | **untouched** — 2.AJ reciprocal-shadow patch | +45 (pre-existing) | left in place as instructed | - Diff (`git diff --numstat`): `9 1 crates/xenia-gpu/src/mmio_region.rs` — under the 10-LOC hard cap. - The captured `read_vblank_status` clone is held with `let _ = &read_vblank_status;` so the closure still moves it and compiles clean. - The write closure's W1TC path and `tick_vsync_instr` are untouched (`write_vblank_status` still used there). No refactor. - Branch `chore/portable-snapshot`, HEAD `acd1656`. Patch UNCOMMITTED in working tree (as required). 2.AJ exports.rs patch verified intact (+45). ### Source confirmation - `reg::D1MODE_VBLANK_VLINE_STATUS == 0x1951` at `gpu_system.rs:1430`. - Canary `case 0x1951: return 1; // vblank` at `graphics_system.cc:309-310` — exact match. - The ours source comment at `gpu_system.rs:224-232` independently documents 2.AN's premise: the Sylpheed vsync callback "gates *all* its work on reading bit 0 as set: `lwz; rlwinm. r,r,0,31,31; bc 12,2,skip`". --- ## Verification gates ### Build / Test (PRIMARY) - `cargo build --release`: **SUCCEEDS** (incremental, 0.88s). Only a pre-existing unrelated `dead_code` warning in `phase_b_snapshot.rs:245` (`walk_committed_regions`) — not from this patch. - `cargo test -p xenia-gpu -p xenia-kernel -p xenia-app -p xenia-cpu`: **687 pass, 0 fail, 0 regressions** (xenia-app 300; xenia-kernel 227 + 149; xenia-cpu 6; xenia-gpu 5; + ignored doctests). Matches historical baseline exactly. ### Determinism (PRIMARY) — **PASS** - run1 `ours-cold.jsonl`: **65,691,821** events. - run2 `ours-cold-run2.jsonl`: **65,691,821** events. - Bit-identical line count across two cold runs (`XENIA_CACHE_WIPE=1`, `-n 500000000`). (The ~763 KB byte-size delta between the two files is trailing-buffer noise, not an event-count divergence — line counts are exactly equal.) ### VdSwap (PRIMARY) — **NO CHANGE → C-1 not the gate** - run1 VdSwap: **6**. run2 VdSwap: **6**. - 2.AI/2.AJ baseline: 6. **No progression.** Per the gate definition, an unchanged VdSwap means C-1 was not the active blocker. ### Total event count vs baseline — **IDENTICAL** - 2.AO = 65,691,821. 2.AJ baseline = 65,691,821. **Exactly equal.** The hardcode produced zero observable divergence in the execution trace. ### Exit-state (tid=1 / tid=12) — **byte-identical to 2.AJ** - `diff exit-thread-state.json` (2AJ vs 2AO): **BYTE-IDENTICAL**. Same 21 alive threads, same 18 wedge entries. - **tid=1**: `Blocked` @ PC `0x824ac578`, waiting on **Event `0x000010e8`** (sig=false, no signaler). Unchanged — the 2.AI/2.AJ wedge. - **tid=12**: `Blocked` @ PC `0x824ac578`, waiting on **Event `0x00001004`** (sig=false, no signaler). Unchanged — the DPC-dispatcher wedge (2.AC/2.AM). ### tid=1 wait gap on Event 0x10e8 (SECONDARY) — **no improvement** - Event `0x000010e8` ↔ semantic SID `9ad1bebb6cae28c4` (handle.create at host_ns 819,544,956). - tid=1 issues exactly **2** `wait.begin` on this SID, at host_ns ~6.660s, **128.595 µs** apart, then **blocks permanently** (no 3rd wait, never woken). This is the same two-wait-then-permanent-block pattern 2.AJ reported (~126.8 µs). The expected secondary effect ("wait gap may rise as more callbacks succeed") **did not occur** — the gate is downstream of C-2, so nothing changed. ### gpu.interrupt.delivered rate (SECONDARY) — **N/A** - The engine emits no `gpu.interrupt.delivered` event kind (the 11 kinds in the trace are: import.call, kernel.call/return, wait.begin, handle.create/destroy, wake.requested, signal.match, thread.create/exit, schema_version). `VdSetGraphicsInterruptCallback` is called 3× (callback IS registered) — consistent with 2.AJ's 76 ISR firings/100M. Not measurable from this trace; no regression. --- ## Why the fix is inert (C-2 mechanism) The hardcode correctly removes the read asymmetry 2.AN identified: the guest VSync callback `sub_824BE9A0` @ PC `0x824BEA38-0x824BEA44` now always reads bit 0 = 1 and would take its frame-counter branch instead of the `beq loc_824BEAAC` skip. But the trace is **bit-identical** to the bit-clear baseline — meaning the frame-counter branch produces no downstream observable signal either way. Per 2.AN's C-2: the real signaller is the dynamically-installed `opt_callback` stored at `user_data+15144` (tail-called by `sub_824BE9A0` → `sub_824BEA80`). In the 65.7M-event run that opt_callback is **never installed** (its setter `sub_824C1920`, reached only via `sub_822F1F20 ← sub_822F1EE0 ← dispatch-table slot 0x822F1AFC`, requires a deeper game-state event that does not fire). So even with the VBLANK gate forced open, there is no installed callback to write `SignalState=1` on Event 0x10e8 — tid=1 stays wedged. C-1 was a real divergence-vs-canary but **not on the critical path**; C-2 gates it. This is consistent with the 5-iterate methodology lesson logged in 2.AN (variant #44): the "missing signal" is three layers below "what does the wait depend on" — and C-1, one layer up, was correctly fixed but is inert because layer-3 (opt_callback install) never happens. --- ## Confidence + next-iterate recommendation **Confidence: HIGH** that C-1 is inert and C-2 is the prime suspect. Evidence is decisive (bit-identical event count + byte-identical exit state + unchanged VdSwap across two deterministic cold runs). The fix is a correct canary-parity hardening (keep it; it eliminates a latent race) but not a cascade win. **Disposition of this patch:** KEEP uncommitted as dormant correctness/parity infra (like the 2.AJ reciprocal-shadow patch). It costs nothing, matches canary exactly, and closes a real (if currently unreachable) race window. **Next iterate — make C-2 the explicit target.** Recommended (in priority order, mirroring 2.AN's Angle B/C): 1. **2.AP — opt_callback install/clear probe (~5-15 LOC tooling, 0 engine).** `--lr-trace 0x824C1920` (setter `sub_824C1920`) over a 500M run to confirm install count == 0 and identify the nearest reached frame on the `0x822F1AFC` dispatch chain. This is the single highest-value next step: it pins down *which* upstream game-state event must fire. 2. **2.AQ — dispatch-chain reachability walk (~10-30 LOC tooling).** `--lr-trace 0x822F1EE0` / `0x822F1F20` to find where the `0x822F1AFC` dispatch slot stalls — i.e. the deeper game-state predicate that never evaluates true. Three layers up from the wait, this is the actual wedge root. 3. (Deprioritized) The bilateral tid=12 DPC wedge (Event 0x1004, 2.AM) and tid=11 XAudio wedge (2.AL) remain independent and should follow C-2 resolution, not precede it. Do **not** chase any further "force the signal" / "force the install" crowbars before 2.AP/2.AQ identify the gating game-state event — that has been the #44 reading-error trap five iterates running.