Files
xenia-rs/crates/xenia-app/tests/golden
MechaCat02 66bd805726 [iterate-2V] VdSwap: stop bumping primary CP_RB_WPTR out-of-band (canary-faithful)
Ours' `vd_swap` wrote its 64-dword XE_SWAP block at the guest's reserved
`buffer_ptr` slot AND then bumped the primary ring `CP_RB_WPTR` out-of-band
via `state.gpu.extend_write_ptr_by(64)`. That bump was a bug: `buffer_ptr`
(~0x4add6efc) is NOT inside the primary ring (base ~0x4adcd000, 8192 dwords)
— it lives ~10k dwords past it, in the renderer indirect-buffer region. The
bogus WPTR bump pushed the GPU read-pointer PAST the guest's real
write-pointer; the drain treated the overshoot as a circular wrap and
re-executed the splash's draw indirect-buffers ~2×, inflating draws to 78
(the real splash geometry is ~28 draws; 12 INDIRECT_BUFFERs vs the real 6).

Canary's `VdSwap_entry` (xenia-canary xboxkrnl_video.cc:518-548) writes the
fetch-constant patch + PM4_XE_SWAP + NOP pad into the reserved slot and
returns — it NEVER touches CP_RB_WPTR. The guest advances the primary ring
write-pointer itself via its own doorbell once it has populated the slot;
swap-complete CP interrupts come only from the game's in-stream PM4_INTERRUPT
packets, never from VdSwap.

This fix removes only the out-of-band `extend_write_ptr_by(64)` call, keeping
the buffer_ptr block write intact and byte-faithful to canary. Effect at
`--gpu-inline -n 50M`: draws 78→28, INDIRECT_BUFFER 12→6 (re-execution
artifact gone), swaps 4→2. The run now halts at ~19.27M instructions (worker
threads exit) instead of spinning to 50M, because removing the corruption
unmasks the real per-present-interrupt deadlock — the title loop needs a
per-present PM4_INTERRUPT that the stalled game never submits. That deadlock
is a SEPARATE, known gate tracked/addressed elsewhere; it is intentionally
NOT papered over here.

Re-baselined golden crates/xenia-app/tests/golden/sylpheed_n50m.json to the
new honest values (regenerated twice, byte-identical). sylpheed_n2m.json is
unaffected (draws=0 at 2M). cargo test --workspace: 675 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 19:58:05 +02:00
..

Sylpheed regression goldens

These JSON files anchor xenia-rs check digest output for Project Sylpheed.

Files

File -n Mode Captures
sylpheed_n2m.json 2_000_000 full digest early boot (swaps=0, no rendering)
sylpheed_n50m.json 50_000_000 stable-digest first VdSwap pair (swaps=2 post-Phase-A)

Stable-digest mode

sylpheed_n50m.json is captured with --stable-digest, which omits timing-sensitive counters: packets (±28% lockstep noise from a GPU thread race), resolves, interrupts_delivered, interrupts_dropped, texture_decodes. The remaining fields are byte-identical across repeated lockstep runs at a fixed -n.

sylpheed_n2m.json predates the stable-digest flag and uses full-digest compare. It still works because at -n 2M the GPU pipeline has not produced any packets yet — packets=0 is trivially deterministic.

Circularity hazard

Per ORACBUG-001/002/003, these goldens were captured by running the same code they validate. They detect regression from a known-good snapshot, not correctness. When a planned fix intentionally moves the digest (e.g. a shader fix landing draws > 0 for the first time), re-baseline the golden as a separate commit and reference the audit ID in the message.

Re-baselining

cargo build --release -p xenia-app
target/release/xenia-rs check \
    "$SYLPHEED_ISO" \
    -n 50000000 \
    --stable-digest \
    --out crates/xenia-app/tests/golden/sylpheed_n50m.json

Running the goldens

cargo test --release -p xenia-app --test sylpheed_oracles -- --ignored --nocapture

The tests are #[ignore]-gated because each run takes a few seconds, which is unacceptable in the default cargo test cycle. The ISO path defaults to the contributor's local ~/RE Project Sylpheed/Project Sylpheed*.iso and can be overridden via SYLPHEED_ISO=/path/to/sylpheed.iso.

n4b canonical-invocation regression anchor (deferred)

The audit's recommended next sprint also called for a sylpheed_n4b.json golden capturing the canonical reference invocation xenia-rs check sylpheed.iso -n 4_000_000_000 --parallel --reservations-table. This is deferred because:

  1. The --parallel --reservations-table combination is empirically pathologically slow at -n 100M (>32 min per run per the audit memory). At -n 4B the run cost is many hours, not the single-session-friendly 515 min the original plan estimated.
  2. Each phase that intentionally moves rendering counters (C, D, E, F) would need a re-baseline of n4b — a significant time cost compounding over the sprint.

Once the renderer-unblock phases (C+D+E) land and draws > 0 is confirmed at -n 100M lockstep, an n4b artifact may be captured one-shot and stored under audit-runs/post-fix/ (not as a test golden) as a manual regression anchor for the canonical invocation.