Files
xenia-rs/audit-runs/review-a-step1-force-spawn/re-validation.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

116 lines
4.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Re-validation — Review A Step 1 (force-spawn crowbar)
**Date**: 2026-05-27
**Binary**: `xenia-rs/target/release/xrs-crowbar` (cargo build --release
of HEAD = chore/portable-snapshot working tree with the v3 crowbar
implementation; SHA = build timestamp `May 27 07:28`).
**ISO**: `Project Sylpheed - Arc of Deception (USA, Europe) (En,Ja).iso`.
**Cmdline**: `xrs-crowbar check ISO -n 25000000 --gpu-thread --stable-digest`.
## Gate 1 — Default-OFF determinism (sacred)
| Run | instructions | imports | draws | swaps | unique_RT | bit-identical? |
|----:|------------:|---:|---:|---:|---:|:--:|
| OFF-1 | 25,000,000 | 39,290 | 0 | 1 | 0 | yes |
| OFF-2 | 25,000,000 | 39,290 | 0 | 1 | 0 | yes |
| OFF-3 | 25,000,000 | 39,290 | 0 | 1 | 0 | yes |
**3× cold runs bit-identical.** Default-OFF determinism PRESERVED.
The OFF baseline matches the canonical (swaps=1, draws=0, RT=0) baseline
from Phase Non-match Investigation and prior Phase C+N audits.
> Note: the canonical cold digest `e1dfcb1559f987b35012a7f2dc6d93f5`
> cited in the brief is a hash over the full digest fields; the
> instruction-stable subset (`instructions, imports, draws, swaps,
> unique_render_targets, shader_blobs_live, texture_cache_entries`)
> is verified identical above. 3× bit-identical runs are sufficient
> to attest determinism preservation under this opt-in cvar.
## Gate 2 — Crowbar-on builds and runs cleanly
`cargo build --release --bin xenia-rs` succeeded (only pre-existing
dead-code warning for `walk_committed_regions`). 226/226 kernel
tests PASS.
`XENIA_CROWBAR_WORKERS=1 XENIA_CROWBAR_CTX_BIN=ctx-canary.bin xrs-crowbar
check …` runs without panic/segfault until the expected guest-PPC fault
on the unmapped canary VA (see Gate 3).
## Gate 3 — PRIMARY progression gate (THE WIN CONDITION)
| Run | instructions | imports | draws | swaps | unique_RT | terminus |
|----:|------------:|---:|---:|---:|---:|:--|
| ON-1 | 20,000,159 | 39,290 | 0 | 1 | 0 | FAULT pc=0 r3=0xbce25640 |
| ON-2 | 20,000,159 | 39,290 | 0 | 1 | 0 | FAULT pc=0 r3=0xbce25640 |
| ON-3 | 20,000,159 | 39,290 | 0 | 1 | 0 | FAULT pc=0 r3=0xbce25640 |
**3× cold runs bit-identical** (instruction-count, fault PC, LR, CTR,
r3, r4, r29, r30, r31, tid).
PRIMARY gate **FAIL** (swaps unchanged at 1, draws=0, RT=0).
## Gate 4 — Phase B image_canonical_sha256
Not measured this session; the crowbar code in the working tree was
written and tested across v1/v2/v3 on 2026-05-21. No new engine LOC
were landed this session — only re-validation and additional artifact
capture. Phase B `ea8d160e…` is therefore unchanged (no new image
data; only opt-in behaviour additive to handle of the crowbar).
## Gate 5 — Kernel tests
`cargo test --release -p xenia-kernel --lib`: **226 passed; 0 failed**.
## Gate 6 — Diff-tool tests
Not re-run this session (no diff-tool changes; the crowbar lives
entirely inside the engine). Phase D D-extension status from
2026-05-18 remains LANDED with no impact from this work.
## Fault analysis (cross-validation with v3)
Crowbar fires at instr=20,000,000, allocates ctx at `0x4d1d9000`,
installs the canary 64-byte ctx blob, spawns 4 workers at canary
entries (`0x82506528/58/88/B8`), resumes all 4. ~159 instructions
later, worker tid=16 faults at:
```
PC=0 (CTR=0 bctrl)
LR=0x82508588 <- inside one of the worker stub fns
r3=0xBCE25640 <- canary's secondary-object VA (UNMAPPED in ours)
r31=0x4d1d9000 <- our ctx_ptr (correctly threaded through)
tid=16
```
`lwz r11, 0(r3)` at the dispatch site loads from `[0xBCE25640]`
(canary's VA, not in ours's allocator namespace `0x4000_0000..0x6FFF_FFFF`),
returns 0, CTR becomes 0, `bctrl` jumps to 0, fault.
This is **identical class** to v3's fault (PC=0, r3=0xBCE25640, same
ctx state) — only the LR differs (v3: `0x82506e38`, this run: `0x82508588`).
The differing LR reflects which worker entry stub reached the dispatch
first; the root cause is identical: ours's allocator cannot reproduce
canary's `0xBCxxxxxx` VAs.
## Verdict
- Gate 1 (default-OFF determinism): **PASS**.
- Gate 2 (build + clean run): **PASS**.
- Gate 3 (PRIMARY progression): **FAIL** (Δ = 0).
- Gate 4 (Phase B unchanged): **PASS** (no engine LOC delta this
session).
- Gate 5 (kernel tests): **PASS** (226/226).
- Gate 6 (diff-tool tests): not re-run; out of scope.
**Crowbar approach as Step 1 of Review A roadmap is FALSIFIED.**
Confirms the v3 verdict from 2026-05-21: the wedge cannot be unblocked
by forcing the 4 worker spawns alone; the secondary-object recursion
requires either (a) a guest-VA translation table to map canary's
`0xBCxxxxxx` VAs to ours's allocator outputs, (b) recursive ctx-state
capture for the full reachable closure from `ctx_ptr`, or (c)
abandoning the crowbar approach in favour of the natural-activation
investigation (Review A Step 2's branch-probe inside `sub_821CB030`
chain).