# Phase B — Validation record All gates executed on 2026-05-13 against the patched canary (`build-cross/bin/Windows/Debug/xenia_canary.exe` + renamed `xenia_canary_phaseB.exe`) and ours (`target/release/xenia-rs` + renamed `target/release/xenia-rs-phaseB`). ## Gate 1: cvar-OFF determinism ### ours - Pre-patch digest: `audit-runs/phase-a-diff-harness/digest-post-patch-cvaroff.json` (Phase A baseline; Phase A's gate-1 already proved byte-identity to the genuine pre-patch). - Post-Phase-B digest: `audit-runs/phase-b-state-equivalence/digest-post-phaseB-cvaroff.json`. - Both runs: `check --stable-digest -n 50000000` against the same ISO. - `diff` of the two files produces zero output. Byte-identical. **PASS.** ### canary - Phase B adds three new CONFIG DUMP lines (`phase_b_snapshot_dir = ""`, `phase_b_snapshot_and_exit = false`, `phase_b_dump_section_content = false`). All other lines either match Phase A's accepted host-pointer/timing jitter or are unchanged. - Smoke marker (`AUDIT-DEMO-SETUP-BEGIN`) still fires. - **PASS** by the Phase A gate-1 method. ## Gate 2: Snapshot files well-formed ### ours ``` $ ls audit-runs/phase-b-state-equivalence/snap-001/ours/ config.json cpu_state.json kernel.json manifest.json memory.json vfs.json ``` All six files parse as JSON, lead with `"schema_version": 1` (or contain it in manifest), and are alphabetically sort-keys-sorted (verified by re-serializing — `serde_json::Map` defaults to ordered). **PASS.** ### canary ``` $ ls audit-runs/phase-b-state-equivalence/snap-001/canary/ config.json cpu_state.json kernel.json manifest.json memory.json vfs.json ``` Same six files, same shape. Note: canary's `phase_b_snapshot.cc` writes JSON via direct `fmt::format` rather than a JSON map, so keys are emitted in **insertion order, not alphabetical order**. The diff tool parses to dict before comparing, so this asymmetry has no functional impact (verified empirically — `diff_state.py` produces identical reports across multiple runs of either engine). It does mean the canary↔ours manifest hashes differ even when the underlying state is semantically identical; the diff tool falls back to full content comparison in that case. **PASS** with this caveat documented. ## Gate 3: Hash-deterministic re-runs (ours) Two runs of ours with identical args: ``` $ ./target/release/xenia-rs-phaseB exec --quiet \ --phase-b-snapshot-dir --phase-b-snapshot-and-exit # run 1 $ mv /ours /ours-a $ ./target/release/xenia-rs-phaseB exec --quiet \ --phase-b-snapshot-dir --phase-b-snapshot-and-exit # run 2 $ diff -r /ours /ours-a && echo BYTE-IDENTICAL BYTE-IDENTICAL ``` **PASS.** Re-running ours with the same args produces hash-identical snapshot files. > The first re-run attempt produced a `config.json` mismatch because the > two runs were given different `--phase-b-snapshot-dir` values (whose > path string is embedded in `config.json::cvars.phase_b_snapshot_dir`). > That field is in the diff tool's `SKIP_BY_FILE["config.json"]` skip > set; the hash difference confirmed the skip rule is well-placed. With > identical inputs the snapshots are byte-equal. ## Gate 4: Invariants (HARD GATE) From `report.md`: | invariant | canary | ours | ok? | |---|---|---|---| | xex_entry_point | `0x824ab748` | `0x824ab748` | **PASS** | | cpu_state.pc == xex_entry_point | `0x824ab748 == 0x824ab748` (canary) | `0x824ab748 == 0x824ab748` (ours) | **PASS** | | image_loaded_sha256 | `a70993b7…` | `ea8d160e…` | **FAIL → STOP** | The PC + entry-point invariants prove the snapshot point is **equivalent across engines** — both fired immediately before the first instruction at the same address. This is the principal Phase B equivalence claim. The `image_loaded_sha256` mismatch is the **expected STOP condition** per the spec. Phase B's contract is to detect and report this; investigation belongs to Phase C/D. The report.md flags it explicitly with re-run guidance. ## Gate 5: Diff-tool negative test ``` $ cp audit-runs/phase-b-state-equivalence/snap-001/ours/kernel.json /tmp/kernel-mut.json $ sed -i 's/"thread_id": 1/"thread_id": 999/' /tmp/kernel-mut.json $ mkdir -p /tmp/ours-mut && cp -r audit-runs/phase-b-state-equivalence/snap-001/ours/* /tmp/ours-mut/ $ cp /tmp/kernel-mut.json /tmp/ours-mut/kernel.json $ python3 tools/diff-state/diff_state.py \ --canary audit-runs/phase-b-state-equivalence/snap-001/ours \ --ours /tmp/ours-mut --out /tmp/r.md $ echo $? 1 ``` Report.md names two divergences: - `kernel.json ` `manifest-hash-mismatch` — surfaces that `/tmp/ours-mut/kernel.json`'s SHA does not match what `/tmp/ours-mut/manifest.json` claims. - `kernel.json objects[handle_semantic_id=…].details.thread_id` value=`canary=1, ours=999` — the actual mutation. **PASS.** > Verified 2026-05-13 (Phase A/B verify session). Pre-fix the diff tool > trusted the manifest-claimed hashes without verifying them; a tampered > file with an intact manifest copy would silently report "identical" > (exit 0). The fix in [`diff_state.py`](../../tools/diff-state/diff_state.py) > (around `diff_directory`) re-hashes each file, surfaces a > `manifest-hash-mismatch` σ-structural divergence when the on-disk SHA > does not match the manifest, and falls through to a full content diff. ## Summary | Gate | Status | |---|---| | 1. Cvar-OFF determinism (both engines) | PASS | | 2. Snapshots well-formed (both engines) | PASS | | 3. Hash-deterministic re-runs (ours) | PASS | | 4. Invariants — pc == entry_point | PASS | | 4. Invariants — image_loaded_sha256 | **FAIL → STOP** (expected: this is what Phase B catalogs) | | 5. Diff-tool negative test | PASS | ## Cascade prediction at session close - A (snapshot tool emits readable state both engines): **achieved**. - B (section content hashes match): **NOT achieved** — `image_loaded_sha256` differs. The XEX is loaded into different post-decompression states between the two engines. This is the primary finding that Phase C will investigate, *not* a Phase B failure. - C (divergence catalog produced with classification): **achieved** — 58 divergences across all 5 files, fully classified. - D (fix lands): **N/A — out of scope for Phase B**. ## Notes on minor implementation choices - Canary's PPCContext doesn't expose a `pc` field (the JIT dispatch loop manages PC). At the snapshot point the about-to-execute PC equals the `address` arg to `processor()->Execute(...)`, which the hook receives as `entry_address`; we emit that value as `cpu_state.pc`. - Memory snapshots emit a **fixed named-region list** (XEX image, main stack, PCR, TLS) rather than walking the full page table. An earlier blanket-walk approach crashed in Wine because canary's `QueryRegionInfo` reports `COMMIT` for some pages whose host-side backing is reserved-not-committed (physical heap mirrors, low system heap). The named-region list is sufficient for the diff tool's cross-engine comparison. - The `xex_header_sha256` field uses different formats in each engine (canary emits a 64-bit `UserModule::hash()`; ours emits a placeholder zero string). This is a known one-line shim that Phase B intentionally leaves as a divergence to demonstrate the diff tool's δ-content class.