# Phase B — Validation record
All gates executed on 2026-05-13 against the patched canary
(`build-cross/bin/Windows/Debug/xenia_canary.exe` + renamed
`xenia_canary_phaseB.exe`) and ours (`target/release/xenia-rs` + renamed
`target/release/xenia-rs-phaseB`).
## Gate 1: cvar-OFF determinism
### ours
- Pre-patch digest: `audit-runs/phase-a-diff-harness/digest-post-patch-cvaroff.json` (Phase A baseline; Phase A's gate-1 already proved byte-identity to the genuine pre-patch).
- Post-Phase-B digest: `audit-runs/phase-b-state-equivalence/digest-post-phaseB-cvaroff.json`.
- Both runs: `check --stable-digest -n 50000000` against the same ISO.
- `diff` of the two files produces zero output. Byte-identical. **PASS.**
### canary
- Phase B adds three new CONFIG DUMP lines (`phase_b_snapshot_dir = ""`, `phase_b_snapshot_and_exit = false`, `phase_b_dump_section_content = false`). All other lines either match Phase A's accepted host-pointer/timing jitter or are unchanged.
- Smoke marker (`AUDIT-DEMO-SETUP-BEGIN`) still fires.
- **PASS** by the Phase A gate-1 method.
## Gate 2: Snapshot files well-formed
### ours
```
$ ls audit-runs/phase-b-state-equivalence/snap-001/ours/
config.json cpu_state.json kernel.json manifest.json memory.json vfs.json
```
All six files parse as JSON, lead with `"schema_version": 1` (or contain it in manifest), and are alphabetically sort-keys-sorted (verified by re-serializing — `serde_json::Map` defaults to ordered). **PASS.**
### canary
```
$ ls audit-runs/phase-b-state-equivalence/snap-001/canary/
config.json cpu_state.json kernel.json manifest.json memory.json vfs.json
```
Same six files, same shape. Note: canary's `phase_b_snapshot.cc` writes JSON via direct `fmt::format` rather than a JSON map, so keys are emitted in **insertion order, not alphabetical order**. The diff tool parses to dict before comparing, so this asymmetry has no functional impact (verified empirically — `diff_state.py` produces identical reports across multiple runs of either engine). It does mean the canary↔ours manifest hashes differ even when the underlying state is semantically identical; the diff tool falls back to full content comparison in that case. **PASS** with this caveat documented.
## Gate 3: Hash-deterministic re-runs (ours)
Two runs of ours with identical args:
```
$ ./target/release/xenia-rs-phaseB exec --quiet \
--phase-b-snapshot-dir
--phase-b-snapshot-and-exit # run 1
$ mv /ours /ours-a
$ ./target/release/xenia-rs-phaseB exec --quiet \
--phase-b-snapshot-dir --phase-b-snapshot-and-exit # run 2
$ diff -r /ours /ours-a && echo BYTE-IDENTICAL
BYTE-IDENTICAL
```
**PASS.** Re-running ours with the same args produces hash-identical snapshot files.
> The first re-run attempt produced a `config.json` mismatch because the
> two runs were given different `--phase-b-snapshot-dir` values (whose
> path string is embedded in `config.json::cvars.phase_b_snapshot_dir`).
> That field is in the diff tool's `SKIP_BY_FILE["config.json"]` skip
> set; the hash difference confirmed the skip rule is well-placed. With
> identical inputs the snapshots are byte-equal.
## Gate 4: Invariants (HARD GATE)
From `report.md`:
| invariant | canary | ours | ok? |
|---|---|---|---|
| xex_entry_point | `0x824ab748` | `0x824ab748` | **PASS** |
| cpu_state.pc == xex_entry_point | `0x824ab748 == 0x824ab748` (canary) | `0x824ab748 == 0x824ab748` (ours) | **PASS** |
| image_loaded_sha256 | `a70993b7…` | `ea8d160e…` | **FAIL → STOP** |
The PC + entry-point invariants prove the snapshot point is **equivalent across engines** — both fired immediately before the first instruction at the same address. This is the principal Phase B equivalence claim.
The `image_loaded_sha256` mismatch is the **expected STOP condition** per the spec. Phase B's contract is to detect and report this; investigation belongs to Phase C/D. The report.md flags it explicitly with re-run guidance.
## Gate 5: Diff-tool negative test
```
$ cp audit-runs/phase-b-state-equivalence/snap-001/ours/kernel.json /tmp/kernel-mut.json
$ sed -i 's/"thread_id": 1/"thread_id": 999/' /tmp/kernel-mut.json
$ mkdir -p /tmp/ours-mut && cp -r audit-runs/phase-b-state-equivalence/snap-001/ours/* /tmp/ours-mut/
$ cp /tmp/kernel-mut.json /tmp/ours-mut/kernel.json
$ python3 tools/diff-state/diff_state.py \
--canary audit-runs/phase-b-state-equivalence/snap-001/ours \
--ours /tmp/ours-mut --out /tmp/r.md
$ echo $?
1
```
Report.md names two divergences:
- `kernel.json ` `manifest-hash-mismatch` — surfaces that `/tmp/ours-mut/kernel.json`'s SHA does not match what `/tmp/ours-mut/manifest.json` claims.
- `kernel.json objects[handle_semantic_id=…].details.thread_id` value=`canary=1, ours=999` — the actual mutation.
**PASS.**
> Verified 2026-05-13 (Phase A/B verify session). Pre-fix the diff tool
> trusted the manifest-claimed hashes without verifying them; a tampered
> file with an intact manifest copy would silently report "identical"
> (exit 0). The fix in [`diff_state.py`](../../tools/diff-state/diff_state.py)
> (around `diff_directory`) re-hashes each file, surfaces a
> `manifest-hash-mismatch` σ-structural divergence when the on-disk SHA
> does not match the manifest, and falls through to a full content diff.
## Summary
| Gate | Status |
|---|---|
| 1. Cvar-OFF determinism (both engines) | PASS |
| 2. Snapshots well-formed (both engines) | PASS |
| 3. Hash-deterministic re-runs (ours) | PASS |
| 4. Invariants — pc == entry_point | PASS |
| 4. Invariants — image_loaded_sha256 | **FAIL → STOP** (expected: this is what Phase B catalogs) |
| 5. Diff-tool negative test | PASS |
## Cascade prediction at session close
- A (snapshot tool emits readable state both engines): **achieved**.
- B (section content hashes match): **NOT achieved** — `image_loaded_sha256` differs. The XEX is loaded into different post-decompression states between the two engines. This is the primary finding that Phase C will investigate, *not* a Phase B failure.
- C (divergence catalog produced with classification): **achieved** — 58 divergences across all 5 files, fully classified.
- D (fix lands): **N/A — out of scope for Phase B**.
## Notes on minor implementation choices
- Canary's PPCContext doesn't expose a `pc` field (the JIT dispatch loop manages PC). At the snapshot point the about-to-execute PC equals the `address` arg to `processor()->Execute(...)`, which the hook receives as `entry_address`; we emit that value as `cpu_state.pc`.
- Memory snapshots emit a **fixed named-region list** (XEX image, main stack, PCR, TLS) rather than walking the full page table. An earlier blanket-walk approach crashed in Wine because canary's `QueryRegionInfo` reports `COMMIT` for some pages whose host-side backing is reserved-not-committed (physical heap mirrors, low system heap). The named-region list is sufficient for the diff tool's cross-engine comparison.
- The `xex_header_sha256` field uses different formats in each engine (canary emits a 64-bit `UserModule::hash()`; ours emits a placeholder zero string). This is a known one-line shim that Phase B intentionally leaves as a divergence to demonstrate the diff tool's δ-content class.