Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
7.1 KiB
Phase B — Validation record
All gates executed on 2026-05-13 against the patched canary
(build-cross/bin/Windows/Debug/xenia_canary.exe + renamed
xenia_canary_phaseB.exe) and ours (target/release/xenia-rs + renamed
target/release/xenia-rs-phaseB).
Gate 1: cvar-OFF determinism
ours
- Pre-patch digest:
audit-runs/phase-a-diff-harness/digest-post-patch-cvaroff.json(Phase A baseline; Phase A's gate-1 already proved byte-identity to the genuine pre-patch). - Post-Phase-B digest:
audit-runs/phase-b-state-equivalence/digest-post-phaseB-cvaroff.json. - Both runs:
check --stable-digest -n 50000000against the same ISO. diffof the two files produces zero output. Byte-identical. PASS.
canary
- Phase B adds three new CONFIG DUMP lines (
phase_b_snapshot_dir = "",phase_b_snapshot_and_exit = false,phase_b_dump_section_content = false). All other lines either match Phase A's accepted host-pointer/timing jitter or are unchanged. - Smoke marker (
AUDIT-DEMO-SETUP-BEGIN) still fires. - PASS by the Phase A gate-1 method.
Gate 2: Snapshot files well-formed
ours
$ ls audit-runs/phase-b-state-equivalence/snap-001/ours/
config.json cpu_state.json kernel.json manifest.json memory.json vfs.json
All six files parse as JSON, lead with "schema_version": 1 (or contain it in manifest), and are alphabetically sort-keys-sorted (verified by re-serializing — serde_json::Map defaults to ordered). PASS.
canary
$ ls audit-runs/phase-b-state-equivalence/snap-001/canary/
config.json cpu_state.json kernel.json manifest.json memory.json vfs.json
Same six files, same shape. Note: canary's phase_b_snapshot.cc writes JSON via direct fmt::format rather than a JSON map, so keys are emitted in insertion order, not alphabetical order. The diff tool parses to dict before comparing, so this asymmetry has no functional impact (verified empirically — diff_state.py produces identical reports across multiple runs of either engine). It does mean the canary↔ours manifest hashes differ even when the underlying state is semantically identical; the diff tool falls back to full content comparison in that case. PASS with this caveat documented.
Gate 3: Hash-deterministic re-runs (ours)
Two runs of ours with identical args:
$ ./target/release/xenia-rs-phaseB exec --quiet \
--phase-b-snapshot-dir <dir> --phase-b-snapshot-and-exit <iso> # run 1
$ mv <dir>/ours <dir>/ours-a
$ ./target/release/xenia-rs-phaseB exec --quiet \
--phase-b-snapshot-dir <dir> --phase-b-snapshot-and-exit <iso> # run 2
$ diff -r <dir>/ours <dir>/ours-a && echo BYTE-IDENTICAL
BYTE-IDENTICAL
PASS. Re-running ours with the same args produces hash-identical snapshot files.
The first re-run attempt produced a
config.jsonmismatch because the two runs were given different--phase-b-snapshot-dirvalues (whose path string is embedded inconfig.json::cvars.phase_b_snapshot_dir). That field is in the diff tool'sSKIP_BY_FILE["config.json"]skip set; the hash difference confirmed the skip rule is well-placed. With identical inputs the snapshots are byte-equal.
Gate 4: Invariants (HARD GATE)
From report.md:
| invariant | canary | ours | ok? |
|---|---|---|---|
| xex_entry_point | 0x824ab748 |
0x824ab748 |
PASS |
| cpu_state.pc == xex_entry_point | 0x824ab748 == 0x824ab748 (canary) |
0x824ab748 == 0x824ab748 (ours) |
PASS |
| image_loaded_sha256 | a70993b7… |
ea8d160e… |
FAIL → STOP |
The PC + entry-point invariants prove the snapshot point is equivalent across engines — both fired immediately before the first instruction at the same address. This is the principal Phase B equivalence claim.
The image_loaded_sha256 mismatch is the expected STOP condition per the spec. Phase B's contract is to detect and report this; investigation belongs to Phase C/D. The report.md flags it explicitly with re-run guidance.
Gate 5: Diff-tool negative test
$ cp audit-runs/phase-b-state-equivalence/snap-001/ours/kernel.json /tmp/kernel-mut.json
$ sed -i 's/"thread_id": 1/"thread_id": 999/' /tmp/kernel-mut.json
$ mkdir -p /tmp/ours-mut && cp -r audit-runs/phase-b-state-equivalence/snap-001/ours/* /tmp/ours-mut/
$ cp /tmp/kernel-mut.json /tmp/ours-mut/kernel.json
$ python3 tools/diff-state/diff_state.py \
--canary audit-runs/phase-b-state-equivalence/snap-001/ours \
--ours /tmp/ours-mut --out /tmp/r.md
$ echo $?
1
Report.md names two divergences:
kernel.json <manifest>manifest-hash-mismatch— surfaces that/tmp/ours-mut/kernel.json's SHA does not match what/tmp/ours-mut/manifest.jsonclaims.kernel.json objects[handle_semantic_id=…].details.thread_idvalue=canary=1, ours=999— the actual mutation.
PASS.
Verified 2026-05-13 (Phase A/B verify session). Pre-fix the diff tool trusted the manifest-claimed hashes without verifying them; a tampered file with an intact manifest copy would silently report "identical" (exit 0). The fix in
diff_state.py(arounddiff_directory) re-hashes each file, surfaces amanifest-hash-mismatchσ-structural divergence when the on-disk SHA does not match the manifest, and falls through to a full content diff.
Summary
| Gate | Status |
|---|---|
| 1. Cvar-OFF determinism (both engines) | PASS |
| 2. Snapshots well-formed (both engines) | PASS |
| 3. Hash-deterministic re-runs (ours) | PASS |
| 4. Invariants — pc == entry_point | PASS |
| 4. Invariants — image_loaded_sha256 | FAIL → STOP (expected: this is what Phase B catalogs) |
| 5. Diff-tool negative test | PASS |
Cascade prediction at session close
- A (snapshot tool emits readable state both engines): achieved.
- B (section content hashes match): NOT achieved —
image_loaded_sha256differs. The XEX is loaded into different post-decompression states between the two engines. This is the primary finding that Phase C will investigate, not a Phase B failure. - C (divergence catalog produced with classification): achieved — 58 divergences across all 5 files, fully classified.
- D (fix lands): N/A — out of scope for Phase B.
Notes on minor implementation choices
- Canary's PPCContext doesn't expose a
pcfield (the JIT dispatch loop manages PC). At the snapshot point the about-to-execute PC equals theaddressarg toprocessor()->Execute(...), which the hook receives asentry_address; we emit that value ascpu_state.pc. - Memory snapshots emit a fixed named-region list (XEX image, main stack, PCR, TLS) rather than walking the full page table. An earlier blanket-walk approach crashed in Wine because canary's
QueryRegionInforeportsCOMMITfor some pages whose host-side backing is reserved-not-committed (physical heap mirrors, low system heap). The named-region list is sufficient for the diff tool's cross-engine comparison. - The
xex_header_sha256field uses different formats in each engine (canary emits a 64-bitUserModule::hash(); ours emits a placeholder zero string). This is a known one-line shim that Phase B intentionally leaves as a divergence to demonstrate the diff tool's δ-content class.