Files
xenia-rs/audit-runs/phase-ab-verify/re-validation.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

8.9 KiB
Raw Blame History

Phase A + Phase B re-validation evidence (post-fix)

Compact transcript of the post-fix re-runs that prove all 4 Phase A gates and all 5 Phase B gates pass. For full discussion of the issues fixed and per-step methodology see verification-report.md.

Conducted 2026-05-13. Build under test: target/release/xenia-rs (combined Phase A + Phase B, byte-identical to xenia-rs-phaseB). Diff tool under test: tools/diff-state/diff_state.py post-fix.

Combined Phase A + Phase B cvar-OFF determinism

$ ./target/release/xenia-rs check --stable-digest -n 50000000 \
    --out audit-runs/phase-ab-verify/digest-current-cvaroff.json \
    "<ISO>"
$ diff audit-runs/phase-a-diff-harness/digest-pre-patch.json \
       audit-runs/phase-ab-verify/digest-current-cvaroff.json
# (no output → byte-identical)

PASS. The current binary, with no Phase A or Phase B cvars, produces the same instructions=50000001 imports=40454 unimpl=0 draws=0 swaps=1 … digest as the pre-Phase-A baseline.

Phase A gates

Gate 1 — cvar-OFF determinism

  • ours: see "Combined cvar-OFF" above. PASS.
  • canary: 18-s Wine smoke run with --mute=true, no Phase A cvars. xenia.log shows AUDIT-DEMO-SETUP-BEGIN and AUDIT-DEMO-SETUP-GRAPHICS-OK. CONFIG DUMP [Audit] section contains phase_a_event_log_path = "" and phase_a_event_log_mem_writes = false. PASS.

Gate 2 — cvar-ON valid JSONL with schema_version first

$ python3 -c "import json; [json.loads(l) for l in open('audit-runs/phase-a-diff-harness/ours-sanity.jsonl')]"
# (no error — 121 363 lines all parse)
$ head -1 audit-runs/phase-a-diff-harness/ours-sanity.jsonl
{"schema_version":1,"engine":"ours","kind":"schema_version",…}

Same for canary-sanity.jsonl (1 635 789 lines, all parse, header is schema_version). Kind histograms:

  • ours: 1 schema_version + 40 454 each of import.call/kernel.call/kernel.return
  • canary: 1 schema_version + 545 271 import.call + 545 270 kernel.call + 545 247 kernel.return (24 in-flight at SIGKILL).

PASS.

Gate 3 — ≥100-event matching prefix on tid=6→tid=1

$ python3 tools/diff-events/diff_events.py \
    --canary audit-runs/phase-a-diff-harness/canary-sanity.jsonl \
    --ours   audit-runs/phase-a-diff-harness/ours-sanity.jsonl \
    --out    /tmp/post-fix-phase-a.md
$ diff -q audit-runs/phase-a-diff-harness/diff-report.md /tmp/post-fix-phase-a.md
# (no output — byte-identical)

113 matched events on canary tid=6 → ours tid=1 before first divergence at idx 113. PASS.

Gate 4 — negative test detects corruption at exact index

$ python3 -c "
import json
with open('audit-runs/phase-a-diff-harness/ours-sanity.jsonl') as f:
    lines=[next(f) for _ in range(100)]
open('/tmp/ours-short.jsonl','w').writelines(lines)
ev=json.loads(lines[49]); ev['kind']='kernel.CORRUPT'
lines[49]=json.dumps(ev)+'\n'
open('/tmp/ours-corrupt.jsonl','w').writelines(lines)
"
$ python3 tools/diff-events/diff_events.py --canary /tmp/ours-short.jsonl --ours /tmp/ours-short.jsonl --validate-identical
# exit 0 → self-diff PASS
$ python3 tools/diff-events/diff_events.py --canary /tmp/ours-short.jsonl --ours /tmp/ours-corrupt.jsonl --validate-identical
validate-identical: divergence in canary_tid=1 at tid_event_idx=48 (kind: canary='import.call' ours='kernel.CORRUPT')
# exit 1

PASS.

Phase B gates

Gate 1 — cvar-OFF determinism (combined Phase A + Phase B)

  • ours: see "Combined cvar-OFF". PASS.
  • canary: same Wine smoke run shows the 5 expected new [Audit] cvar lines (2 Phase A + 3 Phase B). Smoke marker fires. PASS.

Gate 2 — well-formed snapshots both engines

$ ls audit-runs/phase-b-state-equivalence/snap-001/{canary,ours}/
canary/  config.json  cpu_state.json  kernel.json  manifest.json  memory.json  vfs.json
ours/    config.json  cpu_state.json  kernel.json  manifest.json  memory.json  vfs.json
$ for f in config cpu_state kernel manifest memory vfs; do
    python3 -c "import json; json.load(open('audit-runs/phase-b-state-equivalence/snap-001/canary/$f.json'))"
    python3 -c "import json; json.load(open('audit-runs/phase-b-state-equivalence/snap-001/ours/$f.json'))"
done
# (no error — 12 files all parse)

Manifest SHA-256 claims match recomputed file hashes (verified per file). Note: ours emits keys alphabetically (serde_json default); canary emits in insertion order (fmt::format). Diff tool parses to dict before comparing — no functional impact. PASS, with documentation update in validation.md.

Gate 3 — hash-deterministic re-runs

ours. Two independent runs to different --phase-b-snapshot-dirs:

$ ./target/release/xenia-rs exec --quiet \
    --phase-b-snapshot-dir audit-runs/phase-ab-verify/snap-002a \
    --phase-b-snapshot-and-exit "<ISO>"
$ ./target/release/xenia-rs exec --quiet \
    --phase-b-snapshot-dir audit-runs/phase-ab-verify/snap-002b \
    --phase-b-snapshot-and-exit "<ISO>"
$ python3 tools/diff-state/diff_state.py \
    --canary audit-runs/phase-ab-verify/snap-002a/ours \
    --ours   audit-runs/phase-ab-verify/snap-002b/ours \
    --validate-identical
validate-identical: OK
# exit 0

Same-dir byte-equality:

$ # snap-002c run 1 → ours/, then mv to ours-1, then run 2 → ours/
$ diff -r audit-runs/phase-ab-verify/snap-002c/ours \
          audit-runs/phase-ab-verify/snap-002c/ours-1
# (no output — BYTE-IDENTICAL)

PASS.

canary. New snapshot run via Wine, compared to stored snap-001:

$ wine xenia_canary_phaseB.exe --mute=true \
    --phase_b_snapshot_dir="$WP" --phase_b_snapshot_and_exit=true "<ISO>"
$ python3 tools/diff-state/diff_state.py \
    --canary audit-runs/phase-b-state-equivalence/snap-001/canary \
    --ours   audit-runs/phase-ab-verify/snap-canary-002/canary \
    --validate-identical
validate-identical: OK
# exit 0

PASS.

Gate 4 — invariants

invariant canary ours ok
xex_entry_point 0x824ab748 0x824ab748 PASS
cpu_state.pc == xex_entry_point yes yes PASS
image_loaded_sha256 match a70993b7… ea8d160e… FAIL → STOP (expected catalog finding)

Mismatch reproducible across two independent canary runs (both a70993b7…) and two independent ours runs (both ea8d160e…). The mismatch is the documented Phase C handoff, not a Phase B failure.

Gate 5 — diff-tool negative test

Reproduction of the verbatim validation.md procedure (after diff_state.py fix):

$ cp audit-runs/phase-b-state-equivalence/snap-001/ours/kernel.json /tmp/kernel-mut.json
$ sed -i 's/"thread_id": 1/"thread_id": 999/' /tmp/kernel-mut.json
$ # build /tmp/verify-gate5/ from snap-001/ours + the mutated kernel.json
$ python3 tools/diff-state/diff_state.py \
    --canary audit-runs/phase-b-state-equivalence/snap-001/ours \
    --ours   /tmp/verify-gate5 --out /tmp/r3.md
wrote /tmp/r3.md (2 divergences)
# exit 1

Report.md names two divergences:

  • kernel.json <manifest> manifest-hash-mismatch σ-structural (file SHA on disk does not match manifest's claim)
  • kernel.json objects[handle_semantic_id=9879c5053fedb1d0].details.thread_id γ-kernel-content canary=1 ours=999

PASS.

Regression: stored Phase B catalog unchanged after fix

$ python3 tools/diff-state/diff_state.py \
    --canary audit-runs/phase-b-state-equivalence/snap-001/canary \
    --ours   audit-runs/phase-b-state-equivalence/snap-001/ours \
    --out    /tmp/post-fix-phase-b.md
wrote /tmp/post-fix-phase-b.md (58 divergences)
# exit 2 (STOP)
$ diff -q audit-runs/phase-b-state-equivalence/report.md /tmp/post-fix-phase-b.md
# (no output → byte-identical)

The 58-divergence catalog is unchanged. The diff_state.py fix behavior is restricted to the case where on-disk SHA disagrees with manifest claim, which only occurs in tampering or cross-engine testing where each engine emits its own bytes.

Unit tests

$ cargo test -p xenia-kernel event_log
test event_log::tests::fnv1a_known_vector ... ok
test event_log::tests::semantic_id_stable ... ok
test result: ok. 2 passed; 0 failed

PASS.

Summary

Gate Status
Phase A 1 cvar-OFF (ours) PASS
Phase A 1 cvar-OFF (canary) PASS
Phase A 2 cvar-ON well-formed JSONL PASS
Phase A 3 ≥100-event matching prefix PASS
Phase A 4 negative test PASS
Phase B 1 cvar-OFF (ours) PASS
Phase B 1 cvar-OFF (canary) PASS
Phase B 2 well-formed snapshots PASS
Phase B 3 hash-deterministic re-runs (ours) PASS
Phase B 3 hash-deterministic re-runs (canary) PASS
Phase B 4 invariants pc == entry_point PASS
Phase B 4 invariant image_loaded_sha256 FAIL → STOP (documented finding for Phase C)
Phase B 5 negative test PASS (post-fix)
Combined cvar-OFF byte-identical to baseline PASS
Diff-tool synthetic edges (each tool, 5 cases) PASS
Hook-point semantic equivalence PASS

All gates that should PASS, do. The single FAIL is the documented image_loaded_sha256 STOP condition that defines Phase B's success boundary.