Files
xenia-rs/audit-runs/phase-ab-verify/re-validation.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

252 lines
8.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Phase A + Phase B re-validation evidence (post-fix)
Compact transcript of the post-fix re-runs that prove all 4 Phase A
gates and all 5 Phase B gates pass. For full discussion of the issues
fixed and per-step methodology see `verification-report.md`.
Conducted 2026-05-13. Build under test: `target/release/xenia-rs`
(combined Phase A + Phase B, byte-identical to `xenia-rs-phaseB`).
Diff tool under test: `tools/diff-state/diff_state.py` post-fix.
## Combined Phase A + Phase B cvar-OFF determinism
```
$ ./target/release/xenia-rs check --stable-digest -n 50000000 \
--out audit-runs/phase-ab-verify/digest-current-cvaroff.json \
"<ISO>"
$ diff audit-runs/phase-a-diff-harness/digest-pre-patch.json \
audit-runs/phase-ab-verify/digest-current-cvaroff.json
# (no output → byte-identical)
```
PASS. The current binary, with no Phase A or Phase B cvars, produces
the same `instructions=50000001 imports=40454 unimpl=0 draws=0
swaps=1 …` digest as the pre-Phase-A baseline.
## Phase A gates
### Gate 1 — cvar-OFF determinism
- **ours**: see "Combined cvar-OFF" above. PASS.
- **canary**: 18-s Wine smoke run with `--mute=true`, no Phase A cvars.
`xenia.log` shows `AUDIT-DEMO-SETUP-BEGIN` and
`AUDIT-DEMO-SETUP-GRAPHICS-OK`. CONFIG DUMP `[Audit]` section
contains `phase_a_event_log_path = ""` and
`phase_a_event_log_mem_writes = false`. PASS.
### Gate 2 — cvar-ON valid JSONL with `schema_version` first
```
$ python3 -c "import json; [json.loads(l) for l in open('audit-runs/phase-a-diff-harness/ours-sanity.jsonl')]"
# (no error — 121 363 lines all parse)
$ head -1 audit-runs/phase-a-diff-harness/ours-sanity.jsonl
{"schema_version":1,"engine":"ours","kind":"schema_version",…}
```
Same for `canary-sanity.jsonl` (1 635 789 lines, all parse, header is
`schema_version`). Kind histograms:
- ours: 1 schema_version + 40 454 each of import.call/kernel.call/kernel.return
- canary: 1 schema_version + 545 271 import.call + 545 270 kernel.call + 545 247 kernel.return (24 in-flight at SIGKILL).
PASS.
### Gate 3 — ≥100-event matching prefix on tid=6→tid=1
```
$ python3 tools/diff-events/diff_events.py \
--canary audit-runs/phase-a-diff-harness/canary-sanity.jsonl \
--ours audit-runs/phase-a-diff-harness/ours-sanity.jsonl \
--out /tmp/post-fix-phase-a.md
$ diff -q audit-runs/phase-a-diff-harness/diff-report.md /tmp/post-fix-phase-a.md
# (no output — byte-identical)
```
113 matched events on canary tid=6 → ours tid=1 before first
divergence at idx 113. PASS.
### Gate 4 — negative test detects corruption at exact index
```
$ python3 -c "
import json
with open('audit-runs/phase-a-diff-harness/ours-sanity.jsonl') as f:
lines=[next(f) for _ in range(100)]
open('/tmp/ours-short.jsonl','w').writelines(lines)
ev=json.loads(lines[49]); ev['kind']='kernel.CORRUPT'
lines[49]=json.dumps(ev)+'\n'
open('/tmp/ours-corrupt.jsonl','w').writelines(lines)
"
$ python3 tools/diff-events/diff_events.py --canary /tmp/ours-short.jsonl --ours /tmp/ours-short.jsonl --validate-identical
# exit 0 → self-diff PASS
$ python3 tools/diff-events/diff_events.py --canary /tmp/ours-short.jsonl --ours /tmp/ours-corrupt.jsonl --validate-identical
validate-identical: divergence in canary_tid=1 at tid_event_idx=48 (kind: canary='import.call' ours='kernel.CORRUPT')
# exit 1
```
PASS.
## Phase B gates
### Gate 1 — cvar-OFF determinism (combined Phase A + Phase B)
- **ours**: see "Combined cvar-OFF". PASS.
- **canary**: same Wine smoke run shows the 5 expected new
`[Audit]` cvar lines (2 Phase A + 3 Phase B). Smoke marker fires.
PASS.
### Gate 2 — well-formed snapshots both engines
```
$ ls audit-runs/phase-b-state-equivalence/snap-001/{canary,ours}/
canary/ config.json cpu_state.json kernel.json manifest.json memory.json vfs.json
ours/ config.json cpu_state.json kernel.json manifest.json memory.json vfs.json
$ for f in config cpu_state kernel manifest memory vfs; do
python3 -c "import json; json.load(open('audit-runs/phase-b-state-equivalence/snap-001/canary/$f.json'))"
python3 -c "import json; json.load(open('audit-runs/phase-b-state-equivalence/snap-001/ours/$f.json'))"
done
# (no error — 12 files all parse)
```
Manifest SHA-256 claims match recomputed file hashes (verified per
file). Note: ours emits keys alphabetically (`serde_json` default);
canary emits in insertion order (`fmt::format`). Diff tool parses
to dict before comparing — no functional impact. PASS, with
documentation update in `validation.md`.
### Gate 3 — hash-deterministic re-runs
**ours.** Two independent runs to different
`--phase-b-snapshot-dir`s:
```
$ ./target/release/xenia-rs exec --quiet \
--phase-b-snapshot-dir audit-runs/phase-ab-verify/snap-002a \
--phase-b-snapshot-and-exit "<ISO>"
$ ./target/release/xenia-rs exec --quiet \
--phase-b-snapshot-dir audit-runs/phase-ab-verify/snap-002b \
--phase-b-snapshot-and-exit "<ISO>"
$ python3 tools/diff-state/diff_state.py \
--canary audit-runs/phase-ab-verify/snap-002a/ours \
--ours audit-runs/phase-ab-verify/snap-002b/ours \
--validate-identical
validate-identical: OK
# exit 0
```
Same-dir byte-equality:
```
$ # snap-002c run 1 → ours/, then mv to ours-1, then run 2 → ours/
$ diff -r audit-runs/phase-ab-verify/snap-002c/ours \
audit-runs/phase-ab-verify/snap-002c/ours-1
# (no output — BYTE-IDENTICAL)
```
PASS.
**canary.** New snapshot run via Wine, compared to stored snap-001:
```
$ wine xenia_canary_phaseB.exe --mute=true \
--phase_b_snapshot_dir="$WP" --phase_b_snapshot_and_exit=true "<ISO>"
$ python3 tools/diff-state/diff_state.py \
--canary audit-runs/phase-b-state-equivalence/snap-001/canary \
--ours audit-runs/phase-ab-verify/snap-canary-002/canary \
--validate-identical
validate-identical: OK
# exit 0
```
PASS.
### Gate 4 — invariants
| invariant | canary | ours | ok |
|---|---|---|---|
| `xex_entry_point` | `0x824ab748` | `0x824ab748` | PASS |
| `cpu_state.pc == xex_entry_point` | yes | yes | PASS |
| `image_loaded_sha256` match | `a70993b7…` | `ea8d160e…` | **FAIL → STOP (expected catalog finding)** |
Mismatch reproducible across two independent canary runs (both
`a70993b7…`) and two independent ours runs (both `ea8d160e…`). The
mismatch is the documented Phase C handoff, not a Phase B failure.
### Gate 5 — diff-tool negative test
Reproduction of the verbatim `validation.md` procedure (after
diff_state.py fix):
```
$ cp audit-runs/phase-b-state-equivalence/snap-001/ours/kernel.json /tmp/kernel-mut.json
$ sed -i 's/"thread_id": 1/"thread_id": 999/' /tmp/kernel-mut.json
$ # build /tmp/verify-gate5/ from snap-001/ours + the mutated kernel.json
$ python3 tools/diff-state/diff_state.py \
--canary audit-runs/phase-b-state-equivalence/snap-001/ours \
--ours /tmp/verify-gate5 --out /tmp/r3.md
wrote /tmp/r3.md (2 divergences)
# exit 1
```
Report.md names two divergences:
- `kernel.json <manifest>` `manifest-hash-mismatch` σ-structural
(file SHA on disk does not match manifest's claim)
- `kernel.json objects[handle_semantic_id=9879c5053fedb1d0].details.thread_id`
γ-kernel-content `canary=1 ours=999`
PASS.
## Regression: stored Phase B catalog unchanged after fix
```
$ python3 tools/diff-state/diff_state.py \
--canary audit-runs/phase-b-state-equivalence/snap-001/canary \
--ours audit-runs/phase-b-state-equivalence/snap-001/ours \
--out /tmp/post-fix-phase-b.md
wrote /tmp/post-fix-phase-b.md (58 divergences)
# exit 2 (STOP)
$ diff -q audit-runs/phase-b-state-equivalence/report.md /tmp/post-fix-phase-b.md
# (no output → byte-identical)
```
The 58-divergence catalog is unchanged. The diff_state.py fix
behavior is restricted to the case where on-disk SHA disagrees with
manifest claim, which only occurs in tampering or cross-engine
testing where each engine emits its own bytes.
## Unit tests
```
$ cargo test -p xenia-kernel event_log
test event_log::tests::fnv1a_known_vector ... ok
test event_log::tests::semantic_id_stable ... ok
test result: ok. 2 passed; 0 failed
```
PASS.
## Summary
| Gate | Status |
|---|---|
| Phase A 1 cvar-OFF (ours) | PASS |
| Phase A 1 cvar-OFF (canary) | PASS |
| Phase A 2 cvar-ON well-formed JSONL | PASS |
| Phase A 3 ≥100-event matching prefix | PASS |
| Phase A 4 negative test | PASS |
| Phase B 1 cvar-OFF (ours) | PASS |
| Phase B 1 cvar-OFF (canary) | PASS |
| Phase B 2 well-formed snapshots | PASS |
| Phase B 3 hash-deterministic re-runs (ours) | PASS |
| Phase B 3 hash-deterministic re-runs (canary) | PASS |
| Phase B 4 invariants `pc == entry_point` | PASS |
| Phase B 4 invariant `image_loaded_sha256` | FAIL → STOP (documented finding for Phase C) |
| Phase B 5 negative test | PASS (post-fix) |
| Combined cvar-OFF byte-identical to baseline | PASS |
| Diff-tool synthetic edges (each tool, 5 cases) | PASS |
| Hook-point semantic equivalence | PASS |
All gates that should PASS, do. The single FAIL is the documented
`image_loaded_sha256` STOP condition that defines Phase B's success
boundary.