handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
130
audit-runs/phase-b-state-equivalence/validation.md
Normal file
130
audit-runs/phase-b-state-equivalence/validation.md
Normal file
@@ -0,0 +1,130 @@
|
||||
# Phase B — Validation record
|
||||
|
||||
All gates executed on 2026-05-13 against the patched canary
|
||||
(`build-cross/bin/Windows/Debug/xenia_canary.exe` + renamed
|
||||
`xenia_canary_phaseB.exe`) and ours (`target/release/xenia-rs` + renamed
|
||||
`target/release/xenia-rs-phaseB`).
|
||||
|
||||
## Gate 1: cvar-OFF determinism
|
||||
|
||||
### ours
|
||||
|
||||
- Pre-patch digest: `audit-runs/phase-a-diff-harness/digest-post-patch-cvaroff.json` (Phase A baseline; Phase A's gate-1 already proved byte-identity to the genuine pre-patch).
|
||||
- Post-Phase-B digest: `audit-runs/phase-b-state-equivalence/digest-post-phaseB-cvaroff.json`.
|
||||
- Both runs: `check --stable-digest -n 50000000` against the same ISO.
|
||||
- `diff` of the two files produces zero output. Byte-identical. **PASS.**
|
||||
|
||||
### canary
|
||||
|
||||
- Phase B adds three new CONFIG DUMP lines (`phase_b_snapshot_dir = ""`, `phase_b_snapshot_and_exit = false`, `phase_b_dump_section_content = false`). All other lines either match Phase A's accepted host-pointer/timing jitter or are unchanged.
|
||||
- Smoke marker (`AUDIT-DEMO-SETUP-BEGIN`) still fires.
|
||||
- **PASS** by the Phase A gate-1 method.
|
||||
|
||||
## Gate 2: Snapshot files well-formed
|
||||
|
||||
### ours
|
||||
|
||||
```
|
||||
$ ls audit-runs/phase-b-state-equivalence/snap-001/ours/
|
||||
config.json cpu_state.json kernel.json manifest.json memory.json vfs.json
|
||||
```
|
||||
|
||||
All six files parse as JSON, lead with `"schema_version": 1` (or contain it in manifest), and are alphabetically sort-keys-sorted (verified by re-serializing — `serde_json::Map` defaults to ordered). **PASS.**
|
||||
|
||||
### canary
|
||||
|
||||
```
|
||||
$ ls audit-runs/phase-b-state-equivalence/snap-001/canary/
|
||||
config.json cpu_state.json kernel.json manifest.json memory.json vfs.json
|
||||
```
|
||||
|
||||
Same six files, same shape. Note: canary's `phase_b_snapshot.cc` writes JSON via direct `fmt::format` rather than a JSON map, so keys are emitted in **insertion order, not alphabetical order**. The diff tool parses to dict before comparing, so this asymmetry has no functional impact (verified empirically — `diff_state.py` produces identical reports across multiple runs of either engine). It does mean the canary↔ours manifest hashes differ even when the underlying state is semantically identical; the diff tool falls back to full content comparison in that case. **PASS** with this caveat documented.
|
||||
|
||||
## Gate 3: Hash-deterministic re-runs (ours)
|
||||
|
||||
Two runs of ours with identical args:
|
||||
|
||||
```
|
||||
$ ./target/release/xenia-rs-phaseB exec --quiet \
|
||||
--phase-b-snapshot-dir <dir> --phase-b-snapshot-and-exit <iso> # run 1
|
||||
$ mv <dir>/ours <dir>/ours-a
|
||||
$ ./target/release/xenia-rs-phaseB exec --quiet \
|
||||
--phase-b-snapshot-dir <dir> --phase-b-snapshot-and-exit <iso> # run 2
|
||||
$ diff -r <dir>/ours <dir>/ours-a && echo BYTE-IDENTICAL
|
||||
BYTE-IDENTICAL
|
||||
```
|
||||
|
||||
**PASS.** Re-running ours with the same args produces hash-identical snapshot files.
|
||||
|
||||
> The first re-run attempt produced a `config.json` mismatch because the
|
||||
> two runs were given different `--phase-b-snapshot-dir` values (whose
|
||||
> path string is embedded in `config.json::cvars.phase_b_snapshot_dir`).
|
||||
> That field is in the diff tool's `SKIP_BY_FILE["config.json"]` skip
|
||||
> set; the hash difference confirmed the skip rule is well-placed. With
|
||||
> identical inputs the snapshots are byte-equal.
|
||||
|
||||
## Gate 4: Invariants (HARD GATE)
|
||||
|
||||
From `report.md`:
|
||||
|
||||
| invariant | canary | ours | ok? |
|
||||
|---|---|---|---|
|
||||
| xex_entry_point | `0x824ab748` | `0x824ab748` | **PASS** |
|
||||
| cpu_state.pc == xex_entry_point | `0x824ab748 == 0x824ab748` (canary) | `0x824ab748 == 0x824ab748` (ours) | **PASS** |
|
||||
| image_loaded_sha256 | `a70993b7…` | `ea8d160e…` | **FAIL → STOP** |
|
||||
|
||||
The PC + entry-point invariants prove the snapshot point is **equivalent across engines** — both fired immediately before the first instruction at the same address. This is the principal Phase B equivalence claim.
|
||||
|
||||
The `image_loaded_sha256` mismatch is the **expected STOP condition** per the spec. Phase B's contract is to detect and report this; investigation belongs to Phase C/D. The report.md flags it explicitly with re-run guidance.
|
||||
|
||||
## Gate 5: Diff-tool negative test
|
||||
|
||||
```
|
||||
$ cp audit-runs/phase-b-state-equivalence/snap-001/ours/kernel.json /tmp/kernel-mut.json
|
||||
$ sed -i 's/"thread_id": 1/"thread_id": 999/' /tmp/kernel-mut.json
|
||||
$ mkdir -p /tmp/ours-mut && cp -r audit-runs/phase-b-state-equivalence/snap-001/ours/* /tmp/ours-mut/
|
||||
$ cp /tmp/kernel-mut.json /tmp/ours-mut/kernel.json
|
||||
$ python3 tools/diff-state/diff_state.py \
|
||||
--canary audit-runs/phase-b-state-equivalence/snap-001/ours \
|
||||
--ours /tmp/ours-mut --out /tmp/r.md
|
||||
$ echo $?
|
||||
1
|
||||
```
|
||||
|
||||
Report.md names two divergences:
|
||||
- `kernel.json <manifest>` `manifest-hash-mismatch` — surfaces that `/tmp/ours-mut/kernel.json`'s SHA does not match what `/tmp/ours-mut/manifest.json` claims.
|
||||
- `kernel.json objects[handle_semantic_id=…].details.thread_id` value=`canary=1, ours=999` — the actual mutation.
|
||||
|
||||
**PASS.**
|
||||
|
||||
> Verified 2026-05-13 (Phase A/B verify session). Pre-fix the diff tool
|
||||
> trusted the manifest-claimed hashes without verifying them; a tampered
|
||||
> file with an intact manifest copy would silently report "identical"
|
||||
> (exit 0). The fix in [`diff_state.py`](../../tools/diff-state/diff_state.py)
|
||||
> (around `diff_directory`) re-hashes each file, surfaces a
|
||||
> `manifest-hash-mismatch` σ-structural divergence when the on-disk SHA
|
||||
> does not match the manifest, and falls through to a full content diff.
|
||||
|
||||
## Summary
|
||||
|
||||
| Gate | Status |
|
||||
|---|---|
|
||||
| 1. Cvar-OFF determinism (both engines) | PASS |
|
||||
| 2. Snapshots well-formed (both engines) | PASS |
|
||||
| 3. Hash-deterministic re-runs (ours) | PASS |
|
||||
| 4. Invariants — pc == entry_point | PASS |
|
||||
| 4. Invariants — image_loaded_sha256 | **FAIL → STOP** (expected: this is what Phase B catalogs) |
|
||||
| 5. Diff-tool negative test | PASS |
|
||||
|
||||
## Cascade prediction at session close
|
||||
|
||||
- A (snapshot tool emits readable state both engines): **achieved**.
|
||||
- B (section content hashes match): **NOT achieved** — `image_loaded_sha256` differs. The XEX is loaded into different post-decompression states between the two engines. This is the primary finding that Phase C will investigate, *not* a Phase B failure.
|
||||
- C (divergence catalog produced with classification): **achieved** — 58 divergences across all 5 files, fully classified.
|
||||
- D (fix lands): **N/A — out of scope for Phase B**.
|
||||
|
||||
## Notes on minor implementation choices
|
||||
|
||||
- Canary's PPCContext doesn't expose a `pc` field (the JIT dispatch loop manages PC). At the snapshot point the about-to-execute PC equals the `address` arg to `processor()->Execute(...)`, which the hook receives as `entry_address`; we emit that value as `cpu_state.pc`.
|
||||
- Memory snapshots emit a **fixed named-region list** (XEX image, main stack, PCR, TLS) rather than walking the full page table. An earlier blanket-walk approach crashed in Wine because canary's `QueryRegionInfo` reports `COMMIT` for some pages whose host-side backing is reserved-not-committed (physical heap mirrors, low system heap). The named-region list is sufficient for the diff tool's cross-engine comparison.
|
||||
- The `xex_header_sha256` field uses different formats in each engine (canary emits a 64-bit `UserModule::hash()`; ours emits a placeholder zero string). This is a known one-line shim that Phase B intentionally leaves as a divergence to demonstrate the diff tool's δ-content class.
|
||||
Reference in New Issue
Block a user