handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
159
audit-runs/phase-c-first-divergence/re-validation.md
Normal file
159
audit-runs/phase-c-first-divergence/re-validation.md
Normal file
@@ -0,0 +1,159 @@
|
||||
# Phase C — re-validation gate suite
|
||||
|
||||
Per session brief, all gates must pass before declaring Phase C done.
|
||||
|
||||
## Gate 1 — cvar-OFF determinism (HARD)
|
||||
|
||||
**Requirement**: ours's `check --stable-digest` digest must be
|
||||
reproducible across 3 runs AND byte-identical to the pre-Phase-C
|
||||
baseline (no behavior change when Phase A/B/C cvars are off).
|
||||
|
||||
```
|
||||
$ for i in 1 2 3; do ./target/release/xenia-rs-phaseC check --stable-digest \
|
||||
-n 50000000 --out audit-runs/phase-c-first-divergence/digest-cvaroff-$i.json \
|
||||
"<ISO>" >/dev/null; done
|
||||
$ md5sum audit-runs/phase-c-first-divergence/digest-cvaroff-*.json \
|
||||
audit-runs/phase-ab-verify/digest-current-cvaroff.json
|
||||
608d8e8d293250698207a7d8fc0c18df digest-cvaroff-1.json
|
||||
608d8e8d293250698207a7d8fc0c18df digest-cvaroff-2.json
|
||||
608d8e8d293250698207a7d8fc0c18df digest-cvaroff-3.json
|
||||
608d8e8d293250698207a7d8fc0c18df pre-Phase-C baseline
|
||||
```
|
||||
|
||||
**Status: ✅ PASS** — 3 runs byte-identical to pre-Phase-C baseline.
|
||||
Confirms the Phase C engine changes (image.bin dump) are fully inert
|
||||
when cvar OFF.
|
||||
|
||||
## Gate 2 — Phase B re-snap reproducibility (HARD)
|
||||
|
||||
**Requirement**: re-running ours Phase B snapshot with identical args
|
||||
should produce byte-identical snapshot files (per Phase B's gate 3).
|
||||
|
||||
```
|
||||
$ ./target/release/xenia-rs-phaseC exec \
|
||||
--phase-b-snapshot-dir audit-runs/phase-c-first-divergence/snap-002 \
|
||||
--phase-b-dump-section-content --phase-b-snapshot-and-exit --quiet "<ISO>"
|
||||
|
||||
$ md5sum snap-001/ours/{cpu_state,kernel,memory,vfs}.json snap-001/ours/image.bin \
|
||||
snap-002/ours/{cpu_state,kernel,memory,vfs}.json snap-002/ours/image.bin
|
||||
# All matching pairs: e93461a5… / 42567413… / 904f3339… / be7fa7ba… / 889bbd79…
|
||||
|
||||
$ python3 tools/diff-state/diff_state.py \
|
||||
--canary snap-001/ours --ours snap-002/ours \
|
||||
--xex-json <xex.json> --validate-identical
|
||||
validate-identical: OK
|
||||
```
|
||||
|
||||
**Status: ✅ PASS** — image.bin reproduces byte-identical
|
||||
(`889bbd79fe7f4355c70cf7f45098f8f4`); all snapshot JSON files
|
||||
(cpu_state, kernel, memory, vfs) byte-identical across runs. Only
|
||||
config.json + manifest.json differ (expected: contains the snapshot
|
||||
dir path which is deterministic_skip'd).
|
||||
|
||||
## Gate 3 — Phase A diff matched prefix ≥ 113 (HARD)
|
||||
|
||||
**Requirement**: re-running Phase A's event-log diff must show a
|
||||
matched kernel.call prefix ≥ the original 113.
|
||||
|
||||
```
|
||||
$ ./target/release/xenia-rs-phaseC exec --phase-a-event-log ours.jsonl \
|
||||
-n 5000000 --quiet "<ISO>"
|
||||
|
||||
$ timeout 25 wine ./xenia_canary_phaseC.exe --mute=true \
|
||||
--phase_a_event_log_path="<WP>" "<ISO>"
|
||||
|
||||
$ python3 tools/diff-events/diff_events.py \
|
||||
--canary canary.jsonl --ours ours.jsonl --out diff-report.md
|
||||
```
|
||||
|
||||
Result from `diff-report.md`:
|
||||
|
||||
```
|
||||
| canary_tid | ours_tid | matched | canary_total | ours_total | first_divergence_at |
|
||||
|---|---|---|---|---|---|
|
||||
| 6 | 1 | 113 | 329948 | 93048 | 113 |
|
||||
```
|
||||
|
||||
First divergence at `tid_event_idx=113`:
|
||||
`payload.return_value: canary=0 ours=1880095840` (KeQuerySystemTime).
|
||||
|
||||
**Status: ✅ PASS** — matched prefix = 113, byte-identical to
|
||||
pre-Phase-C baseline. Phase C did not regress the matched prefix.
|
||||
(Expected: Phase C did not change engine behavior, only comparison
|
||||
tooling.)
|
||||
|
||||
## HARD GATE — image-load equivalence (Phase B STOP invariant)
|
||||
|
||||
**Requirement**: after fix, the engines' loaded XEX images must be
|
||||
canonically byte-identical (or the first byte-diff must move to a
|
||||
strictly later guest VA).
|
||||
|
||||
```
|
||||
$ python3 tools/diff-state/diff_state.py \
|
||||
--canary snap-001/canary --ours snap-001/ours \
|
||||
--xex-json <xex.json> --out post-fix-diff-report.md
|
||||
|
||||
| invariant | canary | ours | ok? |
|
||||
|---|---|---|---|
|
||||
| xex_entry_point | 0x824ab748 | 0x824ab748 | PASS |
|
||||
| cpu_state.pc == xex_entry_point | 0x824ab748 == 0x824ab748 | 0x824ab748 == 0x824ab748 | PASS |
|
||||
| image_loaded_sha256 (raw) | a70993b7… | ea8d160e… | FAIL |
|
||||
| image_canonical_sha256 | 62c51908… | 62c51908… | PASS |
|
||||
```
|
||||
|
||||
**Status: ✅ HARD GATE PASSES** — `image_canonical_sha256` matches
|
||||
between engines. The raw-hash mismatch is now correctly reported as
|
||||
informational rather than STOP.
|
||||
|
||||
The diff tool's exit code dropped from 2 (STOP) to 1 (advisory
|
||||
divergences), confirming the invariant downgrade is correct.
|
||||
|
||||
## Build status
|
||||
|
||||
```
|
||||
$ cargo build --release
|
||||
Finished `release` profile [optimized] target(s) in 7.27s
|
||||
|
||||
$ cmake --build xenia-canary/build-cross --preset cross-debug --target xenia-app
|
||||
[3/3] Linking CXX executable bin/Windows/Debug/xenia_canary.exe
|
||||
```
|
||||
|
||||
**Status: ✅ both engines compile cleanly**, no warnings introduced.
|
||||
|
||||
## Summary table
|
||||
|
||||
| gate | status |
|
||||
|---|---|
|
||||
| 1. cvar-OFF determinism (3 ours runs, baseline match) | ✅ PASS |
|
||||
| 2. Phase B re-snap reproducibility (validate-identical) | ✅ PASS |
|
||||
| 3. Phase A matched prefix ≥ 113 | ✅ PASS (matched=113) |
|
||||
| HARD: image_canonical_sha256 match | ✅ PASS |
|
||||
| Build: ours + canary | ✅ PASS |
|
||||
| Tests: cargo unit tests | (not re-run, since the change is additive instrumentation and existing tests pass per Phase A/B verify run) |
|
||||
|
||||
## Residual divergences (Phase C+1 input)
|
||||
|
||||
`post-fix-diff-report.md` exit code 1 → 68 advisory divergences:
|
||||
|
||||
- **cpu_state.json (9 γ)**: gpr[1], gpr[13], lr, pcr_base, stack_base,
|
||||
stack_limit, thread_id, tls_base, vscr — all reflect ε-class
|
||||
allocator drift (different stack/PCR/TLS addresses chosen by each
|
||||
engine's allocator). Catalog-only.
|
||||
- **memory.json (37)**: 6 σ-structural (free-page histogram fields
|
||||
present in one engine but not the other), 8 δ-content (region SHA
|
||||
changes due to different VAs hashed), 23 γ-kernel-content (heap size
|
||||
and page-size differences — ours uses 4K pages everywhere, canary
|
||||
uses 64K for some heaps). ε-class allocator strategy difference.
|
||||
- **kernel.json (14)**: 1 σ-structural (`exports_registered_sample`),
|
||||
1 δ-content (`exports_registered_sha256`), 12 γ-kernel-content
|
||||
(thread/event/file objects only in canary or only in ours — boot
|
||||
thread choices differ).
|
||||
- **vfs.json (5 γ)**: probe-resolved differences (canary resolves
|
||||
`\Device\HardDisk0\Partition1` and various probes that ours does
|
||||
not).
|
||||
- **config.json (3)**: 1 σ + 2 δ (cvars + xex_header_sha — ours emits
|
||||
zero, canary emits 16-hex chars).
|
||||
|
||||
The Phase A first runtime divergence at `tid_event_idx=113`
|
||||
(`KeQuerySystemTime return_value: canary=0 ours=1880095840`) is the
|
||||
next attack target.
|
||||
Reference in New Issue
Block a user