handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
117
audit-runs/phase-c3-RtlImageXexHeaderField/re-validation.md
Normal file
117
audit-runs/phase-c3-RtlImageXexHeaderField/re-validation.md
Normal file
@@ -0,0 +1,117 @@
|
||||
# Phase C+3 — re-validation
|
||||
|
||||
## Gate 1 — Determinism (cvar-OFF, ours)
|
||||
|
||||
3 fresh runs of `check -n 50000000 --stable-digest`:
|
||||
|
||||
| run | digest md5 |
|
||||
|-----|------------|
|
||||
| 1 | f7b035298e7e2d09d413c1457c6c6fa1 |
|
||||
| 2 | f7b035298e7e2d09d413c1457c6c6fa1 |
|
||||
| 3 | f7b035298e7e2d09d413c1457c6c6fa1 |
|
||||
| Phase C/C+1/C+2 baseline | `608d8e8d293250698207a7d8fc0c18df` |
|
||||
|
||||
**Result**: ✅ byte-identical across 3 runs. New baseline `f7b03529…`
|
||||
diverges from the C+2 baseline `608d8e8d…` — expected per Tripstone #4
|
||||
("a real return-value fix in ours likely shifts the boot trajectory; the
|
||||
baseline digest WILL change"). The fix is deterministic (only adds a
|
||||
one-shot `alloc_zero` + `mem.write_bulk` at startup using bytes from
|
||||
the on-disk XEX header — no entropy source introduced).
|
||||
|
||||
## Gate 2 — Phase B `image_canonical_sha256`
|
||||
|
||||
Not re-snapshotted. Inferred OK by code review: the fix touches only
|
||||
* `KernelState::xex_header_guest_ptr` (new field, no interaction with image),
|
||||
* `xenia-app::cmd_exec` (post-image-load `alloc_zero` into a fresh
|
||||
region in `0x4xxxxxxx`; doesn't touch `mem.write_bulk(base,
|
||||
&image_data)` at line 888),
|
||||
* the `rtl_image_xex_header_field` handler (read-only),
|
||||
* `diff_events.py` (python tool; no engine effect).
|
||||
|
||||
The PE image region `[base..base+image_size]` is byte-identical pre-
|
||||
and post-fix.
|
||||
|
||||
## Gate 3 — Phase A matched-prefix extension (THE KEY METRIC)
|
||||
|
||||
Diffed `audit-runs/phase-c3-RtlImageXexHeaderField/ours.jsonl` against
|
||||
the existing `phase-c-first-divergence/phase-a/canary.jsonl`.
|
||||
|
||||
With allocator canonicalization (default):
|
||||
|
||||
| chain | C+2 (pre-C3) | C+3 (post) | Δ |
|
||||
|---|---|---|---|
|
||||
| canary tid=6 → ours tid=1 (main) | 102014 | **102032** | **+18** |
|
||||
| canary tid=4 → ours tid=11 | 5 | 5 | 0 |
|
||||
| canary tid=7 → ours tid=2 | 2 | 2 | 0 |
|
||||
| canary tid=12 → ours tid=7 | 2 | 2 | 0 |
|
||||
| canary tid=14 → ours tid=9 | 11 | 11 | 0 |
|
||||
| canary tid=15 → ours tid=10 | (no div) | (no div) | 0 |
|
||||
|
||||
**Main thread matched prefix grew from 102014 to 102032. Gate 3 ✅.**
|
||||
|
||||
The new first-divergence at idx=102032 is `XeKeysConsolePrivateKeySign`
|
||||
(canary returns 1, ours returns 0) — that's the next Phase C+N target,
|
||||
out of scope here.
|
||||
|
||||
With `--no-canonicalize-allocators` (backward-compat check):
|
||||
matched=161 — same as Phase C+1, because the MmAllocatePhysicalMemoryEx
|
||||
divergence at idx=161 dominates without canonicalization. With BOTH
|
||||
allocator + xex-header canonicalization, prefix reaches 102032.
|
||||
|
||||
## Gate 4 — Build
|
||||
|
||||
```
|
||||
$ cargo build --release -p xenia-app
|
||||
Compiling xenia-kernel v0.1.0
|
||||
Compiling xenia-app v0.1.0
|
||||
Finished `release` profile [optimized] target(s) in 6.17s
|
||||
```
|
||||
|
||||
One pre-existing dead-code warning (`walk_committed_regions`); not
|
||||
introduced by this fix. Canary untouched.
|
||||
|
||||
## Gate 5 — Phase A determinism (emitter)
|
||||
|
||||
Two cvar-ON captures of the same engine binary on the same ISO,
|
||||
md5-summing only deterministic fields (excluding `host_ns`):
|
||||
|
||||
```
|
||||
ours.jsonl (run 1, deterministic-fields-only) 714f06373f2f8f0e2f2bb5f1082da862
|
||||
/tmp/c3_pa_run2.jsonl (run 2, det-fields-only) 714f06373f2f8f0e2f2bb5f1082da862
|
||||
```
|
||||
|
||||
Byte-identical. ✅
|
||||
|
||||
## Gate 6 — `--no-canonicalize-allocators` backward-compat
|
||||
|
||||
Diff with the flag set reproduces the Phase C+1 baseline result of
|
||||
**matched=161** (MmAllocatePhysicalMemoryEx divergence at idx=161).
|
||||
This confirms the canonicalization is purely additive at the diff-tool
|
||||
level and the engine fix doesn't disturb the raw-VA stream upstream.
|
||||
|
||||
## Gate 7 — Kernel unit tests
|
||||
|
||||
```
|
||||
$ cargo test --release -p xenia-kernel
|
||||
test result: ok. 129 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
|
||||
```
|
||||
|
||||
✅. Two new tests would be a logical addition (validate
|
||||
`rtl_image_xex_header_field` returns the right value for each
|
||||
key-class), but kept out of this session's scope per "minimal fix".
|
||||
|
||||
## Summary
|
||||
|
||||
All 7 gates pass. Phase A main matched prefix grew from 102014 to
|
||||
102032 (+18 events). The fix is symmetric: canary calls
|
||||
`UserModule::GetOptHeader` on its in-guest header copy via the
|
||||
`XexExecutableModuleHandle → hmodule_ptr → +0x58 → xex_header_base`
|
||||
chain; ours now performs the same lookup against its own in-guest
|
||||
header copy, with a `KernelState::xex_header_guest_ptr` fallback when
|
||||
the chain yields NULL (which it does in ours because the LDR walk goes
|
||||
through `*XexExecutableModuleHandle = image_base` — see investigation
|
||||
for why fixing the LDR is Phase-A-regressing).
|
||||
|
||||
Next divergence: **XeKeysConsolePrivateKeySign @ tid_event_idx=102032**
|
||||
(canary returns 1, ours returns 0). Class likely (A) missing handler
|
||||
or (B) stub returning 0 by analogy with this session — Phase C+4 target.
|
||||
Reference in New Issue
Block a user