handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
142
audit-runs/phase-c6half-sister-sweep/re-validation.md
Normal file
142
audit-runs/phase-c6half-sister-sweep/re-validation.md
Normal file
@@ -0,0 +1,142 @@
|
||||
# Phase C+6½ — re-validation (HARD GATES)
|
||||
|
||||
## Gate 1 — Determinism (cvar-OFF, ours)
|
||||
|
||||
3 fresh runs of `check -n 50000000 --stable-digest` post-fix:
|
||||
|
||||
| run | digest md5 |
|
||||
|-----|------------|
|
||||
| 1 | c6d895829b4611964978990ae1cb8a6a |
|
||||
| 2 | c6d895829b4611964978990ae1cb8a6a |
|
||||
| 3 | c6d895829b4611964978990ae1cb8a6a |
|
||||
| C+6 baseline (cvar-OFF) | `c6d895829b4611964978990ae1cb8a6a` |
|
||||
|
||||
**Result**: ✅ byte-identical across 3 runs AND **UNCHANGED from C+6**.
|
||||
|
||||
Why digest unchanged despite Phase 2 behavior fixes:
|
||||
* Phase 1 sister sweep is purely emitter-suppression (cvar-OFF inert).
|
||||
* Phase 2 fixes ord 0x82 (`ke_query_interrupt_time`) and 0x98
|
||||
(`stub_success`) — but neither ord is called in the 50M-instr window
|
||||
(verified: 0 hits in event log for both names, before AND after).
|
||||
The new bodies are LATENT for now.
|
||||
|
||||
Cvar-OFF baseline preserved. No regression to determinism.
|
||||
|
||||
## Gate 2 — Phase B `image_canonical_sha256`
|
||||
|
||||
Phase B snapshot captured to `snap/ours/`:
|
||||
```
|
||||
$ grep image_loaded_sha256 snap/ours/config.json
|
||||
"image_loaded_sha256": "ea8d160e9369328a5b922258a92113efb8d7ce3e1a5c12cc521e375985c91c18",
|
||||
```
|
||||
|
||||
Matches the Phase-A/B verify baseline ✅. The fix touches only
|
||||
kernel-export shim/emitter — no PE image bytes modified.
|
||||
|
||||
## Gate 3 — Phase A matched-prefix extension (KEY METRIC)
|
||||
|
||||
Diffed `/tmp/c6half_phasea.jsonl` (50M-instr Phase A capture post-fix)
|
||||
against `audit-runs/phase-c-first-divergence/phase-a/canary.jsonl`.
|
||||
|
||||
| chain | C+6 | C+6½ | Δ |
|
||||
|---|---|---|---|
|
||||
| canary tid=6 → ours tid=1 (main) | 102158 | **102158** | **0** (preserved) |
|
||||
| canary tid=4 → ours tid=11 | 5 | 5 | 0 |
|
||||
| **canary tid=7 → ours tid=2** | **15** | **26** | **+11** ✅ |
|
||||
| canary tid=12 → ours tid=7 | 2 | 2 | 0 |
|
||||
| canary tid=14 → ours tid=9 | 39 | 39 | 0 |
|
||||
| canary tid=15 → ours tid=10 | (no div) | (no div) | 0 |
|
||||
|
||||
**Gate 3 ✅**:
|
||||
* tid=7→tid=2 chain advanced from 15 to 26 (+11) — direct payoff of
|
||||
the `StfsCreateDevice` class-E fix (C+6 predicted this).
|
||||
* Main chain unchanged at 102158 (no regression).
|
||||
* All other chains stable.
|
||||
|
||||
### New tid=7→tid=2 first divergence
|
||||
|
||||
idx=26: `kernel.return KeSetEvent` `return_value=1 (canary) vs 0 (ours)`.
|
||||
|
||||
Same divergence class as tid=4→tid=11 (also at idx=5 on KeSetEvent).
|
||||
`KeSetEvent` semantically returns "the previous signaled state of the
|
||||
event" — canary returns 1 (event was already signaled), ours returns
|
||||
0 (correct STATUS_SUCCESS but wrong semantic). This is the `KeSetEvent`
|
||||
return-value bug class, deferred — would need investigation of ours's
|
||||
`nt_set_event`/`ke_set_event` body return semantics (likely a sister
|
||||
to the `KeSetEvent_entry` body returning `prev_signaled` instead of
|
||||
`STATUS_SUCCESS`).
|
||||
|
||||
## Gate 4 — Build
|
||||
|
||||
```
|
||||
$ cargo build --release
|
||||
Compiling xenia-kernel v0.1.0
|
||||
Compiling xenia-app v0.1.0
|
||||
Finished `release` profile [optimized] target(s) in 6.73s
|
||||
```
|
||||
|
||||
One pre-existing dead-code warning (`walk_committed_regions`); not
|
||||
introduced by this fix. Canary untouched.
|
||||
|
||||
## Gate 5 — Phase A determinism (emitter)
|
||||
|
||||
Two cvar-ON captures of the same engine binary, md5-summing only
|
||||
deterministic fields (excluding `host_ns` and `guest_cycle`):
|
||||
|
||||
```
|
||||
run1 (det-fields-only md5): 11a07772f600abab9dcfe4af2300b554
|
||||
run2 (det-fields-only md5): 11a07772f600abab9dcfe4af2300b554
|
||||
```
|
||||
|
||||
Byte-identical ✅.
|
||||
|
||||
Note: digest changes from C+6's `7312446e…` because the suppression
|
||||
flips events on/off in the deterministic stream for 9 more ords AND
|
||||
the renamed ord 0x82/0x98 produce different name strings. Expected.
|
||||
|
||||
## Gate 6 — Kernel unit tests
|
||||
|
||||
```
|
||||
$ cargo test --release -p xenia-kernel --lib
|
||||
test result: ok. 146 passed; 0 failed; 0 ignored; 0 measured;
|
||||
0 filtered out
|
||||
```
|
||||
|
||||
Test count: 145 (C+6) → 146 (C+6½) = +1 net.
|
||||
* Renamed: `ke_set_ideal_processor_round_trips` →
|
||||
`scheduler_ideal_processor_round_trips` (still exercises round-trip,
|
||||
but via scheduler API directly since the hallucinated kernel-export
|
||||
funcs were deleted).
|
||||
* Added: `ke_query_interrupt_time_returns_synthetic_u64` — asserts
|
||||
the new ord 0x82 body returns a non-zero u64 (> u32::MAX) in gpr[3],
|
||||
guarding against regression to byte-sized return.
|
||||
|
||||
No regressions in any other test.
|
||||
|
||||
## Summary
|
||||
|
||||
All 6 gates pass.
|
||||
|
||||
* **Phase 1**: 9 class-E sister bugs fixed (DbgPrint, RtlCaptureContext,
|
||||
sprintf, RtlUnwind, _vsnprintf, __C_specific_handler,
|
||||
XeKeysConsoleSignatureVerification, StfsCreateDevice, StfsControlDevice).
|
||||
All registered via `register_unimplemented_export` to mirror canary's
|
||||
syscall-thunk silence. Bodies retained (harmless side effects).
|
||||
* **Phase 2**: 2 hallucinated imports fixed (ord 0x82
|
||||
`KeQueryInterruptTime`, ord 0x98 `KeSetBackgroundProcessors`).
|
||||
Both rename + body fix; old hallucinated bodies deleted.
|
||||
* **Phase 3**: no additional findings beyond the explicit list.
|
||||
|
||||
**Headline metric**: tid=7→tid=2 chain matched-prefix grew +11 (15→26),
|
||||
exactly as the C+6 sister-bugs note predicted for `StfsCreateDevice`.
|
||||
Main chain preserved at 102158. No regressions.
|
||||
|
||||
Diff-tool unchanged. Canary unchanged. Heap memory model (C+2) and
|
||||
clock (Stage 2) deferred items untouched.
|
||||
|
||||
Next divergence (per metric):
|
||||
* Main chain still blocked at 102158 (`XamTaskCloseHandle return=1
|
||||
vs 0`) — Phase C+7 target as noted in C+6 memory.
|
||||
* tid=7→tid=2 now blocked at 26 (`KeSetEvent return=1 vs 0`) — same
|
||||
bug class as tid=4→tid=11 at idx=5 (KeSetEvent prev-signaled
|
||||
return-value semantics).
|
||||
Reference in New Issue
Block a user