Files
xenia-rs/audit-runs/phase-c6half-sister-sweep/re-validation.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

5.3 KiB

Phase C+6½ — re-validation (HARD GATES)

Gate 1 — Determinism (cvar-OFF, ours)

3 fresh runs of check -n 50000000 --stable-digest post-fix:

run digest md5
1 c6d895829b4611964978990ae1cb8a6a
2 c6d895829b4611964978990ae1cb8a6a
3 c6d895829b4611964978990ae1cb8a6a
C+6 baseline (cvar-OFF) c6d895829b4611964978990ae1cb8a6a

Result: byte-identical across 3 runs AND UNCHANGED from C+6.

Why digest unchanged despite Phase 2 behavior fixes:

  • Phase 1 sister sweep is purely emitter-suppression (cvar-OFF inert).
  • Phase 2 fixes ord 0x82 (ke_query_interrupt_time) and 0x98 (stub_success) — but neither ord is called in the 50M-instr window (verified: 0 hits in event log for both names, before AND after). The new bodies are LATENT for now.

Cvar-OFF baseline preserved. No regression to determinism.

Gate 2 — Phase B image_canonical_sha256

Phase B snapshot captured to snap/ours/:

$ grep image_loaded_sha256 snap/ours/config.json
"image_loaded_sha256": "ea8d160e9369328a5b922258a92113efb8d7ce3e1a5c12cc521e375985c91c18",

Matches the Phase-A/B verify baseline . The fix touches only kernel-export shim/emitter — no PE image bytes modified.

Gate 3 — Phase A matched-prefix extension (KEY METRIC)

Diffed /tmp/c6half_phasea.jsonl (50M-instr Phase A capture post-fix) against audit-runs/phase-c-first-divergence/phase-a/canary.jsonl.

chain C+6 C+6½ Δ
canary tid=6 → ours tid=1 (main) 102158 102158 0 (preserved)
canary tid=4 → ours tid=11 5 5 0
canary tid=7 → ours tid=2 15 26 +11
canary tid=12 → ours tid=7 2 2 0
canary tid=14 → ours tid=9 39 39 0
canary tid=15 → ours tid=10 (no div) (no div) 0

Gate 3 :

  • tid=7→tid=2 chain advanced from 15 to 26 (+11) — direct payoff of the StfsCreateDevice class-E fix (C+6 predicted this).
  • Main chain unchanged at 102158 (no regression).
  • All other chains stable.

New tid=7→tid=2 first divergence

idx=26: kernel.return KeSetEvent return_value=1 (canary) vs 0 (ours).

Same divergence class as tid=4→tid=11 (also at idx=5 on KeSetEvent). KeSetEvent semantically returns "the previous signaled state of the event" — canary returns 1 (event was already signaled), ours returns 0 (correct STATUS_SUCCESS but wrong semantic). This is the KeSetEvent return-value bug class, deferred — would need investigation of ours's nt_set_event/ke_set_event body return semantics (likely a sister to the KeSetEvent_entry body returning prev_signaled instead of STATUS_SUCCESS).

Gate 4 — Build

$ cargo build --release
   Compiling xenia-kernel v0.1.0
   Compiling xenia-app v0.1.0
    Finished `release` profile [optimized] target(s) in 6.73s

One pre-existing dead-code warning (walk_committed_regions); not introduced by this fix. Canary untouched.

Gate 5 — Phase A determinism (emitter)

Two cvar-ON captures of the same engine binary, md5-summing only deterministic fields (excluding host_ns and guest_cycle):

run1 (det-fields-only md5): 11a07772f600abab9dcfe4af2300b554
run2 (det-fields-only md5): 11a07772f600abab9dcfe4af2300b554

Byte-identical .

Note: digest changes from C+6's 7312446e… because the suppression flips events on/off in the deterministic stream for 9 more ords AND the renamed ord 0x82/0x98 produce different name strings. Expected.

Gate 6 — Kernel unit tests

$ cargo test --release -p xenia-kernel --lib
test result: ok. 146 passed; 0 failed; 0 ignored; 0 measured;
                 0 filtered out

Test count: 145 (C+6) → 146 (C+6½) = +1 net.

  • Renamed: ke_set_ideal_processor_round_tripsscheduler_ideal_processor_round_trips (still exercises round-trip, but via scheduler API directly since the hallucinated kernel-export funcs were deleted).
  • Added: ke_query_interrupt_time_returns_synthetic_u64 — asserts the new ord 0x82 body returns a non-zero u64 (> u32::MAX) in gpr[3], guarding against regression to byte-sized return.

No regressions in any other test.

Summary

All 6 gates pass.

  • Phase 1: 9 class-E sister bugs fixed (DbgPrint, RtlCaptureContext, sprintf, RtlUnwind, _vsnprintf, __C_specific_handler, XeKeysConsoleSignatureVerification, StfsCreateDevice, StfsControlDevice). All registered via register_unimplemented_export to mirror canary's syscall-thunk silence. Bodies retained (harmless side effects).
  • Phase 2: 2 hallucinated imports fixed (ord 0x82 KeQueryInterruptTime, ord 0x98 KeSetBackgroundProcessors). Both rename + body fix; old hallucinated bodies deleted.
  • Phase 3: no additional findings beyond the explicit list.

Headline metric: tid=7→tid=2 chain matched-prefix grew +11 (15→26), exactly as the C+6 sister-bugs note predicted for StfsCreateDevice. Main chain preserved at 102158. No regressions.

Diff-tool unchanged. Canary unchanged. Heap memory model (C+2) and clock (Stage 2) deferred items untouched.

Next divergence (per metric):

  • Main chain still blocked at 102158 (XamTaskCloseHandle return=1 vs 0) — Phase C+7 target as noted in C+6 memory.
  • tid=7→tid=2 now blocked at 26 (KeSetEvent return=1 vs 0) — same bug class as tid=4→tid=11 at idx=5 (KeSetEvent prev-signaled return-value semantics).