handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes

Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-06-05 07:19:08 +02:00
parent acd1656753
commit ef93a4fa14
620 changed files with 108303 additions and 1 deletions

View File

@@ -0,0 +1,58 @@
# Phase C+24 — escalation summary
## Headline
The 105,286 first-divergence is **NOT** a guest control-flow branch. It
is a **scheduler-cadence divergence**: ours fires the first VSYNC
graphics-interrupt callback (`sub_824be9a0`, armed via
`VdSetGraphicsInterruptCallback`) immediately after `VdSwap.return` at
`cycle=5,584,980`, inserting 6 events (KeAcquire / KeRelease pair).
Canary fires the SAME interrupt body with the SAME `r3=0` (VSYNC)
argument, but ~80ms wall-clock later, at idx 106,805. Both engines
execute the SAME guest code path; only the timing of the first VSYNC
interrupt delivery differs.
## Why escalation (per tripstone #5)
- ours uses `tick_vsync_instr` (guest-instruction-count threshold,
150k) to pace VSYNC; canary uses a dedicated host frame-limiter
thread on wall-clock (`Clock::QueryGuestTickCount`).
- Aligning the two would require either adopting wall-clock pacing in
the lockstep diff harness (invalidates 23 phases of digest stability)
or pinning first-VSYNC to a guest-instruction landmark (requires
engine + canary changes).
- A naive 6-event diff-tool absorber realigns for 24 events then
re-diverges (canary: `MmFreePhysicalMemory` vs ours:
`KeEnterCriticalRegion`); chain of downstream timing-induced
divergences would each need separate analysis. Risks reading-errors
#23 and #32.
- MEMORY.md (review_a_boot_state_2026_05_21) explicitly defers
"scheduler-determinism" alongside audio/HID/XAM.
## Per-chain delta
| chain | C+23 baseline | C+24 outcome | delta |
|---|---|---|---|
| main (canary tid=6 → ours tid=1) | 105,286 | 105,286 | 0 |
| sister 11/32/4/41/16 | (unchanged) | (unchanged) | 0 |
## New cold digest
NONE captured — no engine change, no re-run.
## Next target (post-C+24)
Methodology pivot to scheduler-determinism (separate phase, NOT C+25).
Three options written up in `investigation.md` §"Recommended next
action". Until then, the matched-prefix is effectively capped at
105,286 by VSYNC-cadence offset; further C+nn progression on guest
logic alone will not advance the main chain past this point without
first resolving the cadence issue.
## Files
- `investigation.md` — full analysis (this dir)
- NO source files touched.
- NO test edits.
- NO diff-tool edits.
- Phase B image hash `ea8d160e…` UNCHANGED.
- xenia-kernel tests 226 PASS (unchanged).