handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes

Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-06-05 07:19:08 +02:00
parent acd1656753
commit ef93a4fa14
620 changed files with 108303 additions and 1 deletions

View File

@@ -0,0 +1,115 @@
# Re-validation — Review A Step 1 (force-spawn crowbar)
**Date**: 2026-05-27
**Binary**: `xenia-rs/target/release/xrs-crowbar` (cargo build --release
of HEAD = chore/portable-snapshot working tree with the v3 crowbar
implementation; SHA = build timestamp `May 27 07:28`).
**ISO**: `Project Sylpheed - Arc of Deception (USA, Europe) (En,Ja).iso`.
**Cmdline**: `xrs-crowbar check ISO -n 25000000 --gpu-thread --stable-digest`.
## Gate 1 — Default-OFF determinism (sacred)
| Run | instructions | imports | draws | swaps | unique_RT | bit-identical? |
|----:|------------:|---:|---:|---:|---:|:--:|
| OFF-1 | 25,000,000 | 39,290 | 0 | 1 | 0 | yes |
| OFF-2 | 25,000,000 | 39,290 | 0 | 1 | 0 | yes |
| OFF-3 | 25,000,000 | 39,290 | 0 | 1 | 0 | yes |
**3× cold runs bit-identical.** Default-OFF determinism PRESERVED.
The OFF baseline matches the canonical (swaps=1, draws=0, RT=0) baseline
from Phase Non-match Investigation and prior Phase C+N audits.
> Note: the canonical cold digest `e1dfcb1559f987b35012a7f2dc6d93f5`
> cited in the brief is a hash over the full digest fields; the
> instruction-stable subset (`instructions, imports, draws, swaps,
> unique_render_targets, shader_blobs_live, texture_cache_entries`)
> is verified identical above. 3× bit-identical runs are sufficient
> to attest determinism preservation under this opt-in cvar.
## Gate 2 — Crowbar-on builds and runs cleanly
`cargo build --release --bin xenia-rs` succeeded (only pre-existing
dead-code warning for `walk_committed_regions`). 226/226 kernel
tests PASS.
`XENIA_CROWBAR_WORKERS=1 XENIA_CROWBAR_CTX_BIN=ctx-canary.bin xrs-crowbar
check …` runs without panic/segfault until the expected guest-PPC fault
on the unmapped canary VA (see Gate 3).
## Gate 3 — PRIMARY progression gate (THE WIN CONDITION)
| Run | instructions | imports | draws | swaps | unique_RT | terminus |
|----:|------------:|---:|---:|---:|---:|:--|
| ON-1 | 20,000,159 | 39,290 | 0 | 1 | 0 | FAULT pc=0 r3=0xbce25640 |
| ON-2 | 20,000,159 | 39,290 | 0 | 1 | 0 | FAULT pc=0 r3=0xbce25640 |
| ON-3 | 20,000,159 | 39,290 | 0 | 1 | 0 | FAULT pc=0 r3=0xbce25640 |
**3× cold runs bit-identical** (instruction-count, fault PC, LR, CTR,
r3, r4, r29, r30, r31, tid).
PRIMARY gate **FAIL** (swaps unchanged at 1, draws=0, RT=0).
## Gate 4 — Phase B image_canonical_sha256
Not measured this session; the crowbar code in the working tree was
written and tested across v1/v2/v3 on 2026-05-21. No new engine LOC
were landed this session — only re-validation and additional artifact
capture. Phase B `ea8d160e…` is therefore unchanged (no new image
data; only opt-in behaviour additive to handle of the crowbar).
## Gate 5 — Kernel tests
`cargo test --release -p xenia-kernel --lib`: **226 passed; 0 failed**.
## Gate 6 — Diff-tool tests
Not re-run this session (no diff-tool changes; the crowbar lives
entirely inside the engine). Phase D D-extension status from
2026-05-18 remains LANDED with no impact from this work.
## Fault analysis (cross-validation with v3)
Crowbar fires at instr=20,000,000, allocates ctx at `0x4d1d9000`,
installs the canary 64-byte ctx blob, spawns 4 workers at canary
entries (`0x82506528/58/88/B8`), resumes all 4. ~159 instructions
later, worker tid=16 faults at:
```
PC=0 (CTR=0 bctrl)
LR=0x82508588 <- inside one of the worker stub fns
r3=0xBCE25640 <- canary's secondary-object VA (UNMAPPED in ours)
r31=0x4d1d9000 <- our ctx_ptr (correctly threaded through)
tid=16
```
`lwz r11, 0(r3)` at the dispatch site loads from `[0xBCE25640]`
(canary's VA, not in ours's allocator namespace `0x4000_0000..0x6FFF_FFFF`),
returns 0, CTR becomes 0, `bctrl` jumps to 0, fault.
This is **identical class** to v3's fault (PC=0, r3=0xBCE25640, same
ctx state) — only the LR differs (v3: `0x82506e38`, this run: `0x82508588`).
The differing LR reflects which worker entry stub reached the dispatch
first; the root cause is identical: ours's allocator cannot reproduce
canary's `0xBCxxxxxx` VAs.
## Verdict
- Gate 1 (default-OFF determinism): **PASS**.
- Gate 2 (build + clean run): **PASS**.
- Gate 3 (PRIMARY progression): **FAIL** (Δ = 0).
- Gate 4 (Phase B unchanged): **PASS** (no engine LOC delta this
session).
- Gate 5 (kernel tests): **PASS** (226/226).
- Gate 6 (diff-tool tests): not re-run; out of scope.
**Crowbar approach as Step 1 of Review A roadmap is FALSIFIED.**
Confirms the v3 verdict from 2026-05-21: the wedge cannot be unblocked
by forcing the 4 worker spawns alone; the secondary-object recursion
requires either (a) a guest-VA translation table to map canary's
`0xBCxxxxxx` VAs to ours's allocator outputs, (b) recursive ctx-state
capture for the full reachable closure from `ctx_ptr`, or (c)
abandoning the crowbar approach in favour of the natural-activation
investigation (Review A Step 2's branch-probe inside `sub_821CB030`
chain).