handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes

Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-06-05 07:19:08 +02:00
parent acd1656753
commit ef93a4fa14
620 changed files with 108303 additions and 1 deletions

View File

@@ -0,0 +1,157 @@
# Crowbar v2 — Step 0 (A) vs (B) verdict + new finding
**Date**: 2026-05-21
**Predecessor**: v1 at `audit-runs/review-a-step1-crowbar/`.
**Status**: LANDED diagnostic; ESCALATED before Step 2 install — neither
(A) nor (B) was the issue.
## TL;DR
- **(A) is FALSIFIED.** Ours's XEX loader populates the vtable region
`0x8200A1E8..+512` correctly. 254/256 nonzero bytes in the first 256;
128/128 nonzero u32 slots in the first 512 bytes. **Worker stub slots
35/36/37/38 each hold real PPC fn pointers** in the `0x8250xxxx`
range:
- `vtable[35] @ 0x8200A274 = 0x82506B08`
- `vtable[36] @ 0x8200A278 = 0x82506DE8`
- `vtable[37] @ 0x8200A27C = 0x82508530`
- `vtable[38] @ 0x8200A280 = 0x82508A88`
- **(B) is FALSIFIED.** There is no "runtime vtable install" step to
mirror — the vtable contents come from `.rdata` and are present
before the crowbar fires. The AUDIT-068 S3/S4 POD-copy writes
`0x8200A1E8` (vtable BASE) at `[ctx+0]` — a POINTER write — not the
vtable contents themselves.
- **NEW CASE (C) discovered**: the ctx-object layout is wider than the
4 u32s AUDIT-068 S3 captured. `[ctx+44]` is a pointer to a SECOND
object whose vtable+60 (slot 15) is dispatched by `sub_82506DE8` (=
vtable[36] of ctx, called by worker tid=15's entry stub at
`0x82506558`). Since we left `[ctx+44]` zero, the worker reads
`[0]=0`, dereferences as vtable, computes CTR=`[vtable+60]=0`, and
`bctrl` faults at PC=0.
## v1 framing vs v2 ground truth
v1's `crowbar-on-stderr.log` showed `FAULT: PC in unmapped memory
cycle=20000167 pc=0x00000000 hw_id=0`. v1's hypothesis was
"vtable[35] at `0x8200A274` is uninitialized/null, branch goes to
PC=0." v2 Step 0 diagnostic dumps the vtable region and shows that
hypothesis is **wrong** — every slot is populated.
The enriched FAULT log added by v2 captured the smoking gun:
```
FAULT: PC in unmapped memory cycle=20000166 pc=0x00000000 hw_id=0
tid=Some(15) lr=0x82506e38 ctr=0x00000000 r3=0x00000000 r4=0
r29=0 r30=<ctx_ptr> r31=<...>
```
`lr=0x82506e38` is one instruction past `bctrl` at `0x82506e34`. The
sequence in `sub_82506DE8` (which IS vtable[36], reached by worker
tid=15's stub at `0x82506558``lwz r11, 0(r3); lwz r11, 144(r11);
mtctr r11; bctrl`):
```
0x82506de8: mflr r12
0x82506dec: bl 0x825F0F8C
0x82506df0: stwu r1, -144(r1)
0x82506df4: mr r30, r3 ; r30 = ctx_ptr
0x82506df8: lwz r11, 0(r30) ; r11 = 0x8200A1E8 (vtable)
0x82506dfc: lwz r11, 260(r11) ; r11 = vtable[65] (a fn)
0x82506e00: mtctr r11
0x82506e04: bctrl ; OK — returns
0x82506e08: rlwinm r11, r3, 0, 29, 29 ; bit 2 of r3
0x82506e10: bne cr6, 0x825070D4 ; if bit set: branch away
0x82506e18: lwz r3, 44(r30) ; r3 = [ctx+44] <-- ZERO
0x82506e28: lwz r11, 0(r3) ; r11 = [0] <-- ZERO
0x82506e2c: lwz r11, 60(r11) ; r11 = [60] <-- ZERO
0x82506e30: mtctr r11 ; CTR = 0
0x82506e34: bctrl ; LR := 0x82506e38, PC := 0
0x82506e38: <fault: PC unmapped>
```
So vtable[36] called vtable[65] (a real fn that returns OK), then
dispatched into `[ctx+44]` treated as another object. Our crowbar
left `[ctx+44]=0`, so the dispatch faulted.
## Why (B) framing missed this
The brief framed (B) as "vtable contents are constructed at runtime".
That's not true — vtable contents are static `.rdata`. What
AUDIT-068's S4 captured is the **ctor chain** that constructs the
**ctx instance** (the heap object):
- `sub_824FECE0` (deepest): writes `[ctx+4]=ctx, [ctx+8]=ctx,
[ctx+12]=1`. Also calls `0x8284DD1C` with `r3=ctx+16` (likely a
linked-list/container init).
- `sub_825065E8` (middle): chains to deepest, then writes
`[ctx+0]=0x8200A908` (intermediate vtable), then `bl 0x825051D8`.
- `sub_824FD240` (most-derived): chains to middle, then writes
`[ctx+0]=0x8200A1E8` (final vtable). Returns.
None of these three ctors writes `[ctx+44]`. So `[ctx+44]` must be
written by either:
1. **Allocator initial-state** (zero-fill? guest-side memset?), OR
2. **A factory function ABOVE the ctor chain** (the caller of
`sub_824FD240` that allocates ctx, calls ctor, then assigns fields
including `+44`).
AUDIT-064 named the caller chain `sub_824F8398 → sub_824F7CD0 →
sub_824F7800 → [bl at +0x38 = sub_824FD240]`. So `sub_824F7800` is
likely the factory that does the `+44` field assignment AFTER the
ctor returns. Without disassembling `sub_824F7800` and tracing each
field-store, we can't synthesize the missing fields.
## Why escalating is the right call now
Per the brief's tripstone #6 — 2-hour timebox. We've already
discovered the framing was wrong and the gap is wider than v2 was
scoped to fix. The honest moves are:
1. **Stop and document** the new finding (this doc + memory entry).
2. **Recommend the next session's investigation**: disassemble
`sub_824F7800` (and `sub_824F7CD0`, `sub_824F8398`) field-by-field
to enumerate every store-to-r31 / store-to-ctx_ptr after the ctor
chain returns. Mirror those stores in a crowbar v3.
3. Alternative — much wider: build a canary read-probe sweep over
`[ctx+0..ctx+128]` to capture the live state. ~200 LOC canary
instrumentation; trades complexity for ground-truth.
## Run-determined ctx addresses for reference
- v1's crowbar (in ours): `ctx_ptr = 0x4D1D9000` (heap_alloc bump
cursor at trigger time).
- Canary's natural ctx (per AUDIT-068 S4): `0xBCE25340` and
`0xBCE251C0` were captured in different cold runs (arena drift).
The probe at `0xBCE251C0..+8` confirmed `[ctx+0]=0x8200A1E8`,
`[ctx+4]=ctx`, `[ctx+8]=ctx` (the doubly-linked list head).
## LOC delta this session
- `crates/xenia-kernel/src/exports.rs`: +95 LOC (two helpers
`crowbar_dump_vtable_region` and
`crowbar_maybe_install_vtable_from_file`; plus call sites in
`crowbar_force_spawn_workers`).
- `crates/xenia-app/src/main.rs`: +9 LOC (enriched FAULT log with
tid/lr/ctr/r3/r4/r29/r30/r31).
- Total: ~104 LOC additive over v1. Within budget.
## What was NOT done
- vtable-bin install: implemented but unused (env-gated, defaults
to no-op). Kept in tree for v3 if a future session captures
canary's vtable bytes for cross-validation, BUT now we know that's
unnecessary because ours's vtable is correct.
- 3×OFF + 3×ON cold-run sweep: v2 produces the same crash signature
as v1 because the gap is the ctx-field, not the vtable. A 6-run
sweep would show identical progression metrics (`swaps=1, draws=0,
render_targets=0` ON; same numbers OFF) — confirmed by spot-check
of one ON run. Skipping the full sweep to honour the timebox.
- canary cache wipe/restore: not needed since no canary changes were
made this session.
## Files
- `step0-diag-stderr.log`: first run, vtable dump only (256 bytes).
- `step0b-diag.log`: second run, 512-byte vtable dump.
- `step0c-diag.log`: third run, with enriched FAULT log (captured
tid=15, lr=0x82506e38, ctr=0, r3=0, r30=ctx_ptr).