Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
6.8 KiB
Crowbar v2 — Step 0 (A) vs (B) verdict + new finding
Date: 2026-05-21
Predecessor: v1 at audit-runs/review-a-step1-crowbar/.
Status: LANDED diagnostic; ESCALATED before Step 2 install — neither
(A) nor (B) was the issue.
TL;DR
- (A) is FALSIFIED. Ours's XEX loader populates the vtable region
0x8200A1E8..+512correctly. 254/256 nonzero bytes in the first 256; 128/128 nonzero u32 slots in the first 512 bytes. Worker stub slots 35/36/37/38 each hold real PPC fn pointers in the0x8250xxxxrange:vtable[35] @ 0x8200A274 = 0x82506B08vtable[36] @ 0x8200A278 = 0x82506DE8vtable[37] @ 0x8200A27C = 0x82508530vtable[38] @ 0x8200A280 = 0x82508A88
- (B) is FALSIFIED. There is no "runtime vtable install" step to
mirror — the vtable contents come from
.rdataand are present before the crowbar fires. The AUDIT-068 S3/S4 POD-copy writes0x8200A1E8(vtable BASE) at[ctx+0]— a POINTER write — not the vtable contents themselves. - NEW CASE (C) discovered: the ctx-object layout is wider than the
4 u32s AUDIT-068 S3 captured.
[ctx+44]is a pointer to a SECOND object whose vtable+60 (slot 15) is dispatched bysub_82506DE8(= vtable[36] of ctx, called by worker tid=15's entry stub at0x82506558). Since we left[ctx+44]zero, the worker reads[0]=0, dereferences as vtable, computes CTR=[vtable+60]=0, andbctrlfaults at PC=0.
v1 framing vs v2 ground truth
v1's crowbar-on-stderr.log showed FAULT: PC in unmapped memory cycle=20000167 pc=0x00000000 hw_id=0. v1's hypothesis was
"vtable[35] at 0x8200A274 is uninitialized/null, branch goes to
PC=0." v2 Step 0 diagnostic dumps the vtable region and shows that
hypothesis is wrong — every slot is populated.
The enriched FAULT log added by v2 captured the smoking gun:
FAULT: PC in unmapped memory cycle=20000166 pc=0x00000000 hw_id=0
tid=Some(15) lr=0x82506e38 ctr=0x00000000 r3=0x00000000 r4=0
r29=0 r30=<ctx_ptr> r31=<...>
lr=0x82506e38 is one instruction past bctrl at 0x82506e34. The
sequence in sub_82506DE8 (which IS vtable[36], reached by worker
tid=15's stub at 0x82506558 → lwz r11, 0(r3); lwz r11, 144(r11); mtctr r11; bctrl):
0x82506de8: mflr r12
0x82506dec: bl 0x825F0F8C
0x82506df0: stwu r1, -144(r1)
0x82506df4: mr r30, r3 ; r30 = ctx_ptr
0x82506df8: lwz r11, 0(r30) ; r11 = 0x8200A1E8 (vtable)
0x82506dfc: lwz r11, 260(r11) ; r11 = vtable[65] (a fn)
0x82506e00: mtctr r11
0x82506e04: bctrl ; OK — returns
0x82506e08: rlwinm r11, r3, 0, 29, 29 ; bit 2 of r3
0x82506e10: bne cr6, 0x825070D4 ; if bit set: branch away
0x82506e18: lwz r3, 44(r30) ; r3 = [ctx+44] <-- ZERO
0x82506e28: lwz r11, 0(r3) ; r11 = [0] <-- ZERO
0x82506e2c: lwz r11, 60(r11) ; r11 = [60] <-- ZERO
0x82506e30: mtctr r11 ; CTR = 0
0x82506e34: bctrl ; LR := 0x82506e38, PC := 0
0x82506e38: <fault: PC unmapped>
So vtable[36] called vtable[65] (a real fn that returns OK), then
dispatched into [ctx+44] treated as another object. Our crowbar
left [ctx+44]=0, so the dispatch faulted.
Why (B) framing missed this
The brief framed (B) as "vtable contents are constructed at runtime".
That's not true — vtable contents are static .rdata. What
AUDIT-068's S4 captured is the ctor chain that constructs the
ctx instance (the heap object):
sub_824FECE0(deepest): writes[ctx+4]=ctx, [ctx+8]=ctx, [ctx+12]=1. Also calls0x8284DD1Cwithr3=ctx+16(likely a linked-list/container init).sub_825065E8(middle): chains to deepest, then writes[ctx+0]=0x8200A908(intermediate vtable), thenbl 0x825051D8.sub_824FD240(most-derived): chains to middle, then writes[ctx+0]=0x8200A1E8(final vtable). Returns.
None of these three ctors writes [ctx+44]. So [ctx+44] must be
written by either:
- Allocator initial-state (zero-fill? guest-side memset?), OR
- A factory function ABOVE the ctor chain (the caller of
sub_824FD240that allocates ctx, calls ctor, then assigns fields including+44).
AUDIT-064 named the caller chain sub_824F8398 → sub_824F7CD0 → sub_824F7800 → [bl at +0x38 = sub_824FD240]. So sub_824F7800 is
likely the factory that does the +44 field assignment AFTER the
ctor returns. Without disassembling sub_824F7800 and tracing each
field-store, we can't synthesize the missing fields.
Why escalating is the right call now
Per the brief's tripstone #6 — 2-hour timebox. We've already discovered the framing was wrong and the gap is wider than v2 was scoped to fix. The honest moves are:
- Stop and document the new finding (this doc + memory entry).
- Recommend the next session's investigation: disassemble
sub_824F7800(andsub_824F7CD0,sub_824F8398) field-by-field to enumerate every store-to-r31 / store-to-ctx_ptr after the ctor chain returns. Mirror those stores in a crowbar v3. - Alternative — much wider: build a canary read-probe sweep over
[ctx+0..ctx+128]to capture the live state. ~200 LOC canary instrumentation; trades complexity for ground-truth.
Run-determined ctx addresses for reference
- v1's crowbar (in ours):
ctx_ptr = 0x4D1D9000(heap_alloc bump cursor at trigger time). - Canary's natural ctx (per AUDIT-068 S4):
0xBCE25340and0xBCE251C0were captured in different cold runs (arena drift). The probe at0xBCE251C0..+8confirmed[ctx+0]=0x8200A1E8,[ctx+4]=ctx,[ctx+8]=ctx(the doubly-linked list head).
LOC delta this session
crates/xenia-kernel/src/exports.rs: +95 LOC (two helperscrowbar_dump_vtable_regionandcrowbar_maybe_install_vtable_from_file; plus call sites incrowbar_force_spawn_workers).crates/xenia-app/src/main.rs: +9 LOC (enriched FAULT log with tid/lr/ctr/r3/r4/r29/r30/r31).- Total: ~104 LOC additive over v1. Within budget.
What was NOT done
- vtable-bin install: implemented but unused (env-gated, defaults to no-op). Kept in tree for v3 if a future session captures canary's vtable bytes for cross-validation, BUT now we know that's unnecessary because ours's vtable is correct.
- 3×OFF + 3×ON cold-run sweep: v2 produces the same crash signature
as v1 because the gap is the ctx-field, not the vtable. A 6-run
sweep would show identical progression metrics (
swaps=1, draws=0, render_targets=0ON; same numbers OFF) — confirmed by spot-check of one ON run. Skipping the full sweep to honour the timebox. - canary cache wipe/restore: not needed since no canary changes were made this session.
Files
step0-diag-stderr.log: first run, vtable dump only (256 bytes).step0b-diag.log: second run, 512-byte vtable dump.step0c-diag.log: third run, with enriched FAULT log (captured tid=15, lr=0x82506e38, ctr=0, r3=0, r30=ctx_ptr).