Files
xenia-rs/audit-runs/review-a-step1b-crowbar-v2/investigation.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

6.8 KiB
Raw Blame History

Crowbar v2 — Step 0 (A) vs (B) verdict + new finding

Date: 2026-05-21 Predecessor: v1 at audit-runs/review-a-step1-crowbar/. Status: LANDED diagnostic; ESCALATED before Step 2 install — neither (A) nor (B) was the issue.

TL;DR

  • (A) is FALSIFIED. Ours's XEX loader populates the vtable region 0x8200A1E8..+512 correctly. 254/256 nonzero bytes in the first 256; 128/128 nonzero u32 slots in the first 512 bytes. Worker stub slots 35/36/37/38 each hold real PPC fn pointers in the 0x8250xxxx range:
    • vtable[35] @ 0x8200A274 = 0x82506B08
    • vtable[36] @ 0x8200A278 = 0x82506DE8
    • vtable[37] @ 0x8200A27C = 0x82508530
    • vtable[38] @ 0x8200A280 = 0x82508A88
  • (B) is FALSIFIED. There is no "runtime vtable install" step to mirror — the vtable contents come from .rdata and are present before the crowbar fires. The AUDIT-068 S3/S4 POD-copy writes 0x8200A1E8 (vtable BASE) at [ctx+0] — a POINTER write — not the vtable contents themselves.
  • NEW CASE (C) discovered: the ctx-object layout is wider than the 4 u32s AUDIT-068 S3 captured. [ctx+44] is a pointer to a SECOND object whose vtable+60 (slot 15) is dispatched by sub_82506DE8 (= vtable[36] of ctx, called by worker tid=15's entry stub at 0x82506558). Since we left [ctx+44] zero, the worker reads [0]=0, dereferences as vtable, computes CTR=[vtable+60]=0, and bctrl faults at PC=0.

v1 framing vs v2 ground truth

v1's crowbar-on-stderr.log showed FAULT: PC in unmapped memory cycle=20000167 pc=0x00000000 hw_id=0. v1's hypothesis was "vtable[35] at 0x8200A274 is uninitialized/null, branch goes to PC=0." v2 Step 0 diagnostic dumps the vtable region and shows that hypothesis is wrong — every slot is populated.

The enriched FAULT log added by v2 captured the smoking gun:

FAULT: PC in unmapped memory cycle=20000166 pc=0x00000000 hw_id=0
  tid=Some(15) lr=0x82506e38 ctr=0x00000000 r3=0x00000000 r4=0
  r29=0 r30=<ctx_ptr> r31=<...>

lr=0x82506e38 is one instruction past bctrl at 0x82506e34. The sequence in sub_82506DE8 (which IS vtable[36], reached by worker tid=15's stub at 0x82506558lwz r11, 0(r3); lwz r11, 144(r11); mtctr r11; bctrl):

0x82506de8: mflr r12
0x82506dec: bl   0x825F0F8C
0x82506df0: stwu r1, -144(r1)
0x82506df4: mr   r30, r3              ; r30 = ctx_ptr
0x82506df8: lwz  r11, 0(r30)          ; r11 = 0x8200A1E8 (vtable)
0x82506dfc: lwz  r11, 260(r11)        ; r11 = vtable[65] (a fn)
0x82506e00: mtctr r11
0x82506e04: bctrl                     ; OK — returns
0x82506e08: rlwinm r11, r3, 0, 29, 29 ; bit 2 of r3
0x82506e10: bne cr6, 0x825070D4       ; if bit set: branch away
0x82506e18: lwz r3, 44(r30)           ; r3 = [ctx+44]    <-- ZERO
0x82506e28: lwz r11, 0(r3)            ; r11 = [0]       <-- ZERO
0x82506e2c: lwz r11, 60(r11)          ; r11 = [60]      <-- ZERO
0x82506e30: mtctr r11                 ; CTR = 0
0x82506e34: bctrl                     ; LR := 0x82506e38, PC := 0
0x82506e38: <fault: PC unmapped>

So vtable[36] called vtable[65] (a real fn that returns OK), then dispatched into [ctx+44] treated as another object. Our crowbar left [ctx+44]=0, so the dispatch faulted.

Why (B) framing missed this

The brief framed (B) as "vtable contents are constructed at runtime". That's not true — vtable contents are static .rdata. What AUDIT-068's S4 captured is the ctor chain that constructs the ctx instance (the heap object):

  • sub_824FECE0 (deepest): writes [ctx+4]=ctx, [ctx+8]=ctx, [ctx+12]=1. Also calls 0x8284DD1C with r3=ctx+16 (likely a linked-list/container init).
  • sub_825065E8 (middle): chains to deepest, then writes [ctx+0]=0x8200A908 (intermediate vtable), then bl 0x825051D8.
  • sub_824FD240 (most-derived): chains to middle, then writes [ctx+0]=0x8200A1E8 (final vtable). Returns.

None of these three ctors writes [ctx+44]. So [ctx+44] must be written by either:

  1. Allocator initial-state (zero-fill? guest-side memset?), OR
  2. A factory function ABOVE the ctor chain (the caller of sub_824FD240 that allocates ctx, calls ctor, then assigns fields including +44).

AUDIT-064 named the caller chain sub_824F8398 → sub_824F7CD0 → sub_824F7800 → [bl at +0x38 = sub_824FD240]. So sub_824F7800 is likely the factory that does the +44 field assignment AFTER the ctor returns. Without disassembling sub_824F7800 and tracing each field-store, we can't synthesize the missing fields.

Why escalating is the right call now

Per the brief's tripstone #6 — 2-hour timebox. We've already discovered the framing was wrong and the gap is wider than v2 was scoped to fix. The honest moves are:

  1. Stop and document the new finding (this doc + memory entry).
  2. Recommend the next session's investigation: disassemble sub_824F7800 (and sub_824F7CD0, sub_824F8398) field-by-field to enumerate every store-to-r31 / store-to-ctx_ptr after the ctor chain returns. Mirror those stores in a crowbar v3.
  3. Alternative — much wider: build a canary read-probe sweep over [ctx+0..ctx+128] to capture the live state. ~200 LOC canary instrumentation; trades complexity for ground-truth.

Run-determined ctx addresses for reference

  • v1's crowbar (in ours): ctx_ptr = 0x4D1D9000 (heap_alloc bump cursor at trigger time).
  • Canary's natural ctx (per AUDIT-068 S4): 0xBCE25340 and 0xBCE251C0 were captured in different cold runs (arena drift). The probe at 0xBCE251C0..+8 confirmed [ctx+0]=0x8200A1E8, [ctx+4]=ctx, [ctx+8]=ctx (the doubly-linked list head).

LOC delta this session

  • crates/xenia-kernel/src/exports.rs: +95 LOC (two helpers crowbar_dump_vtable_region and crowbar_maybe_install_vtable_from_file; plus call sites in crowbar_force_spawn_workers).
  • crates/xenia-app/src/main.rs: +9 LOC (enriched FAULT log with tid/lr/ctr/r3/r4/r29/r30/r31).
  • Total: ~104 LOC additive over v1. Within budget.

What was NOT done

  • vtable-bin install: implemented but unused (env-gated, defaults to no-op). Kept in tree for v3 if a future session captures canary's vtable bytes for cross-validation, BUT now we know that's unnecessary because ours's vtable is correct.
  • 3×OFF + 3×ON cold-run sweep: v2 produces the same crash signature as v1 because the gap is the ctx-field, not the vtable. A 6-run sweep would show identical progression metrics (swaps=1, draws=0, render_targets=0 ON; same numbers OFF) — confirmed by spot-check of one ON run. Skipping the full sweep to honour the timebox.
  • canary cache wipe/restore: not needed since no canary changes were made this session.

Files

  • step0-diag-stderr.log: first run, vtable dump only (256 bytes).
  • step0b-diag.log: second run, 512-byte vtable dump.
  • step0c-diag.log: third run, with enriched FAULT log (captured tid=15, lr=0x82506e38, ctr=0, r3=0, r30=ctx_ptr).