Files
xenia-rs/audit-runs/audit-068-host-mem-watch/writer-report-v4.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

294 lines
14 KiB
Markdown

# AUDIT-068 Session 4 — writer identified (guest PPC code)
Date: 2026-05-20
## Headline
**Writer found.** The host-side write of `0x8200A1E8` at `[0xBCE251C0]` is performed
by **JIT-emitted guest PPC code**, NOT host C++ code. Reading-error #36 (POD
struct-copy bypass) — registered in Sessions 2 and 3 as the explanation for the
host-side surface gap — is **partially superseded**: the gap is real for host
C++ writes, but the actual writer of THIS particular vptr install is on the
guest side. AUDIT-067 (which hooks all 16 PPC store opcodes at JIT-emit time)
caught it on the first try, once the correct target value `0x8200A1E8` was
configured (per the Session 3 correction; AUDIT-067's prior runs watched the
wrong value `0x8200A208`).
**No new instrumentation was needed.** Session 4 used the existing AUDIT-067
machinery + Session 3's AUDIT-068 read-probe to cross-validate.
## Writer PC and ctor chain
The ANON_Class_713383D7 instance is constructed via a **three-level
inheritance ctor chain**, fully on the guest PPC side. Each ctor writes the
next vtable down to slot 0:
```
sub_824FECE0 (deepest base ctor)
├─ stw r31, 4(r31) ; *(this+4) = this ← self-pointer
├─ stw r31, 8(r31) ; *(this+8) = this ← self-pointer
├─ stw r11=1, 12(r31) ; *(this+12) = 1 (refcount?)
└─ bl 0x8284DD1C ; sub-helper on &this[+16]
↑ called from
sub_825065E8 (intermediate base ctor — writes vtable 0x8200A908)
├─ bl sub_824FECE0 ; chain to deepest base
├─ lis r11, 0x8201; subi r11, 22264 → r11 = 0x8200A908
├─ stw r11, 0(r31) ; *(this) = 0x8200A908
└─ bl 0x825051D8 ; sub-helper init of fields
↑ called from
sub_824FD240 (most-derived ctor — writes vtable 0x8200A1E8)
├─ bl sub_825065E8 ; chain to intermediate base
├─ lis r11, 0x8201; subi r11, 24088 → r11 = 0x8200A1E8
└─ stw r11, 0(r31) ; *(this) = 0x8200A1E8 ← THE INSTALL
```
The doubly-linked list head sentinel pattern observed by Session 3's read
probe (`{vptr, self, self}` at offsets {0, +4, +8}) is now fully explained:
- Offsets +4 and +8 are written by the **deepest base ctor** (`sub_824FECE0`
at PCs `0x824FECFC` and `0x824FED04`) as `*(this+4) = this; *(this+8) =
this`. This is the LIST_ENTRY head sentinel.
- Offset 0 is overwritten three times in rapid succession by the inheritance
chain — landing on `0x8200A1E8` after `sub_824FD240` completes. The read
probe (1ms poll period) only ever sees the final value.
All three writes happen on the same 1ms poll tick from the read probe's
perspective, which is why the install LOOKS like a 12-byte POD struct copy. It
is actually 3 separate ctors writing 4 PPC `stw` instructions (one vtable
slot, three list-init slots, plus a refcount byte that the read probe
neighbor at `0xBCE251CC` would have detected). The neighbor at `0xBCE251BC`
(`-4`) DOES NOT change because the ctor only writes at offsets >= 0.
## Capture evidence
### Run 11 — AUDIT-067 with corrected value `0x8200A1E8`
Cmdline: `--audit_67_value_watch=0x8200A1E8 --audit_68_host_mem_read_probe=0xBCE251C0:4:1000000 --audit_61_branch_probe_pcs=0x825070F0 --mute=true`.
The very first PPC store of `0x8200A1E8` hit:
```
host_ns=10019392400 CHANGE 0xBCE251C0: 0xBCE25640 → 0x8200A908 (read probe — intermediate base ctor)
host_ns=10021528400 CHANGE 0xBCE251C0: 0x8200A908 → 0x8200A1E8 (read probe — most-derived ctor)
AUDIT-067-VAL pc=824FD264 lr=824FD258 val=8200A1E8 dst=BCE251C0
r3=BCE251C0 r4=00000002 r5=00000020 r6=03A72280
r31=BCE251C0 tid=6
AUDIT-061-BR pc=825070F0 lr=824F7B24 r3=BCE251C0 tid=6 (slot-1 dispatch fires
immediately after)
```
The intermediate-base vtable write at PC `0x8250660C` (value `0x8200A908`)
was NOT in this run's AUDIT-067 watch list (only `0x8200A1E8` was), so only
the most-derived hit is logged. Run 12 confirms.
### Run 12 — AUDIT-067 with both vtable values + 3-slot read probe
Cmdline: `--audit_67_value_watch=0x8200A1E8,0x8200A908,0xBCE251C0 --audit_68_host_mem_read_probe=0xBCE251C0:4:1000000,0xBCE251C4:4:1000000,0xBCE251C8:4:1000000 --audit_61_branch_probe_pcs=0x825070F0 --mute=true`.
Captures the full ctor chain on a different cold-trajectory (instance at
`0xBCE25340` this time — arena-drift sister of Run 11's `0xBCE251C0`):
```
AUDIT-067-VAL pc=8250660C lr=82506600 val=8200A908 dst=BCE25340 (intermediate base ctor write)
AUDIT-067-VAL pc=824FD264 lr=824FD258 val=8200A1E8 dst=BCE25340 (most-derived ctor write)
AUDIT-061-BR pc=825070F0 lr=824F7B24 r3=BCE25340 (slot-1 dispatch)
```
Both runs reproduce: the PC pair `{0x8250660C, 0x824FD264}` is invariant
across cold runs. The instance address VARIES (arena drift), but the writer
PCs do not.
## Why earlier sessions missed this
### Sessions 1+2
Hooked `xe::store_and_swap<T>`, `xe::store<T>`, `Memory::Zero/Fill/Copy`,
`xe::endian_store::set()`, `Memory::Copy` byte-scan, 4 XEX-loader memcpy
sites. These are HOST C++ write paths to guest memory. The JIT does NOT use
them — JIT-emitted PPC stores compile down to direct x64 `mov` instructions
operating on `virtual_membase_ + va`, with inline byte-swap intrinsics
(`bswap` / `pshufb`). They bypass every `xe::store*` template.
Reading-error #35 (Session 1: "hook-surface incompleteness") was right to
the extent that the surfaces don't cover all host-side write paths — but
this writer was never on the host side at all.
### Session 3
Read probe captured the install epoch (~9.4-9.6s host_ns) and the neighbor
pattern (3 simultaneous writes within 1ms). The "POD struct copy bypass"
hypothesis (reading-error #36) was a reasonable explanation under the
constraint "host-write surfaces miss the install", but the actual cause is
that the writes come from the JIT not from host code at all.
### AUDIT-067 (prior to Session 4)
Watched value `0x8200A208`. The CORRECT vptr value is `0x8200A1E8` (per
Session 3's correction). AUDIT-067 was hooked into every PPC store opcode and
would have caught the install on the first run if it had watched the right
value. Session 4 re-ran it with the corrected value and caught the writer.
## ours-side cross-reference
`sub_824FD240` is GUEST PPC code present in the Sylpheed XEX. Both engines'
JIT compiles and runs the same machine code given the same inputs. There is
no host-side analog in `xenia-rs/crates/xenia-kernel/` — and there shouldn't
be: this isn't a kernel handler, it's a game's own class constructor.
Per `xenia-rs/docs/functions/sub_824F7800.md`:
> AUDIT-064 ours `--ctor-probe=0x824F7800` -n 500M: **0 fires**.
>
> The chain runs downstream of `sub_822F1AA8`'s vtable[0] dispatch through
> `sub_82173990` — which waits on tid=13 — so ours never reaches it because
> tid=13 is blocked on the AUDIT-049 wedge.
`sub_824FD240` is reached via:
```
sub_824F8398
→ sub_824F7CD0
→ sub_824F7800
→ sub_824FD240 (call at PC 0x824F7838) ← THE WRITER
→ ... bctrl at PC 0x824F7B20 dispatches sub_825070F0
```
In ours, the entire call chain above `sub_824F7800` fires **0 times** because
the AUDIT-049 wedge blocks tid=13 upstream. Therefore the ANON_Class_713383D7
instance is never constructed, the vtable `0x8200A1E8` is never installed,
and the bctrl at `0x824F7B20` never dispatches `sub_825070F0`.
**This is consistent with all prior phase audits**. Session 4 confirms the
existing diagnosis: the divergence root is upstream at tid=13, not at the
ANON_Class ctor or the worker dispatch.
## Static-DB cross-check
| PC | Function | Notes |
|---|---|---|
| `0x824FECE0` | `sub_824FECE0` (deepest base ctor) | Writes self-pointers at +4/+8/+12; calls helper at `0x8284DD1C` |
| `0x824FECFC` | inside `sub_824FECE0` | `stw r31, 4(r31)` — flink_ptr write |
| `0x824FED04` | inside `sub_824FECE0` | `stw r31, 8(r31)` — blink_ptr write |
| `0x825065E8` | `sub_825065E8` (intermediate base ctor) | Calls deepest; writes vtable `0x8200A908` |
| `0x8250660C` | inside `sub_825065E8` | `stw r11, 0(r31)` — vtable `0x8200A908` write |
| `0x825051D8` | called by intermediate base | Sub-helper initializing many `+0xXX` member fields |
| `0x824FD240` | `sub_824FD240` (most-derived ctor) | Calls intermediate base; writes vtable `0x8200A1E8` |
| `0x824FD264` | inside `sub_824FD240` | `stw r11, 0(r31)` — vtable `0x8200A1E8` write — THE INSTALL |
| `0x824F7800` | `sub_824F7800` | Allocates instance at `+0x38` via `sub_824FD230`/`sub_824FD240` |
| `0x824F7838` | inside `sub_824F7800` | `bl sub_824FD240` — invokes most-derived ctor |
| `0x824F7B20` | inside `sub_824F7800` | `bctrl` — dispatches `sub_825070F0` via vtable slot 1 |
| `0x825070F0` | `sub_825070F0` (slot-1 method) | Worker fan-out target — AUDIT-067/061's original lookup |
Static caller chain into `sub_824FD240`:
- Single static caller: `0x824F7838` inside `sub_824F7800`.
- The 4-fn dispatch ladder above (`824F8398 → 824F7CD0 → 824F7800 → bctrl →
825070F0`) was already classified by AUDIT-064.
## ε-constraint validation
Session 3's install epoch bound was `host_ns ∈ [9.4, 9.6] s`. Run 11
observed install at `host_ns=10.019s`; Run 12 captured the intermediate base
ctor write at `host_ns ≈ 10s` (read probe transition timestamps not
explicitly logged in Run 12 grep output, but within boot's normal jitter
window). The earlier session's ±200ms estimate was off — actual jitter is
closer to ±500ms cold-to-cold. Update: epoch is **`host_ns ∈ [9.4, 10.1] s`**.
## LOC added (Session 4)
**Zero canary LOC added**. All Session 4 work used existing AUDIT-067
(JIT-emit value watch in `ppc_hir_builder.cc` + `ppc_emit_memory.cc`) and
Session 3's `audit_68_host_mem_read_probe` cvar machinery (read-mode probe
thread in `audit_68_host_mem_watch_base.cc`).
Cumulative across Sessions 1+2+3 (canary): ~520 LOC additive, all cvar-gated
default-off, retained in tree. **Session 4 adds no LOC** — the writer was
identifiable by re-running existing instrumentation with the corrected
target value.
## Cascade outcome (Session 4)
- **A**: identify writer PC — **PASS** (`sub_824FD240+0x24` at PC `0x824FD264`;
most-derived ctor of the inheritance chain).
- **B**: identify caller chain — **PASS** (`sub_824F7800+0x38` → `sub_824FD240`;
matches AUDIT-064's previously-known 4-fn dispatch ladder).
- **C**: identify ours-side analog presence — **PASS** (no host analog needed;
guest PPC code; ours's JIT would execute the same code if reached, but the
call chain is unreachable due to tid=13 AUDIT-049 wedge).
- **D**: reading-error class registration — **PASS** (see below; #36
re-scoped).
Net 4/4 wins (no in-progress items). Session 4 closes AUDIT-068.
## Reading-error class re-scoping
**#36 (POD-struct copy-assignment bypass)** — registered Sessions 2+3 as the
explanation for the host-side surface gap. Session 4 finding: this writer is
NOT host C++; it is JIT-emitted PPC code. The class #36 framing remains valid
in principle (host C++ POD copy IS a real bypass class, demonstrated by
Session 2's reading-error #35 sanity), but it does NOT apply to THIS
investigation. Updated rule:
> Before adding new host-side write hooks, always check whether the writer
> could be GUEST CODE running under the JIT. AUDIT-067 (JIT-store value
> watch) is the cheaper first check. If AUDIT-067 with the *correct* target
> value still produces 0 hits, only THEN escalate to host-side surface
> hooks. The reverse order (Session 1+2's host-first approach) wastes
> instrumentation budget when the writer turns out to be guest-side.
Secondary rule: **always cross-check the configured target value against the
read probe's observed values**. Session 1+2+3+AUDIT-067 all watched the wrong
value (`0x8200A208`) because that was AUDIT-058's quoted value, which was
actually the address of slot-1-WITHIN-the-vtable, not the vtable base. The
read probe directly observed the correct value `0x8200A1E8` in Session 3 —
Session 4 simply propagated that correction to AUDIT-067.
## Artifacts (this dir)
- `run11-audit067-corrected-value.log` — AUDIT-067 with value
`0x8200A1E8`; 4 hits (1 install + 3 sibling instances in worker threads).
- `run12-full-ctor-chain.log` — AUDIT-067 with full ctor chain
(vtable values `0x8200A1E8` and `0x8200A908` + self-pointer
`0xBCE251C0`) and 3-slot read probe; captures all 5 writer-related events
on tid=6.
- `writer-report-v4.md` — this file.
## Discipline observed
- xenia-rs HEAD `e6d43a23ac393004d2e5adf2f0395fd0b5e6448b` UNCHANGED ✓
(verified: sha256 of `git diff HEAD` at session start =
`ed30fd526643918f67311caff0a10d1346d73fd0c0323e02477883cf5ff20357`; same
at session end).
- `--mute=true` on every run ✓
- Cold-protocol: cache wipe + restore from `/tmp/canary-cache-bak-audit-068`
at session end ✓ (backup re-created at session start from current cache;
prior session's backup was missing).
- Canary tree: no new instrumentation added (zero LOC delta). All Sessions
1-3 instrumentation retained as-is (cvar-gated default-off).
- No destructive shortcuts ✓.
## AUDIT-068 closure
AUDIT-068 is **CLOSED**. The host-side writer of `0x8200A1E8` at
`[0xBCE251C0]` is conclusively identified as **guest PPC code at
`sub_824FD240+0x24` (PC `0x824FD264`)**, the most-derived constructor of the
ANON_Class_713383D7 inheritance chain. The intermediate base ctor at
`sub_825065E8+0x24` (PC `0x8250660C`) writes the intermediate vtable
`0x8200A908`. The deepest base ctor at `sub_824FECE0` writes the
doubly-linked-list head sentinel (self-pointer writes at offsets +4 and +8).
The Phase-NonMatch divergence root remains the **upstream tid=13 AUDIT-049
wedge**, not the ctor or vtable. ours never reaches the calling code, so
the instance is never constructed and `sub_825070F0` never dispatches. No
host-side analog is needed because the writer is part of the game's own
code.
## Recommended next steps (NOT Session 5 of AUDIT-068)
Move investigation upstream to the **AUDIT-049 / Phase-W wedge** at tid=13.
That is where ours and canary actually diverge; the ANON_Class ctor and
sub_825070F0 are downstream symptoms.
- Re-open the tid=13 wedge analysis under a new audit number.
- Cross-reference `xenia-rs/audit-runs/phase-w-wedge-reattack/current-state.md`
for the most recent state.