Files
xenia-rs/audit-runs/audit-068-host-mem-watch/session-2-plan.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

6.7 KiB

AUDIT-068 Session 2 plan

Date authored: 2026-05-19 (end of Session 1).

Session 1 outcome recap

The Session 1 instrumentation is in place and proven to work (1,639 sanity hits for value=0). The two target writers — vtable install at 0xBCE25340 = 0x8200A208 and voice-struct clear [VOICE+0x164]=0 — produced 0 hits each.

The negative result narrows the search space: neither writer goes through xe::store_and_swap<T>, xe::store<T>, or Memory::Zero/Fill/Copy. The remaining un-hooked host-side write surfaces are:

  1. Memory::TranslateVirtual<T*>(va) followed by raw pointer assignment or memcpy (the XEX loader pattern; appears throughout xenia/cpu/xex_module.cc and many kernel-import handlers).
  2. xe::be<T>* p = …; *p = value; — typed big-endian wrappers; assignment goes through byte_swap but does NOT invoke store_and_swap.
  3. xe::TranslateVirtualBE<T>(va) returning a be<T>* followed by assignment.

Session 2 — extension of canary instrumentation

Step 1: Hook the xe::be<T>::operator= family

In xenia/base/byte_order.h (find the be<T> template's operator=). Add a check_host_write(this, value, sizeof(T), "be<T>::op=") call before the store. Cost when off: one relaxed atomic load.

This catches the most common kernel-handler pattern:

auto* p = memory()->TranslateVirtual<X_THING*>(addr);
p->field = some_value;          // be<u32>::operator=(some_value)

Step 2: Optionally hook Memory::Copy byte-by-byte for value matches

Current behavior: Memory::Copy only checks the first u32 of the source. Replace with a scan over the source bytes for every 4-byte aligned position, comparing against the configured value list (cap N=8 makes this cheap). This catches XEX loader memcpys that write a vptr embedded in a section.

Tradeoff: when watching value=0x00000000 with a large copy, this triggers many spurious hits. Solution: do the scan ONLY when the value list is non-empty (already gated on g_active & 1).

Step 3: Add a Memory::WriteWord32(addr, value) shim and route XEX loader's memcpys through it

Two options:

  • (A) Wrap every xex_module.cc memcpy with a pre-scan that calls check_guest_va(addr+i, *(uint32_t*)(src+i), 4, "xex_memcpy") for each aligned 4-byte position. Localized change in xex_module.cc, ~10 LOC.
  • (B) Add a generic Memory::CopyWithWatch wrapper. Less invasive at the call sites but requires a parallel API.

Recommend (A) for Session 2 — surgical, scoped to the one source file.

Step 4: Re-run the two captures from Session 1

Same cmdlines, expect non-zero hits this time. Specifically expect:

  • Run 1 (vtable install): at least one hit on a xex_memcpy write of 0x8200A208 into the heap region. If still 0, the install is a synthesized runtime computation by some kernel handler — at that point, add a process-wide allocator probe (log every MmAllocatePhysicalMemoryEx return and tag it; cross-reference with subsequent writes).
  • Run 2 (voice-struct clear): depends on where the voice struct actually lives. Likely needs a guest-side memory probe FIRST (read voice struct base via xeAudioGetVoice… reflection) to find the exact heap region, THEN addr-watch over that.

Step 5: Cross-reference each hit with ours's exports.rs

For every captured writer fn name, locate the matching handler in xenia-rs/crates/xenia-kernel/src/exports.rs:

  • If the handler exists but doesn't emit the write: Session 2's fix is to add the write in ours.
  • If the handler is missing entirely: Session 2's fix is to implement the handler.

For the XEX-loader memcpy case (most likely catch for vtable install): the analog in ours is xenia-rs/crates/xenia-kernel/src/loader.rs (or xenia-cpu/xenia-binary's XEX module loader). Verify ours's section-loading code paths.

Step 6: Predicted progression-metric impact

  • If vtable 0x8200A208 install is identified and mirrored in ours: enables sub_825070F0 to fire (per AUDIT-058/063/067), which spawns 4 worker threads (tid=27/28/29 + one unresumed in canary). This is THE keystone gap per Phase NonMatch.
  • If voice-struct clear is identified: removes the XAudio callback's blocking-wait path (per Phase HostAudio-Eager) so tid=14/15 sister chains catch up.
  • Combined: closes ~60% of the missing event volume (XAudio) + the sub_825070F0 worker fan-out.

Risks / unknowns

  1. be<T>::operator= is everywhere. The hot-path overhead matters less for capture runs (cvar-on) but adds atomic loads to EVERY guest-memory typed assignment in canary. If it bloats the build's runtime even when off, gate the hook behind a build-time #ifdef XENIA_AUDIT_68. Default should still be ON-by-build for the canary debug binary used as oracle.
  2. The vptr install may be conditional / data-driven. If the install runs only after some guest call sequence that ours doesn't reach (because ours's earlier state is divergent), then capturing the install in canary tells us WHAT writes it but Session 2 still needs to figure out WHY ours's path diverges before the install. This is the Phase NonMatch-style upstream-divergence problem.
  3. Cold-boot determinism: cache wipe + restore protocol (per memory #31/#32/#33/#34) must be honored across runs. Session 1 used backup /tmp/canary-cache-bak-audit-068.

LOC budget

Steps 1-3 combined: estimated 60-90 LOC additive on canary, plus testing. Step 5 cross-referencing is purely investigative (no LOC).

Cascade prediction (Session 2)

  • A=catch the vtable installer: ~75% (raises from Session 1's 0% by widening coverage to be<T>::op= + xex_memcpy).
  • B=catch the voice-struct clearer: ~50% (depends on knowing the right addr range).
  • C=identify the ours-side gap for the vtable install: ~70% if A succeeds.
  • D=Session 3 lands the ours-side fix and progression metric moves: ~40-50%.

Session 2 deliverable

  • audit-runs/audit-068-host-mem-watch/run3-vtable-extended.log — vtable run with new hooks.
  • audit-runs/audit-068-host-mem-watch/run4-voice-struct-extended.log — voice struct run.
  • audit-runs/audit-068-host-mem-watch/writer-report-v2.md — annotated writer set + per-writer ours-side analog.
  • audit-runs/audit-068-host-mem-watch/fix-canary-v2.diff — extended canary instrumentation.
  • memory/project_audit_068_session2_2026_05_XX.md — memory entry.
  • MEMORY.md index update.

Discipline

  • --mute=true every canary run.
  • Wipe both cache locations before each cold run (xenia-canary/build-cross/bin/Windows/Debug/cache + ~/.local/share/Xenia/cache if present).
  • Restore canary cache from /tmp/canary-cache-bak-audit-068 at session end.
  • No modifications to ours source.
  • Keep canary instrumentation purely additive + cvar-gated default-off (parser-lazy via UINT32_MAX sentinel pattern landed in Session 1).