Files
xenia-rs/audit-runs/audit-068-host-mem-watch/writer-report-v3.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

17 KiB
Raw Blame History

AUDIT-068 Session 3 — read-mode probe writer report

Date: 2026-05-20

Summary

Session 3 adds a read-mode probe to the AUDIT-068 instrumentation. Instead of hooking host-side write surfaces (Session 1+2's approach, which produced 0 hits across ~9 surfaces despite the install being real), the probe spawns a dedicated low-priority polling thread that samples configured guest VAs every PERIOD_NS and emits AUDIT-068-READ-CHANGE events on transition.

The probe bounded the install epoch for the ANON_Class_713383D7 vptr to host_ns ≈ 9.4129.612 s (varies ±200 ms between cold runs) and provided the first direct evidence that the install is a bulk POD struct copy of a 12-byte {vptr, self_ptr, self_ptr} record into the instance's first three u32 slots — written simultaneously within the same 1 ms poll interval. Reading-error class #36 (POD-struct copy-assignment bypass) is now confirmed in the strongest possible terms: Run 10 enabled BOTH the read probe AND the full ~9-surface host-write watch simultaneously with the CORRECT target value 0x8200A1E8, and observed the read probe catch the install while host-write surfaces produced 0 hits.

A secondary finding overturns part of the AUDIT-067 framing: the actual vptr value installed is 0x8200A1E8, not 0x8200A208. The number 0x8200A208 is the address of the slot-1 fn pointer WITHIN the vtable (32 bytes into the vtable). The value stored at [ctx_ptr] is the vtable BASE = 0x8200A1E8. AUDIT-067 hooked all 16 PPC store opcodes for 0x8200A208 — it should have also (or instead) watched 0x8200A1E8. This may explain part of why AUDIT-067 also produced 0 hits.

LOC added (Session 3 delta, canary only)

File LOC delta Purpose
src/xenia/cpu/cpu_flags.h +7 New cvar audit_68_host_mem_read_probe declaration.
src/xenia/cpu/cpu_flags.cc +6 Cvar definition.
src/xenia/memory.cc +18 Register g_guest_to_host_thunk (wraps Memory::TranslateVirtual) and g_query_protect_thunk (wraps LookupHeap+QueryProtect) inside Memory::Memory(); reset to nullptr in ~Memory().
src/xenia/base/audit_68_host_mem_watch_fwd.h +17 GuestToHostThunk + QueryProtectThunk extern decls.
src/xenia/base/audit_68_host_mem_watch_base.cc +~170 ReadProbe struct + parser (VA:SIZE:PERIOD_NS CSV form) + sample_at() w/ page-protect guard + read_probe_thread_main() polling loop + start_read_probe_thread_if_configured() lazy-start (called from check_host_write_slowpath).
Total ~218 LOC additive All cvar-gated default-off (empty CSV = thread never spawned).

Cumulative across Sessions 1+2+3: ~520 LOC.

xenia-rs HEAD e6d43a23ac393004d2e5adf2f0395fd0b5e6448b UNCHANGED.

Cvar format

--audit_68_host_mem_read_probe=VA1:SIZE1:PERIOD1,VA2:SIZE2:PERIOD2,...

Each tuple is VA:SIZE:PERIOD_NS. SIZE ∈ {1, 2, 4, 8}. PERIOD_NS floored at 1 us (1000). Max 8 tuples. Default empty (off).

Lazy-start: the poll thread spawns only on the first call to check_host_write_slowpath() after Memory::Memory() has registered the thunks. This reuses the Session 2 static-init gate. The thread is detached (daemon-style) and polls until process exit.

Captures

All runs cold-boot (cache wipe before each), --mute=true, against the Sylpheed ISO. 90 s wallclock each.

Run 6 — primary read-probe on 0xBCE25340

Cmdline: --audit_68_host_mem_read_probe=0xBCE25340:4:1000000 --mute=true.

Observations:

host_ns=729615200    INITIAL  0x00000000
host_ns=738072700    CHANGE   0x00000000 → 0xBCE254C0   (arena-local pointer)
host_ns=1537758000   CHANGE   0xBCE254C0 → 0xBCE25640
host_ns=1591760600   CHANGE   0xBCE25640 → 0xBCE25350
host_ns=1592827100   CHANGE   0xBCE25350 → 0xBCE257C0
host_ns=1601443500   CHANGE   0xBCE257C0 → 0x82061050   (looks like XEX vtable)
host_ns=1602506700   CHANGE   0x82061050 → 0x820610E0   (final, stable through 90 s)

Boot reached worker spawn (thid=27/28/29 visible in log tail) — so the probe was alive for the whole 90 s wallclock; only ~7 changes occurred at 0xBCE25340 in this run, and the value never became 0x8200A208.

This indicated the address 0xBCE25340 cited in AUDIT-058/067 is NOT deterministic across runs — there's "arena drift" in the 0xBCE25xxx region. The Phase-NonMatch investigation memo (2026-05-19) already documented this: canary cold sample saw ctx_ptr=0xBCE251C0 while AUDIT-058 saw 0xBCE25340.

Run 7 — neighbor bisect on 0xBCE25340 ± 4/8

Cmdline: --audit_68_host_mem_read_probe=0xBCE2533C:4:1000000,0xBCE25340:4:1000000,0xBCE25344:4:1000000,0xBCE25348:4:1000000.

host_ns=655976500   INITIAL  all four = 0
host_ns=664462100   CHANGE   0xBCE25340: 0 → 0xBCE254C0
host_ns=1374604200  CHANGE   0xBCE25340: 0xBCE254C0 → 0x07C65ADA   (3 SIMULTANEOUS)
host_ns=1374604200  CHANGE   0xBCE25344:          0 → 0x001EE000
host_ns=1374604200  CHANGE   0xBCE25348:          0 → 0x0003A313

Key signal: at host_ns=1.374 s, three adjacent u32 slots changed within the same 1 ms poll interval but the neighbor at 0xBCE2533C did NOT. This is a clear bulk struct-copy / memcpy footprint — the writer wrote a 12-byte record starting at 0xBCE25340. The three values {0x07C65ADA, 0x001EE000, 0x0003A313} are NOT the vtable (don't match 0x8200A208/0x8200A1E8); they look like random-looking data (FNV-style hash, allocation size, refcount?). This particular write happens to a DIFFERENT object instance reusing the 0xBCE25340 slot, not the ANON_Class instance.

Run 8 — locate the actual ctx_ptr via AUDIT-061 fire

Cmdline: --audit_61_branch_probe_pcs=0x825070F0 --audit_68_host_mem_read_probe=0xBCE25340:4:1000000.

AUDIT-061-BR pc=825070F0 ... r3=BCE251C0 ... fired late in the run. So in THIS cold trajectory the ANON_Class instance is at 0xBCE251C0, not 0xBCE25340. The probe at 0xBCE25340 was watching the wrong address.

Run 9 — neighbor bisect on the correct ctx_ptr 0xBCE251C0

Cmdline: --audit_61_branch_probe_pcs=0x825070F0 --audit_68_host_mem_read_probe=0xBCE251BC:4:1000000,0xBCE251C0:4:1000000,0xBCE251C4:4:1000000,0xBCE251C8:4:1000000.

host_ns=633560300   INITIAL  all four = 0
host_ns=642041900   CHANGE   0xBCE251C0: 0 → 0xBCE25340           (arena ptr)
host_ns=1387443500  CHANGE   0xBCE251C0: 0xBCE25340 → 0xBCE254C0  (2 SIMULTANEOUS)
host_ns=1387443500  CHANGE   0xBCE251C8:          0 → 0x00000148
host_ns=1412116800  CHANGE   0xBCE251C0: 0xBCE254C0 → 0           (2 SIMULTANEOUS clear)
host_ns=1412116800  CHANGE   0xBCE251C8: 0x148 → 0
host_ns=1457544600  CHANGE   0xBCE251C0:        0 → 0xBF80199A    (2 SIMULTANEOUS — floats)
host_ns=1457544600  CHANGE   0xBCE251C4:        0 → 0x3F802D83    (= -1.0008, 1.0014)
host_ns=5710239000  CHANGE   0xBCE251C0: 0xBF80199A → 0xBCE25640  (arena ptr)
host_ns=9416025400  CHANGE   0xBCE251C0: 0xBCE25640 → 0x8200A1E8  (3 SIMULTANEOUS — THE INSTALL)
host_ns=9416025400  CHANGE   0xBCE251C4: 0xBCE251C0 → 0xBCE251C0  (self-ptr)
host_ns=9416025400  CHANGE   0xBCE251C8:          0 → 0xBCE251C0  (self-ptr)
AUDIT-061-BR pc=825070F0 r3=BCE251C0 (fire ~25 s wallclock)

The install epoch is host_ns = 9.416025400 s. Three slots written simultaneously to {vptr=0x8200A1E8, self=0xBCE251C0, self=0xBCE251C0} — classic struct construction or *ptr = X_FOO{...} POD copy pattern. The slot at 0xBCE251BC (4 bytes before ctx_ptr) did NOT change, bounding the write to exactly 12 bytes starting at 0xBCE251C0.

The install is ~966 ms BEFORE the sub_825070F0 fire (~10.4 s host_ns, matches Phase-NonMatch documented thread.create burst at 10.382 s) and well within the 60-90 s capture window.

Run 10 — cross-validation: read-probe + host-write watch with correct value

Cmdline: --audit_68_host_mem_watch_values=0x8200A1E8,0x8200A208,0xE8A10082,0x82A10082 --audit_68_host_mem_watch_addrs=0xBCE251C0 --audit_68_host_mem_read_probe=0xBCE251C0:4:1000000 --audit_61_branch_probe_pcs=0x825070F0.

host_ns=9612147300  CHANGE   0xBCE251C0: 0xBCE25640 → 0x8200A1E8   (read probe catches)
AUDIT-061-BR pc=825070F0 r3=BCE251C0                                (sub_825070F0 fires)
AUDIT-068-HOST-WRITE: 0 hits                                        (write surfaces miss)

This is the definitive proof:

  1. The install IS captured by the read probe at host_ns ≈ 9.6 s.
  2. The corrected value 0x8200A1E8 (not 0x8200A208) is the actual vptr.
  3. None of the ~9 host-write surfaces hooked in Session 1+2 catches it.

Reading-error class #36 confirmed: the writer uses a path that bypasses all of xe::store_and_swap<T>, xe::store<T>, Memory::Zero/Fill/Copy, xe::endian_store::set(), and Memory::Copy byte-scan — most likely a *reinterpret_cast<X_FOO*>(host_ptr) = X_FOO{...} raw POD struct copy-assignment OR a direct memcpy(host_ptr_from_TranslateVirtual, &local_struct, sizeof(X_FOO)).

Headline finding

Install epoch: host_ns ≈ 9.49.6 s (varies ±200 ms across cold runs). This is ~966 ms before sub_825070F0 fires (~10.4 s host_ns).

Neighbor pattern: 3 simultaneous writes at 0xBCE251C0, +4, +8 within the same 1 ms poll interval — {vptr=0x8200A1E8, self=0xBCE251C0, self=0xBCE251C0}. 0xBCE251BC (-4) does NOT change. This is a 12-byte POD struct copy.

Implications:

  • The write is invisible to all currently-hooked host-write surfaces.
  • The value bytes {0xE8, 0xA1, 0x00, 0x82, 0xC0, 0x51, 0xE2, 0xBC, 0xC0, 0x51, 0xE2, 0xBC} (big-endian guest order) must appear together in some source — either as a constant pre-baked vtable instance pattern that's memcpy'd, or as fields computed by host code and bulk-written.
  • The fact that the second and third slots are self-pointers (= ctx_ptr) suggests a doubly-linked-list head node initialization: head.vptr = vtbl; head.next = &head; head.prev = &head;. This is a textbook intrusive list / queue head pattern.

Wallclock relation to AUDIT-067's sub_825070F0 fire

Event Host_ns Wallclock (≈)
Probe init (first slowpath call) ~640 ms ~1.6 s
Various pre-install arena reuse of slot 0.65.7 s 1.66.5 s
Vptr install at 0xBCE251C0 9.4129.612 s ~10.410.6 s
Phase-NonMatch documented thread.create burst 10.38210.384 s ~11.3 s
sub_825070F0 fire (AUDIT-061-BR captured) ~10.5 s ~25 s wallclock (AUDIT-067 quoted)

The "host_ns ~10.5 s when sub_825070F0 fires" vs "~25 s wallclock" gap is because host_ns starts when the first AUDIT-068 slowpath call lands (i.e. when canary's static-init plus Wine startup are done) — Wine's JIT-warmup/early-boot takes ~15 s before guest PPC code starts. The ANON_Class install happens ~960 ms before sub_825070F0 dispatch, within the same "post-DiscImageDevice resolve" boot phase that AUDIT-058 framed.

Session 4 recommendation

Three paths to identifying the writer, ranked by feasibility:

The install epoch (host_ns ≈ 9.49.6 s) and the 12-byte simultaneous-write signature (3 u32 slots) narrows the candidate hooks dramatically. Two surgical instrumentation strategies:

(a) Pre-instrument all *reinterpret_cast<X*>(host_ptr) = X{...} sites in canary. Ripgrep finds them: pattern \*reinterpret_cast<[A-Z]\w*\*>\([^)]*\)\s*= in src/xenia/kernel/**.cc. A quick scan of Session 1 inventory listed ~30 such sites, but most are in kernel-import handlers that fire repeatedly — the ε-constraint of "fires exactly once at host_ns 9.49.6 s on tid=6" lets us bisect.

(b) Wrap xe::SetField() / pointer-typed assignment helpers if any exist. Otherwise instrument memcpy(host_ptr_from_TranslateVirtual, ...) patterns directly — there are ~40 such sites across kernel/util/cpu code per Session 1+2 surveys. The ones NOT already wrapped by Session 2 (xex_module.cc got 4 sites) are candidates.

LOC budget: ~50-100 additive in canary; default-off cvar audit_68_pod_copy_watch_addrs (CSV of VA ranges; emits on every memcpy/raw assign within range).

Path 2 — Guard-page SIGSEGV trap

Use the existing canary ExceptionHandler infrastructure (src/xenia/base/exception_handler*.cc — already cross-platform, has Win SEH and POSIX SIGSEGV handlers wired). Mark the 4K page containing 0xBCE251C0 as read-only at host_ns = 9.4 s (just before the install epoch); the page fault triggers the writer's host instruction, log RIP/host stack, then unprotect+resume.

Pros: catches the writer with bytecode-level precision regardless of how it writes (memcpy, raw assign, vector store, etc.).

Cons: ~150200 LOC platform-gated; needs accurate epoch timing (can't trap the whole boot or it crashes). Use host_ns ≥ 9.0 s as the gate.

Path 3 — Kernel-handler grep with new ε-constraint

Now that the install epoch is known (9.49.6 s host_ns; just AFTER DiscImageDevice::ResolvePath(\\dat\\movie) per AUDIT-058 narrative), grep all kernel handlers for ones that fire in that window AND write to the heap. The probe log already shows this is right around the time HostPathDevice::ResolvePath(\\dat\\movie) runs and various worker file IO starts. Cross-reference with canary's existing kernel-call trace (--log_level=4) to enumerate handlers called in the 9.09.7 s window.

LOC: 0 (purely investigative).

Recommended Session 4 priority: Path 1 first (concrete instrumentation extends what we have, leverages the epoch constraint). Path 2 as backstop. Path 3 alongside as a cheap parallel investigation.

Cascade outcome (Session 3)

  • A: identify install epoch — PASS (9.49.6 s host_ns; ~966 ms before sub_825070F0).
  • B: identify neighbor pattern — PASS (3-slot simultaneous write, POD struct signature confirmed).
  • C: confirm reading-error #36 — PASS (Run 10 demonstrates host-write surfaces miss the install even with the CORRECT target value 0x8200A1E8).
  • D: identify the host-side writer — N/A (Session 4 work, with epoch and signature constraints to narrow the search).
  • E: secondary discovery: actual vptr is 0x8200A1E8 not 0x8200A208PASS (AUDIT-067's target value was off by 32 bytes; may have contributed to that audit's 0-hit JIT store result).

Net 4/5 wins. Session 4 has concrete constraints (epoch, signature, value correction) to land the writer identification.

Reading-error class #36 reinforcement

Session 3 directly demonstrates reading-error #36 (POD-struct copy-assignment bypass for typed BE/LE field watch). The corrective rule is now formalized as:

When hooking host-side writes to guest memory, member-level set() hooks (e.g. xe::endian_store::set()) catch ONLY explicit assignments like *be<T>* = value. They DO NOT catch:

  1. POD struct copy-assignment (*reinterpret_cast<X*>(host_ptr) = X{...}).
  2. memcpy into the host pointer (memcpy(host_ptr_from_TranslateVirtual, &local_struct, sizeof(X))).
  3. Vector-typed bulk store intrinsics that target guest memory.

Mitigation: pair host-write hooks with read-mode probes at the target VA — the read probe captures the install regardless of the writer's mechanism, and provides epoch + neighbor-pattern constraints for the follow-up targeted instrumentation.

This rule is now reflected in the AUDIT-068 Session 3 read-probe machinery — preserved in canary tree for all future audits.

Discipline observed

  • --mute=true on every run ✓
  • Cold-protocol: cache wipe before each cold run; cache restored from /tmp/canary-cache-bak-audit-068 at session end ✓ (current cache was backed up at session start since prior backup was missing).
  • xenia-rs HEAD e6d43a23… UNCHANGED ✓ (verified by sha256 of git diff HEAD at session start vs end; uncommitted modifications from prior sessions are unchanged from session start, no new modifications made by this session).
  • Canary instrumentation purely additive + cvar-gated default-off ✓
  • No destructive shortcuts ✓
  • Static-init gate pattern preserved + extended (Session 3's read probe thread is also gated on g_guest_to_host_thunk + g_query_protect_thunk being non-null — same discipline as Session 2's thunk gate).

Artifacts (this dir)

  • fix-canary-v3.diff — cumulative Session 3 instrumentation (this run).
  • run6-read-probe-bisect.log — primary probe on 0xBCE25340 (90 s; 7 changes, ended at 0x820610E0, never 0x8200A208).
  • run7-read-probe-neighbors.log — bisect probe on 0xBCE25340 ± 4/8; 3 simultaneous writes at +0/+4/+8 confirming POD signature.
  • run9-read-probe-251C0-neighbors.log — neighbor probe on the actual ctx_ptr 0xBCE251C0; captures the install at host_ns=9.416 s.
  • run10-cross-validation.log — read probe + host-write watch with CORRECT value 0x8200A1E8; demonstrates 0 HOST-WRITE hits while read probe sees the install at host_ns=9.612 s.
  • writer-report-v3.md — this file.

(Run 8 was an intermediate diagnostic; data is included in Run 9/10 logs.)

Phase B / progression

  • image_loaded_sha256 ea8d160e… UNCHANGED (instrumentation does not touch XEX image processing).
  • xenia-rs HEAD UNCHANGED.
  • No progression-metric movement (Session 3 is instrumentation-only). Session 4 has concrete leads.