Files
xenia-rs/audit-runs/audit-068-host-mem-watch/writer-report-v3.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

345 lines
17 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# AUDIT-068 Session 3 — read-mode probe writer report
Date: 2026-05-20
## Summary
Session 3 adds a **read-mode probe** to the AUDIT-068 instrumentation. Instead
of hooking host-side write surfaces (Session 1+2's approach, which produced 0
hits across ~9 surfaces despite the install being real), the probe spawns a
dedicated low-priority polling thread that samples configured guest VAs every
`PERIOD_NS` and emits `AUDIT-068-READ-CHANGE` events on transition.
The probe bounded the install epoch for the `ANON_Class_713383D7` vptr to
**host_ns ≈ 9.4129.612 s** (varies ±200 ms between cold runs) and provided
the first direct evidence that the install is a **bulk POD struct copy** of a
12-byte `{vptr, self_ptr, self_ptr}` record into the instance's first three
u32 slots — written simultaneously within the same 1 ms poll interval.
**Reading-error class #36 (POD-struct copy-assignment bypass) is now
confirmed in the strongest possible terms**: Run 10 enabled BOTH the read
probe AND the full ~9-surface host-write watch simultaneously with the
CORRECT target value `0x8200A1E8`, and observed the read probe catch the
install while host-write surfaces produced **0 hits**.
A secondary finding overturns part of the AUDIT-067 framing: the actual vptr
value installed is **`0x8200A1E8`**, not `0x8200A208`. The number `0x8200A208`
is the address of the slot-1 fn pointer WITHIN the vtable (32 bytes into the
vtable). The value stored at `[ctx_ptr]` is the vtable BASE = `0x8200A1E8`.
AUDIT-067 hooked all 16 PPC store opcodes for `0x8200A208` — it should have
also (or instead) watched `0x8200A1E8`. This may explain part of why AUDIT-067
also produced 0 hits.
## LOC added (Session 3 delta, canary only)
| File | LOC delta | Purpose |
|---|---:|---|
| `src/xenia/cpu/cpu_flags.h` | +7 | New cvar `audit_68_host_mem_read_probe` declaration. |
| `src/xenia/cpu/cpu_flags.cc` | +6 | Cvar definition. |
| `src/xenia/memory.cc` | +18 | Register `g_guest_to_host_thunk` (wraps `Memory::TranslateVirtual`) and `g_query_protect_thunk` (wraps `LookupHeap`+`QueryProtect`) inside `Memory::Memory()`; reset to nullptr in `~Memory()`. |
| `src/xenia/base/audit_68_host_mem_watch_fwd.h` | +17 | `GuestToHostThunk` + `QueryProtectThunk` extern decls. |
| `src/xenia/base/audit_68_host_mem_watch_base.cc` | +~170 | `ReadProbe` struct + parser (`VA:SIZE:PERIOD_NS` CSV form) + `sample_at()` w/ page-protect guard + `read_probe_thread_main()` polling loop + `start_read_probe_thread_if_configured()` lazy-start (called from `check_host_write_slowpath`). |
| **Total** | **~218 LOC additive** | All cvar-gated default-off (empty CSV = thread never spawned). |
Cumulative across Sessions 1+2+3: ~520 LOC.
xenia-rs HEAD `e6d43a23ac393004d2e5adf2f0395fd0b5e6448b` **UNCHANGED**.
## Cvar format
```
--audit_68_host_mem_read_probe=VA1:SIZE1:PERIOD1,VA2:SIZE2:PERIOD2,...
```
Each tuple is `VA:SIZE:PERIOD_NS`. SIZE ∈ {1, 2, 4, 8}. PERIOD_NS floored at
1 us (1000). Max 8 tuples. Default empty (off).
Lazy-start: the poll thread spawns only on the first call to
`check_host_write_slowpath()` after `Memory::Memory()` has registered the
thunks. This reuses the Session 2 static-init gate. The thread is detached
(daemon-style) and polls until process exit.
## Captures
All runs cold-boot (cache wipe before each), `--mute=true`, against the
Sylpheed ISO. 90 s wallclock each.
### Run 6 — primary read-probe on `0xBCE25340`
Cmdline: `--audit_68_host_mem_read_probe=0xBCE25340:4:1000000 --mute=true`.
Observations:
```
host_ns=729615200 INITIAL 0x00000000
host_ns=738072700 CHANGE 0x00000000 → 0xBCE254C0 (arena-local pointer)
host_ns=1537758000 CHANGE 0xBCE254C0 → 0xBCE25640
host_ns=1591760600 CHANGE 0xBCE25640 → 0xBCE25350
host_ns=1592827100 CHANGE 0xBCE25350 → 0xBCE257C0
host_ns=1601443500 CHANGE 0xBCE257C0 → 0x82061050 (looks like XEX vtable)
host_ns=1602506700 CHANGE 0x82061050 → 0x820610E0 (final, stable through 90 s)
```
**Boot reached worker spawn (thid=27/28/29 visible in log tail)** — so the
probe was alive for the whole 90 s wallclock; only ~7 changes occurred at
`0xBCE25340` in this run, and the value never became `0x8200A208`.
This indicated the address `0xBCE25340` cited in AUDIT-058/067 is NOT
deterministic across runs — there's "arena drift" in the `0xBCE25xxx` region.
The Phase-NonMatch investigation memo (2026-05-19) already documented this:
canary cold sample saw `ctx_ptr=0xBCE251C0` while AUDIT-058 saw `0xBCE25340`.
### Run 7 — neighbor bisect on `0xBCE25340 ± 4/8`
Cmdline: `--audit_68_host_mem_read_probe=0xBCE2533C:4:1000000,0xBCE25340:4:1000000,0xBCE25344:4:1000000,0xBCE25348:4:1000000`.
```
host_ns=655976500 INITIAL all four = 0
host_ns=664462100 CHANGE 0xBCE25340: 0 → 0xBCE254C0
host_ns=1374604200 CHANGE 0xBCE25340: 0xBCE254C0 → 0x07C65ADA (3 SIMULTANEOUS)
host_ns=1374604200 CHANGE 0xBCE25344: 0 → 0x001EE000
host_ns=1374604200 CHANGE 0xBCE25348: 0 → 0x0003A313
```
**Key signal**: at host_ns=1.374 s, three adjacent u32 slots changed within
the same 1 ms poll interval but the neighbor at `0xBCE2533C` did NOT. This is
a clear bulk struct-copy / memcpy footprint — the writer wrote a 12-byte
record starting at `0xBCE25340`. The three values `{0x07C65ADA, 0x001EE000,
0x0003A313}` are NOT the vtable (don't match `0x8200A208`/`0x8200A1E8`); they
look like random-looking data (FNV-style hash, allocation size, refcount?).
This particular write happens to a DIFFERENT object instance reusing the
`0xBCE25340` slot, not the ANON_Class instance.
### Run 8 — locate the actual ctx_ptr via AUDIT-061 fire
Cmdline: `--audit_61_branch_probe_pcs=0x825070F0 --audit_68_host_mem_read_probe=0xBCE25340:4:1000000`.
`AUDIT-061-BR pc=825070F0 ... r3=BCE251C0 ...` fired late in the run. So in
THIS cold trajectory the ANON_Class instance is at `0xBCE251C0`, not
`0xBCE25340`. The probe at `0xBCE25340` was watching the wrong address.
### Run 9 — neighbor bisect on the correct ctx_ptr `0xBCE251C0`
Cmdline: `--audit_61_branch_probe_pcs=0x825070F0 --audit_68_host_mem_read_probe=0xBCE251BC:4:1000000,0xBCE251C0:4:1000000,0xBCE251C4:4:1000000,0xBCE251C8:4:1000000`.
```
host_ns=633560300 INITIAL all four = 0
host_ns=642041900 CHANGE 0xBCE251C0: 0 → 0xBCE25340 (arena ptr)
host_ns=1387443500 CHANGE 0xBCE251C0: 0xBCE25340 → 0xBCE254C0 (2 SIMULTANEOUS)
host_ns=1387443500 CHANGE 0xBCE251C8: 0 → 0x00000148
host_ns=1412116800 CHANGE 0xBCE251C0: 0xBCE254C0 → 0 (2 SIMULTANEOUS clear)
host_ns=1412116800 CHANGE 0xBCE251C8: 0x148 → 0
host_ns=1457544600 CHANGE 0xBCE251C0: 0 → 0xBF80199A (2 SIMULTANEOUS — floats)
host_ns=1457544600 CHANGE 0xBCE251C4: 0 → 0x3F802D83 (= -1.0008, 1.0014)
host_ns=5710239000 CHANGE 0xBCE251C0: 0xBF80199A → 0xBCE25640 (arena ptr)
host_ns=9416025400 CHANGE 0xBCE251C0: 0xBCE25640 → 0x8200A1E8 (3 SIMULTANEOUS — THE INSTALL)
host_ns=9416025400 CHANGE 0xBCE251C4: 0xBCE251C0 → 0xBCE251C0 (self-ptr)
host_ns=9416025400 CHANGE 0xBCE251C8: 0 → 0xBCE251C0 (self-ptr)
AUDIT-061-BR pc=825070F0 r3=BCE251C0 (fire ~25 s wallclock)
```
**The install epoch is host_ns = 9.416025400 s.** Three slots written
simultaneously to `{vptr=0x8200A1E8, self=0xBCE251C0, self=0xBCE251C0}`
classic struct construction or `*ptr = X_FOO{...}` POD copy pattern. The
slot at `0xBCE251BC` (4 bytes before `ctx_ptr`) did NOT change, bounding the
write to exactly 12 bytes starting at `0xBCE251C0`.
The install is ~966 ms BEFORE the `sub_825070F0` fire (~10.4 s host_ns,
matches Phase-NonMatch documented thread.create burst at 10.382 s) and well
within the 60-90 s capture window.
### Run 10 — cross-validation: read-probe + host-write watch with correct value
Cmdline: `--audit_68_host_mem_watch_values=0x8200A1E8,0x8200A208,0xE8A10082,0x82A10082 --audit_68_host_mem_watch_addrs=0xBCE251C0 --audit_68_host_mem_read_probe=0xBCE251C0:4:1000000 --audit_61_branch_probe_pcs=0x825070F0`.
```
host_ns=9612147300 CHANGE 0xBCE251C0: 0xBCE25640 → 0x8200A1E8 (read probe catches)
AUDIT-061-BR pc=825070F0 r3=BCE251C0 (sub_825070F0 fires)
AUDIT-068-HOST-WRITE: 0 hits (write surfaces miss)
```
This is the definitive proof:
1. The install IS captured by the read probe at host_ns ≈ 9.6 s.
2. The corrected value `0x8200A1E8` (not `0x8200A208`) is the actual vptr.
3. None of the ~9 host-write surfaces hooked in Session 1+2 catches it.
**Reading-error class #36 confirmed**: the writer uses a path that bypasses
all of `xe::store_and_swap<T>`, `xe::store<T>`, `Memory::Zero/Fill/Copy`,
`xe::endian_store::set()`, and `Memory::Copy` byte-scan — most likely a
`*reinterpret_cast<X_FOO*>(host_ptr) = X_FOO{...}` raw POD struct
copy-assignment OR a direct `memcpy(host_ptr_from_TranslateVirtual,
&local_struct, sizeof(X_FOO))`.
## Headline finding
**Install epoch**: host_ns ≈ 9.49.6 s (varies ±200 ms across cold runs).
This is ~966 ms before sub_825070F0 fires (~10.4 s host_ns).
**Neighbor pattern**: **3 simultaneous writes** at `0xBCE251C0`, `+4`, `+8`
within the same 1 ms poll interval — `{vptr=0x8200A1E8, self=0xBCE251C0,
self=0xBCE251C0}`. `0xBCE251BC` (`-4`) does NOT change. This is a 12-byte
POD struct copy.
**Implications**:
- The write is invisible to all currently-hooked host-write surfaces.
- The value bytes `{0xE8, 0xA1, 0x00, 0x82, 0xC0, 0x51, 0xE2, 0xBC, 0xC0,
0x51, 0xE2, 0xBC}` (big-endian guest order) must appear together in some
source — either as a constant pre-baked vtable instance pattern that's
memcpy'd, or as fields computed by host code and bulk-written.
- The fact that the second and third slots are self-pointers (`= ctx_ptr`)
suggests a doubly-linked-list head node initialization: `head.vptr = vtbl;
head.next = &head; head.prev = &head;`. This is a textbook intrusive list
/ queue head pattern.
## Wallclock relation to AUDIT-067's sub_825070F0 fire
| Event | Host_ns | Wallclock (≈) |
|---|---:|---:|
| Probe init (first slowpath call) | ~640 ms | ~1.6 s |
| Various pre-install arena reuse of slot | 0.65.7 s | 1.66.5 s |
| **Vptr install at `0xBCE251C0`** | **9.4129.612 s** | **~10.410.6 s** |
| Phase-NonMatch documented thread.create burst | 10.38210.384 s | ~11.3 s |
| sub_825070F0 fire (AUDIT-061-BR captured) | ~10.5 s | **~25 s wallclock** (AUDIT-067 quoted) |
The "host_ns ~10.5 s when sub_825070F0 fires" vs "~25 s wallclock" gap is
because `host_ns` starts when the first AUDIT-068 slowpath call lands (i.e.
when canary's static-init plus Wine startup are done) — Wine's
JIT-warmup/early-boot takes ~15 s before guest PPC code starts. The
ANON_Class install happens ~960 ms before sub_825070F0 dispatch, within the
same "post-DiscImageDevice resolve" boot phase that AUDIT-058 framed.
## Session 4 recommendation
Three paths to identifying the writer, ranked by feasibility:
### Path 1 (RECOMMENDED) — POD struct-copy hook with NEW ε-constraint
The install epoch (host_ns ≈ 9.49.6 s) and the 12-byte simultaneous-write
signature (3 u32 slots) narrows the candidate hooks dramatically. Two
surgical instrumentation strategies:
(a) **Pre-instrument all `*reinterpret_cast<X*>(host_ptr) = X{...}` sites in
canary**. Ripgrep finds them: pattern
`\*reinterpret_cast<[A-Z]\w*\*>\([^)]*\)\s*=` in `src/xenia/kernel/**.cc`. A
quick scan of Session 1 inventory listed ~30 such sites, but most are in
kernel-import handlers that fire repeatedly — the ε-constraint of "fires
exactly once at host_ns 9.49.6 s on tid=6" lets us bisect.
(b) **Wrap `xe::SetField()` / pointer-typed assignment helpers** if any
exist. Otherwise instrument `memcpy(host_ptr_from_TranslateVirtual, ...)`
patterns directly — there are ~40 such sites across kernel/util/cpu code per
Session 1+2 surveys. The ones NOT already wrapped by Session 2 (xex_module.cc
got 4 sites) are candidates.
LOC budget: ~50-100 additive in canary; default-off cvar
`audit_68_pod_copy_watch_addrs` (CSV of VA ranges; emits on every memcpy/raw
assign within range).
### Path 2 — Guard-page SIGSEGV trap
Use the existing canary `ExceptionHandler` infrastructure
(src/xenia/base/exception_handler*.cc — already cross-platform, has Win SEH
and POSIX SIGSEGV handlers wired). Mark the 4K page containing `0xBCE251C0`
as read-only at host_ns = 9.4 s (just before the install epoch); the page
fault triggers the writer's host instruction, log RIP/host stack, then
unprotect+resume.
Pros: catches the writer with bytecode-level precision regardless of how it
writes (memcpy, raw assign, vector store, etc.).
Cons: ~150200 LOC platform-gated; needs accurate epoch timing (can't trap
the whole boot or it crashes). Use host_ns ≥ 9.0 s as the gate.
### Path 3 — Kernel-handler grep with new ε-constraint
Now that the install epoch is known (9.49.6 s host_ns; just AFTER
`DiscImageDevice::ResolvePath(\\dat\\movie)` per AUDIT-058 narrative), grep
all kernel handlers for ones that fire in that window AND write to the
heap. The probe log already shows this is right around the time
`HostPathDevice::ResolvePath(\\dat\\movie)` runs and various worker file IO
starts. Cross-reference with canary's existing kernel-call trace
(`--log_level=4`) to enumerate handlers called in the 9.09.7 s window.
LOC: 0 (purely investigative).
**Recommended Session 4 priority: Path 1 first** (concrete instrumentation
extends what we have, leverages the epoch constraint). Path 2 as backstop.
Path 3 alongside as a cheap parallel investigation.
## Cascade outcome (Session 3)
- **A**: identify install epoch — **PASS** (9.49.6 s host_ns; ~966 ms before
sub_825070F0).
- **B**: identify neighbor pattern — **PASS** (3-slot simultaneous write,
POD struct signature confirmed).
- **C**: confirm reading-error #36 — **PASS** (Run 10 demonstrates host-write
surfaces miss the install even with the CORRECT target value
`0x8200A1E8`).
- **D**: identify the host-side writer — **N/A** (Session 4 work, with epoch
and signature constraints to narrow the search).
- **E**: secondary discovery: actual vptr is `0x8200A1E8` not `0x8200A208`
— **PASS** (AUDIT-067's target value was off by 32 bytes; may have
contributed to that audit's 0-hit JIT store result).
Net 4/5 wins. Session 4 has concrete constraints (epoch, signature, value
correction) to land the writer identification.
## Reading-error class #36 reinforcement
Session 3 directly demonstrates reading-error #36 (POD-struct
copy-assignment bypass for typed BE/LE field watch). The corrective rule is
now formalized as:
> When hooking host-side writes to guest memory, member-level set() hooks
> (e.g. `xe::endian_store::set()`) catch ONLY explicit assignments like
> `*be<T>* = value`. They DO NOT catch:
> 1. POD struct copy-assignment (`*reinterpret_cast<X*>(host_ptr) = X{...}`).
> 2. memcpy into the host pointer (`memcpy(host_ptr_from_TranslateVirtual,
> &local_struct, sizeof(X))`).
> 3. Vector-typed bulk store intrinsics that target guest memory.
>
> Mitigation: pair host-write hooks with **read-mode probes** at the
> target VA — the read probe captures the install regardless of the writer's
> mechanism, and provides epoch + neighbor-pattern constraints for the
> follow-up targeted instrumentation.
This rule is now reflected in the AUDIT-068 Session 3 read-probe machinery —
preserved in canary tree for all future audits.
## Discipline observed
- `--mute=true` on every run ✓
- Cold-protocol: cache wipe before each cold run; cache restored from
`/tmp/canary-cache-bak-audit-068` at session end ✓ (current cache was
backed up at session start since prior backup was missing).
- xenia-rs HEAD `e6d43a23…` UNCHANGED ✓ (verified by sha256 of `git diff
HEAD` at session start vs end; uncommitted modifications from prior
sessions are unchanged from session start, no new modifications made by
this session).
- Canary instrumentation purely additive + cvar-gated default-off ✓
- No destructive shortcuts ✓
- Static-init gate pattern preserved + extended (Session 3's read probe
thread is also gated on `g_guest_to_host_thunk + g_query_protect_thunk`
being non-null — same discipline as Session 2's thunk gate).
## Artifacts (this dir)
- `fix-canary-v3.diff` — cumulative Session 3 instrumentation (this run).
- `run6-read-probe-bisect.log` — primary probe on `0xBCE25340` (90 s; 7
changes, ended at `0x820610E0`, never `0x8200A208`).
- `run7-read-probe-neighbors.log` — bisect probe on `0xBCE25340 ± 4/8`; 3
simultaneous writes at `+0/+4/+8` confirming POD signature.
- `run9-read-probe-251C0-neighbors.log` — neighbor probe on the actual
ctx_ptr `0xBCE251C0`; **captures the install** at host_ns=9.416 s.
- `run10-cross-validation.log` — read probe + host-write watch with CORRECT
value `0x8200A1E8`; demonstrates 0 HOST-WRITE hits while read probe sees
the install at host_ns=9.612 s.
- `writer-report-v3.md` — this file.
(Run 8 was an intermediate diagnostic; data is included in Run 9/10 logs.)
## Phase B / progression
- `image_loaded_sha256 ea8d160e…` UNCHANGED (instrumentation does not touch
XEX image processing).
- xenia-rs HEAD UNCHANGED.
- No progression-metric movement (Session 3 is instrumentation-only). Session
4 has concrete leads.