Files
xenia-rs/audit-runs/audit-068-host-mem-watch/writer-report-v2.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

179 lines
13 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# AUDIT-068 Session 2 — writer report (extended coverage)
Date: 2026-05-19
## Summary
Session 2 extends Session 1's host-side write watch from `xe::store_and_swap<T>` + `xe::store<T>` + `Memory::Zero/Fill/Copy` to ALSO cover:
1. **`xe::endian_store<T,E>::set()`** (the underlying impl of `xe::be<T>`/`xe::le<T>`), gated on `Memory::Memory()` having registered the host→guest thunk so static-init order doesn't race the cvar.
2. **`Memory::Copy` full byte-scan** over every 4-byte-aligned source offset (gated on `g_active & 0x1`).
3. **XEX loader memcpy/lzx_decompress pre-scan** at 4 sites in `xenia/cpu/xex_module.cc` (patch-memcpy, uncompressed-image memcpy, basic-block memcpy, LZX-decompress output).
The static-init gate proved load-bearing: my initial Run 5 (XEX section sanity) produced 0 hits because `endian_store::set()` was fired during static-init before `cvars::audit_68_host_mem_watch_*` objects were constructed; `parse_locked()` ran with empty strings and permanently latched `g_active=0`. Fix: defer parse until `g_host_to_guest_thunk` is non-null (set inside `Memory::Memory()`).
## LOC added (canary only)
| File | LOC delta | Purpose |
|---|---:|---|
| `src/xenia/base/byte_order.h` | +27 | `endian_store::set()` hook (gated on `g_host_to_guest_thunk != nullptr`) + `#include <type_traits>` + `#include "audit_68_host_mem_watch_fwd.h"` |
| `src/xenia/memory.cc` | +35 / -17 | `Memory::Copy` byte-scan over 4-byte-aligned source positions; preserves addr-only coarse event |
| `src/xenia/cpu/xex_module.cc` | +35 | Inline helper `audit68_prescan_memcpy()` + wraps at sites 427 (patch image), 592 (uncompressed exe load), 668 (basic-block memcpy), 840 (post-`lzx_decompress` scan of guest-image bytes) |
| `src/xenia/base/audit_68_host_mem_watch_base.cc` | +12 | Static-init gate in `check_host_write_slowpath` and `check_guest_va_slowpath` |
| **Total** | **~110 LOC additive** (cvar-gated; zero cost when off, modest cost when on) | |
xenia-rs HEAD `e6d43a23ac393004d2e5adf2f0395fd0b5e6448b` UNCHANGED.
## Captures
All runs cold-boot (cache wipe before each), `--mute=true`, against the Sylpheed ISO.
### Run 5 — XEX .text region sanity (validates Step 3)
Cmdline: `--audit_68_host_mem_watch_addrs=0x82000000-0x82010000 --mute=true`. 70 s wallclock.
**Result: 1 hit, in INIT line + 1 HOST-WRITE.** This is the Step 3 validation — Session 1's smoking-gun absence of writes to the XEX `.text` region IS now caught.
```
i> 00000114 AUDIT-068-INIT values_csv="" addrs_csv="0x82000000-0x82010000" values_parsed=0 addr_ranges_parsed=1 active=0x2
i> 00000114 AUDIT-068-INIT addr_range[0] = 0x82000000-0x82010000
i> 00000114 AUDIT-068-HOST-WRITE guest_va=0x82000000 host_ptr=0x0000000000000000 val=0x000000004D5A9000 sz=8 fn=xex_lzx_decompress_output host_ns=300 tid=276
```
The value `0x4D5A9000` is the BE-encoded first 4 bytes of the XEX image: `"MZ\x90\x00"` = PE/EXE magic. Exactly as expected — `lzx_decompress` writes the decoded image starting at `base_address_=0x82000000`. **Session 1's reading-error class #35 is now mitigated**.
Note: only ONE hit appears (the coarse addr-only event for the start of the lzx output region) because the addr-range `0x82000000-0x82010000` intersects only the head of the ~2 MB decompress span. The per-4-byte value loop is skipped (no values configured, `active & 0x1 == 0`).
### Run 3 — vtable `0x8200A208 / 0x8200A928` writers (extended)
Cmdline: `--audit_68_host_mem_watch_values=0x8200A208,0x8200A928,0x080082A2,0x2829820 --audit_68_host_mem_watch_addrs=0xBCE25340 --mute=true`. 90 s wallclock.
**Result: 0 HOST-WRITE hits** (INIT lines present; `active=0x3`). Boot reaches tid=29 spawn (post-Phase-NonMatch trigger window).
```
i> 00000114 AUDIT-068-INIT values_csv="0x8200A208,0x8200A928,0x080082A2,0x2829820" addrs_csv="0xBCE25340" values_parsed=4 addr_ranges_parsed=1 active=0x3
i> 00000114 AUDIT-068-INIT value[0] = 0x8200A208
i> 00000114 AUDIT-068-INIT value[1] = 0x8200A928
i> 00000114 AUDIT-068-INIT value[2] = 0x080082A2
i> 00000114 AUDIT-068-INIT value[3] = 0x02829820
i> 00000114 AUDIT-068-INIT addr_range[0] = 0xBCE25340-0xBCE25347
```
**Critical implication**: with Session 2's extended coverage, NONE of the following surfaces ever wrote the target value or to the target VA in canary's full boot:
- `xe::store_and_swap<T>` (T = u8/u16/u32/u64/i8/i16/i32/i64)
- `xe::store<T>` (host-endian sibling)
- `Memory::Zero/Fill/Copy` (incl. full byte-scan in `Memory::Copy`)
- `xe::endian_store<T,E>::set()` (the underlying `be<T>`/`le<T>` write path)
- XEX loader memcpy at 4 sites + `lzx_decompress` output
AUDIT-067 already ruled out all 16 PPC JIT'd store opcodes (stw/stwu/stwx/stwux/stwbrx/stwcx./stmw/std/stdu/stdux/stdx/stdbrx/stdcx./stvx/stvxl/stvewx). Combined verdict: **`0xBCE25340` is never explicitly written via any known canonical write surface**. Yet `sub_825070F0` reads `[0xBCE25340]=0x8200A208` per AUDIT-058/063/067 trigger fire. New search candidates listed below.
### Run 4 — voice-struct field clear extended
Cmdline: `--audit_68_host_mem_watch_addrs=0x42500000-0x42600000 --mute=true`. 60 s wallclock.
**Result: 0 HOST-WRITE hits** (INIT lines present; `active=0x2`).
Per Session 1 plan, the addr range `0x42500000-0x42600000` was a guess. With Session 2's extended coverage it remains a guess — voice struct base is unknown. Next step (Session 3+): instrument canary's `XAudio2AudioDriver::CreateVoice` (or equivalent) to log the heap region holding the voice array, then re-run with that range.
### Sanity (value=0) — confirms full-surface coverage
Cmdline: `--audit_68_host_mem_watch_values=0x00000000 --mute=true`. 20 s wallclock.
**Result: 78,738 hits** across all hooked surfaces:
| Surface | Hits | Notes |
|---|---:|---|
| `xex_lzx_decompress_output` | 78,655 | Every 4-byte-zero u32 in the LZX-decompressed Sylpheed image (.bss/.padding) |
| `Memory::Zero` | 39 | Heap-page zero on Memory::Initialize + stack zeros |
| `be<T>::set` | 35 | **NEW hook — proves Step 1 works.** Header writes from `kernel_state.cc` / `xboxkrnl_threading.cc` etc. |
| `store_and_swap<u32>` | 5 | TIB/kernel-pointer init (same as Session 1) |
| `Memory::Fill` | 4 | RtlFillMemory equivalents |
Session 1 sanity was 1,639 hits — Session 2 covers ~48× more surface area, validating that the new hooks fire correctly during boot.
## Headline finding
Session 2 expanded the host-write watch from **~5 surfaces** (store_and_swap, store, Memory::Zero/Fill/Copy) to **~9 surfaces** (+ be<T>::set, + xex_module memcpy at 4 sites, + lzx_decompress output). Sanity went from 1,639 → 78,738 hits, validating the new hooks.
**Despite this expansion**, the vtable install at `[0xBCE25340] = 0x8200A208` STILL produces 0 hits across canary's full boot. Combined with AUDIT-067's 16 PPC JIT store hooks producing 0 hits, the install path is officially OUTSIDE the known canonical write surfaces. Possible remaining paths (Session 3+ search space):
1. **Direct `*reinterpret_cast<T*>(host_ptr) = value`** in kernel-import handlers (raw pointer assignment, bypassing `xe::be<T>::set()`, `xe::store_and_swap`, and `Memory::*`). Audit needs ripgrep on `kernel/xboxkrnl/*.cc` for patterns matching the above.
2. **Allocator-side initial-state writes**`MmAllocatePhysicalMemoryEx` returning a block that already contains the value from a prior committed-but-deallocated page (cross-page artifact). Memory protection routines (`MmSetAllocationProtect` etc.) may also mutate.
3. **GPU/HostMemory mmio mappings** — D3D12 backbuffer / texture upload may write to guest VA ranges directly via mapped allocations.
4. **VFS file readback into guest VA**`NtReadFile` writes the file contents into guest memory via `Memory::Copy` (now scanned) OR via a direct `memcpy(host_ptr, src, n)` in `xfile.cc`/host_path_file.cc. Need to audit those.
5. **Kernel-import handler using a typed POD struct copy** — e.g. `*reinterpret_cast<X_FOO*>(host_ptr) = X_FOO{...}` where memberwise assignment runs through neither `be<T>::set()` (because POD struct copy uses memcpy semantics) nor `store_and_swap`.
Path 5 is the most likely candidate. The implicit copy-assignment of a struct containing `be<T>` members would NOT route through `set()` — only through bytewise memcpy. This is a hook-surface gap that Session 3 should target.
## Cross-reference each captured writer in ours
### `xex_lzx_decompress_output` (Run 5 — 1 hit)
Captures the LZX decompress of the XEX image into guest VA `base_address_=0x82000000`. In canary: `xenia/cpu/xex_module.cc:840` calls `lzx_decompress(compress_buffer, ..., buffer, uncompressed_size, ...)` where `buffer = memory()->TranslateVirtual(base_address_)`.
**Ours-side analog**: `xenia-rs/crates/xenia-xex/src/lzx.rs` + `xenia-rs/crates/xenia-xex/src/loader.rs`. Per Phase B `image_loaded_sha256 ea8d160e…` matching across cold runs, ours's LZX decoder produces byte-identical output to canary's. No fix needed. **GAP CLASS: NONE.**
### `be<T>::set` (sanity-v2 — 35 hits in 20 s)
Per sanity capture, these are likely kernel-state header writes (`kernel_state.cc:create_dispatch_table` etc.). Ours's analog: `xenia-rs/crates/xenia-kernel/src/state.rs` + `exports.rs` (each kernel handler that writes a `be<T>` field). Without enabling per-event tagging in the canary log we can't enumerate which handler produced which hit; full cross-reference deferred to Session 3.
**GAP CLASS: UNKNOWN — needs per-tid stack-trace enrichment in canary instrumentation.**
### `Memory::Zero`, `Memory::Fill`, `store_and_swap<u32>` (sanity-v2 — 48 hits combined)
Already covered by Session 1 cross-reference. No new gaps surfaced.
## Predicted vs actual outcomes
| Cascade rung | Prediction | Actual |
|---|---|---|
| A=catch vtable installer | ~75% | **FAIL** — 0 hits despite ~9-surface coverage. Hook-surface still incomplete OR install is via path-5-style POD struct copy. |
| B=catch voice-struct clearer | ~50% | **FAIL** — 0 hits. Addr range was a guess; needs guest-side voice-base probe first. |
| C=identify ours's gap if A succeeds | ~70% (cond. on A) | **N/A** (A failed). |
| D=Session 3 progression-metric move | ~40-50% (cond. on A+C) | **N/A** (A failed). |
Validated rungs:
| Rung | Actual |
|---|---|
| **E=Step 3 validation (XEX section caught)** | **PASS** — Run 5 caught `xex_lzx_decompress_output` at `0x82000000` with `MZ\x90\x00` magic. Session 1 reading-error #35 resolved at the hook level. |
| **F=be<T>::set() hook fires correctly** | **PASS** — sanity-v2 saw 35 be<T>::set hits in 20 s without crashing static init. |
## Session 3 recommendation
Three concrete next steps in priority order:
**Step 1 — Hook raw pointer assignments inside `kernel/util/shim_utils.h`.** Per shim_utils.h, kernel-import handlers receive typed pointers (`X_HANDLE*`, etc.) and assign via `*ptr = value` raw assignment. `be<T>` field assignment in a POD struct does NOT go through `set()` because struct-level memcpy semantics skip the member init. Add a `XAUDIT_68_WRITE_FIELD(host_ptr, value)` macro to be invoked at known write sites OR (more invasive) instrument each `*ptr = ...` pattern. ~50-100 LOC additive.
**Step 2 — Add a memory-protection trap on guest VA `0xBCE25340` (4 bytes).** Use a guard page (`Memory::Protect` to read-only) and trap the host signal handler to log the writer's RIP/x86 instruction. This is the nuclear option — bypasses ALL emulation-layer hooks and catches the actual host store instruction. Requires platform-specific SIGSEGV/AEH handler integration. ~150-200 LOC platform-gated.
**Step 3 — Read-mode probe instead of write-mode.** Place a `RtlReadGuestU32(0xBCE25340)` probe at the FIRST iteration of canary's main loop AFTER memory init; log the VALUE at that address. If the value is `0` early then `0x8200A208` later, we know it's written between those moments. Combined with `--audit_61_branch_probe_pcs=0x825070F0` (which AUDIT-067 confirmed fires) and a binary-bisect over the boot trajectory.
Step 3 is cheapest (~20 LOC) and may pinpoint the install epoch without finding the writer; pair with bisection across the audit-068 event log.
## Cascade outcome
- A (vtable installer caught): **FAIL** — surfaces still incomplete, but space narrowed.
- B (voice-struct clearer caught): **FAIL** — addr range remains a guess.
- C (ours gap identified): **N/A** (A failed).
- D (Session 3 progression move): **N/A**.
- **E (Step 3 XEX-section validation)**: **PASS** — proves Session 1's #35 surface gap is at least partially closed.
- **F (be<T>::set hook works)**: **PASS**.
Net: 2 cascade wins (E, F) for "instrumentation is sound and now covers ~9 surfaces"; 2 cascade losses (A, B) for "the actual writer is in a path that's STILL un-hooked or doesn't exist as a canonical write at all".
## Artifacts (this dir)
- `instrumentation-design.md` (Session 1)
- `fix-canary.diff` (Session 1 — 5-file diff)
- `fix-canary-v2.diff` (Session 2 — extends with 4 more sites)
- `run1-vtable-writers.log` (Session 1 — 0 hits)
- `run2-voice-struct-writers.log` (Session 1 — 0 hits)
- `run3-vtable-extended.log` (Session 2 — 0 HOST-WRITE hits, INIT confirmed)
- `run4-voice-struct-extended.log` (Session 2 — 0 hits)
- `run5-xex-section-sanity.log` (Session 2 — **1 hit** validating Step 3)
- `sanity-value0.log` (Session 1 — 1,639 hits)
- `sanity-v2-value0.log` (Session 2 — 78,738 hits incl. 35 from be<T>::set)
- `writer-report.md` (Session 1)
- `writer-report-v2.md` (this file)
- `session-2-plan.md`