Replaces APUBUG-PRODUCER-001's random-victim-hijack audio injection
with a dedicated per-client guest worker thread, mirroring xenia-canary's
apu/audio_system.cc:84-159 WorkerThreadMain pattern in xenia-rs's
threading model. Audio callback ticker is now safe to enable by default.
## What changed
- xenia-kernel/src/xaudio.rs: new XAudioState fields worker_handles +
worker_refs (one slot per of XAUDIO_MAX_CLIENTS=8). Synthetic
park-handle helper (0xF000_0000 | client_idx) — outside the normal
alloc range so wake_eligible_waiters never finds it; the only
legitimate state-flip is via try_inject_audio_callback.
- xenia-kernel/src/exports.rs: xaudio_register_render_driver spawns a
64KB-stack guest thread (create_suspended=true) via
state.scheduler.spawn after registration succeeds. Immediately flips
the spawned thread's state from Blocked(Suspended) to
Blocked(WaitAny[synthetic]) so it's parked but not woken. Stores the
kernel handle so find_by_handle resolves a fresh ThreadRef after slot
compaction. Failure paths log + leave xaudio.worker_refs[i] = None,
in which case the ticker drops fires (no random-victim fallback).
- xenia-app/src/main.rs: try_inject_audio_callback resolves the worker
via worker_handles[index] instead of scanning runqueues for a Ready
or Blocked victim. The PC+r3 injection and SavedCallbackCtx capture
are unchanged; the existing LR_HALT restore path re-blocks the
worker on its synthetic handle for the next tick. Flag handling
reworked: --xaudio-tick / XENIA_XAUDIO_TICK now act as explicit
override (truthy = force on, falsey = force off, absent = use the
KernelState default).
- xenia-kernel/src/state.rs: xaudio_tick_enabled default flipped from
false to true. Pre-fix it was off because the random-victim hijack
regressed swaps=2->1; with the dedicated worker that whole class of
regression is gone.
## Cascade verification at -n 500M (audit-runs/audit-048-audio-host-pump/)
Pre-fix baseline: audit-runs/audit-047-gamma-wedges/ours-end-state.log.
| Dim | Predicted (AUDIT-032) | Observed |
|-----|-------------------------------------|---------------------------------|
| A | tid=9 leaves Blocked[0x828A3254] | Ready @ pc=0x824d1404 |
| B | tid=10 leaves Blocked[0x828A3230] | Ready @ same pc/lr |
| C | XAudioSubmitRenderDriverFrame > 0 | Mixer setup path executed |
| D | KeReleaseSemaphore 0 -> non-zero | 0 -> 1; xaudio.callback.delivered=1 |
Bonus: audit-042's tid=6 worker pair on 0x10A0+0x10A4 also went
Blocked->Ready as a downstream effect.
Boot trajectory shifted significantly: NtWaitForSingleObjectEx
1,489,791 -> 30; NtSetEvent 3,334 -> 68; new exports firing
(StfsCreateDevice, ObCreateSymbolicLink, XamContentCreateEnumerator,
XamEnumerate, XamTaskSchedule, ExCreateThread x10, KeSetAffinityThread x7,
NtCreateSemaphore x4, NtWaitForMultipleObjectsEx x94, NtDuplicateObject x14,
XeCryptSha, XeKeysConsolePrivateKeySign). The system left the
audio-wait busy loop and entered the savegame/content/crypto init phase.
swaps regressed 2 -> 1 (degenerate splash repeat lost; main thread now
advances past splash entirely, blocked on a different handle). draws
unchanged at 0 — expected per AUDIT-032 (audio gate != renderer gate).
## Tests + scope
- cargo build --release succeeds, no new warnings.
- cargo test -p xenia-kernel --lib: 127/127 pass (incl. xaudio).
- cargo test -p xenia-app --lib: 5/5 non-ignored pass.
- Lockstep goldens (sylpheed_n2m / sylpheed_n50m) WILL drift on this
fix and need re-baselining as a follow-up commit.
75 net non-comment LOC across 4 files, well under AUDIT-032's
60-120 LOC budget.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the four remaining deferred follow-up items in one bundle.
All four are smaller-scope and additive; lockstep determinism
unaffected (analyzer-only changes).
## M9.5 — __CxxFrameHandler scope-table parsing
- New `xenia_analysis::eh_scope` module. Magic-scans .rdata for the
three documented MSVC FuncInfo signatures (0x19930520/21/22) on
4-byte alignment. Each match is parsed as the documented struct
(BE u32 fields), with sanity caps on max_state / n_try_blocks /
pointer validity.
- Walks pUnwindMap (UnwindMapEntry, 8 bytes) and pTryBlockMap
(TryBlockMapEntry, 20 bytes) into one row each.
- New tables eh_funcinfo, eh_unwind_map, eh_try_blocks.
- Sylpheed yield: 2,588 FuncInfo (all version 0x19930522) /
10,019 unwind entries / 315 try-blocks.
## M11.5 — Static-init driver chain detection
- New `xenia_analysis::static_init` module. Walks every function
looking for the canonical _initterm loop: lwz cursor; mtctr;
bcctrl; addi cursor, cursor, 4 bounded by a compare against another
constant register. Extracts (array_start, array_end) and reads
the array.
- Reuses `function_pointer_arrays` table — drivers' arrays land with
kind='static_init' (replacing M11's prologue-heuristic output where
the structurally-grounded pattern fires).
- Sylpheed yield: 0 drivers detected — the binary's static-init
structure does not match the canonical CRT loop. Infrastructure
ready; future M11.6 can relax.
## VMX vector-store xrefs (M6 follow-up)
- Adds AltiVec/VMX X-form load/store XOs to the M6 opcode-31
dispatch: lvx/lvxl/lvebx/lvehx/lvewx (reads) and
stvx/stvxl/stvebx/stvehx/stvewx (writes), all addr_mode=
'x_form_indexed'. Static resolution still requires both rA and rB
constant.
- Sylpheed yield: 110 newly-detected stvx writes.
## Shift_JIS + UTF-8 localised-string detection (M7 follow-up)
- Extends `xenia_analysis::strings::analyze` with scan_shift_jis (JIS
X 0208 lead/trail byte ranges + half-width katakana pass-through)
and scan_utf8 (2- and 3-byte sequences). At least one multi-byte
unit required so pure-ASCII strings aren't double-counted.
- SJIS bytes rendered as \xHH escapes for diagnostic readability;
full SJIS→UTF-8 decoding deferred.
- Sylpheed yield: 790 Shift_JIS strings (Japanese debug + UI text)
+ 39 UTF-8.
## Tests
- +2 EH (parses_minimal_funcinfo_v0, rejects_bogus_max_state)
- +2 static_init (detects_canonical_initterm_loop, rejects_function_without_pattern)
- +2 strings (detects_shift_jis_string, detects_utf8_multibyte_string)
Tests 649→655 (+6 unit tests). DB schema golden + write_analysis_results
signature updated for new EH parameter.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the dominant case M5 could not resolve — `lwz vt, off(this);
lwz fn, slot(vt); mtctr; bcctrl` (real C++ dispatch). Implements
class-membership inference using constructor-side vptr writes as an
oracle for which vtables can land at each offset.
## Algorithm
Phase 1 — vptr-write scan: walk every function with the existing
lis+addi register tracker. When `stw rA, off(rB)` writes a known M3
vtable address into off(rB), record `(vtable_addr, vptr_offset,
writer_pc, writer_function)` as a constructor-side vptr write.
Phase 2 — invert by offset: `vtables_by_offset[off] = {V : V written
at off in any ctor}`.
Phase 3 — dispatch detection: from each `bcctrl LK=1`, walk back
≤16 instructions looking for the canonical chain. Bail on register
clobber, branch, or label (basic-block) boundary.
Phase 4 — edge emission: for `(dispatch_pc, vptr_off, slot)`, emit one
`xrefs.kind='ind_call'` row per vtable V where:
- `vtables_by_offset[vptr_off]` contains V, AND
- `V.length > slot` (V actually has a method at that slot)
Multi-candidate sites (the common case at offset 0) are an
over-approximation; downstream queries filter to single-candidate sites
for high confidence:
`WHERE candidate_count=1` in `indirect_dispatch_sites`.
## Schema
NEW TABLES:
- `vptr_writes(writer_pc, vtable_address, vptr_offset, writer_function)`
- `indirect_dispatch_sites(dispatch_pc PK, vptr_offset, slot, candidate_count)`
- `indirect_dispatch_candidates(dispatch_pc, vtable_address, method_address)`
NEW INDICES on vtable_address / vptr_offset / method_address /
(vptr_offset, slot) for fast joins.
## Sylpheed yield
- 567 vptr writes / 214 vtables / 29 offsets (offset 0 = 88%).
- 6,842 dispatch sites resolved: 97 single-candidate (high-confidence) +
6,745 multi-candidate.
- 687,963 ind_call xref rows.
- 2,746 newly-reachable functions via v_indirect_reachability_from_entry
(compared to 0 with M5 alone).
- Audit-009 cluster: functions including 0x823BC9E0, 0x823BC290,
0x823BC5A0, 0x823BB158 newly reachable — actionable for the
renderer-plateau hunt.
Tests 640→649 (+4 ind_dispatch_typed unit tests + 5 from tighter golden
expansion). Schema golden + write_analysis_results signature updated.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Five LOW-priority milestones bundled. Total ~700 LOC across 11 files.
## M9 — has_eh derived from pdata.flags exception bit
- New `functions.has_eh BOOLEAN NOT NULL` column. Derived from M1's
already-parsed `pdata.flags` (bit 31 of the packed word — the
exception-handler-present flag, distinct from bit 30 which is the
always-1 32-bit-code flag). Index idx_functions_has_eh.
- Sylpheed: 2,975 of 23,073 pdata-validated functions have EH (12.9%).
## M10 — .tls section / IMAGE_TLS_DIRECTORY32 parser
- New `xenia_xex::tls::parse_tls` parses the directory + zero-terminated
callback array. Returns None when the binary has no .tls section.
- New `tls_info` (singleton row) + `tls_callbacks(slot, address)` tables.
- New `DbWriter::write_tls()` no-ops on None.
- Sylpheed has no .tls section → 0 rows; infra ready for binaries with
__declspec(thread).
## M8 + M11 — function_pointer_arrays (dispatch tables + static initialisers)
- New `xenia_analysis::funcptr_arrays::analyze` widens M3's vtable scan:
detects runs of ≥2 function pointers in .rdata and classifies each as
`vtable` (M3 re-emit), `dispatch_table` (M8), or `static_init` (M11)
via a constructor-prologue heuristic (mfspr + small stwu).
- New tables `function_pointer_arrays(address PK, length, kind)` and
`function_pointer_array_entries(array_address, slot, function_address)`.
- Sylpheed: 722 vtables + 388 dispatch_tables = 1,110 arrays / 6,347 slots.
0 static_init detected (Sylpheed's ctors don't all match the
conservative heuristic; M11.5 future work can chain via the entry-
point's static-init driver).
## M12 — --lr-trace runtime canary-diff harness
- New CLI `exec --lr-trace=PC[,PC,...]` and `--lr-trace-out=PATH` flags.
Symbolic resolution (Class::method, Class::*) via M4 lookup. Env vars
XENIA_LR_TRACE / XENIA_LR_TRACE_OUT also work.
- New `KernelState::lr_trace_pcs` + `lr_trace_writer` + helper
`fire_lr_trace_if_match(hw_id)` invoked from the per-instr probe slot.
- JSONL output: pc/tid/hw/cycle/r3/r4/r5/r6/lr — superset of what
xenia-canary's --log_lr_on_pc patch emits, with a cycle counter for
cross-run reproducibility. Diff-friendly via `jq`.
- Lockstep digest unaffected: smoke test on entry-point PC fires once
with cycle=0/lr=BCBCBCBC/all-GPR-zero (correct initial state).
Tests 636→640 (+2 TLS tests, +2 funcptr_arrays tests). Schema golden
updated for new tables + has_eh column. Lockstep determinism preserved
(instructions=2000005 ×2 reruns identical).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds finer-grained addressing-mode classification to every data xref row
plus new dispatch for instruction families not previously emitted:
- New `xrefs.addr_mode VARCHAR NULL` column. NULL for control-flow edges
(call / ind_call / j / br); one of d_form / lis_addi / lis_ori /
multiword / x_form_indexed / x_form_byterev / atomic / dcbz for data
edges. Index idx_xrefs_addr_mode.
- New `xenia_analysis::xref::AddrMode` enum + Xref::addr_mode field.
- Opcode 46/47 (lmw/stmw) expand to one xref per slot — D-form multi-word
load/store now resolves all (32-rS) consecutive addresses.
- Opcode 31 X-form dispatch — stwx/stbx/sthx/stwux/stbux/sthux/stdx/stdux,
lwzx/lbzx/lhzx/lhax/lwzux/lbzux/lhzux/lhaux/ldx/ldux,
stwcx./stdcx. (atomic),
stwbrx/sthbrx/lwbrx/lhbrx (byte-reverse),
dcbz (cache-line clear).
- X-form rows are emitted ONLY when both rA and rB resolve to known
constants (rare but present); the dominant runtime-indexed pattern
remains correctly skipped.
Sylpheed yield (regen on master + merge):
- 442 newly-detected x_form_indexed reads (lwzx/lhzx into static tables).
- 40 newly-detected atomic writes (stwcx./stdcx. with resolvable address).
- 28,834 lis_addi refs, 18,485 d_form reads, 3,288 d_form writes — every
pre-existing data row now tagged.
- 0 multiword / dcbz / byterev (these instructions exist but aren't on
lis+addi-tracked code paths).
Tests 633→636 (+3 xref unit tests covering AddrMode tag uniqueness,
data-edge addr_mode round-trip, control-edge None invariant). Schema
golden updated (xrefs gains addr_mode column).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two MEDIUM milestones bundled (both opportunistic per plan; both small).
## M5 — indirect-dispatch reachability
- `xenia_analysis::indirect`: per-basic-block register tracker over each
detected function. Recognises the canonical static-vtable pattern
`lis+addi → lwz off(rA) → mtctr → bcctrl` where rA holds a known M3
vtable address. Emits one `Xref { kind: IndirectCall }` per resolvable
bcctrl site.
- PowerPC ABI awareness: `bl`-style calls clobber volatile r0..r12 + ctr
but preserve non-volatile r13..r31, so a vtable pointer parked in r30/r31
before a call survives.
- Label-based basic-block boundaries kill register state — bounds
false-positive risk for jump-IN paths.
- New `XrefKind::IndirectCall` variant (DB tag `'ind_call'`).
- New SQL view `v_indirect_reachability_from_entry` — strict superset of
`v_reachability_from_entry`, taking `ind_call` edges in the BFS.
Sylpheed yield: 0 edges detected. The binary's 1,001 static lis+addi
references into vtables are nearly all constructor-side vptr writes, not
dispatches; real method dispatch goes through `this->vptr` which requires
alias analysis we explicitly don't do. Documented in SCHEMA.md as the
expected limitation. Three unit tests cover the synthetic-correctness path.
## M7 — string / constant-pool detection
- `xenia_analysis::strings`: scans `.rdata` for runs of ≥ 6 printable
ASCII bytes (NUL-terminated) and ≥ 6 UTF-16LE code units (basic-plane
printable ASCII, NUL u16 terminator).
- New `strings(address PK, encoding, length, content)` table + encoding index.
- Implicit cross-ref via existing `xrefs.kind='ref'` rows whose target
matches a strings.address.
Sylpheed yield: 6,311 ASCII strings (including embedded HLSL shader source
and AS_CB_SURFACE_SWIZZLE_* assertion strings). 9,132 lis+addi sites
cross-reference detected strings — names source PCs near each string in
one query. Four unit tests cover encoding detection, NUL termination, and
short-run rejection.
Tests 626→633 (+3 indirect, +4 strings).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CLI extension only — no schema change. Adds symbolic resolution for
--pc-probe / --branch-probe / --ctor-probe tokens:
- `0xADDR` / `2186674160` — numeric (current behavior, no DB load).
- `Class::method` — joins classes × methods × demangled_names.
- `Class::*` — joins classes × methods (all slots).
- `function_name` — falls back to functions.name for free functions /
saverestore stubs / labels.
New `xenia_analysis::lookup::resolve_probe_token(db_path, token)` opens the
DB read-only ONLY when a token is non-numeric, so legacy numeric flows pay
no IO. New `--probe-db PATH` flag (or `XENIA_PROBE_DB` env / default
`sylpheed.db` next to the .iso) selects the DB.
Symbolic resolution happens BEFORE any guest exec, so it cannot affect the
lockstep digest. Verified deterministic across two reruns at -n 2M
(instructions=2000005 identical).
End-to-end smoke test on Sylpheed: `--pc-probe='ANON_Class_6B674251::*'`
resolves to all 45 method PCs of that anonymous class (matching the
methods-table row count for that vtable).
Tests 621→626 (+5 lookup unit tests covering numeric passthrough,
symbolic-without-DB error, Class::method resolution, Class::* expansion,
and functions.name fallback).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds detection of statically-allocated MSVC vtables in .rdata/.data:
- New `xenia_analysis::vtables` walks read-only sections looking for runs of
≥3 contiguous big-endian u32 values where each value lands on a known
function start (from M1's corrected functions table). 2-slot runs are
rejected to keep false-positive rate down.
- For each candidate the MSVC RTTI walk vtable[-1] → CompleteObjectLocator
→ TypeDescriptor → mangled name is attempted; on success the demangled
class name is recorded along with a best-effort RTTIClassHierarchyDescriptor
walk to fill base_classes_json. On failure (RTTI stripped — common for
shipped game binaries) the class is named ANON_Class_<fnv1a-hash> keyed
by sorted method-PC list, so identical vtables collapse to one entry.
- DB: new tables `vtables`, `methods`, `classes` with indices on
function_address and rtti_present. `write_analysis_results` takes a
`&[Vtable]` slice; `write_disasm` (back-compat) passes empty.
- cmd_dis wires the scan after xref analysis using
`func_analysis.functions.keys()` as the function-start oracle.
Validation on Sylpheed (RTTI stripped, as expected): 722 vtables / 499
unique classes / 5571 methods. Sanity invariant: every methods.function_address
joins to functions.address (0 broken refs). Largest vtable: 131 slots.
Tests 617→621 (+4 vtable unit tests covering 3-slot detect, 2-slot reject,
synth name stability, and synth name divergence).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds an authoritative function-boundary source from the linker:
- New `xenia_xex::pdata` parses .pdata 8-byte entries (BeginAddress + packed
prolog/length/flags). Bit layout per Microsoft PE32 PowerPC spec: prolog in
bits 0..7, function_length in bits 8..29, flags in 30..31.
- `func::analyze_with_pdata` unions pdata BeginAddresses into the candidate
set, attaches `pdata_validated`/`pdata_length` to each `FuncInfo`, and trims
any function whose `end` overlaps the next start (catches mis-merge where
one row spanned two prologues — the audit-031 sub_824D23B0/sub_824D29F0
case).
- DB: extends `functions` with `pdata_validated BOOLEAN`, `pdata_length BIGINT`;
new table `pdata_entries`; index on pdata_validated.
- New `crates/xenia-analysis/SCHEMA.md` documents M1 layer + forward work.
Validation on Sylpheed: 25481 functions (was 12156) / 23073 pdata_validated /
0 orphans / 0 mis-merges. Audit-031 mis-merge resolved: sub_824D29F0 now has
its own row with `pdata_length=280` (70 dwords); sub_824D23B0 now correctly
ends at 0x824D2878 (`pdata_length=1224` matches prologue walk).
Tests 605→610. New 5-test pdata unit suite covers bit layout + sentinel +
out-of-range filtering + real-world layout round-trip.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Without the page-state guard, read_bulk faulted on PROT_NONE pages of
the 4 GiB host reservation. Per-page is_mapped check skips uncommitted
pages, leaving the buffer's leading zero bytes in place. Total LOC
budget after trim: 70.
The headless cmd_exec path passes quiet=false in normal use but the
diagnostic --dump-section is independent of the chatty thread/dump
prints, so it should not be gated by --quiet. Lockstep digest preserved.
Adds an opt-in diagnostic that emits one tracing line per guest store
overlapping any armed byte address, naming the writer (tid, pc, lr)
plus old/new u32 lanes. Mirrors the --pc-probe / --branch-probe shape;
pc/lr are stamped from worker_prologue via a thread-local Cell, so
default runs (empty watch set) take a single is_empty() check on each
write. Lockstep digest preserved (instructions=100000003 across reruns,
sylpheed_n50m.json golden byte-identical).
Diagnostic infra only; no functional change. Used to identify producers
of dispatch-state writes for the audit-017 / audit-019 hunt.
Sister to --pc-probe / --ctor-probe but emits a single compact one-line
BRANCH-PROBE record per fire (pc, tid, hw, cycle, r3, lr, cr0/cr6 flags)
with no back-chain. Designed for tracing every conditional-branch fire
inside a candidate-gate function so the last PC reached before the
function epilogue identifies the exit branch.
Runtime trace at audit-runs/audit-007/sub_824A9710-trace.log decisively
identifies the priv-11 gate:
- Exit branch: 0x824a9944 (post bl sub_824ABD88 first call)
- Responsible kernel call: NtDeviceIoControlFile, FsCtlCode=0x74004
(registered as stub_success at exports.rs:90)
- Mechanical chain: stub returns 0/SUCCESS without writing OUT, game
reads [out_buf+8], finds zero, assigns hardcoded 0xC0000034
(STATUS_OBJECT_NAME_NOT_FOUND) at sub_824ABD88:0x824abea8-ac, exits
via 0x824a9944's lt branch before priv-11 site at 0x824a99a0.
592→592 tests; lockstep instructions=100000010, swaps=2, draws=0
deterministic across reruns. Read-only diagnostic — no fix this session.
Next session: KRNBUG-IO-003 (real NtDeviceIoControlFile per canary
NullDevice::IoControl for FsCtlCodes 0x70000 + 0x74004).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends `--ctor-probe` machinery into `--pc-probe` (clap alias) with
the optional `PC@DISPATCHER:OFFSET` token form: on a hit, the helper
additionally logs `[disp+off]` — what the producer's
`lwz r3, OFFSET(r3)` is about to read. Reuses `parse_hex_u32`; both
flags share parser + storage.
Read-only diagnostic. Lockstep digest preserved (`run digest matches
golden` at -n 50M `--stable-digest`). 588 tests green.
Decisive findings (full deliverable in `audit-findings.md` /
`audit-runs/audit-005/`):
- Failure mode α confirmed for KRNBUG-AUDIT-004: all 9 producer call
sites for handles 0x100c (5 sites) and 0x15e0 (4 sites) fire 0x at
-n 500M. The producer code path is not reached.
- Set-diff of kernel-call sequences (canary.log oracle vs ours.log
at -n 500M) identifies 11 exports canary calls and we don't:
XGetAVPack, XeCryptSha, XeKeysConsolePrivateKeySign,
ObCreateSymbolicLink, NtDeviceIoControlFile (×2),
XamUserReadProfileSettings (×2), XamTaskSchedule, XamTaskCloseHandle,
KeReleaseSemaphore (×268), KeResetEvent, ExTerminateThread (×2).
- XGetAVPack has exactly one caller (sub_824AB578 at 0x824AB5A0).
The 4 instructions immediately preceding it are:
addi r3, r0, 10 ; privilege bit 10
bl XexCheckExecutablePrivilege
cmpli 0, r3, 0
bc 12, eq, 0x824AB724 ; if r3==0, skip whole block
- exports.rs:193 registers XexCheckExecutablePrivilege as
stub_return_zero. Always returning 0 -> guest takes the branch
and skips the entire AV/crypto/save-data init block.
- The other call site (sub_824A9710 at 0x824A99A0) queries privilege
11 with opposite polarity (bne) -> gates XamTaskSchedule on the
privilege-NOT-set arm. With both stubs returning 0, the guest
walks the wrong arm of every privilege-gated branch.
- This explains why the dispatcher fields read zero
([0x828F3D08+0x50]=0, [0x828F4070+0x24]=0 from AUDIT-004 dumps):
the ctors run, but the producers that would populate those fields
with a non-zero handle never execute.
Next session: replace XexCheckExecutablePrivilege stub with real
priv-bit lookup from XEX header. See audit-findings.md
KRNBUG-AUDIT-005 for the validation matrix.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Diagnostic-only, read-only. Lockstep `instructions=100000002`
preserved bit-exact at -n 100M --stable-digest. 586 → 588 tests.
Adds two read-only diagnostics for the parked-waiter producer hunt:
* `--ctor-probe=0x8217C850,0x...` — at every interpreter step,
if `ctx.pc` is in the configured set, print one `CTOR-PROBE`
line capturing live r3 (= `this` in MSVC PPC ctors), lr
(= return site), sp, plus an 8-frame back-chain with
saved-r31/r30 per frame. Fires once per hit, exactly what the
8-instance-pool probe needed.
* `--dump-addr=0x828F3D08,0x828F4070,0x828F3EC0,...` — at end of
run (after the FOCUS report in `dump_thread_diagnostic`), each
address gets a 128-byte hex + be32 + ASCII dump. Used to
inspect the static dispatcher / job-queue struct layouts
AUDIT-003 identified.
Both gated default-off; empty set is a single `is_empty()` test on
the hot path. No guest state is mutated, so the
`sylpheed_n*m.json` lockstep digest is preserved.
KRNBUG-AUDIT-004 findings (corrects KRNBUG-AUDIT-002/003):
1. **The "8-instance pool" hypothesis for handle 0x1004 is FALSE.**
Probing the inner per-instance ctors `[0x821783D8, 0x82181750,
0x821701C8]` at -n 50M shows each fires EXACTLY ONCE with
r3 = `[0x828F3EC0, 0x828F3D08, 0x828F4070]` respectively. All
three handles are Meyers-style singletons with one dispatcher
each. The "called 8 times" claim came from miscounting raw
entries to the OUTER getter sub_8217C850 — but that getter is
itself a Meyers-singleton-getter; only the FIRST entry cascades
through to bl 0x821783D8 (gated on `[0x828F48D8] bit 0`).
2. **The producer indirection layer is the singleton-getter
itself.** Static byte-scan of .rdata / .data shows 0 hits for
the dispatcher addresses — no static registry table holds them.
But the xrefs table for the OUTER getters reveals 5–6 callers
each, MOSTLY non-create-chain, sharing the canonical producer
pattern: `bl outer_singleton_getter; lwz r3, OFFSET(r3); bl
0x824AA1D8` (with OFFSET=80 for 0x100c, =36 for 0x15e0). So the
AUDIT-003 xref audit was necessary but not sufficient — it
correctly saw "no direct producer references" but missed the
singleton-getter indirection layer.
3. **Dispatcher struct layouts** (128-byte dumps captured at -n
50M --halt-on-deadlock):
- 0x828F3D08 (handle 0x100c): event_handle at +0x4C (0x100c),
thread_handle at +0x48 (0x1010), self-pointer at +0x74,
capacity 7 at +0x28, queue empty (+0/+3C = -1).
- 0x828F4070 (handle 0x15e0): event_handle at +0x20 (0x15e0),
sibling-handle 0x15E4 at +0x1C, queue empty (+0x10 = -1).
- 0x828F3EC0 (handle 0x1004): event_handle at +0x78 (0x1004),
4 guest-heap sub-buffers at +0x20/+0x3C/+0x44/+0x50 in
0x4xxxxxxx range — noticeably different layout from the
other two pure POD job queues.
Files:
crates/xenia-kernel/src/state.rs ctor_probe_pcs / dump_addrs +
fire_ctor_probe_if_match + 2 tests
crates/xenia-app/src/main.rs Exec --ctor-probe / --dump-addr
CLI parsing, prologue hook,
end-of-run struct dumper
audit-findings.md KRNBUG-AUDIT-004 entry
audit-runs/audit-004/ 50M probe runs (v1 outer-getter
hits, v2 inner-ctor hits proving
the singleton hypothesis)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a read-only MSVC RTTI traversal helper (`read_class_at_this`)
and a `probe_create_stack_classes` integration that walks each
captured back-chain frame for handle creates in `--trace-handles-focus`
and probes each frame's most-likely `this` candidate (live r31/r30/r3
for frame 0; saved-r31/r30 from the prologue spill area at [fp-12]/
[fp-16] for deeper frames). False-positive guard rejects the CRT
static-init iterator pattern (vtable's first two slots must be image-
range function pointers — PPC instruction words like `mflr r12` are
not in 0x82xxxxxx).
`dump_thread_diagnostic` now takes `&GuestMemory` so the FOCUS report
prints, for each parked waiter, a WAIT-THREAD block with full back-
chain frames and per-slot saved-register dump for offline lookup.
End-to-end finding (-n 500M producer-trace):
* Handle 0x100c dispatcher = 0x828F3D08 (image rdata; verified by
sub_82181750 disasm + xref table). [this+0] = -1 sentinel — POD
job queue, NOT a C++ polymorphic class.
* Handle 0x15e0 dispatcher = 0x828F4070 (same shape).
* Handle 0x1004's 8-instance pool members still TBD (MSVC ctors
didn't preserve `this` in r31).
* 0x42450b5c is a separate audit class (heap-allocated, parks via
non-`do_wait_single` path).
Decisive xref audit: every reference to 0x828F3D08 / 0x828F4070 in
the static analysis is in a ctor or the CRT init driver. NO producer
code references either dispatcher base. Confirms `signal_attempts=0`
is unreachable-producer, not broken-producer.
Tests: 581 → 586 green (+5: RTTI-intact / RTTI-stripped / non-object
/ cstring / probe_create_stack integration). `--stable-digest -n
100M` instructions=100000002 unchanged. Master HEAD prior: 6440261.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds `walk_guest_back_chain` (PPC EABI back-chain walker) and a
`record_create_with_stack` audit hook gated on `--trace-handles-focus`.
NtCreateEvent / NtCreateSemaphore / NtCreateTimer / XamTaskSchedule now
route through the new helper so focused handles capture up to 6 stack
frames at allocation time. Diagnostic-only, read-only memory access:
unfocused handles pay one HashSet lookup, focused ones pay six
back-chain dereferences. Lockstep determinism preserved.
End-to-end finding: handles 0x1004 (8-instance pool via static ctor at
0x8280F810), 0x100c (singleton built inside main()), 0x15e0 (singleton
in distinct cluster) are silph-framework dispatcher objects whose
producer code is unreached at -n 500M. The producer hunt now has class
ownership; vtable/RTTI readout is the next step.
Tests: 576 → 581 green. `--stable-digest -n 100M` instructions=100000002
unchanged. Master HEAD prior: 9d45efe.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the three XAudio kernel-export stubs (Register/Unregister/SubmitFrame)
with canary-faithful implementations and add a periodic buffer-complete
callback ticker reusing the existing SavedCallbackCtx injection machinery.
Canary parity:
- xboxkrnl_audio.cc:56-93 — read callback_ptr[0..1], wrap callback_arg in a
4-byte big-endian guest heap buffer (`wrapped_callback_arg`), write
`0x4155_xxxx` to *driver_ptr.
- audio_system.cc:139-141 — guest callback receives r3 = wrapped pointer,
not raw callback_arg.
- audio_driver.h:21-24 — frame rate 256 samples / 48 kHz ≈ 5.33 ms.
Implementation:
- New `crates/xenia-kernel/src/xaudio.rs` — `XAudioClient`, `XAudioState`
(8-slot table, pending FIFO, dual-mode ticker), `XAUDIO_INSTR_PERIOD =
48_000` (lockstep) and `XAUDIO_PERIOD = 5.333 ms` (--parallel), same
pattern as KRNBUG-D08 v-sync.
- `try_inject_audio_callback` in xenia-app mirrors `try_inject_graphics_interrupt`,
shares `interrupts.saved` slot for mutex with graphics callbacks.
Gating: ticker + injector run only when `--xaudio-tick` /
`XENIA_XAUDIO_TICK=1`. Default off because Sylpheed's audio callback
enters an infinite `KeWaitForSingleObject` loop on first invocation
(canary's host worker thread provides the buffer-completion fence we
don't model), which hijacks a guest HW thread and regresses
`swaps=2 → 1`. Default-off preserves the lockstep `sylpheed_n*m.json`
goldens exactly.
Producer hunt outcome (FALSIFIED for parked handles 0x1004/0x100c/0x15e4):
at `-n 500M --xaudio-tick` all 3 handles still show
`signal_attempts=0 (primary=0, ghost=0)`. Audio callback is not the
missing producer. Next candidate per audit-findings.md is Timer DPC
delivery (KeSetTimer / KeInsertQueueDpc).
Tests: 562 → 576 green (10 in `xaudio.rs`, 4 in `exports.rs`).
Lockstep `--stable-digest -n 100M` default-off: instructions=100000002,
swaps=2 (matches pre-change baseline byte-for-byte).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The synthetic v-sync ticker used a per-instruction proxy
(VSYNC_INSTR_PERIOD = 150 k) tuned for ~10 MIPS lockstep
throughput → 60 Hz. Audit M11 observed this drifts under
`--parallel`: with 6 worker threads sharing the kernel mutex,
the dispatcher executes more PPC instructions per tick
callback, so the accumulator never crosses 150 k. Result:
~629 v-syncs/100M lockstep → ~2 v-syncs/100M --parallel.
Hybrid solution preserves lockstep determinism (which the
goldens depend on) while fixing --parallel:
* `tick_vsync_instr(instr_count)` — legacy instruction-count
ticker, used by lockstep. Bit-stable across runs.
* `tick_vsync_wallclock()` — new Instant-based ticker. Fires
`floor(elapsed / VSYNC_PERIOD)` v-syncs since the anchor
and advances the anchor by that many full periods (no
lazy backlog). Capped at INTERRUPT_QUEUE_CAP per call so a
forward-jumping clock can't overflow the FIFO.
* `KernelState.parallel_active` flag set at startup from
`--parallel` / `XENIA_PARALLEL=1`. Read by `coord_pre_round`
in main.rs to choose between the two tickers.
Verification:
* cargo test --workspace --release: 561 passing (+3 new
wall-clock tests vs prior 558 baseline).
* lockstep -n 100M --stable-digest: BIT-IDENTICAL to
pre-Phase-3 baseline. interrupts_delivered preserved at
~630 (was ~629 pre-fix).
* --parallel --reservations-table -n 30M: interrupts_delivered
rose from ~2 to 17. (FIFO INTERRUPT_QUEUE_CAP=4 still caps
burst delivery; that's a separate bottleneck — addressed
by raising cap when --parallel queue depth becomes the
next blocker.)
Trade-off: --parallel runs are non-deterministic at the
v-sync rate by design (per audit M05 PPCBUG-703 already).
Lockstep stays bit-identical, so the `sylpheed_n*m.json`
goldens are untouched.
Audit IDs: KRNBUG-D08 (closed).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a one-run diagnostic that distinguishes "guest never called
Nt/KeSetEvent on this handle" from "signal landed but waiter wasn't
woken", for any handle named via `--trace-handles-focus`.
Parked-waiter context (project_xenia_rs_sylpheed_stage3_2026_04_29):
four worker threads block Sylpheed past `draws=0` on handles
0x1004 / 0x100c / 0x15e4 / 0x42450b5c (mr=true, sig=false). The
pre-existing audit dropped signal-attempts that targeted handles
without a primary trail, so we couldn't tell whether the producer
was unreachable in the guest or whether the signal landed but missed
its waiter.
Three changes:
* audit.rs: `HandleAudit` gains `focus: HashSet<u32>` and
`ghost_trails: HashMap<u32, GhostTrail>`. `record_signal`
auto-falls-through to a new `record_signal_attempt_ghost` when no
primary trail exists AND the handle is in `focus`. Bounded by
AUDIT_RING_CAPACITY per handle. Two new tests cover the focus
ghost-trail and no-double-record invariants.
* main.rs: new `--trace-handles-focus=<LIST>` flag (hex 0x or decimal,
comma-separated) populates `kernel.audit.focus`. Implies
`--trace-handles`. New "=== Handle audit (focus) ===" section in
`dump_thread_diagnostic` emits per-handle:
- signal_attempts (primary + ghost), waits, wakes
- merged cycle-sorted timeline (last 16)
- GuestExport / KernelInternal classification
- <AUDIT_BLIND> marker when waiter_count > 0 but the audit
saw no waits (i.e. waiter parked via a non-audit path —
CS / spinlock / DPC).
- DIAGNOSIS conclusion that selects between five branches.
* `cmd_check` passes None for focus → goldens unaffected.
Empirical run output at -n 500M lockstep with
`--trace-handles-focus=0x1004,0x100c,0x15e4,0x42450b5c`:
handle=0x00001004 kind=Event/Manual waiters=1 signaled=false
signal_attempts=0 (primary=0, ghost=0)
waits=1 wakes=0
created cycle=0 tid=1 lr=0x824a9f6c src=NtCreateEvent
=> producer is a missing kernel signal source
(or BST-paradox upstream)
... (same shape for 0x100c, 0x15e4)
handle=0x42450b5c kind=<UNCREATED> waiters=1 signal_attempts=0
waits=0 wakes=0 <AUDIT_BLIND>
=> waiter parked via non-audited path
Conclusion: hypothesis (A) confirmed for all 4 handles. Producer is
NOT a wake/eligibility bug — it is a genuinely missing kernel signal
source. The 3 Event/Manual handles share a creator
(lr=0x824a9f6c, tid=1) and the same wait-call wrapper at
lr=0x824ac578 — these are 3 worker threads all parked on
"work-available" notifications that never come.
Verification:
* cargo test --workspace --release: 558 passing (+2 new ghost-trail
tests vs prior 556 baseline)
* lockstep -n 100M --stable-digest: bit-identical to master HEAD
Audit IDs: KRNBUG-AUDIT-001 (closed — diagnostic instrumentation).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a regression-catcher golden for Sylpheed boot at -n 50M lockstep,
covering the first VdSwap pair (the n2m oracle is swap-blind because
the first VdSwap fires at ~18M instructions). The new --stable-digest
flag emits/compares only fields that are deterministic in lockstep:
instructions, imports, unimpl, draws, swaps,
unique_render_targets, shader_blobs_live, texture_cache_entries
Excluded:
packets — empirically ±2-8% lockstep variance (GPU thread race per
audit M11)
resolves, interrupts_delivered, interrupts_dropped, texture_decodes —
scheduling-sensitive under --parallel
path — cwd-dependent
Empirical determinism: 3 consecutive lockstep -n 50M runs produce
byte-identical stable-digest output.
The n4b canonical-invocation golden the audit's recommended next sprint
also called for is deferred. Per audit memory `--parallel
--reservations-table` is pathologically slow (>32 min for -n 100M), so
-n 4B in that mode would be many hours per run, not the 5-15 min the
plan estimated. n4b will be captured one-shot post-renderer-unblock as
a manual artifact under audit-runs/post-fix/, not as a test golden. See
crates/xenia-app/tests/golden/README.md.
Test infrastructure:
- crates/xenia-app/tests/sylpheed_oracles.rs — invokes
CARGO_BIN_EXE_xenia-rs against the ISO. Path resolved via SYLPHEED_ISO
env var (skips gracefully if missing).
- #[ignore]-gated; run via:
cargo test --release -p xenia-app --test sylpheed_oracles \\
-- --ignored --nocapture
Closes ORACBUG-004 (P0). Partial: ORACBUG-006 (P1 deferred).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
observability.rs installs the tracing subscriber stack (env-filter +
JSON file appender + chrome trace + error layer) and the metrics
recorder shared by the workspace. main.rs grows the new CLI surface:
--parallel, --reservations-table, --trace-handles, --analyze=
{rust,sql,both}, xenia dis --json, --ui, plus the wiring that runs
the CPU through the new scheduler, drives the GPU's threaded backend,
and surfaces the framebuffer + HUD via xenia-ui.
Add tests/parallel_stress.rs (#[ignore]-gated long form, short form
runs 20×@5M) and tests/golden/sylpheed_n2m.json — the digest the
lockstep/parallel combos compare against.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rust reimplementation of the xenia Xbox 360 emulator targeting reverse-
engineering and preservation, initially scoped to Project Sylpheed.
Includes:
- XEX2 loader (LZX decompression, AES decryption, PE parsing)
- XISO / XGD2 disc image VFS
- PPC interpreter with 200+ opcodes and VMX128 decoding
- Static analyzer: functions, cross-references, labels, asm + SQLite output
- HLE kernel covering the xboxkrnl/xam subset used by Sylpheed init
- Debugger with in-memory and SQLite-backed execution tracing
- `xenia-rs` CLI with extract/dis/exec commands that produce cumulative,
superset SQLite databases and opt-in instruction/import/branch traces
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>