--- name: xenia-rs handle-audit harness + post-2026-04-25 sync state description: Per-handle signal/wait/wake audit (—trace-handles), and the diagnostic finding that the previously-reported HLE sync gap no longer reproduces at -n 500M type: project originSessionId: f83e67b7-97f4-4222-a37f-e1720ab3ace6 --- ## Audit harness landed [`xenia-kernel/src/audit.rs`](../../../RE%20Project%20Sylpheed/xenia-rs/crates/xenia-kernel/src/audit.rs) — `HandleAudit` + `HandleAuditTrail` capture create/signal/wait/wake events per kernel handle, bounded ring of 32 entries each. `KernelState::audit` is `enabled=false` by default; flip via `--trace-handles` flag or `XENIA_TRACE_HANDLES=1`. Disabled is a single inline early-return in each record method — zero hot-path cost. Hook sites (in [`xenia-kernel/src/exports.rs`](../../../RE%20Project%20Sylpheed/xenia-rs/crates/xenia-kernel/src/exports.rs)): - **Create**: `nt_create_event`, `nt_create_semaphore`, `nt_create_timer` - **Signal**: `KeSetEvent`, `NtSetEvent`, `KePulseEvent`, `NtPulseEvent`, `KeReleaseSemaphore`, `NtReleaseSemaphore`, `NtSignalAndWaitForSingleObjectEx` (signal half), `signal_io_completion_event` - **Wait**: `do_wait_single`, `do_wait_multiple` (one record per handle in the wait set) - **Wake**: inside `wake_eligible_waiters` (separate records for manual-reset fan-out vs auto-reset/semaphore single-wake) Diagnostic dump in [`xenia-app/src/main.rs::dump_thread_diagnostic`](../../../RE%20Project%20Sylpheed/xenia-rs/crates/xenia-app/src/main.rs) prints the audit trail at end-of-run when audit is enabled. Highlights `` (smoking gun for missing signal source) and `` (handles called out in the original deadlock report). ## Diagnostic finding (2026-04-25) The HLE sync gap previously reported at ~7.5M cycles on Sylpheed boot is **no longer reproducing**. Verified: - `xenia-rs exec sylpheed.iso --halt-on-deadlock -n 500_000_000` — `EXIT=0`, no halt fired, no `scheduler.deadlock_halts` or `scheduler.deadlock_recoveries` counters appear. - VdSwap=1 fires at ~22M instructions, VdSwap=2 at ~30M instructions; matches the post-Tier-4 baseline. - Audit data confirms the originally-suspect handles (0x10FC, 0x1014, 0x1104, 0x10DC, 0x10F0) all *do* receive signals: e.g., 0x10FC = Event/Auto with 1 signal (NtSetEvent from tid=4) + 1 wake; 0x1014 = Semaphore with 15 signals / 15 wakes / 16 waits. - Threads still parked at end-of-run (tids 2/3/4/5/6/10/13/14/16/18) are in normal worker-idle states (event+semaphore producer/consumer with timeouts, or "service exits on stop-event" with no shutdown signal — both expected). **Why:** likely a combination of the IRQ-injection stack-pad fix (2026-04-24) and Tier-4 perf work (2026-04-25) shifted scheduler timing past the previous deadlock window. **How to apply:** APC + Mutant infrastructure (`KeInitializeApc=0x6D`, `KeInsertQueueApc=0x7A`, `NtQueueApcThread=0xE3`, `KeAlertThread=0x4F`, `KeInitializeMutant=0x72`, `KeReleaseMutant=0x87`, `NtCreateMutant=0xD4`, `NtReleaseMutant=0xF2` — all in canary [`xboxkrnl_threading.cc`](../../../RE%20Project%20Sylpheed/xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_threading.cc)) was planned but DEFERRED — the audit data did not point to a missing kernel API. Implement only when a future regression actually requires it. ## Resolve fill-ins landed (2026-04-25) [`xenia-gpu/src/edram.rs`](../../../RE%20Project%20Sylpheed/xenia-rs/crates/xenia-gpu/src/edram.rs) gained: `write_sample_32bpp`, `write_rect_32bpp`, `read_sample_64bpp`, `write_sample_64bpp`, `write_rect_64bpp`, `fill_rect_64bpp`. 64bpp helpers use Canary's doubled-pitch convention (`pitch_tiles_32bpp << 1`). [`xenia-gpu/src/resolve.rs`](../../../RE%20Project%20Sylpheed/xenia-rs/crates/xenia-gpu/src/resolve.rs) `copy_to_memory` now handles: 1. **64bpp sources** via new `is_64bpp_bitwise_equivalent` (k_16_16_16_16, k_16_16_16_16_FLOAT, k_32_32_FLOAT). Two `write_u32` per pixel; `bpp_log2 = 3` for tiled offset. 2. **MSAA averaging** (k01/k23/k0123) via per-format decode/average/encode helpers: - `k_8_8_8_8`/`k_8_8_8_8_GAMMA`: per-byte rounded unsigned mean - `k_2_10_10_10`: per-field rounded mean (widths 2/10/10/10) - `k_16_16_FLOAT`, `k_16_16_16_16_FLOAT`: half-float decode → fp32 sum → encode - `k_32_FLOAT`, `k_32_32_FLOAT`: bitcast → fp32 sum → bitcast - `k_16_16_16_16`: per-16-bit-field rounded mean [`xenia-gpu/src/gpu_system.rs`](../../../RE%20Project%20Sylpheed/xenia-rs/crates/xenia-gpu/src/gpu_system.rs) clear-paint dispatches to `fill_rect_64bpp` for 64bpp sources, using `RB_COLOR_CLEAR_LO` (lo) + `RB_COLOR_CLEAR` (hi) per Canary `draw_util.cc:1302-1303`. Endian k8in64/k8in128 and `copy_dest_exp_bias != 0` remain on backlog (rare on first-pixels path); current code preserves the pre-existing warn+skip behavior for both. ## wgpu→ShadowEdram readback — deferred, foundation in place `ShadowEdram` write APIs (`write_rect_32bpp`, `write_rect_64bpp`) are the foundational data-structure work the future readback retile path will use. The cross-thread plumbing (UiBridge `request_rt_readback` / `poll_rt_readback`, per-RT offscreen wgpu textures in `xenos_pipeline.rs`, `copy_texture_to_buffer` + `map_async` callback) is **deferred**: Sylpheed's current boot path fires no Xenos draws, so wiring the cross-thread readback today would land speculative code that can't be exercised against a real game flow. The plan file at [`/home/fabi/.claude/plans/please-address-the-hle-eager-pixel.md`](../../../home/fabi/.claude/plans/please-address-the-hle-eager-pixel.md) Section 2 has the full design (`ReadbackState`, `OffscreenRt`, `RtCache`). ## Verification (2026-04-25 session) - `cargo test --workspace --release` — **386 tests pass** (was 369 baseline; +17 new for audit, edram, resolve). - `xenia-rs check sylpheed.iso -n 2_000_000 --expect crates/xenia-app/tests/golden/sylpheed_n2m.json` — clean in both default block-cache mode and `XENIA_FORCE_PER_INSTR=1` per-instruction mode. - `xenia-rs exec sylpheed.iso --halt-on-deadlock -n 500_000_000` — `EXIT=0`, no deadlock counters tripped. - `cargo bench -p xenia-cpu` — `tight_alu_loop=119.86 MIPS`, `loadstore_loop=95.67 MIPS`, `mmio_storm=70.08 MIPS` (all at or above the prior post-Tier-4 baseline of 114.8 / 91.8 / 67.8). ## Files touched this session - New: `xenia-rs/crates/xenia-kernel/src/audit.rs`. - Modified: `xenia-kernel/src/lib.rs` (mod), `state.rs` (KernelState::audit + helpers), `exports.rs` (hook calls at create/signal/wait/wake sites). `xenia-app/src/main.rs` (--trace-handles flag, audit dump in dump_thread_diagnostic, env var `XENIA_TRACE_HANDLES`). `xenia-gpu/src/edram.rs` (new 32bpp+64bpp write APIs + fill_rect_64bpp + tests). `xenia-gpu/src/resolve.rs` (is_64bpp_bitwise_equivalent + 64bpp source path + MSAA averaging + half-float helpers + tests). `xenia-gpu/src/gpu_system.rs` (64bpp clear-paint dispatch).