Files
xenia-rs/crates/xenia-kernel/src/xaudio.rs
MechaCat02 49f3eafa15 AUDIT-032: dedicated audio worker thread per client (Plan B)
Replaces APUBUG-PRODUCER-001's random-victim-hijack audio injection
with a dedicated per-client guest worker thread, mirroring xenia-canary's
apu/audio_system.cc:84-159 WorkerThreadMain pattern in xenia-rs's
threading model. Audio callback ticker is now safe to enable by default.

## What changed

- xenia-kernel/src/xaudio.rs: new XAudioState fields worker_handles +
  worker_refs (one slot per of XAUDIO_MAX_CLIENTS=8). Synthetic
  park-handle helper (0xF000_0000 | client_idx) — outside the normal
  alloc range so wake_eligible_waiters never finds it; the only
  legitimate state-flip is via try_inject_audio_callback.
- xenia-kernel/src/exports.rs: xaudio_register_render_driver spawns a
  64KB-stack guest thread (create_suspended=true) via
  state.scheduler.spawn after registration succeeds. Immediately flips
  the spawned thread's state from Blocked(Suspended) to
  Blocked(WaitAny[synthetic]) so it's parked but not woken. Stores the
  kernel handle so find_by_handle resolves a fresh ThreadRef after slot
  compaction. Failure paths log + leave xaudio.worker_refs[i] = None,
  in which case the ticker drops fires (no random-victim fallback).
- xenia-app/src/main.rs: try_inject_audio_callback resolves the worker
  via worker_handles[index] instead of scanning runqueues for a Ready
  or Blocked victim. The PC+r3 injection and SavedCallbackCtx capture
  are unchanged; the existing LR_HALT restore path re-blocks the
  worker on its synthetic handle for the next tick. Flag handling
  reworked: --xaudio-tick / XENIA_XAUDIO_TICK now act as explicit
  override (truthy = force on, falsey = force off, absent = use the
  KernelState default).
- xenia-kernel/src/state.rs: xaudio_tick_enabled default flipped from
  false to true. Pre-fix it was off because the random-victim hijack
  regressed swaps=2->1; with the dedicated worker that whole class of
  regression is gone.

## Cascade verification at -n 500M (audit-runs/audit-048-audio-host-pump/)

Pre-fix baseline: audit-runs/audit-047-gamma-wedges/ours-end-state.log.

| Dim | Predicted (AUDIT-032)               | Observed                        |
|-----|-------------------------------------|---------------------------------|
| A   | tid=9 leaves Blocked[0x828A3254]    | Ready @ pc=0x824d1404           |
| B   | tid=10 leaves Blocked[0x828A3230]   | Ready @ same pc/lr              |
| C   | XAudioSubmitRenderDriverFrame > 0   | Mixer setup path executed       |
| D   | KeReleaseSemaphore 0 -> non-zero    | 0 -> 1; xaudio.callback.delivered=1 |

Bonus: audit-042's tid=6 worker pair on 0x10A0+0x10A4 also went
Blocked->Ready as a downstream effect.

Boot trajectory shifted significantly: NtWaitForSingleObjectEx
1,489,791 -> 30; NtSetEvent 3,334 -> 68; new exports firing
(StfsCreateDevice, ObCreateSymbolicLink, XamContentCreateEnumerator,
XamEnumerate, XamTaskSchedule, ExCreateThread x10, KeSetAffinityThread x7,
NtCreateSemaphore x4, NtWaitForMultipleObjectsEx x94, NtDuplicateObject x14,
XeCryptSha, XeKeysConsolePrivateKeySign). The system left the
audio-wait busy loop and entered the savegame/content/crypto init phase.

swaps regressed 2 -> 1 (degenerate splash repeat lost; main thread now
advances past splash entirely, blocked on a different handle). draws
unchanged at 0 — expected per AUDIT-032 (audio gate != renderer gate).

## Tests + scope

- cargo build --release succeeds, no new warnings.
- cargo test -p xenia-kernel --lib: 127/127 pass (incl. xaudio).
- cargo test -p xenia-app --lib: 5/5 non-ignored pass.
- Lockstep goldens (sylpheed_n2m / sylpheed_n50m) WILL drift on this
  fix and need re-baselining as a follow-up commit.

75 net non-comment LOC across 4 files, well under AUDIT-032's
60-120 LOC budget.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 15:06:25 +02:00

334 lines
12 KiB
Rust

//! XAudio render-driver-client registration + buffer-complete callback loop
//! (canary parity: `xenia/apu/audio_system.cc`).
//!
//! Replaces the host-thread + per-client-semaphore + XAudio2 driver layer with
//! a periodic ticker that enqueues a "buffer complete" fire for each
//! registered client at the audio frame rate (256 samples / 48 kHz ≈ 5.33 ms).
//! The injection path in `xenia-app` reuses the same [`crate::SavedCallbackCtx`]
//! plumbing the graphics-interrupt path uses — only one callback runs at a
//! time across either subsystem, gated by `interrupts.is_in_callback()`.
//!
//! Lockstep mode uses an instruction-count proxy
//! ([`XAUDIO_INSTR_PERIOD`]) so `--stable-digest` stays bit-exact;
//! `--parallel` uses wall-clock ([`XAUDIO_PERIOD`]) — same dual-mode pattern
//! as KRNBUG-D08 v-sync.
use std::collections::VecDeque;
use std::time::{Duration, Instant};
use xenia_cpu::ThreadRef;
/// Mirrors [audio_system.h:30](../../../../xenia-canary/src/xenia/apu/audio_system.h#L30)
/// `kMaximumClientCount = 8`.
pub const XAUDIO_MAX_CLIENTS: usize = 8;
/// AUDIT-032 Plan B: synthetic kernel-handle base for the dedicated audio
/// worker threads' parking `WaitAny`. These handles are deliberately OUTSIDE
/// the normal allocator range (which starts at `0x1000` and grows by 4 in
/// [`crate::state::KernelState::alloc_handle`]) so a `state.objects` lookup
/// always misses — meaning [`crate::exports::wake_eligible_waiters`] will
/// never spuriously wake a worker. The only legitimate path that flips a
/// worker out of `Blocked(WaitAny[SYNTHETIC])` is the audio-callback
/// injection in `try_inject_audio_callback` (state→`ServicingIrq`) and the
/// `LR_HALT` saved-context restore (state→`Blocked` again). One handle per
/// client slot keeps wait lists per-worker (defensive — `wake_eligible` is a
/// no-op anyway).
pub const XAUDIO_SYNTHETIC_HANDLE_BASE: u32 = 0xF000_0000;
/// Compute the synthetic park-handle for client slot `i`.
pub const fn synthetic_park_handle(i: usize) -> u32 {
XAUDIO_SYNTHETIC_HANDLE_BASE | (i as u32)
}
/// Source code stamped into [`crate::SavedCallbackCtx::source`] when an
/// audio callback is injected. Distinct from graphics-interrupt sources
/// (`INTERRUPT_SOURCE_VSYNC = 0`, `INTERRUPT_SOURCE_CP = 1`) so logs and
/// the audit trail can disambiguate.
pub const INTERRUPT_SOURCE_AUDIO: u32 = 0x100;
/// Lockstep instruction-count period. Picked so the ratio against
/// [`crate::interrupts::VSYNC_INSTR_PERIOD`] (`150_000`) ≈ 16.67 ms / 5.33 ms,
/// matching canary's 256 samples / 48 kHz audio cadence.
pub const XAUDIO_INSTR_PERIOD: u64 = 48_000;
/// Wall-clock period under `--parallel`. 256 / 48000 s = 5.333… ms.
pub const XAUDIO_PERIOD: Duration = Duration::from_nanos(5_333_333);
/// Bound on the pending-fires FIFO. Stops a long-running export from
/// queueing unbounded callbacks while injection is starved.
pub const XAUDIO_QUEUE_CAP: usize = 16;
#[derive(Debug, Clone, Copy)]
pub struct XAudioClient {
pub callback_pc: u32,
pub callback_arg: u32,
/// Guest pointer to the heap-allocated 4-byte buffer holding
/// `callback_arg` big-endian — passed as r3 to the guest callback,
/// matching canary's
/// [audio_system.cc:225-228](../../../../xenia-canary/src/xenia/apu/audio_system.cc#L225-L228)
/// + [audio_system.cc:139-141](../../../../xenia-canary/src/xenia/apu/audio_system.cc#L139-L141).
pub wrapped_callback_arg: u32,
}
#[derive(Debug)]
pub struct XAudioState {
pub clients: [Option<XAudioClient>; XAUDIO_MAX_CLIENTS],
pub pending: VecDeque<usize>,
pub delivered: u64,
pub dropped: u64,
pub accumulator: u64,
pub last_instr_count: u64,
pub last_instant: Option<Instant>,
/// AUDIT-032 Plan B: dedicated audio-worker thread per client slot.
/// Mirrors xenia-canary's `apu/audio_system.cc:84-159` host worker but
/// using a guest-side parked thread instead — registered at
/// `XAudioRegisterRenderDriverClient` time and lazily looked up by
/// `try_inject_audio_callback` via `scheduler.find_by_handle`. The
/// worker is parked in `Blocked(WaitAny[SYNTHETIC_HANDLE])`; injection
/// flips it to `ServicingIrq` and the `LR_HALT` restore path puts it
/// back to `Blocked`. Each slot also remembers the kernel handle so
/// `find_by_handle` can resolve a fresh `ThreadRef` after slot
/// pruning/reordering. Phantom-typed for callers that don't link
/// `xenia_cpu` (none currently) to keep this self-contained.
pub worker_handles: [Option<u32>; XAUDIO_MAX_CLIENTS],
pub worker_refs: [Option<ThreadRef>; XAUDIO_MAX_CLIENTS],
}
impl Default for XAudioState {
fn default() -> Self {
Self {
clients: [None; XAUDIO_MAX_CLIENTS],
pending: VecDeque::new(),
delivered: 0,
dropped: 0,
accumulator: 0,
last_instr_count: 0,
last_instant: None,
worker_handles: [None; XAUDIO_MAX_CLIENTS],
worker_refs: [None; XAUDIO_MAX_CLIENTS],
}
}
}
impl XAudioState {
pub fn register(&mut self, client: XAudioClient) -> Option<usize> {
for (i, slot) in self.clients.iter_mut().enumerate() {
if slot.is_none() {
*slot = Some(client);
return Some(i);
}
}
None
}
pub fn unregister(&mut self, index: usize) {
if index < XAUDIO_MAX_CLIENTS {
self.clients[index] = None;
self.pending.retain(|&i| i != index);
// Worker thread (if any) stays parked on its synthetic handle
// — Sylpheed never re-registers, so leaving it Blocked is
// simpler than wiring a clean teardown. Clear our refs so a
// future `register` rebuilds them.
self.worker_handles[index] = None;
self.worker_refs[index] = None;
}
}
pub fn get(&self, index: usize) -> Option<XAudioClient> {
self.clients.get(index).copied().flatten()
}
pub fn any_registered(&self) -> bool {
self.clients.iter().any(|c| c.is_some())
}
fn enqueue_all_active(&mut self) {
for i in 0..XAUDIO_MAX_CLIENTS {
if self.clients[i].is_none() {
continue;
}
if self.pending.len() >= XAUDIO_QUEUE_CAP {
self.dropped += 1;
return;
}
self.pending.push_back(i);
}
}
pub fn peek_next(&self) -> Option<usize> {
self.pending.front().copied()
}
pub fn take_next(&mut self) -> Option<usize> {
self.pending.pop_front()
}
/// Lockstep instruction-count ticker. Idempotently advances the
/// accumulator from `last_instr_count` to `current_instr_count` and
/// enqueues one fire-set per full [`XAUDIO_INSTR_PERIOD`] crossed.
/// Returns `true` iff at least one fire was queued.
pub fn tick_instr(&mut self, current_instr_count: u64) -> bool {
if !self.any_registered() {
self.last_instr_count = current_instr_count;
self.accumulator = 0;
return false;
}
let delta = current_instr_count.saturating_sub(self.last_instr_count);
self.last_instr_count = current_instr_count;
self.accumulator = self.accumulator.saturating_add(delta);
if self.accumulator < XAUDIO_INSTR_PERIOD {
return false;
}
let periods = self.accumulator / XAUDIO_INSTR_PERIOD;
self.accumulator %= XAUDIO_INSTR_PERIOD;
let to_fire = (periods as usize).min(XAUDIO_QUEUE_CAP);
for _ in 0..to_fire {
self.enqueue_all_active();
}
true
}
/// Wall-clock ticker for `--parallel`. First call seeds the anchor
/// (no fire). Subsequent calls fire `floor(elapsed / XAUDIO_PERIOD)`
/// fire-sets and advance the anchor by that many full periods.
pub fn tick_wallclock(&mut self) -> bool {
if !self.any_registered() {
self.last_instant = None;
return false;
}
let now = Instant::now();
let anchor = match self.last_instant {
Some(t) => t,
None => {
self.last_instant = Some(now);
return false;
}
};
let elapsed = now.saturating_duration_since(anchor);
let period_ns = XAUDIO_PERIOD.as_nanos() as u64;
let elapsed_ns = elapsed.as_nanos() as u64;
let periods = elapsed_ns / period_ns;
if periods == 0 {
return false;
}
let advance = Duration::from_nanos(periods * period_ns);
self.last_instant = Some(anchor + advance);
let to_fire = (periods as usize).min(XAUDIO_QUEUE_CAP);
for _ in 0..to_fire {
self.enqueue_all_active();
}
true
}
}
#[cfg(test)]
mod tests {
use super::*;
fn dummy_client(arg: u32) -> XAudioClient {
XAudioClient {
callback_pc: 0x8200_0000 + arg,
callback_arg: arg,
wrapped_callback_arg: 0x4000_0000 + arg,
}
}
#[test]
fn register_assigns_first_free_slot() {
let mut s = XAudioState::default();
let i0 = s.register(dummy_client(1)).unwrap();
let i1 = s.register(dummy_client(2)).unwrap();
assert_eq!(i0, 0);
assert_eq!(i1, 1);
assert_eq!(s.get(0).unwrap().callback_arg, 1);
assert_eq!(s.get(1).unwrap().callback_arg, 2);
}
#[test]
fn unregister_clears_slot_and_pending() {
let mut s = XAudioState::default();
let i = s.register(dummy_client(1)).unwrap();
s.pending.push_back(i);
s.unregister(i);
assert!(s.get(i).is_none());
assert!(s.pending.is_empty());
}
#[test]
fn register_returns_none_when_full() {
let mut s = XAudioState::default();
for k in 0..XAUDIO_MAX_CLIENTS {
assert!(s.register(dummy_client(k as u32)).is_some());
}
assert!(s.register(dummy_client(99)).is_none());
}
#[test]
fn tick_instr_no_clients_does_not_fire() {
let mut s = XAudioState::default();
assert!(!s.tick_instr(XAUDIO_INSTR_PERIOD * 10));
assert!(s.pending.is_empty());
}
#[test]
fn tick_instr_fires_at_period() {
let mut s = XAudioState::default();
let i = s.register(dummy_client(7)).unwrap();
assert!(!s.tick_instr(XAUDIO_INSTR_PERIOD - 1));
assert!(s.pending.is_empty());
assert!(s.tick_instr(XAUDIO_INSTR_PERIOD));
assert_eq!(s.peek_next(), Some(i));
}
#[test]
fn tick_instr_drains_multiple_periods_in_one_call() {
let mut s = XAudioState::default();
let i = s.register(dummy_client(7)).unwrap();
assert!(s.tick_instr(XAUDIO_INSTR_PERIOD * 4));
assert_eq!(s.pending.len(), 4);
for _ in 0..4 {
assert_eq!(s.take_next(), Some(i));
}
assert!(s.pending.is_empty());
}
#[test]
fn tick_instr_fires_for_each_registered_client() {
let mut s = XAudioState::default();
let a = s.register(dummy_client(1)).unwrap();
let b = s.register(dummy_client(2)).unwrap();
assert!(s.tick_instr(XAUDIO_INSTR_PERIOD));
assert_eq!(s.pending.len(), 2);
assert_eq!(s.take_next(), Some(a));
assert_eq!(s.take_next(), Some(b));
}
#[test]
fn tick_instr_caps_queue_growth() {
let mut s = XAudioState::default();
s.register(dummy_client(1)).unwrap();
s.tick_instr(XAUDIO_INSTR_PERIOD * (XAUDIO_QUEUE_CAP as u64 + 50));
assert!(s.pending.len() <= XAUDIO_QUEUE_CAP);
}
#[test]
fn tick_wallclock_first_call_seeds_anchor() {
let mut s = XAudioState::default();
s.register(dummy_client(1)).unwrap();
assert!(!s.tick_wallclock());
assert!(s.pending.is_empty());
assert!(s.last_instant.is_some());
}
#[test]
fn tick_wallclock_fires_after_period() {
let mut s = XAudioState::default();
let i = s.register(dummy_client(1)).unwrap();
s.tick_wallclock();
std::thread::sleep(XAUDIO_PERIOD + Duration::from_millis(2));
assert!(s.tick_wallclock());
assert!(!s.pending.is_empty());
assert_eq!(s.peek_next(), Some(i));
}
}