AUDIT-032: dedicated audio worker thread per client (Plan B)

Replaces APUBUG-PRODUCER-001's random-victim-hijack audio injection
with a dedicated per-client guest worker thread, mirroring xenia-canary's
apu/audio_system.cc:84-159 WorkerThreadMain pattern in xenia-rs's
threading model. Audio callback ticker is now safe to enable by default.

## What changed

- xenia-kernel/src/xaudio.rs: new XAudioState fields worker_handles +
  worker_refs (one slot per of XAUDIO_MAX_CLIENTS=8). Synthetic
  park-handle helper (0xF000_0000 | client_idx) — outside the normal
  alloc range so wake_eligible_waiters never finds it; the only
  legitimate state-flip is via try_inject_audio_callback.
- xenia-kernel/src/exports.rs: xaudio_register_render_driver spawns a
  64KB-stack guest thread (create_suspended=true) via
  state.scheduler.spawn after registration succeeds. Immediately flips
  the spawned thread's state from Blocked(Suspended) to
  Blocked(WaitAny[synthetic]) so it's parked but not woken. Stores the
  kernel handle so find_by_handle resolves a fresh ThreadRef after slot
  compaction. Failure paths log + leave xaudio.worker_refs[i] = None,
  in which case the ticker drops fires (no random-victim fallback).
- xenia-app/src/main.rs: try_inject_audio_callback resolves the worker
  via worker_handles[index] instead of scanning runqueues for a Ready
  or Blocked victim. The PC+r3 injection and SavedCallbackCtx capture
  are unchanged; the existing LR_HALT restore path re-blocks the
  worker on its synthetic handle for the next tick. Flag handling
  reworked: --xaudio-tick / XENIA_XAUDIO_TICK now act as explicit
  override (truthy = force on, falsey = force off, absent = use the
  KernelState default).
- xenia-kernel/src/state.rs: xaudio_tick_enabled default flipped from
  false to true. Pre-fix it was off because the random-victim hijack
  regressed swaps=2->1; with the dedicated worker that whole class of
  regression is gone.

## Cascade verification at -n 500M (audit-runs/audit-048-audio-host-pump/)

Pre-fix baseline: audit-runs/audit-047-gamma-wedges/ours-end-state.log.

| Dim | Predicted (AUDIT-032)               | Observed                        |
|-----|-------------------------------------|---------------------------------|
| A   | tid=9 leaves Blocked[0x828A3254]    | Ready @ pc=0x824d1404           |
| B   | tid=10 leaves Blocked[0x828A3230]   | Ready @ same pc/lr              |
| C   | XAudioSubmitRenderDriverFrame > 0   | Mixer setup path executed       |
| D   | KeReleaseSemaphore 0 -> non-zero    | 0 -> 1; xaudio.callback.delivered=1 |

Bonus: audit-042's tid=6 worker pair on 0x10A0+0x10A4 also went
Blocked->Ready as a downstream effect.

Boot trajectory shifted significantly: NtWaitForSingleObjectEx
1,489,791 -> 30; NtSetEvent 3,334 -> 68; new exports firing
(StfsCreateDevice, ObCreateSymbolicLink, XamContentCreateEnumerator,
XamEnumerate, XamTaskSchedule, ExCreateThread x10, KeSetAffinityThread x7,
NtCreateSemaphore x4, NtWaitForMultipleObjectsEx x94, NtDuplicateObject x14,
XeCryptSha, XeKeysConsolePrivateKeySign). The system left the
audio-wait busy loop and entered the savegame/content/crypto init phase.

swaps regressed 2 -> 1 (degenerate splash repeat lost; main thread now
advances past splash entirely, blocked on a different handle). draws
unchanged at 0 — expected per AUDIT-032 (audio gate != renderer gate).

## Tests + scope

- cargo build --release succeeds, no new warnings.
- cargo test -p xenia-kernel --lib: 127/127 pass (incl. xaudio).
- cargo test -p xenia-app --lib: 5/5 non-ignored pass.
- Lockstep goldens (sylpheed_n2m / sylpheed_n50m) WILL drift on this
  fix and need re-baselining as a follow-up commit.

75 net non-comment LOC across 4 files, well under AUDIT-032's
60-120 LOC budget.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-05-10 15:06:25 +02:00
parent 7bc9e3acac
commit 49f3eafa15
4 changed files with 196 additions and 45 deletions

View File

@@ -948,16 +948,35 @@ fn cmd_exec_inner(
}); });
let parallel_active = parallel || parallel_via_env; let parallel_active = parallel || parallel_via_env;
kernel.parallel_active = parallel_active; kernel.parallel_active = parallel_active;
let xaudio_tick_via_env = std::env::var("XENIA_XAUDIO_TICK") // AUDIT-032: default is `KernelState::xaudio_tick_enabled = true` now
.ok() // that the dedicated worker eliminates HW-thread hijack regressions.
.is_some_and(|v| { // Treat `--xaudio-tick` / `XENIA_XAUDIO_TICK=...` as an explicit
// override only when set: `0`/`false`/`no` forces it off; truthy
// values keep it on. CLI flag is bool so `--xaudio-tick` (present)
// is treated as the truthy override; absence is "use default".
let xaudio_tick_env_raw = std::env::var("XENIA_XAUDIO_TICK").ok();
let xaudio_tick_env_falsey = xaudio_tick_env_raw
.as_deref()
.map(|v| {
let v = v.trim().to_ascii_lowercase();
v == "0" || v == "false" || v == "no"
})
.unwrap_or(false);
let xaudio_tick_env_truthy = xaudio_tick_env_raw
.as_deref()
.map(|v| {
let v = v.trim().to_ascii_lowercase(); let v = v.trim().to_ascii_lowercase();
v == "1" || v == "true" || v == "yes" v == "1" || v == "true" || v == "yes"
}); })
kernel.xaudio_tick_enabled = xaudio_tick || xaudio_tick_via_env; .unwrap_or(false);
if xaudio_tick || xaudio_tick_env_truthy {
kernel.xaudio_tick_enabled = true;
} else if xaudio_tick_env_falsey {
kernel.xaudio_tick_enabled = false;
}
if kernel.xaudio_tick_enabled && !quiet { if kernel.xaudio_tick_enabled && !quiet {
tracing::info!( tracing::info!(
"XAudio callback ticker enabled (--xaudio-tick / XENIA_XAUDIO_TICK=1)" "XAudio callback ticker enabled (AUDIT-032 default; toggle via --xaudio-tick / XENIA_XAUDIO_TICK)"
); );
} }
if reservations_table || reservations_via_env || parallel_active { if reservations_table || reservations_via_env || parallel_active {
@@ -3247,12 +3266,15 @@ fn try_inject_graphics_interrupt(kernel: &mut xenia_kernel::KernelState) {
); );
} }
/// APUBUG-PRODUCER-001 — inject a pending XAudio buffer-complete callback /// AUDIT-032 Plan B — inject a pending XAudio buffer-complete callback
/// into the next available HW thread. Mirrors /// into the **dedicated audio worker** registered for the head-of-queue
/// [`try_inject_graphics_interrupt`] (same victim-selection policy, same /// client. Mirrors
/// SP-pad, same saved-context restore-on-sentinel) but pulls the callback /// [`try_inject_graphics_interrupt`] (same SP-pad, same saved-context
/// PC + r3 argument from a registered [`xenia_kernel::xaudio::XAudioClient`] /// restore-on-sentinel) but the target thread is fixed at registration
/// instead of the graphics-callback registration. /// time instead of selected via the random-victim policy. The pre-fix
/// random-victim path corrupted unrelated thread state
/// (APUBUG-PRODUCER-001 "HW-thread hijack"); per-client workers eliminate
/// that whole class of regression.
/// ///
/// Mutual exclusion with the graphics path is via the shared /// Mutual exclusion with the graphics path is via the shared
/// `interrupts.saved` slot — if a graphics callback is already in flight, /// `interrupts.saved` slot — if a graphics callback is already in flight,
@@ -3273,39 +3295,39 @@ fn try_inject_audio_callback(kernel: &mut xenia_kernel::KernelState) {
return; return;
}; };
let mut victim: Option<xenia_cpu::ThreadRef> = None; // Resolve the dedicated worker thread. Cached `worker_refs[index]` is
'outer_ready: for (hw_id, slot) in kernel.scheduler.slots.iter().enumerate() { // stale-tolerant via `find_by_handle` (handle survives slot
for (idx, t) in slot.runqueue.iter().enumerate() { // compaction). If no worker is registered (spawn failed at register
if matches!(t.state, HwState::Ready) { // time, or fix is disabled in a fork), drop the fire — no fallback to
victim = Some(xenia_cpu::ThreadRef::new(hw_id as u8, idx as u16)); // random-victim, that regressed boot trajectory.
break 'outer_ready; let worker_handle = kernel.xaudio.worker_handles[index];
} let target_ref = match worker_handle.and_then(|h| kernel.scheduler.find_by_handle(h)) {
Some(r) => {
kernel.xaudio.worker_refs[index] = Some(r);
r
} }
} None => {
if victim.is_none() { let _ = kernel.xaudio.take_next();
'outer_blocked: for (hw_id, slot) in kernel.scheduler.slots.iter().enumerate() { kernel.xaudio.dropped += 1;
for (idx, t) in slot.runqueue.iter().enumerate() { return;
if matches!(t.state, HwState::Blocked(_)) {
victim = Some(xenia_cpu::ThreadRef::new(hw_id as u8, idx as u16));
break 'outer_blocked;
}
}
} }
}
let Some(target_ref) = victim else {
let _ = kernel.xaudio.take_next();
kernel.xaudio.dropped += 1;
return;
}; };
let t = kernel.scheduler.thread_mut(target_ref); let t = kernel.scheduler.thread_mut(target_ref);
let prev_state = t.state.clone(); let prev_state = t.state.clone();
match prev_state { match prev_state {
HwState::Ready => {}
HwState::Blocked(reason) => { HwState::Blocked(reason) => {
t.state = HwState::ServicingIrq(reason); t.state = HwState::ServicingIrq(reason);
} }
_ => unreachable!("victim selection above filtered out other variants"), // Worker normally re-blocks on synthetic via the LR_HALT restore
// path. If we ever observe it elsewhere (e.g. deadlock force-wake
// moved it to Ready), drop this fire rather than corrupt state —
// the next tick will retry once the restore path runs.
_ => {
let _ = kernel.xaudio.take_next();
kernel.xaudio.dropped += 1;
return;
}
} }
let _ = kernel.xaudio.take_next(); let _ = kernel.xaudio.take_next();
@@ -3331,7 +3353,7 @@ fn try_inject_audio_callback(kernel: &mut xenia_kernel::KernelState) {
idx = target_ref.idx, idx = target_ref.idx,
callback = format_args!("{:#010x}", client.callback_pc), callback = format_args!("{:#010x}", client.callback_pc),
wrapped = format_args!("{:#010x}", client.wrapped_callback_arg), wrapped = format_args!("{:#010x}", client.wrapped_callback_arg),
"xaudio callback: injecting" "xaudio callback: injecting (dedicated worker)"
); );
} }

View File

@@ -3106,9 +3106,89 @@ fn xaudio_register_render_driver(ctx: &mut PpcContext, mem: &GuestMemory, state:
if driver_ptr != 0 { if driver_ptr != 0 {
mem.write_u32(driver_ptr, driver_id); mem.write_u32(driver_ptr, driver_id);
} }
// AUDIT-032 Plan B: spawn a dedicated audio-worker guest thread for
// this client and park it on a synthetic `WaitAny` handle so
// `try_inject_audio_callback` can flip it to `ServicingIrq` when
// a buffer-complete fire is queued. Mirrors xenia-canary's
// `apu/audio_system.cc:84-159` host worker without spawning a host OS
// thread. Failure here is non-fatal (the client is still registered;
// the periodic ticker will queue fires that the round prologue
// simply drops with `dropped += 1` because there's no worker to pump).
let worker_stack = 0x10_000u32; // 64 KiB — half of canary's 128 KiB.
let worker_ref_handle = if let Some(image) =
crate::thread::allocate_thread_image(state, mem, worker_stack, 0)
{
use std::sync::atomic::Ordering;
let tid = state.next_thread_id.fetch_add(1, Ordering::Relaxed);
let handle = state.alloc_handle_for(KernelObject::Thread {
id: tid,
hw_id: None,
exit_code: None,
waiters: Vec::new(),
});
let tls_slot_count = state.next_tls_index.load(Ordering::Relaxed);
let params = SpawnParams {
entry: callback_pc,
start_context: wrapped,
stack_base: image.stack_base,
stack_size: image.stack_size,
pcr_base: image.pcr_base,
tls_base: image.tls_base,
thread_handle: handle,
guest_tid: tid,
create_suspended: true,
is_initial: false,
tls_slot_count,
affinity_mask: 0,
priority: 0,
ideal_processor: None,
};
match state.scheduler.spawn(params, &mut GuestMemoryPcr(mem)) {
Ok(hw_id) => {
if let Some(KernelObject::Thread { hw_id: slot, .. }) = state.objects.get_mut(&handle) {
*slot = Some(hw_id);
}
// Flip from `Blocked(Suspended)` (set by spawn for
// create_suspended=true) to a `Blocked(WaitAny)` on a
// synthetic handle never owned by any kernel object.
// `wake_eligible_waiters` looks the handle up in
// `state.objects` and returns early on miss, so this
// park-state is only released by audio-callback injection.
let park_handle = crate::xaudio::synthetic_park_handle(index);
let target = ThreadRef::new(
hw_id,
(state.scheduler.slots[hw_id as usize].runqueue.len() - 1) as u16,
);
// Both Blocked(Suspended) (set by spawn) and
// Blocked(WaitAny) are non-runnable, so the
// `non_empty_runnable` bitmask is unchanged — no need to
// call the private `recompute_slot_runnable` helper.
state.scheduler.thread_mut(target).state =
xenia_cpu::scheduler::HwState::Blocked(BlockReason::WaitAny {
handles: vec![park_handle],
deadline: None,
});
Some((handle, target))
}
Err(_) => {
tracing::warn!("XAudioRegisterRenderDriverClient: spawn failed for worker idx={}", index);
None
}
}
} else {
tracing::warn!("XAudioRegisterRenderDriverClient: allocate_thread_image failed for worker idx={}", index);
None
};
if let Some((h, r)) = worker_ref_handle {
state.xaudio.worker_handles[index] = Some(h);
state.xaudio.worker_refs[index] = Some(r);
}
tracing::info!( tracing::info!(
"XAudioRegisterRenderDriverClient: index={} callback={:#010x} arg={:#010x} wrapped={:#010x} driver={:#010x}", "XAudioRegisterRenderDriverClient: index={} callback={:#010x} arg={:#010x} wrapped={:#010x} driver={:#010x} worker_handle={:?}",
index, callback_pc, callback_arg, wrapped, driver_id, index, callback_pc, callback_arg, wrapped, driver_id,
state.xaudio.worker_handles[index],
); );
ctx.gpr[3] = 0; ctx.gpr[3] = 0;
} }

View File

@@ -139,13 +139,17 @@ pub struct KernelState {
/// graphics interrupts is enforced by the injector's /// graphics interrupts is enforced by the injector's
/// `is_in_callback()` guard. /// `is_in_callback()` guard.
pub xaudio: crate::xaudio::XAudioState, pub xaudio: crate::xaudio::XAudioState,
/// Default false. When true, the round prologue runs the XAudio /// AUDIT-032 Plan B (default true). When true, the round prologue
/// ticker + `try_inject_audio_callback`. Off by default because the /// runs the XAudio ticker + `try_inject_audio_callback`. Pre-fix this
/// callback firing shifts the boot trajectory under Sylpheed /// was off by default because injection used random-victim selection
/// (regresses `swaps=2`→`1` and 12×s `imports`), which would break /// (APUBUG-PRODUCER-001 HW-thread hijack) which corrupted unrelated
/// the `sylpheed_n*m.json` lockstep goldens. Flipped on by /// state. With dedicated per-client audio workers spawned at
/// `--xaudio-tick` / `XENIA_XAUDIO_TICK=1` for the diagnostic /// `XAudioRegisterRenderDriverClient`, injection only ever runs on
/// producer-hunt path. /// the registered worker so it is safe to leave on. Lockstep goldens
/// `sylpheed_n*m.json` will drift on this fix and need re-baselining
/// (handled out-of-band). The `--xaudio-tick` flag / `XENIA_XAUDIO_TICK=1`
/// env var now act as explicit-override; flipping it off restores the
/// pre-fix path (no audio callbacks fire at all).
pub xaudio_tick_enabled: bool, pub xaudio_tick_enabled: bool,
/// Per-handle refcount. Since `NtDuplicateObject` aliases (returns the /// Per-handle refcount. Since `NtDuplicateObject` aliases (returns the
/// source handle value as the "new" handle rather than minting a fresh /// source handle value as the "new" handle rather than minting a fresh
@@ -306,7 +310,11 @@ impl KernelState {
ui: None, ui: None,
interrupts: crate::interrupts::InterruptState::default(), interrupts: crate::interrupts::InterruptState::default(),
xaudio: crate::xaudio::XAudioState::default(), xaudio: crate::xaudio::XAudioState::default(),
xaudio_tick_enabled: false, // AUDIT-032: dedicated audio worker per client (Plan B in
// `xaudio_register_render_driver`) — not victim hijack, so safe
// to enable by default. Previously gated off because the
// random-victim selection corrupted unrelated thread state.
xaudio_tick_enabled: true,
handle_refcount: HashMap::new(), handle_refcount: HashMap::new(),
pending_timer_fires: Vec::new(), pending_timer_fires: Vec::new(),
audit: HandleAudit::default(), audit: HandleAudit::default(),

View File

@@ -16,10 +16,30 @@
use std::collections::VecDeque; use std::collections::VecDeque;
use std::time::{Duration, Instant}; use std::time::{Duration, Instant};
use xenia_cpu::ThreadRef;
/// Mirrors [audio_system.h:30](../../../../xenia-canary/src/xenia/apu/audio_system.h#L30) /// Mirrors [audio_system.h:30](../../../../xenia-canary/src/xenia/apu/audio_system.h#L30)
/// `kMaximumClientCount = 8`. /// `kMaximumClientCount = 8`.
pub const XAUDIO_MAX_CLIENTS: usize = 8; pub const XAUDIO_MAX_CLIENTS: usize = 8;
/// AUDIT-032 Plan B: synthetic kernel-handle base for the dedicated audio
/// worker threads' parking `WaitAny`. These handles are deliberately OUTSIDE
/// the normal allocator range (which starts at `0x1000` and grows by 4 in
/// [`crate::state::KernelState::alloc_handle`]) so a `state.objects` lookup
/// always misses — meaning [`crate::exports::wake_eligible_waiters`] will
/// never spuriously wake a worker. The only legitimate path that flips a
/// worker out of `Blocked(WaitAny[SYNTHETIC])` is the audio-callback
/// injection in `try_inject_audio_callback` (state→`ServicingIrq`) and the
/// `LR_HALT` saved-context restore (state→`Blocked` again). One handle per
/// client slot keeps wait lists per-worker (defensive — `wake_eligible` is a
/// no-op anyway).
pub const XAUDIO_SYNTHETIC_HANDLE_BASE: u32 = 0xF000_0000;
/// Compute the synthetic park-handle for client slot `i`.
pub const fn synthetic_park_handle(i: usize) -> u32 {
XAUDIO_SYNTHETIC_HANDLE_BASE | (i as u32)
}
/// Source code stamped into [`crate::SavedCallbackCtx::source`] when an /// Source code stamped into [`crate::SavedCallbackCtx::source`] when an
/// audio callback is injected. Distinct from graphics-interrupt sources /// audio callback is injected. Distinct from graphics-interrupt sources
/// (`INTERRUPT_SOURCE_VSYNC = 0`, `INTERRUPT_SOURCE_CP = 1`) so logs and /// (`INTERRUPT_SOURCE_VSYNC = 0`, `INTERRUPT_SOURCE_CP = 1`) so logs and
@@ -59,6 +79,19 @@ pub struct XAudioState {
pub accumulator: u64, pub accumulator: u64,
pub last_instr_count: u64, pub last_instr_count: u64,
pub last_instant: Option<Instant>, pub last_instant: Option<Instant>,
/// AUDIT-032 Plan B: dedicated audio-worker thread per client slot.
/// Mirrors xenia-canary's `apu/audio_system.cc:84-159` host worker but
/// using a guest-side parked thread instead — registered at
/// `XAudioRegisterRenderDriverClient` time and lazily looked up by
/// `try_inject_audio_callback` via `scheduler.find_by_handle`. The
/// worker is parked in `Blocked(WaitAny[SYNTHETIC_HANDLE])`; injection
/// flips it to `ServicingIrq` and the `LR_HALT` restore path puts it
/// back to `Blocked`. Each slot also remembers the kernel handle so
/// `find_by_handle` can resolve a fresh `ThreadRef` after slot
/// pruning/reordering. Phantom-typed for callers that don't link
/// `xenia_cpu` (none currently) to keep this self-contained.
pub worker_handles: [Option<u32>; XAUDIO_MAX_CLIENTS],
pub worker_refs: [Option<ThreadRef>; XAUDIO_MAX_CLIENTS],
} }
impl Default for XAudioState { impl Default for XAudioState {
@@ -71,6 +104,8 @@ impl Default for XAudioState {
accumulator: 0, accumulator: 0,
last_instr_count: 0, last_instr_count: 0,
last_instant: None, last_instant: None,
worker_handles: [None; XAUDIO_MAX_CLIENTS],
worker_refs: [None; XAUDIO_MAX_CLIENTS],
} }
} }
} }
@@ -90,6 +125,12 @@ impl XAudioState {
if index < XAUDIO_MAX_CLIENTS { if index < XAUDIO_MAX_CLIENTS {
self.clients[index] = None; self.clients[index] = None;
self.pending.retain(|&i| i != index); self.pending.retain(|&i| i != index);
// Worker thread (if any) stays parked on its synthetic handle
// — Sylpheed never re-registers, so leaving it Blocked is
// simpler than wiring a clean teardown. Clear our refs so a
// future `register` rebuilds them.
self.worker_handles[index] = None;
self.worker_refs[index] = None;
} }
} }