Milestone-2 (intro video dat/movie/ADV.wmv) audio path + major RE tooling. XMA AUDIO (built, working, deterministic, tested): - APU MMIO 0x7FEA0000 + 320x64B register-mapped context array; real XMACreateContext/Release (xma.rs); real FFmpeg xma2 decoder XMA_CONTEXT_DATA->S16BE PCM (xma_decode.rs, xma2_codec.rs, ffmpeg-sys-next). Decode runs synchronously on the CPU thread (deterministic, no host thread). - Audio-worker scheduler fix (main.rs LR_HALT restore + scheduler.rs): the XAudio render-callback worker was wrongly exited after ~2 deliveries; now survives -> guest drives XMA decode (70 kicks). - XAudioSubmitRenderDriverFrame made faithful. Golden sylpheed_n50m re-baselined; tests pass. RE TOOLING: - Runtime indirect-dispatch recorder (dispatch_rec.rs): records (call-site->target, r3, lr); env-gated XENIA_DISPATCH_REC, filters XENIA_DISPATCH_REC_TARGETS/_SITES; deterministic, observe-only. - Repaired static analyzer (vtables.rs): vtable extraction silently fragmented vtables with non-function head slots (missed the XMV engine vtable). Fixed via vptr-write-anchoring -> engine fully typed (vtables 722->1150 on rebuild). - Fixed probe HEISENBUG (main.rs run_superblock): --audit-pc-probe-hex/--mem-watch no longer disable superblock chaining; probes fire inside the chain loop -> scheduling identical armed-vs-unarmed, movie subsystem now observable. Fixed a --quiet bug swallowing armed trace reports. VIDEO still doesn't play (B, guest-side): the XMV engine never issues begin-playback (sub_825076F0, vtable 0x8200a1e8 slot21) -> never primes -> 2000ms timeout. Narrowed to the ARM2 engine-setup wrappers; no honest our-side gate-fix (masking forbidden). See HANDOFF-iterate-4A-milestone2.md for new-machine setup (incl. the FFmpeg apt deps + sylpheed.db regeneration) and continuation pointers. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
382 lines
14 KiB
Rust
382 lines
14 KiB
Rust
//! XAudio render-driver-client registration + buffer-complete callback loop
|
|
//! (canary parity: `xenia/apu/audio_system.cc`).
|
|
//!
|
|
//! Replaces the host-thread + per-client-semaphore + XAudio2 driver layer with
|
|
//! a periodic ticker that enqueues a "buffer complete" fire for each
|
|
//! registered client at the audio frame rate (256 samples / 48 kHz ≈ 5.33 ms).
|
|
//! The injection path in `xenia-app` reuses the same [`crate::SavedCallbackCtx`]
|
|
//! plumbing the graphics-interrupt path uses — only one callback runs at a
|
|
//! time across either subsystem, gated by `interrupts.is_in_callback()`.
|
|
//!
|
|
//! Lockstep mode uses an instruction-count proxy
|
|
//! ([`XAUDIO_INSTR_PERIOD`]) so `--stable-digest` stays bit-exact;
|
|
//! `--parallel` uses wall-clock ([`XAUDIO_PERIOD`]) — same dual-mode pattern
|
|
//! as KRNBUG-D08 v-sync.
|
|
|
|
use std::collections::VecDeque;
|
|
use std::time::{Duration, Instant};
|
|
|
|
use xenia_cpu::ThreadRef;
|
|
|
|
/// Mirrors [audio_system.h:30](../../../../xenia-canary/src/xenia/apu/audio_system.h#L30)
|
|
/// `kMaximumClientCount = 8`.
|
|
pub const XAUDIO_MAX_CLIENTS: usize = 8;
|
|
|
|
/// AUDIT-032 Plan B: synthetic kernel-handle base for the dedicated audio
|
|
/// worker threads' parking `WaitAny`. These handles are deliberately OUTSIDE
|
|
/// the normal allocator range (which starts at `0x1000` and grows by 4 in
|
|
/// [`crate::state::KernelState::alloc_handle`]) so a `state.objects` lookup
|
|
/// always misses — meaning [`crate::exports::wake_eligible_waiters`] will
|
|
/// never spuriously wake a worker. The only legitimate path that flips a
|
|
/// worker out of `Blocked(WaitAny[SYNTHETIC])` is the audio-callback
|
|
/// injection in `try_inject_audio_callback` (state→`ServicingIrq`) and the
|
|
/// `LR_HALT` saved-context restore (state→`Blocked` again). One handle per
|
|
/// client slot keeps wait lists per-worker (defensive — `wake_eligible` is a
|
|
/// no-op anyway).
|
|
pub const XAUDIO_SYNTHETIC_HANDLE_BASE: u32 = 0xF000_0000;
|
|
|
|
/// The scheduler's deadlock force-wake skips waiters parked solely on
|
|
/// handles at/above [`xenia_cpu::scheduler::SYNTHETIC_PARK_HANDLE_FLOOR`]
|
|
/// so it never destroys a parked audio worker. Keep these in lockstep:
|
|
/// every `synthetic_park_handle` must fall inside that protected range.
|
|
const _: () = assert!(
|
|
XAUDIO_SYNTHETIC_HANDLE_BASE >= xenia_cpu::scheduler::SYNTHETIC_PARK_HANDLE_FLOOR
|
|
);
|
|
|
|
/// Compute the synthetic park-handle for client slot `i`.
|
|
pub const fn synthetic_park_handle(i: usize) -> u32 {
|
|
XAUDIO_SYNTHETIC_HANDLE_BASE | (i as u32)
|
|
}
|
|
|
|
/// Source code stamped into [`crate::SavedCallbackCtx::source`] when an
|
|
/// audio callback is injected. Distinct from graphics-interrupt sources
|
|
/// (`INTERRUPT_SOURCE_VSYNC = 0`, `INTERRUPT_SOURCE_CP = 1`) so logs and
|
|
/// the audit trail can disambiguate.
|
|
pub const INTERRUPT_SOURCE_AUDIO: u32 = 0x100;
|
|
|
|
/// Lockstep instruction-count period. Picked so the ratio against
|
|
/// [`crate::interrupts::VSYNC_INSTR_PERIOD`] (`150_000`) ≈ 16.67 ms / 5.33 ms,
|
|
/// matching canary's 256 samples / 48 kHz audio cadence.
|
|
pub const XAUDIO_INSTR_PERIOD: u64 = 48_000;
|
|
|
|
/// Wall-clock period under `--parallel`. 256 / 48000 s = 5.333… ms.
|
|
pub const XAUDIO_PERIOD: Duration = Duration::from_nanos(5_333_333);
|
|
|
|
/// Bound on the pending-fires FIFO. Stops a long-running export from
|
|
/// queueing unbounded callbacks while injection is starved.
|
|
pub const XAUDIO_QUEUE_CAP: usize = 16;
|
|
|
|
#[derive(Debug, Clone, Copy)]
|
|
pub struct XAudioClient {
|
|
pub callback_pc: u32,
|
|
pub callback_arg: u32,
|
|
/// Guest pointer to the heap-allocated 4-byte buffer holding
|
|
/// `callback_arg` big-endian — passed as r3 to the guest callback,
|
|
/// matching canary's
|
|
/// [audio_system.cc:225-228](../../../../xenia-canary/src/xenia/apu/audio_system.cc#L225-L228)
|
|
/// + [audio_system.cc:139-141](../../../../xenia-canary/src/xenia/apu/audio_system.cc#L139-L141).
|
|
pub wrapped_callback_arg: u32,
|
|
/// Count of frames the guest has handed us via
|
|
/// `XAudioSubmitRenderDriverFrame` for this client. Canary's
|
|
/// `AudioSystem::SubmitFrame` forwards the sample buffer to the client's
|
|
/// driver, whose playback completion later releases the client semaphore
|
|
/// — the pacing our callback ticker emulates. The guest mixer
|
|
/// (`sub_824DC350`) discards SubmitFrame's return and reads no field it
|
|
/// writes, so this counter is purely observational (logging / liveness),
|
|
/// never read back by the guest. Deterministic: incremented only inside
|
|
/// the guest-driven export call.
|
|
pub submitted_frames: u64,
|
|
}
|
|
|
|
#[derive(Debug)]
|
|
pub struct XAudioState {
|
|
pub clients: [Option<XAudioClient>; XAUDIO_MAX_CLIENTS],
|
|
pub pending: VecDeque<usize>,
|
|
pub delivered: u64,
|
|
pub dropped: u64,
|
|
pub accumulator: u64,
|
|
pub last_instr_count: u64,
|
|
pub last_instant: Option<Instant>,
|
|
/// AUDIT-032 Plan B: dedicated audio-worker thread per client slot.
|
|
/// Mirrors xenia-canary's `apu/audio_system.cc:84-159` host worker but
|
|
/// using a guest-side parked thread instead — registered at
|
|
/// `XAudioRegisterRenderDriverClient` time and lazily looked up by
|
|
/// `try_inject_audio_callback` via `scheduler.find_by_handle`. The
|
|
/// worker is parked in `Blocked(WaitAny[SYNTHETIC_HANDLE])`; injection
|
|
/// flips it to `ServicingIrq` and the `LR_HALT` restore path puts it
|
|
/// back to `Blocked`. Each slot also remembers the kernel handle so
|
|
/// `find_by_handle` can resolve a fresh `ThreadRef` after slot
|
|
/// pruning/reordering. Phantom-typed for callers that don't link
|
|
/// `xenia_cpu` (none currently) to keep this self-contained.
|
|
pub worker_handles: [Option<u32>; XAUDIO_MAX_CLIENTS],
|
|
pub worker_refs: [Option<ThreadRef>; XAUDIO_MAX_CLIENTS],
|
|
}
|
|
|
|
impl Default for XAudioState {
|
|
fn default() -> Self {
|
|
Self {
|
|
clients: [None; XAUDIO_MAX_CLIENTS],
|
|
pending: VecDeque::new(),
|
|
delivered: 0,
|
|
dropped: 0,
|
|
accumulator: 0,
|
|
last_instr_count: 0,
|
|
last_instant: None,
|
|
worker_handles: [None; XAUDIO_MAX_CLIENTS],
|
|
worker_refs: [None; XAUDIO_MAX_CLIENTS],
|
|
}
|
|
}
|
|
}
|
|
|
|
impl XAudioState {
|
|
pub fn register(&mut self, client: XAudioClient) -> Option<usize> {
|
|
for (i, slot) in self.clients.iter_mut().enumerate() {
|
|
if slot.is_none() {
|
|
*slot = Some(client);
|
|
return Some(i);
|
|
}
|
|
}
|
|
None
|
|
}
|
|
|
|
pub fn unregister(&mut self, index: usize) {
|
|
if index < XAUDIO_MAX_CLIENTS {
|
|
self.clients[index] = None;
|
|
self.pending.retain(|&i| i != index);
|
|
// Worker thread (if any) stays parked on its synthetic handle
|
|
// — Sylpheed never re-registers, so leaving it Blocked is
|
|
// simpler than wiring a clean teardown. Clear our refs so a
|
|
// future `register` rebuilds them.
|
|
self.worker_handles[index] = None;
|
|
self.worker_refs[index] = None;
|
|
}
|
|
}
|
|
|
|
pub fn get(&self, index: usize) -> Option<XAudioClient> {
|
|
self.clients.get(index).copied().flatten()
|
|
}
|
|
|
|
/// Faithful counterpart to canary `AudioSystem::SubmitFrame`: the guest
|
|
/// driver client `index` handed us one frame of samples. Canary forwards
|
|
/// `samples` to the client's `AudioDriver`, whose playback-completion
|
|
/// callback later releases the client semaphore — the buffer-consumed
|
|
/// pacing our [`tick_instr`]/[`try_inject_audio_callback`] path already
|
|
/// emulates. SubmitFrame itself returns void and the guest mixer
|
|
/// (`sub_824DC350`) reads no field from it, so all we faithfully need to
|
|
/// do is validate the client and account the frame. Returns `true` iff
|
|
/// `index` is a registered client (canary submits silence / warns
|
|
/// otherwise). Deterministic — only the guest-driven export mutates this.
|
|
pub fn record_submit(&mut self, index: usize) -> bool {
|
|
match self.clients.get_mut(index) {
|
|
Some(Some(c)) => {
|
|
c.submitted_frames = c.submitted_frames.saturating_add(1);
|
|
true
|
|
}
|
|
_ => false,
|
|
}
|
|
}
|
|
|
|
pub fn submitted_frames(&self, index: usize) -> u64 {
|
|
self.clients
|
|
.get(index)
|
|
.copied()
|
|
.flatten()
|
|
.map(|c| c.submitted_frames)
|
|
.unwrap_or(0)
|
|
}
|
|
|
|
pub fn any_registered(&self) -> bool {
|
|
self.clients.iter().any(|c| c.is_some())
|
|
}
|
|
|
|
fn enqueue_all_active(&mut self) {
|
|
for i in 0..XAUDIO_MAX_CLIENTS {
|
|
if self.clients[i].is_none() {
|
|
continue;
|
|
}
|
|
if self.pending.len() >= XAUDIO_QUEUE_CAP {
|
|
self.dropped += 1;
|
|
return;
|
|
}
|
|
self.pending.push_back(i);
|
|
}
|
|
}
|
|
|
|
pub fn peek_next(&self) -> Option<usize> {
|
|
self.pending.front().copied()
|
|
}
|
|
|
|
pub fn take_next(&mut self) -> Option<usize> {
|
|
self.pending.pop_front()
|
|
}
|
|
|
|
/// Lockstep instruction-count ticker. Idempotently advances the
|
|
/// accumulator from `last_instr_count` to `current_instr_count` and
|
|
/// enqueues one fire-set per full [`XAUDIO_INSTR_PERIOD`] crossed.
|
|
/// Returns `true` iff at least one fire was queued.
|
|
pub fn tick_instr(&mut self, current_instr_count: u64) -> bool {
|
|
if !self.any_registered() {
|
|
self.last_instr_count = current_instr_count;
|
|
self.accumulator = 0;
|
|
return false;
|
|
}
|
|
let delta = current_instr_count.saturating_sub(self.last_instr_count);
|
|
self.last_instr_count = current_instr_count;
|
|
self.accumulator = self.accumulator.saturating_add(delta);
|
|
if self.accumulator < XAUDIO_INSTR_PERIOD {
|
|
return false;
|
|
}
|
|
let periods = self.accumulator / XAUDIO_INSTR_PERIOD;
|
|
self.accumulator %= XAUDIO_INSTR_PERIOD;
|
|
let to_fire = (periods as usize).min(XAUDIO_QUEUE_CAP);
|
|
for _ in 0..to_fire {
|
|
self.enqueue_all_active();
|
|
}
|
|
true
|
|
}
|
|
|
|
/// Wall-clock ticker for `--parallel`. First call seeds the anchor
|
|
/// (no fire). Subsequent calls fire `floor(elapsed / XAUDIO_PERIOD)`
|
|
/// fire-sets and advance the anchor by that many full periods.
|
|
pub fn tick_wallclock(&mut self) -> bool {
|
|
if !self.any_registered() {
|
|
self.last_instant = None;
|
|
return false;
|
|
}
|
|
let now = Instant::now();
|
|
let anchor = match self.last_instant {
|
|
Some(t) => t,
|
|
None => {
|
|
self.last_instant = Some(now);
|
|
return false;
|
|
}
|
|
};
|
|
let elapsed = now.saturating_duration_since(anchor);
|
|
let period_ns = XAUDIO_PERIOD.as_nanos() as u64;
|
|
let elapsed_ns = elapsed.as_nanos() as u64;
|
|
let periods = elapsed_ns / period_ns;
|
|
if periods == 0 {
|
|
return false;
|
|
}
|
|
let advance = Duration::from_nanos(periods * period_ns);
|
|
self.last_instant = Some(anchor + advance);
|
|
let to_fire = (periods as usize).min(XAUDIO_QUEUE_CAP);
|
|
for _ in 0..to_fire {
|
|
self.enqueue_all_active();
|
|
}
|
|
true
|
|
}
|
|
}
|
|
|
|
#[cfg(test)]
|
|
mod tests {
|
|
use super::*;
|
|
|
|
fn dummy_client(arg: u32) -> XAudioClient {
|
|
XAudioClient {
|
|
callback_pc: 0x8200_0000 + arg,
|
|
callback_arg: arg,
|
|
wrapped_callback_arg: 0x4000_0000 + arg,
|
|
submitted_frames: 0,
|
|
}
|
|
}
|
|
|
|
#[test]
|
|
fn register_assigns_first_free_slot() {
|
|
let mut s = XAudioState::default();
|
|
let i0 = s.register(dummy_client(1)).unwrap();
|
|
let i1 = s.register(dummy_client(2)).unwrap();
|
|
assert_eq!(i0, 0);
|
|
assert_eq!(i1, 1);
|
|
assert_eq!(s.get(0).unwrap().callback_arg, 1);
|
|
assert_eq!(s.get(1).unwrap().callback_arg, 2);
|
|
}
|
|
|
|
#[test]
|
|
fn unregister_clears_slot_and_pending() {
|
|
let mut s = XAudioState::default();
|
|
let i = s.register(dummy_client(1)).unwrap();
|
|
s.pending.push_back(i);
|
|
s.unregister(i);
|
|
assert!(s.get(i).is_none());
|
|
assert!(s.pending.is_empty());
|
|
}
|
|
|
|
#[test]
|
|
fn register_returns_none_when_full() {
|
|
let mut s = XAudioState::default();
|
|
for k in 0..XAUDIO_MAX_CLIENTS {
|
|
assert!(s.register(dummy_client(k as u32)).is_some());
|
|
}
|
|
assert!(s.register(dummy_client(99)).is_none());
|
|
}
|
|
|
|
#[test]
|
|
fn tick_instr_no_clients_does_not_fire() {
|
|
let mut s = XAudioState::default();
|
|
assert!(!s.tick_instr(XAUDIO_INSTR_PERIOD * 10));
|
|
assert!(s.pending.is_empty());
|
|
}
|
|
|
|
#[test]
|
|
fn tick_instr_fires_at_period() {
|
|
let mut s = XAudioState::default();
|
|
let i = s.register(dummy_client(7)).unwrap();
|
|
assert!(!s.tick_instr(XAUDIO_INSTR_PERIOD - 1));
|
|
assert!(s.pending.is_empty());
|
|
assert!(s.tick_instr(XAUDIO_INSTR_PERIOD));
|
|
assert_eq!(s.peek_next(), Some(i));
|
|
}
|
|
|
|
#[test]
|
|
fn tick_instr_drains_multiple_periods_in_one_call() {
|
|
let mut s = XAudioState::default();
|
|
let i = s.register(dummy_client(7)).unwrap();
|
|
assert!(s.tick_instr(XAUDIO_INSTR_PERIOD * 4));
|
|
assert_eq!(s.pending.len(), 4);
|
|
for _ in 0..4 {
|
|
assert_eq!(s.take_next(), Some(i));
|
|
}
|
|
assert!(s.pending.is_empty());
|
|
}
|
|
|
|
#[test]
|
|
fn tick_instr_fires_for_each_registered_client() {
|
|
let mut s = XAudioState::default();
|
|
let a = s.register(dummy_client(1)).unwrap();
|
|
let b = s.register(dummy_client(2)).unwrap();
|
|
assert!(s.tick_instr(XAUDIO_INSTR_PERIOD));
|
|
assert_eq!(s.pending.len(), 2);
|
|
assert_eq!(s.take_next(), Some(a));
|
|
assert_eq!(s.take_next(), Some(b));
|
|
}
|
|
|
|
#[test]
|
|
fn tick_instr_caps_queue_growth() {
|
|
let mut s = XAudioState::default();
|
|
s.register(dummy_client(1)).unwrap();
|
|
s.tick_instr(XAUDIO_INSTR_PERIOD * (XAUDIO_QUEUE_CAP as u64 + 50));
|
|
assert!(s.pending.len() <= XAUDIO_QUEUE_CAP);
|
|
}
|
|
|
|
#[test]
|
|
fn tick_wallclock_first_call_seeds_anchor() {
|
|
let mut s = XAudioState::default();
|
|
s.register(dummy_client(1)).unwrap();
|
|
assert!(!s.tick_wallclock());
|
|
assert!(s.pending.is_empty());
|
|
assert!(s.last_instant.is_some());
|
|
}
|
|
|
|
#[test]
|
|
fn tick_wallclock_fires_after_period() {
|
|
let mut s = XAudioState::default();
|
|
let i = s.register(dummy_client(1)).unwrap();
|
|
s.tick_wallclock();
|
|
std::thread::sleep(XAUDIO_PERIOD + Duration::from_millis(2));
|
|
assert!(s.tick_wallclock());
|
|
assert!(!s.pending.is_empty());
|
|
assert_eq!(s.peek_next(), Some(i));
|
|
}
|
|
}
|