fix(xam): XAMBUG-PRODUCER-001 — XamTaskSchedule spawns a real guest thread
Replaces the no-op stub at xam.rs:204 with a canary-faithful implementation mirroring xenia-canary/src/xenia/kernel/xam/xam_task.cc:43-80. Allocates a ThreadImage, allocates a KernelObject::Thread handle, and routes through Scheduler::spawn with entry=callback and start_context=message_ptr (canary's third positional XThread ctor arg). Stack size = max(0x4000, page-aligned 0x10_0000). Producer-hypothesis outcome (500M --trace-handles-focus run): the call site at 0x824a9a10 is never reached during this boot horizon, so XamTaskSchedule cannot be the missing producer for the 3 parked Event/Manual handles (0x1004, 0x100c, 0x15e4). The fix still lands — the stub was a real correctness bug that would manifest the moment the boot advances past the current deadlock. Next candidate per audit-findings.md: XAudioRegisterRenderDriverClient. - Workspace tests: 561 → 562 green (new test xam::tests::xam_task_schedule_spawns_real_thread). - --stable-digest -n 100M: instructions=100000002 unchanged from baseline; lockstep determinism preserved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -3974,3 +3974,50 @@ likely producer-class candidates for next session:
|
||||
4. **GPUBUG-FETCH-PATCH-001**: re-enable the PM4_TYPE0
|
||||
fetch-constant patch via a side-channel (GpuCommand variant)
|
||||
when draws actually start firing — relevant for bloom/blur N+1.
|
||||
|
||||
## Producer-hunt session 2026-05-03
|
||||
|
||||
### XAMBUG-PRODUCER-001 — XamTaskSchedule was a no-op stub
|
||||
|
||||
**Status:** fixed. Hypothesis falsified for the parked-waiter set.
|
||||
|
||||
**Site:** `crates/xenia-kernel/src/xam.rs:204` (pre-fix).
|
||||
**Canary parity:** `xenia-canary/src/xenia/kernel/xam/xam_task.cc:43-80`.
|
||||
|
||||
The pre-fix stub allocated a handle, logged it, and returned
|
||||
`STATUS_SUCCESS` — it never spawned a thread. Replaced with a
|
||||
canary-faithful implementation: allocates a `ThreadImage`, allocates
|
||||
a `KernelObject::Thread` handle, and routes through
|
||||
`Scheduler::spawn` with `entry=callback`, `start_context=message_ptr`
|
||||
(canary's third positional `XThread` arg). Stack sized as
|
||||
`max(0x4000, page-aligned 0x10_0000)`.
|
||||
|
||||
**Verification:**
|
||||
- Unit test `xam::tests::xam_task_schedule_spawns_real_thread`
|
||||
confirms the spawned thread's `pc == callback` and `gpr[3] == message_ptr`.
|
||||
- Workspace tests: 561 → 562 green.
|
||||
- `--stable-digest -n 100M` lockstep: `instructions=100000002`
|
||||
unchanged from baseline (interpreter determinism preserved).
|
||||
- `--trace-handles-focus=0x1004,0x100c,0x15e4 -n 500M`: no
|
||||
`kernel.calls{name=XamTaskSchedule}` counter appears — the call
|
||||
site at `0x824a9a10` is **never reached** within 500M
|
||||
instructions. Boot stalls earlier on the parked handles.
|
||||
|
||||
**Outcome:** the 3 focus handles still show
|
||||
`signal_attempts=0 (primary=0, ghost=0)` after 500M instructions.
|
||||
The XAM-task hypothesis is therefore **falsified for this run** —
|
||||
XamTaskSchedule cannot be the missing producer for these specific
|
||||
handles, because Sylpheed's only call site to it isn't reached
|
||||
before the deadlock.
|
||||
|
||||
The fix lands regardless: the stub was a real correctness bug that
|
||||
will manifest the moment the call site is reached (post-deadlock-resolution).
|
||||
|
||||
### Recommended next producer candidate
|
||||
|
||||
`XAudioRegisterRenderDriverClient` (currently a one-shot stub, called
|
||||
once per the metric counter). Audio buffer-complete callbacks are a
|
||||
known signal source on Xbox 360 audio engines; the stub may be
|
||||
hiding the producer for one of the 3 handles. If that lead is also
|
||||
falsified, escalate to file I/O completion (`signal_io_completion_event`
|
||||
already real but possibly mis-routed) or Timer DPC delivery.
|
||||
|
||||
@@ -1,7 +1,10 @@
|
||||
//! HLE kernel export implementations (xam.xex).
|
||||
|
||||
use crate::state::{KernelState, ModuleId};
|
||||
use crate::objects::KernelObject;
|
||||
use crate::state::{GuestMemoryPcr, KernelState, ModuleId};
|
||||
use crate::thread::allocate_thread_image;
|
||||
use xenia_cpu::PpcContext;
|
||||
use xenia_cpu::scheduler::SpawnParams;
|
||||
use xenia_memory::{GuestMemory, MemoryAccess};
|
||||
|
||||
pub fn register_exports(state: &mut KernelState) {
|
||||
@@ -201,10 +204,85 @@ fn xam_loader_terminate_title(ctx: &mut PpcContext, _mem: &GuestMemory, _state:
|
||||
|
||||
// ===== Task =====
|
||||
|
||||
fn xam_task_schedule(ctx: &mut PpcContext, _mem: &GuestMemory, state: &mut KernelState) {
|
||||
let handle = state.alloc_handle();
|
||||
tracing::info!("XamTaskSchedule: handle={:#x}", handle);
|
||||
ctx.gpr[3] = 0;
|
||||
/// `XamTaskSchedule(callback, message, optional_ptr, handle_ptr_out)` —
|
||||
/// spawn a guest thread that runs `callback(message)` asynchronously.
|
||||
/// Mirrors xenia-canary's `XamTaskSchedule_entry` (xam_task.cc:43-80):
|
||||
/// stack is `max(0x4000, page-aligned default)`, the new thread enters at
|
||||
/// `callback` with `message` in r3, and the resulting thread handle is
|
||||
/// written to `handle_ptr_out`.
|
||||
fn xam_task_schedule(ctx: &mut PpcContext, mem: &GuestMemory, state: &mut KernelState) {
|
||||
let callback = ctx.gpr[3] as u32;
|
||||
let message_ptr = ctx.gpr[4] as u32;
|
||||
let optional_ptr = ctx.gpr[5] as u32;
|
||||
let handle_ptr = ctx.gpr[6] as u32;
|
||||
let lr = ctx.lr as u32;
|
||||
|
||||
if optional_ptr != 0 {
|
||||
let v1 = mem.read_u32(optional_ptr);
|
||||
let v2 = mem.read_u32(optional_ptr + 4);
|
||||
tracing::info!("XamTaskSchedule: args v1={:#010x} v2={:#010x}", v1, v2);
|
||||
}
|
||||
|
||||
let stack_size = std::cmp::max(0x4000u32, (0x10_0000u32 + 0xFFF) & !0xFFF);
|
||||
|
||||
let Some(image) = allocate_thread_image(state, mem, stack_size, 0) else {
|
||||
tracing::error!("XamTaskSchedule: failed to allocate thread image");
|
||||
ctx.gpr[3] = 0xC000_009A; // STATUS_INSUFFICIENT_RESOURCES
|
||||
return;
|
||||
};
|
||||
|
||||
use std::sync::atomic::Ordering;
|
||||
let tid = state.next_thread_id.fetch_add(1, Ordering::Relaxed);
|
||||
let handle = state.alloc_handle_for(KernelObject::Thread {
|
||||
id: tid,
|
||||
hw_id: None,
|
||||
exit_code: None,
|
||||
waiters: Vec::new(),
|
||||
});
|
||||
|
||||
let tls_slot_count = state.next_tls_index.load(Ordering::Relaxed);
|
||||
let params = SpawnParams {
|
||||
entry: callback,
|
||||
start_context: message_ptr,
|
||||
stack_base: image.stack_base,
|
||||
stack_size: image.stack_size,
|
||||
pcr_base: image.pcr_base,
|
||||
tls_base: image.tls_base,
|
||||
thread_handle: handle,
|
||||
guest_tid: tid,
|
||||
create_suspended: false,
|
||||
is_initial: false,
|
||||
tls_slot_count,
|
||||
affinity_mask: 0,
|
||||
priority: 0,
|
||||
ideal_processor: None,
|
||||
};
|
||||
match state.scheduler.spawn(params, &mut GuestMemoryPcr(mem)) {
|
||||
Ok(hw_id) => {
|
||||
metrics::counter!("scheduler.spawn.ok").increment(1);
|
||||
if let Some(KernelObject::Thread { hw_id: slot, .. }) = state.objects.get_mut(&handle) {
|
||||
*slot = Some(hw_id);
|
||||
}
|
||||
if handle_ptr != 0 {
|
||||
mem.write_u32(handle_ptr, handle);
|
||||
}
|
||||
state.audit_create(handle, "Thread", lr, "XamTaskSchedule");
|
||||
tracing::info!(
|
||||
"XamTaskSchedule: tid={} handle={:#x} hw={} callback={:#010x} message={:#010x}",
|
||||
tid,
|
||||
handle,
|
||||
hw_id,
|
||||
callback,
|
||||
message_ptr,
|
||||
);
|
||||
ctx.gpr[3] = 0; // STATUS_SUCCESS
|
||||
}
|
||||
Err(_) => {
|
||||
metrics::counter!("scheduler.spawn.rejected").increment(1);
|
||||
tracing::error!("XamTaskSchedule: no free HW thread slot");
|
||||
ctx.gpr[3] = 0xC000_009A;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// ===== Alloc =====
|
||||
@@ -326,3 +404,66 @@ fn xget_video_mode(ctx: &mut PpcContext, mem: &GuestMemory, _state: &mut KernelS
|
||||
}
|
||||
ctx.gpr[3] = 0;
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
use xenia_memory::page_table::MemoryProtect;
|
||||
|
||||
const SCRATCH_BASE: u32 = 0x4000_0000;
|
||||
|
||||
fn fresh() -> (PpcContext, GuestMemory, KernelState) {
|
||||
let mut mem = GuestMemory::new().expect("memory init");
|
||||
mem.alloc(SCRATCH_BASE, 0x1000, MemoryProtect::READ | MemoryProtect::WRITE)
|
||||
.expect("scratch page must commit");
|
||||
let mut state = KernelState::new();
|
||||
state.install_initial_thread(
|
||||
PpcContext::default(),
|
||||
0x7000_0000,
|
||||
0x10_0000,
|
||||
SCRATCH_BASE + 0x800,
|
||||
SCRATCH_BASE + 0xC00,
|
||||
0xF000_0001,
|
||||
&mut mem,
|
||||
);
|
||||
state.scheduler.begin_slot_visit(0);
|
||||
(PpcContext::default(), mem, state)
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn xam_task_schedule_spawns_real_thread() {
|
||||
let (mut ctx, mut mem, mut state) = fresh();
|
||||
|
||||
let callback_pc: u32 = 0x824a_93c8;
|
||||
let message_ptr: u32 = SCRATCH_BASE + 0x100;
|
||||
let handle_out: u32 = SCRATCH_BASE + 0x200;
|
||||
ctx.gpr[3] = callback_pc as u64;
|
||||
ctx.gpr[4] = message_ptr as u64;
|
||||
ctx.gpr[5] = 0;
|
||||
ctx.gpr[6] = handle_out as u64;
|
||||
ctx.lr = 0x824a_9a14;
|
||||
|
||||
xam_task_schedule(&mut ctx, &mut mem, &mut state);
|
||||
|
||||
assert_eq!(ctx.gpr[3], 0, "XamTaskSchedule must return STATUS_SUCCESS");
|
||||
|
||||
let handle = mem.read_u32(handle_out);
|
||||
assert!(handle >= 0x1000, "handle must be allocated, got {:#x}", handle);
|
||||
|
||||
let r = state
|
||||
.scheduler
|
||||
.find_by_handle(handle)
|
||||
.expect("spawned thread must be findable by handle");
|
||||
let new_ctx = state.scheduler.ctx_mut_ref(r);
|
||||
assert_eq!(new_ctx.pc, callback_pc, "entry PC must be the callback");
|
||||
assert_eq!(
|
||||
new_ctx.gpr[3] as u32, message_ptr,
|
||||
"r3 must hold the message pointer"
|
||||
);
|
||||
|
||||
match state.objects.get(&handle) {
|
||||
Some(KernelObject::Thread { hw_id: Some(_), .. }) => {}
|
||||
other => panic!("expected Thread object with hw_id set, got {:?}", other),
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user