Files
xenia-rs/crates/xenia-kernel/src/silph_synth.rs
MechaCat02 40f208ea4e [2.BF] Silph WorkerCtx: install canary's real sub-vtable at [+0x2C][0]
Round-21 pivot of the audit-059 synth-spawn module. Round 20 made the
silph::WorkerCtx workers run by attaching a 32-slot stub sub-vtable
where every entry was a `li r3, 0; blr` stub — workers spawned but
spun forever because slots 15/17 short-circuited to NULL ("no work").

Round 21 reads canary's real sub-vtable VA out of the XEX `.rdata` —
`0x8200A168` — and points `[sub_object + 0]` at it directly. The
vtable bytes live in the static image both engines map, so no guest
memory is consumed and slot 15 (= `sub_824FCCC8`) and slot 17
(= `sub_824FCE38`) — the only slots `sub_82506B08` ever calls —
become working game methods.

Discovery method (canary probes in
`audit-runs/audit-059-handle-disambiguation/round21-subvtable-canary/`):
  1. `--audit_jit_prolog_pc=0x82506B08` to catch the first WorkerCtx
     virtual-dispatch entry; `[r3+0x2C]` revealed the sub-object VA.
  2. Re-run with `--audit_jit_prolog_mem_dump=<sub-obj VA>` to deref
     `[sub-object + 0]` = sub-vtable VA = 0x8200A168.
  3. PE inspection (`xex-text/xex-rdata` is the static image) reads
     all 31 slots; slot 15 -> sub_824FCCC8, slot 17 -> sub_824FCE38.

Smoke metrics (50M instructions, `XENIA_CACHE_PERSIST=1
XENIA_SILPH_SYNTH=1`, audit-runs/audit-059-handle-disambiguation/
round21-real-vtable/):
  * 4/4 workers spawned, no crash, no new fault
  * KeSetEvent 633885 -> 431860 (-32%)
  * KeWaitForSingleObject 258441 -> 185762 (-28%)
  * Per-handle state unchanged on the focused stalled set
    (0x1020/0x1090 still `<NO_SIGNALS_DESPITE_WAITS>`,
    0x12a4/0x12ac/0x1218/0x1224 still `<UNCREATED>`).
  * No VdSwap/draws progression observed in this window.

Verdict: B (partial). The workers no longer spin in a stub-loop —
internal call density shifted — but the focused wedge handles still
don't get signalled. Likely root cause: workers may now be waiting
on the WorkerCtx's own KEVENTs (which we synthesised at
+0x54/+0x94) for upstream work that no producer is enqueuing.

Net LOC: 29 ins / 31 del. Tests: workspace passes (lockstep app
tests, kernel 127/127, hir 288/288, scheduler 38/38).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-06-07 21:19:52 +02:00

281 lines
12 KiB
Rust
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
//! AUDIT-2.BF — synthetic spawn of the silph::WorkerCtx worker quartet.
//!
//! AUDIT-058/059 traced a 6-level static-caller ladder
//! (`sub_824F7800 ← sub_824F7CD0 ← sub_824F8398 ← sub_821B55D8 ← sub_821B6DF4`,
//! topped by virtual-dispatch from `sub_82172BA0+0x1E8`) that activates
//! `sub_825070F0` in canary at ~1× / 30 s, kicking off four worker threads
//! initialised against a single ~0x440-byte ctx. In ours none of those PCs
//! fire (audit-059 round 9 confirmed sub_821B6DF4 = 0×, real chain entry =
//! virtual-dispatch from sub_82172BA0+0x1E8 hits wrong-vtable slot).
//!
//! Rather than chase the wrong-vtable break, this module reproduces the end
//! state directly: at the first observation of a load-bearing VFS path
//! (`dat/movie`), we synthesise the ctx structure in guest memory per audit-
//! 059 round 5's live hexdump and spawn the four worker entry points the
//! same way AUDIT-048's audio host-pump spawns its dedicated client worker.
//!
//! The ctx is opaque to the workers — only fields they dereference matter.
//! Per round 5 dump (`audit-runs/audit-059-handle-disambiguation/round5-ctx-
//! dump/canary.log`):
//!
//! +0x00 vtable = 0x8200A1E8 (XEX .rdata, valid in both engines)
//! +0x04 self = ctx
//! +0x08 intrusive head= ctx
//! +0x0C init flag = 1
//! +0x10 packed byte = 0x01000000
//! +0x18 float ~1.0 = 0x3F7FCCCC
//! +0x1C float ~1.0 = 0x3F802D83
//! +0x24 flag = 1
//! +0x28..+0x30 = three foreign pointers, NULL initially
//! +0x54..+0x84 = 4× X_KEVENT auto-reset, state=0
//! +0x94..+0xC4 = 4× X_KEVENT manual-reset, state=1
//! +0x210..+0x250 = 4-entry intrusive work-ring, empty
//!
//! Worker entries (each takes r3 = ctx_ptr):
//! 0x82506528, 0x82506558, 0x82506588, 0x825065B8
use xenia_cpu::scheduler::{BlockReason, SpawnParams};
use xenia_cpu::ThreadRef;
use xenia_memory::{GuestMemory, MemoryAccess};
use crate::objects::KernelObject;
use crate::state::{GuestMemoryPcr, KernelState};
use crate::thread::allocate_thread_image;
/// XEX `.rdata` vtable for the silph::WorkerCtx singleton (audit-059 round 5).
const SILPH_CTX_VTABLE: u32 = 0x8200_A1E8;
/// 4-element fixed entry table — guest text PCs for the four worker bodies.
const SILPH_WORKER_ENTRIES: [u32; 4] = [
0x8250_6528,
0x8250_6558,
0x8250_6588,
0x8250_65B8,
];
/// Round 0x440 up to a page-ish so the ctx alloc never straddles a page
/// boundary in heap_alloc's bookkeeping. Round 20 grew the alloc from 0x500
/// to 0x800 to make room for a synthesised sub-object at +0x300 and its
/// 32-slot vtable at +0x500 (= ctx + 0x500..0x580). Round 21 retains the
/// embedded sub-object but drops the synthesized vtable (we now point at
/// canary's real XEX-resident sub-vtable directly), so the 0x500..0x580
/// region is unused but harmless.
const SILPH_CTX_SIZE: u32 = 0x800;
/// Offset within the ctx allocation of the synthetic sub-object referenced
/// at `[ctx+0x2C]`. Canary's sub-object sits ~0x300 bytes above the ctx and
/// varies per-instance; we keep it embedded in the same alloc so a single
/// `heap_alloc` covers everything.
const SILPH_SUBOBJ_OFFSET: u32 = 0x300;
/// XEX `.rdata` VA of canary's real sub-object vtable (audit-059 round 21).
/// Discovered by:
/// 1. Probing canary at `pc=0x82506B08` (= `sub_82506B08`, method 35 of
/// the WorkerCtx vtable, the first sub-object method called by every
/// `sub_82506528/58/88/B8` worker entry).
/// 2. Capturing `[ctx+0x2C]` from the JIT-prolog dump (= sub-object VA
/// in canary's heap).
/// 3. Re-running with `--audit_jit_prolog_mem_dump=<sub-obj VA>` to read
/// `[sub-object + 0]` = sub-vtable VA = **`0x8200A168`**.
/// PE inspection confirms slot 15 (called via `[r11+0x3C]` at
/// `sub_82506B08+0x44`) = `sub_824FCCC8` and slot 17 (`[r11+0x44]` at
/// `sub_82506B08+0x70`) = `sub_824FCE38`. Both are real game methods in
/// the same `.text` region as the rest of the worker dispatch surface.
const SILPH_SUB_VTABLE_SOURCE_VA: u32 = 0x8200_A168;
/// Round-19 XEX-resident wrapper constant observed at `[ctx+0x30]` in every
/// canary ctx (audit-059 round 7). Same value for all four ctxes — opaque
/// pointer / handle the worker passes through without dereferencing.
const SILPH_CTX_FIELD_30_CONST: u32 = 0xBE56_8F00;
/// 64 KiB worker stack (mirrors AUDIT-048 audio worker), half of canary's
/// 128 KiB default.
const SILPH_WORKER_STACK: u32 = 0x10_000;
/// Idempotently synthesise the silph::WorkerCtx and spawn the four worker
/// threads it normally drives.
///
/// `suspended` controls whether the spawned threads enter the runqueue as
/// `Ready` (false) or as `Blocked(Suspended)` (true). Use `true` for
/// diagnostic baselines where you want the ctx materialised in guest memory
/// for downstream probes but don't want the worker bodies executing (e.g.
/// when round-5 ctx fields like the foreign-arena pointers at +0x28/+0x2C/
/// +0x30 are still NULL and the workers would fault on first dereference).
///
/// Returns the ctx VA on the first call; on subsequent calls returns the
/// cached VA without re-spawning. Failures inside spawn are logged but the
/// `synth_done` latch is still flipped so we don't retry-loop.
///
/// Mirrors the AUDIT-048 audio-worker spawn pattern in
/// `xaudio_register_render_driver` (`exports.rs:3122`).
pub fn spawn_silph_workers(
state: &mut KernelState,
mem: &GuestMemory,
suspended: bool,
) -> Option<u32> {
if state.silph_synth_done {
return Some(state.silph_synth_ctx);
}
state.silph_synth_done = true;
let Some(ctx) = state.heap_alloc(SILPH_CTX_SIZE, mem) else {
tracing::warn!("silph_synth: heap_alloc({:#x}) failed for ctx", SILPH_CTX_SIZE);
return None;
};
state.silph_synth_ctx = ctx;
// Zero the entire ctx page first — heap_alloc returns freshly mapped
// memory but we want the audit-059-round-5 layout to be canonical
// regardless of any future allocator behaviour change.
for off in (0..SILPH_CTX_SIZE).step_by(4) {
mem.write_u32(ctx + off, 0);
}
// ---- Header scalars (per audit-059 round 5 hexdump) ----
mem.write_u32(ctx + 0x00, SILPH_CTX_VTABLE);
mem.write_u32(ctx + 0x04, ctx); // self
mem.write_u32(ctx + 0x08, ctx); // intrusive list head pointing at self
mem.write_u32(ctx + 0x0C, 0x0000_0001); // init flag / refcount
mem.write_u32(ctx + 0x10, 0x0100_0000); // packed byte field
mem.write_u32(ctx + 0x18, 0x3F7F_CCCC); // float ~1.0 (UI rate A)
mem.write_u32(ctx + 0x1C, 0x3F80_2D83); // float ~1.0 (UI rate B)
mem.write_u32(ctx + 0x24, 0x0000_0001);
// +0x28..+0x30 = three foreign pointers.
// +0x28 — canary's first-fire snapshot has NULL here. Round-19 fault
// analysis shows worker bodies don't dereference this on
// first entry, so we leave it NULL too.
// +0x2C — sub-object pointer. Worker bodies do
// `lwz r3,44(rN); lwz r11,0(r3); lwz r11,60(r11); bctrl`,
// i.e. virtual-dispatch through slot 15 of the sub-object's
// vtable. Point this at our synthesised sub-object embedded
// at ctx + SILPH_SUBOBJ_OFFSET.
// +0x30 — XEX-resident wrapper constant 0xBE568F00 (round 7). Opaque
// but identical across all four canary ctxes.
let subobj_ptr = ctx + SILPH_SUBOBJ_OFFSET;
mem.write_u32(ctx + 0x2C, subobj_ptr);
mem.write_u32(ctx + 0x30, SILPH_CTX_FIELD_30_CONST);
// ---- Embedded sub-object at +0x300 ----
// Round-21 pivot: instead of synthesising a stub vtable that returns
// NULL from every slot, point `[sub_object + 0]` directly at canary's
// real XEX-resident sub-vtable VA. The vtable bytes are part of the
// same static image both engines map, so referring to it costs zero
// guest memory and gives the workers a working virtual-method surface
// (slot 15 = sub_824FCCC8, slot 17 = sub_824FCE38, plus 29 other real
// methods). Round-19 disassembly shows worker bodies only touch the
// sub-object's vtable; the rest of the sub-object is opaque so we
// leave it zero-filled.
mem.write_u32(subobj_ptr, SILPH_SUB_VTABLE_SOURCE_VA);
// ---- 4× X_KEVENT auto-reset at +0x54/+0x64/+0x74/+0x84, state = 0 ----
// X_DISPATCH_HEADER layout (canary xobject.h:35):
// +0x00 type (u8: 0=manual-event, 1=auto-event, 2=mutant, ...)
// +0x01 abandoned (u8)
// +0x02 size (u8 dwords)
// +0x03 inserted (u8)
// +0x04 signal_state (u32 BE)
// +0x08..+0x0F list_head (two pointers — self-link = empty list)
for i in 0..4u32 {
let off = ctx + 0x54 + (i * 0x10);
mem.write_u8(off, 1); // type = auto-reset Event
mem.write_u32(off + 4, 0); // signal_state = 0
// List head self-link denotes empty waiter list.
mem.write_u32(off + 8, off + 8);
mem.write_u32(off + 12, off + 8);
}
// ---- 4× X_KEVENT manual-reset at +0x94..+0xC4, state = 1 (pre-signaled) ----
for i in 0..4u32 {
let off = ctx + 0x94 + (i * 0x10);
mem.write_u8(off, 0); // type = manual-reset Event
mem.write_u32(off + 4, 1); // signal_state = 1 (pre-signaled)
mem.write_u32(off + 8, off + 8);
mem.write_u32(off + 12, off + 8);
}
// ---- 4-entry intrusive work-ring at +0x210, initially empty ----
// Each entry: [+0]=0x01000000 [+4]=0 [+8]=self_ptr [+0xC]=self_ptr.
for i in 0..4u32 {
let off = ctx + 0x210 + (i * 0x10);
mem.write_u32(off, 0x0100_0000);
mem.write_u32(off + 4, 0);
mem.write_u32(off + 8, off + 8);
mem.write_u32(off + 12, off + 8);
}
// +0x250 "XEN"-tagged descriptors and +0x2E0 resource-index table left
// zero — they may be populated lazily by the workers themselves.
// ---- Spawn the 4 worker guest threads ----
use std::sync::atomic::Ordering;
let mut spawned = 0usize;
for (i, &entry) in SILPH_WORKER_ENTRIES.iter().enumerate() {
let Some(image) = allocate_thread_image(state, mem, SILPH_WORKER_STACK, 0) else {
tracing::warn!("silph_synth: allocate_thread_image failed for worker {}", i);
continue;
};
let tid = state.next_thread_id.fetch_add(1, Ordering::Relaxed);
let handle = state.alloc_handle_for(KernelObject::Thread {
id: tid,
hw_id: None,
exit_code: None,
waiters: Vec::new(),
});
let tls_slot_count = state.next_tls_index.load(Ordering::Relaxed);
let params = SpawnParams {
entry,
start_context: ctx, // r3 = ctx_ptr
stack_base: image.stack_base,
stack_size: image.stack_size,
pcr_base: image.pcr_base,
tls_base: image.tls_base,
thread_handle: handle,
guest_tid: tid,
create_suspended: suspended,
is_initial: false,
tls_slot_count,
affinity_mask: 0,
priority: 0,
ideal_processor: None,
};
match state.scheduler.spawn(params, &mut GuestMemoryPcr(mem)) {
Ok(hw_id) => {
if let Some(KernelObject::Thread { hw_id: slot, .. }) =
state.objects.get_mut(&handle)
{
*slot = Some(hw_id);
}
let tref = ThreadRef::new(
hw_id,
(state.scheduler.slots[hw_id as usize].runqueue.len() - 1) as u16,
);
state.silph_synth_handles[i] = Some(handle);
state.silph_synth_refs[i] = Some(tref);
spawned += 1;
tracing::info!(
"silph_synth: spawned worker {} tid={} handle={:#x} entry={:#010x} ctx={:#010x}",
i, tid, handle, entry, ctx
);
}
Err(_) => {
tracing::warn!(
"silph_synth: scheduler.spawn failed for worker {} entry={:#010x}",
i, entry
);
}
}
// Avoid an unused-variable warning if BlockReason isn't referenced.
let _ = BlockReason::WaitAny {
handles: Vec::new(),
deadline: None,
};
}
tracing::info!(
"silph_synth: ctx={:#010x} workers_spawned={}/4",
ctx, spawned
);
Some(ctx)
}