[iterate-4A] Milestone-2: XMA audio decoder + RE tooling (dispatch recorder, analyzer vtable-fix, non-perturbing probes)
Milestone-2 (intro video dat/movie/ADV.wmv) audio path + major RE tooling. XMA AUDIO (built, working, deterministic, tested): - APU MMIO 0x7FEA0000 + 320x64B register-mapped context array; real XMACreateContext/Release (xma.rs); real FFmpeg xma2 decoder XMA_CONTEXT_DATA->S16BE PCM (xma_decode.rs, xma2_codec.rs, ffmpeg-sys-next). Decode runs synchronously on the CPU thread (deterministic, no host thread). - Audio-worker scheduler fix (main.rs LR_HALT restore + scheduler.rs): the XAudio render-callback worker was wrongly exited after ~2 deliveries; now survives -> guest drives XMA decode (70 kicks). - XAudioSubmitRenderDriverFrame made faithful. Golden sylpheed_n50m re-baselined; tests pass. RE TOOLING: - Runtime indirect-dispatch recorder (dispatch_rec.rs): records (call-site->target, r3, lr); env-gated XENIA_DISPATCH_REC, filters XENIA_DISPATCH_REC_TARGETS/_SITES; deterministic, observe-only. - Repaired static analyzer (vtables.rs): vtable extraction silently fragmented vtables with non-function head slots (missed the XMV engine vtable). Fixed via vptr-write-anchoring -> engine fully typed (vtables 722->1150 on rebuild). - Fixed probe HEISENBUG (main.rs run_superblock): --audit-pc-probe-hex/--mem-watch no longer disable superblock chaining; probes fire inside the chain loop -> scheduling identical armed-vs-unarmed, movie subsystem now observable. Fixed a --quiet bug swallowing armed trace reports. VIDEO still doesn't play (B, guest-side): the XMV engine never issues begin-playback (sub_825076F0, vtable 0x8200a1e8 slot21) -> never primes -> 2000ms timeout. Narrowed to the ARM2 engine-setup wrappers; no honest our-side gate-fix (masking forbidden). See HANDOFF-iterate-4A-milestone2.md for new-machine setup (incl. the FFmpeg apt deps + sylpheed.db regeneration) and continuation pointers. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -26,6 +26,14 @@ use xenia_xex::pe::PeSection;
|
||||
|
||||
use crate::demangle;
|
||||
|
||||
/// Maximum number of consecutive non-function slots tolerated inside an
|
||||
/// anchor-recovered vtable before the run is considered terminated. MSVC
|
||||
/// vtables can carry null / pure-virtual / unrecognised-thunk slots in their
|
||||
/// head or interior; a small budget lets those through without merging two
|
||||
/// physically-adjacent vtables. Kept small to avoid bridging the gap between
|
||||
/// distinct tables.
|
||||
const MAX_ANCHOR_GAP: usize = 2;
|
||||
|
||||
/// One detected vtable.
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct Vtable {
|
||||
@@ -56,6 +64,35 @@ pub fn analyze(
|
||||
image_base: u32,
|
||||
sections: &[PeSection],
|
||||
function_starts: &std::collections::BTreeSet<u32>,
|
||||
) -> Vec<Vtable> {
|
||||
analyze_with_anchors(pe, image_base, sections, function_starts, &std::collections::BTreeSet::new())
|
||||
}
|
||||
|
||||
/// Like [`analyze`], but additionally recovers vtables whose base address is
|
||||
/// known a-priori from a constructor vptr-write store (an "anchor"). The
|
||||
/// contiguity heuristic in pass 1 fragments any vtable whose head region
|
||||
/// contains words that don't resolve to recognised function entries (null /
|
||||
/// pure-virtual / unrecognised thunk slots); those vtables are never emitted
|
||||
/// and the downstream typed-dispatch resolver can't type objects of that
|
||||
/// class. An anchor is a *content-independent* vtable signal — the ctor
|
||||
/// literally installs `vtable_base` into `this+0` via
|
||||
/// `addis/addi (or lis/ori) → stw rX, 0(rThis)` — so for every anchor not
|
||||
/// already covered by a pass-1 run we synthesise a vtable starting at that
|
||||
/// base, reading the fnptr-array run while *tolerating* up to
|
||||
/// [`MAX_ANCHOR_GAP`] consecutive non-function slots before terminating.
|
||||
///
|
||||
/// `anchors` are absolute VAs of vtable bases (from
|
||||
/// [`scan_vptr_write_constants`]). Existing pass-1 vtables are kept unchanged
|
||||
/// (no regression): an anchor that already coincides with a detected vtable
|
||||
/// base is skipped, and an anchor that lands *inside* an existing run is also
|
||||
/// skipped (it's a sub-object pointer, not a fresh table).
|
||||
#[tracing::instrument(skip_all, fields(image_base = format_args!("{:#010x}", image_base)))]
|
||||
pub fn analyze_with_anchors(
|
||||
pe: &[u8],
|
||||
image_base: u32,
|
||||
sections: &[PeSection],
|
||||
function_starts: &std::collections::BTreeSet<u32>,
|
||||
anchors: &std::collections::BTreeSet<u32>,
|
||||
) -> Vec<Vtable> {
|
||||
let started = std::time::Instant::now();
|
||||
// Sections we'll scan for vtable bodies.
|
||||
@@ -117,6 +154,120 @@ pub fn analyze(
|
||||
let _ = (va_start, va_end);
|
||||
}
|
||||
|
||||
// --- Anchor-driven recovery (vptr-write-anchored vtables) ---
|
||||
//
|
||||
// Build a coverage interval set from pass-1 runs so we don't re-emit a
|
||||
// table for an anchor that already lies within an extracted vtable.
|
||||
let mut covered: Vec<(u32, u32)> = candidates
|
||||
.iter()
|
||||
.map(|v| (v.address, v.address + v.length * 4))
|
||||
.collect();
|
||||
covered.sort_unstable();
|
||||
|
||||
let is_covered = |addr: u32, covered: &[(u32, u32)]| -> bool {
|
||||
covered.iter().any(|&(s, e)| addr >= s && addr < e)
|
||||
};
|
||||
|
||||
// Section lookup for "which scan target contains this VA?"
|
||||
let scan_targets_va: Vec<(u32, u32, usize, usize)> = sections
|
||||
.iter()
|
||||
.filter(|s| matches!(s.name.as_str(), ".rdata" | ".data"))
|
||||
.map(|s| {
|
||||
let va = image_base + s.virtual_address;
|
||||
(
|
||||
va,
|
||||
va + s.virtual_size,
|
||||
s.virtual_address as usize,
|
||||
(s.virtual_address + s.virtual_size) as usize,
|
||||
)
|
||||
})
|
||||
.collect();
|
||||
|
||||
// Cap a recovered run at the *next anchor* so two physically-adjacent
|
||||
// anchored vtables don't merge. We deliberately do NOT cap at pass-1
|
||||
// fragments: a fragment is a sub-run the contiguity scan carved out of a
|
||||
// larger table, and the anchor legitimately re-absorbs it (subsumed
|
||||
// fragments are removed afterwards).
|
||||
let anchor_bases: std::collections::BTreeSet<u32> = anchors.iter().copied().collect();
|
||||
|
||||
let mut recovered = 0usize;
|
||||
let mut newly: Vec<Vtable> = Vec::new();
|
||||
for &anchor in anchors {
|
||||
if is_covered(anchor, &covered) { continue; }
|
||||
// Locate the containing .rdata/.data section.
|
||||
let Some(&(va_lo, va_hi, raw_lo, raw_hi)) =
|
||||
scan_targets_va.iter().find(|&&(lo, hi, _, _)| anchor >= lo && anchor < hi)
|
||||
else { continue };
|
||||
if anchor % 4 != 0 { continue; }
|
||||
let raw_hi = raw_hi.min(pe.len());
|
||||
// Read the fnptr-array run starting at the anchor. Tolerate small
|
||||
// gaps of non-function slots (null / pure-virtual / unrecognised),
|
||||
// but require the run to actually contain at least one real function
|
||||
// (otherwise it's just data, not a vtable).
|
||||
let next_base = anchor_bases.range((anchor + 4)..).next().copied();
|
||||
let mut methods: Vec<u32> = Vec::new();
|
||||
let mut gap = 0usize;
|
||||
let mut real_fns = 0usize;
|
||||
let mut off = (anchor - va_lo) as usize + raw_lo;
|
||||
let mut va = anchor;
|
||||
while off + 4 <= raw_hi && va < va_hi {
|
||||
if let Some(nb) = next_base && va >= nb { break; }
|
||||
let val = u32::from_be_bytes([pe[off], pe[off + 1], pe[off + 2], pe[off + 3]]);
|
||||
if function_starts.contains(&val) {
|
||||
methods.push(val);
|
||||
real_fns += 1;
|
||||
gap = 0;
|
||||
} else {
|
||||
// A non-function slot. Keep the slot (so downstream slot
|
||||
// indexing stays aligned) but count toward the gap budget.
|
||||
gap += 1;
|
||||
if gap > MAX_ANCHOR_GAP {
|
||||
// Drop the trailing gap slots — they belong past the
|
||||
// table's end.
|
||||
methods.truncate(methods.len().saturating_sub(gap - 1));
|
||||
break;
|
||||
}
|
||||
methods.push(val);
|
||||
}
|
||||
off += 4;
|
||||
va += 4;
|
||||
}
|
||||
// Trim any trailing non-function slots (the table ends at its last
|
||||
// real method).
|
||||
while methods.last().is_some_and(|&m| !function_starts.contains(&m)) {
|
||||
methods.pop();
|
||||
}
|
||||
if real_fns == 0 || methods.is_empty() { continue; }
|
||||
let length = methods.len() as u32;
|
||||
newly.push(Vtable {
|
||||
address: anchor,
|
||||
length,
|
||||
col_address: None,
|
||||
class_name: synth_anon_name(&methods),
|
||||
rtti_present: false,
|
||||
base_classes_json: None,
|
||||
methods,
|
||||
});
|
||||
recovered += 1;
|
||||
}
|
||||
if recovered > 0 {
|
||||
// Drop pass-1 fragments fully subsumed by a recovered (anchored)
|
||||
// vtable — the anchor base is authoritative and the fragment was a
|
||||
// contiguity-scan artifact of the same table. Keep fragments that
|
||||
// only partially overlap (defensive; shouldn't happen for true
|
||||
// sub-runs) so we never lose method coverage.
|
||||
let recovered_spans: Vec<(u32, u32)> =
|
||||
newly.iter().map(|v| (v.address, v.address + v.length * 4)).collect();
|
||||
candidates.retain(|v| {
|
||||
!recovered_spans
|
||||
.iter()
|
||||
.any(|&(s, e)| v.address >= s && v.address + v.length * 4 <= e)
|
||||
});
|
||||
candidates.extend(newly);
|
||||
tracing::info!(recovered, "vtables recovered from vptr-write anchors");
|
||||
}
|
||||
let _ = &covered;
|
||||
|
||||
// RTTI walk: for each candidate, look at vtable[-1].
|
||||
let pe_image_base = image_base;
|
||||
for v in &mut candidates {
|
||||
@@ -268,6 +419,98 @@ fn read_class_hierarchy(
|
||||
serde_json::to_string(&names).ok()
|
||||
}
|
||||
|
||||
/// Pre-pass: discover candidate vtable *bases* from constructor vptr-write
|
||||
/// stores, independent of the static contiguity heuristic. A vptr install is
|
||||
/// the canonical `addis/addi` (or `lis/ori`) immediate build of a constant
|
||||
/// pointing into `.rdata` / `.data`, followed by `stw rX, 0(rThis)` — i.e. the
|
||||
/// ctor writing the vtable pointer to `this+0`. We return the set of such
|
||||
/// constants; these are fed to [`analyze_with_anchors`] so a vtable with
|
||||
/// non-function head words isn't lost.
|
||||
///
|
||||
/// We only consider stores at displacement 0 (the primary vptr; secondary
|
||||
/// MI vptrs land at non-zero offsets and are handled by the existing
|
||||
/// contiguity scan / typed-dispatch resolver well enough). The register
|
||||
/// tracker mirrors the lis+addi propagation used elsewhere and is reset at
|
||||
/// every basic-block boundary (`block_boundaries`).
|
||||
pub fn scan_vptr_write_constants(
|
||||
pe: &[u8],
|
||||
image_base: u32,
|
||||
functions: &std::collections::BTreeMap<u32, (u32, bool)>, // start -> (end, is_saverestore)
|
||||
sections: &[PeSection],
|
||||
block_boundaries: &std::collections::HashSet<u32>,
|
||||
) -> std::collections::BTreeSet<u32> {
|
||||
// Ranges that a vtable base may legitimately live in.
|
||||
let data_ranges: Vec<(u32, u32)> = sections
|
||||
.iter()
|
||||
.filter(|s| matches!(s.name.as_str(), ".rdata" | ".data"))
|
||||
.map(|s| (image_base + s.virtual_address, image_base + s.virtual_address + s.virtual_size))
|
||||
.collect();
|
||||
let in_data = |a: u32| data_ranges.iter().any(|&(s, e)| a >= s && a < e);
|
||||
|
||||
const OP_ADDI: u32 = 14;
|
||||
const OP_ADDIS: u32 = 15;
|
||||
const OP_ORI: u32 = 24;
|
||||
const OP_STW: u32 = 36;
|
||||
const OP_X_FORM: u32 = 31;
|
||||
|
||||
let read = |addr: u32| -> Option<u32> {
|
||||
let off = addr.wrapping_sub(image_base) as usize;
|
||||
if off + 4 > pe.len() { return None; }
|
||||
Some(u32::from_be_bytes([pe[off], pe[off + 1], pe[off + 2], pe[off + 3]]))
|
||||
};
|
||||
|
||||
let mut anchors: std::collections::BTreeSet<u32> = std::collections::BTreeSet::new();
|
||||
for (&fn_start, &(fn_end, is_saverestore)) in functions {
|
||||
if is_saverestore { continue; }
|
||||
let mut reg: [Option<u32>; 32] = [None; 32];
|
||||
let mut pc = fn_start;
|
||||
while pc < fn_end {
|
||||
if pc != fn_start && block_boundaries.contains(&pc) {
|
||||
reg = [None; 32];
|
||||
}
|
||||
let Some(instr) = read(pc) else { break };
|
||||
let op = instr >> 26;
|
||||
let rd = ((instr >> 21) & 0x1F) as usize;
|
||||
let ra = ((instr >> 16) & 0x1F) as usize;
|
||||
let simm = ((instr & 0xFFFF) as i16) as i32;
|
||||
let uimm = instr & 0xFFFF;
|
||||
match op {
|
||||
OP_ADDIS if ra == 0 => reg[rd] = Some(uimm << 16),
|
||||
OP_ADDIS => reg[rd] = reg[ra].map(|b| b.wrapping_add(uimm << 16)),
|
||||
OP_ADDI if ra != 0 => reg[rd] = reg[ra].map(|b| b.wrapping_add(simm as u32)),
|
||||
OP_ADDI => reg[rd] = Some(simm as u32),
|
||||
OP_ORI => {
|
||||
let rs = rd;
|
||||
reg[ra] = reg[rs].map(|b| b | uimm);
|
||||
}
|
||||
OP_STW => {
|
||||
// `stw rS, off(rA)` with displacement 0 = primary vptr install.
|
||||
if ra != 0
|
||||
&& simm == 0
|
||||
&& let Some(val) = reg[rd]
|
||||
&& in_data(val)
|
||||
{
|
||||
anchors.insert(val);
|
||||
}
|
||||
}
|
||||
32..=35 | 40..=43 | 48..=51 => reg[rd] = None,
|
||||
OP_X_FORM => {
|
||||
let xo = (instr >> 1) & 0x3FF;
|
||||
if xo != 444 && xo != 467 { reg[rd] = None; } // keep `or`(444=mr)/`mtspr`-ish
|
||||
}
|
||||
18 | 16 => {
|
||||
if (instr & 1) != 0 {
|
||||
for r in 0..=12 { reg[r] = None; }
|
||||
}
|
||||
}
|
||||
_ => {}
|
||||
}
|
||||
pc = pc.wrapping_add(4);
|
||||
}
|
||||
}
|
||||
anchors
|
||||
}
|
||||
|
||||
/// Synthetic name for an RTTI-stripped vtable, derived from a stable hash of
|
||||
/// the sorted method-PC list. Two vtables with identical method ordering
|
||||
/// collapse to the same anonymous name.
|
||||
@@ -385,6 +628,112 @@ mod tests {
|
||||
assert!(!vtables[0].rtti_present);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn anchor_recovers_vtable_with_nonfn_head() {
|
||||
// A vtable whose head has a null + an unrecognised word, so the
|
||||
// contiguity scan (≥3 contiguous known fns) fragments it. The anchor
|
||||
// (from a ctor vptr-write) must recover the whole table from its base.
|
||||
let image_base = 0x82000000u32;
|
||||
let rdata_va = 0x1000u32;
|
||||
let text_va = 0x2000u32;
|
||||
let rdata_size = 0x40u32;
|
||||
let text_size = 0x100u32;
|
||||
let total = (text_va + text_size) as usize;
|
||||
let mut pe = vec![0u8; total];
|
||||
|
||||
let f0 = image_base + text_va;
|
||||
let f1 = image_base + text_va + 0x10;
|
||||
let f2 = image_base + text_va + 0x20;
|
||||
// Slots: [null, NONFN(0xDEAD), f0, f1, f2]
|
||||
let slots: [u32; 5] = [0, 0xDEADBEEF, f0, f1, f2];
|
||||
for (i, val) in slots.iter().enumerate() {
|
||||
pe[rdata_va as usize + i * 4..rdata_va as usize + (i + 1) * 4]
|
||||
.copy_from_slice(&val.to_be_bytes());
|
||||
}
|
||||
|
||||
let sections = vec![
|
||||
PeSection {
|
||||
name: ".rdata".into(),
|
||||
virtual_address: rdata_va,
|
||||
virtual_size: rdata_size,
|
||||
raw_offset: rdata_va,
|
||||
raw_size: rdata_size,
|
||||
flags: 0x4000_0040,
|
||||
},
|
||||
PeSection {
|
||||
name: ".text".into(),
|
||||
virtual_address: text_va,
|
||||
virtual_size: text_size,
|
||||
raw_offset: text_va,
|
||||
raw_size: text_size,
|
||||
flags: 0x6000_0020,
|
||||
},
|
||||
];
|
||||
let mut function_starts = std::collections::BTreeSet::new();
|
||||
for &pc in &[f0, f1, f2] { function_starts.insert(pc); }
|
||||
|
||||
// Without an anchor: the head gap (null + nonfn = 2 slots) means the
|
||||
// contiguous run is only [f0,f1,f2]=3 starting at +0x08, so pass-1
|
||||
// still finds it but at the WRONG base (0x...1008), not the true base.
|
||||
let no_anchor = analyze(&pe, image_base, §ions, &function_starts);
|
||||
assert!(
|
||||
!no_anchor.iter().any(|v| v.address == image_base + rdata_va),
|
||||
"without anchor the table is not recovered at its true base"
|
||||
);
|
||||
|
||||
// With the anchor at the true base:
|
||||
let mut anchors = std::collections::BTreeSet::new();
|
||||
anchors.insert(image_base + rdata_va);
|
||||
let with_anchor =
|
||||
analyze_with_anchors(&pe, image_base, §ions, &function_starts, &anchors);
|
||||
let v = with_anchor
|
||||
.iter()
|
||||
.find(|v| v.address == image_base + rdata_va)
|
||||
.expect("anchor must recover vtable at its true base");
|
||||
// length spans through f2 (slot 4): 5 slots.
|
||||
assert_eq!(v.length, 5, "table spans null/nonfn head through last fn");
|
||||
assert_eq!(v.methods[2], f0);
|
||||
assert_eq!(v.methods[4], f2);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn scan_vptr_write_constants_finds_ctor_store() {
|
||||
// Encode a ctor: addis r11,r0,0x8201; addi r11,r11,lo; stw r11,0(r31)
|
||||
// installing vtable base 0x8200A908 into this+0.
|
||||
let image_base = 0x82000000u32;
|
||||
let ctor = 0x82001000u32;
|
||||
let mut pe = vec![0u8; 0x4000];
|
||||
// Lay out a tiny .rdata at 0x...A900 so the constant lands in-range.
|
||||
let vt_base = 0x8200A908u32; // 0x82010000 - 22264
|
||||
let addis = (15u32 << 26) | (11 << 21) | (0 << 16) | 0x8201;
|
||||
let lo = (vt_base & 0xFFFF) as i16; // -22264
|
||||
let addi = (14u32 << 26) | (11 << 21) | (0 << 16) | ((lo as u16) as u32);
|
||||
// addi r11,r0,lo would set r11=lo (sign-extended); we need addis+addi
|
||||
// chained. Re-encode addis into r11 from r0, then addi r11,r11,lo.
|
||||
let addi2 = (14u32 << 26) | (11 << 21) | (11 << 16) | ((lo as u16) as u32);
|
||||
let stw = (36u32 << 26) | (11 << 21) | (31 << 16) | 0; // stw r11,0(r31)
|
||||
let at = (ctor - image_base) as usize;
|
||||
pe[at..at + 4].copy_from_slice(&addis.to_be_bytes());
|
||||
pe[at + 4..at + 8].copy_from_slice(&addi2.to_be_bytes());
|
||||
pe[at + 8..at + 12].copy_from_slice(&stw.to_be_bytes());
|
||||
let _ = addi;
|
||||
|
||||
let sections = vec![PeSection {
|
||||
name: ".rdata".into(),
|
||||
virtual_address: 0xA900,
|
||||
virtual_size: 0x200,
|
||||
raw_offset: 0xA900,
|
||||
raw_size: 0x200,
|
||||
flags: 0x4000_0040,
|
||||
}];
|
||||
let mut funcs: std::collections::BTreeMap<u32, (u32, bool)> = std::collections::BTreeMap::new();
|
||||
funcs.insert(ctor, (ctor + 0x40, false));
|
||||
let anchors = scan_vptr_write_constants(
|
||||
&pe, image_base, &funcs, §ions, &std::collections::HashSet::new(),
|
||||
);
|
||||
assert!(anchors.contains(&vt_base), "ctor vptr store must yield anchor {vt_base:#x}, got {anchors:?}");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn rejects_2_method_run() {
|
||||
let image_base = 0x82000000u32;
|
||||
|
||||
Reference in New Issue
Block a user