Merge analysis-overhaul/m9-eh-flag (M8+M9+M10+M11+M12)

This commit is contained in:
MechaCat02
2026-05-08 22:29:39 +02:00
11 changed files with 852 additions and 16 deletions

View File

@@ -274,10 +274,134 @@ See `crates/xenia-analysis/src/lookup.rs`.
- 0 multiword / dcbz / byterev — these instructions exist in the binary
but are not in lis+addi-tracked code paths.
## Forward work (M8M12, not yet landed)
## Layer M8 + M11 — Function-pointer arrays beyond vtables (landed)
- **M8** — dispatch-table heuristics beyond vtables (e.g. function-pointer arrays in `.data`).
- **M9** — `__CxxFrameHandler` exception scope-table parsing.
- **M10** — `.tls` section / TLS slot tracking.
- **M11** — `__xc_a` / `__xc_z` static-initializer driver detection.
- **M12** — comparative-PC-trace mode for canary diff (runtime side, not analyzer).
### Schema additions
- New table `function_pointer_arrays(address PK, length, kind)` where
`kind` is `'vtable'` (M3 re-emit), `'dispatch_table'` (M8), or
`'static_init'` (M11).
- New table `function_pointer_array_entries(array_address, slot,
function_address, PRIMARY KEY (array_address, slot))` — one row per
slot of every detected array (vtable + non-vtable).
- Indices on `function_pointer_arrays.kind` and
`function_pointer_array_entries.function_address`.
### What this layer does
- Walks `.rdata` (only — `.data` produces too many false positives) for
runs of ≥ 2 consecutive 4-byte BE values where each value is a known
function entry from M1's `functions` table.
- Skips runs whose start matches an M3 vtable head — those are re-emitted
in this table with `kind='vtable'` for unified queries but not
re-classified.
- Heuristically classifies non-vtable runs:
- `static_init` (M11): every entry's first instruction is `mfspr r12, LR`
AND the next is `stwu r1, -N(r1)` with `N ≤ 0x80` (or a save-stub `bl`).
Mirrors the typical C++ static-initialiser prologue.
- `dispatch_table` (M8): everything else.
### What this layer does NOT do
- Does not parse symbol-table-bracketed regions like `__xc_a` / `__xc_z`
/ `__xi_a` / `__xi_z` directly — Sylpheed's symbol table is stripped.
- Does not chain multi-segment static-init drivers; future M11.5 could
walk the entry-point's static-init driver call chain to surface
ground-truth ctor PCs.
- 2-slot runs in `.rdata` may be false positives where two struct fields
happen to alias function VAs; downstream queries should use a length
filter (`WHERE length >= 3`) when high precision matters.
### Sylpheed yield
- 722 vtables (M3 re-emit) + 388 dispatch_tables = 1,110 arrays in
`function_pointer_arrays`.
- 0 static_init detected — Sylpheed's ctors don't all match the
conservative prologue heuristic. Lengths concentrate at 2 slots
(typical of switch-case jump tables).
## Layer M9 — `has_eh` from `.pdata` exception flag (landed)
### Schema additions
- `functions.has_eh BOOLEAN NOT NULL` — true when `.pdata`'s exception-
handler-present bit (bit 31 of word 1, the high bit) is set.
- Index `idx_functions_has_eh`.
### What this layer does
- Derived directly from M1's already-parsed `pdata.flags` bit field (no
new parsing). The bit was always available in `pdata_entries.flags`;
this layer surfaces it as a first-class column on `functions`.
### What this layer does NOT do
- Does not parse the actual `__CxxFrameHandler` / `__C_specific_handler`
scope-table records that the exception bit gates. Walking those tables
would let us name try/catch ranges and per-state cleanup actions, but
is out of scope for a derive-only milestone.
### Sylpheed yield
- 2,975 of 23,073 pdata-validated functions have `has_eh=true` (12.9%) —
plausible MSVC C++ EH coverage rate. Largest EH function: 26,328 bytes
(`sub_823518F0`).
## Layer M10 — `.tls` section / TLS directory (landed)
### Schema additions
- New table `tls_info(raw_data_start, raw_data_end, index_address,
callback_array, zero_fill_size, characteristics)` — at most one row
(the IMAGE_TLS_DIRECTORY32).
- New table `tls_callbacks(slot PK, address)` — one row per resolved TLS
callback function.
### What this layer does
- Reads the first 24 bytes of the `.tls` section as an
`IMAGE_TLS_DIRECTORY32` and walks the zero-terminated callback array.
- All addresses stored as absolute VAs.
### What this layer does NOT do
- Does not parse the raw TLS template content (the variable initialiser
block); just records its start/end VAs.
### Sylpheed yield
- 0 rows — Sylpheed has no `.tls` section. Infrastructure ready for any
binary that uses `__declspec(thread)` storage.
## Layer M12 — `--lr-trace` runtime canary-diff harness (landed)
### Runtime additions (no DB)
- New CLI flag `--lr-trace=PC[,PC,...]` on `exec` — comma-separated PCs
to capture as JSONL records on every fire. Symbolic tokens (`Class::method`)
resolve via M4's lookup against `--probe-db`. Settable via
`XENIA_LR_TRACE`.
- New CLI flag `--lr-trace-out=PATH` — writes JSONL to a file (one
record per line). Stdout when omitted. Settable via `XENIA_LR_TRACE_OUT`.
- New kernel state fields `lr_trace_pcs: HashSet<u32>` +
`lr_trace_writer: Option<Mutex<File>>` and helper
`KernelState::fire_lr_trace_if_match(hw_id)` invoked from the
per-instruction probe slot.
### JSONL record fields
`pc, tid, hw, cycle, r3, r4, r5, r6, lr` — superset of what
xenia-canary's `--log_lr_on_pc` patch emits, with a cycle counter added
for cross-run reproducibility.
### What this layer does NOT do
- Does not capture VMX / FP register state (only GPRs r3..r6).
- Does not buffer / batch records — one `write_all` per fire. For
high-frequency probes (e.g. tight loops at >1M fires/sec), redirect
to a file and use a SSD.
### Determinism
Lockstep digest unaffected: probe firing happens after the per-instr
hooks for ctor/branch probes and only emits side-channel output. Verified
end-of-session: `check sylpheed.iso --stable-digest -n 2M` ×2 produced
byte-identical digests (`instructions=2000005`).
---
## Forward work (not yet landed)
- **M5.5** — `this`-flow extension to M5. Resolve vtable dispatches via
`lwz vt, off(this)` patterns by tracing constructor-side vptr writes.
Highest-value future work for the audit-009 cluster renderer hunt.
- **M9.5** — full `__CxxFrameHandler` scope-table parsing (try/catch
range names, per-state cleanup actions).
- **M11.5** — walk the static-initialiser driver call chain from the
entry point to surface ground-truth ctor PCs.
- VMX/VMX128 vector-store xref emission (M6 follow-up).
- UTF-8 / shift_jis localised-string detection in `.rdata` (M7 follow-up).

View File

@@ -306,7 +306,8 @@ impl DbWriter {
///
/// `vtables` is the M3 result; pass an empty slice when the caller has
/// not run the vtable scan (the tables are still created, just empty).
/// `strings` is the M7 result; same convention.
/// `strings` is the M7 result; same convention. `funcptr_arrays` is the
/// M8/M11 result.
#[tracing::instrument(skip_all, name = "db.write_analysis_results")]
pub fn write_analysis_results(
&mut self,
@@ -317,6 +318,7 @@ impl DbWriter {
xrefs: &XrefMap,
vtables: &[crate::vtables::Vtable],
strings: &[crate::strings::DetectedString],
funcptr_arrays: &[crate::funcptr_arrays::FuncPtrArray],
) -> anyhow::Result<()> {
self.conn.execute_batch("
CREATE TABLE functions (
@@ -328,7 +330,8 @@ impl DbWriter {
is_leaf BOOLEAN NOT NULL, -- true if the function has no outgoing calls
is_saverestore BOOLEAN NOT NULL, -- true if __savegprlr_* / __restgprlr_* stub
pdata_validated BOOLEAN NOT NULL, -- true if .pdata RUNTIME_FUNCTION exists at this VA
pdata_length BIGINT -- length in bytes per .pdata; NULL if no pdata entry
pdata_length BIGINT, -- length in bytes per .pdata; NULL if no pdata entry
has_eh BOOLEAN NOT NULL -- M9: pdata exception-flag bit set; function has C++ EH/SEH
);
CREATE TABLE pdata_entries (
@@ -377,6 +380,33 @@ impl DbWriter {
content VARCHAR NOT NULL -- UTF-8 representation of the string
);
CREATE TABLE tls_info (
raw_data_start BIGINT NOT NULL, -- VA of TLS template start
raw_data_end BIGINT NOT NULL, -- VA one-past-end of TLS template
index_address BIGINT NOT NULL, -- VA of u32 the loader writes the assigned slot index into
callback_array BIGINT NOT NULL, -- VA of zero-terminated callback array (0 if none)
zero_fill_size BIGINT NOT NULL, -- bytes of zero-fill appended after raw template
characteristics BIGINT NOT NULL -- IMAGE_TLS_DIRECTORY characteristics flags
);
CREATE TABLE tls_callbacks (
slot BIGINT PRIMARY KEY, -- 0-based index in the callback array
address BIGINT NOT NULL -- VA of callback function
);
CREATE TABLE function_pointer_arrays (
address BIGINT PRIMARY KEY, -- absolute VA of the array's first slot
length BIGINT NOT NULL, -- number of slots
kind VARCHAR NOT NULL -- 'vtable' (M3) | 'dispatch_table' (M8) | 'static_init' (M11)
);
CREATE TABLE function_pointer_array_entries (
array_address BIGINT NOT NULL, -- FK to function_pointer_arrays.address
slot BIGINT NOT NULL, -- 0-based slot index
function_address BIGINT NOT NULL, -- VA of the function this slot points at
PRIMARY KEY (array_address, slot)
);
CREATE TABLE demangled_names (
address BIGINT, -- VA the mangled name is associated with; NULL when from a non-address source (e.g. RTTI-only string)
mangled VARCHAR NOT NULL, -- original mangled symbol (e.g. ?Foo@Bar@@QEAAXXZ)
@@ -406,11 +436,13 @@ impl DbWriter {
insert_vtables(&self.conn, vtables, pe, info.image_base)?;
insert_methods_and_classes(&self.conn, vtables, labels)?;
insert_strings(&self.conn, strings)?;
insert_funcptr_arrays(&self.conn, funcptr_arrays)?;
insert_xrefs_streaming(&self.conn, xrefs, pe, info.image_base, func_analysis, labels)?;
let indices = [
("idx_functions_name", "CREATE INDEX idx_functions_name ON functions(name)"),
("idx_functions_pdata_validated", "CREATE INDEX idx_functions_pdata_validated ON functions(pdata_validated)"),
("idx_functions_has_eh", "CREATE INDEX idx_functions_has_eh ON functions(has_eh)"),
("idx_labels_kind", "CREATE INDEX idx_labels_kind ON labels(kind)"),
("idx_labels_name", "CREATE INDEX idx_labels_name ON labels(name)"),
("idx_demangled_address", "CREATE INDEX idx_demangled_address ON demangled_names(address)"),
@@ -420,6 +452,8 @@ impl DbWriter {
("idx_classes_rtti", "CREATE INDEX idx_classes_rtti ON classes(rtti_present)"),
("idx_strings_encoding", "CREATE INDEX idx_strings_encoding ON strings(encoding)"),
("idx_xrefs_addr_mode", "CREATE INDEX idx_xrefs_addr_mode ON xrefs(addr_mode)"),
("idx_fparrays_kind", "CREATE INDEX idx_fparrays_kind ON function_pointer_arrays(kind)"),
("idx_fpentries_function", "CREATE INDEX idx_fpentries_function ON function_pointer_array_entries(function_address)"),
("idx_xrefs_target", "CREATE INDEX idx_xrefs_target ON xrefs(target)"),
("idx_xrefs_source", "CREATE INDEX idx_xrefs_source ON xrefs(source)"),
("idx_xrefs_source_func", "CREATE INDEX idx_xrefs_source_func ON xrefs(source_func)"),
@@ -448,7 +482,39 @@ impl DbWriter {
xrefs: &XrefMap,
) -> anyhow::Result<()> {
self.ingest_instructions(pe, info, func_analysis, labels)?;
self.write_analysis_results(pe, info, func_analysis, labels, xrefs, &[], &[])?;
self.write_analysis_results(pe, info, func_analysis, labels, xrefs, &[], &[], &[])?;
Ok(())
}
/// M10 — write the parsed `.tls` directory + callback array. No-op
/// when `tls` is `None` (binary has no `.tls` section).
#[tracing::instrument(skip_all, name = "db.write_tls")]
pub fn write_tls(
&mut self,
tls: Option<&xenia_xex::tls::TlsInfo>,
) -> anyhow::Result<()> {
let Some(t) = tls else { return Ok(()); };
self.conn.execute(
"INSERT INTO tls_info (raw_data_start, raw_data_end, index_address,
callback_array, zero_fill_size, characteristics)
VALUES (?, ?, ?, ?, ?, ?)",
params![
t.raw_data_start as i64,
t.raw_data_end as i64,
t.index_address as i64,
t.callback_array as i64,
t.zero_fill_size as i64,
t.characteristics as i64,
],
)?;
let mut stmt = self.conn.prepare(
"INSERT INTO tls_callbacks (slot, address) VALUES (?, ?)"
)?;
for (i, cb) in t.callbacks.iter().enumerate() {
stmt.execute(params![i as i64, cb.address as i64])?;
}
metrics::counter!("db.rows", "table" => "tls_callbacks").increment(t.callbacks.len() as u64);
tracing::info!(rows = t.callbacks.len(), table = "tls_callbacks", "tls write complete");
Ok(())
}
@@ -755,8 +821,8 @@ fn insert_functions(
let mut stmt = conn.prepare(
"INSERT INTO functions
(address, name, end_address, frame_size, saved_gprs, is_leaf, is_saverestore,
pdata_validated, pdata_length)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)"
pdata_validated, pdata_length, has_eh)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)"
)?;
for (&addr, fi) in &func_analysis.functions {
let name = labels.get(&addr)
@@ -772,6 +838,7 @@ fn insert_functions(
fi.is_saverestore,
fi.pdata_validated,
fi.pdata_length.map(|n| n as i64),
fi.has_eh,
])?;
}
Ok(())
@@ -884,6 +951,37 @@ fn insert_strings(
Ok(())
}
fn insert_funcptr_arrays(
conn: &Connection,
arrays: &[crate::funcptr_arrays::FuncPtrArray],
) -> anyhow::Result<()> {
if arrays.is_empty() { return Ok(()); }
let mut stmt_arr = conn.prepare(
"INSERT INTO function_pointer_arrays (address, length, kind) VALUES (?, ?, ?)
ON CONFLICT DO NOTHING"
)?;
let mut stmt_ent = conn.prepare(
"INSERT INTO function_pointer_array_entries (array_address, slot, function_address)
VALUES (?, ?, ?) ON CONFLICT DO NOTHING"
)?;
let mut n_arr = 0u64;
let mut n_ent = 0u64;
for a in arrays {
let inserted = stmt_arr.execute(params![
a.address as i64, a.length as i64, a.kind,
])?;
if inserted > 0 { n_arr += 1; }
for (i, &fn_va) in a.entries.iter().enumerate() {
stmt_ent.execute(params![a.address as i64, i as i64, fn_va as i64])?;
n_ent += 1;
}
}
metrics::counter!("db.rows", "table" => "function_pointer_arrays").increment(n_arr);
metrics::counter!("db.rows", "table" => "function_pointer_array_entries").increment(n_ent);
tracing::info!(arrays = n_arr, entries = n_ent, "function-pointer arrays insert complete");
Ok(())
}
fn insert_demangled_from_labels(
conn: &Connection,
labels: &HashMap<u32, String>,

View File

@@ -39,6 +39,10 @@ pub struct FuncInfo {
/// Function size in bytes per `.pdata`'s `function_length` field, if known.
/// Absent (None) when this row is prologue-only.
pub pdata_length: Option<u32>,
/// True when `.pdata`'s exception-flag bit is set on this entry — the
/// function has a registered C++ EH (or SEH) frame handler. Always false
/// for entries without `.pdata` coverage. (M9)
pub has_eh: bool,
}
/// Result of the function analysis pass.
@@ -296,6 +300,8 @@ pub fn analyze_with_pdata(
if let Some(p) = pdata_entry {
fi.pdata_validated = true;
fi.pdata_length = Some(p.function_length);
// bit 0 of the packed flags = exception-handler-present
fi.has_eh = (p.flags & 0x2) != 0;
// If the prologue walk ended too early, trust pdata's length.
let pdata_end = p.begin_address.wrapping_add(p.function_length);
if pdata_end > fi.end {
@@ -317,6 +323,7 @@ pub fn analyze_with_pdata(
is_saverestore: false,
pdata_validated: true,
pdata_length: Some(p.function_length),
has_eh: (p.flags & 0x2) != 0,
},
);
}
@@ -326,6 +333,7 @@ pub fn analyze_with_pdata(
if let Some(sb) = save_base {
// The save block is one cascade: entry at each rN, falls through to blr
// Treat as a single function with the first entry point
let pe_sb = pdata_by_begin.get(&sb).copied();
functions.insert(sb, FuncInfo {
start: sb,
end: sb + 20 * 4, // 18 std + stw r12 + blr
@@ -333,11 +341,13 @@ pub fn analyze_with_pdata(
saved_gprs: 18,
is_leaf: true,
is_saverestore: true,
pdata_validated: pdata_by_begin.contains_key(&sb),
pdata_length: pdata_by_begin.get(&sb).map(|p| p.function_length),
pdata_validated: pe_sb.is_some(),
pdata_length: pe_sb.map(|p| p.function_length),
has_eh: pe_sb.map(|p| (p.flags & 0x2) != 0).unwrap_or(false),
});
}
if let Some(rb) = restore_base {
let pe_rb = pdata_by_begin.get(&rb).copied();
functions.insert(rb, FuncInfo {
start: rb,
end: rb + 21 * 4, // 18 ld + lwz r12 + mtspr LR + blr
@@ -345,8 +355,9 @@ pub fn analyze_with_pdata(
saved_gprs: 18,
is_leaf: true,
is_saverestore: true,
pdata_validated: pdata_by_begin.contains_key(&rb),
pdata_length: pdata_by_begin.get(&rb).map(|p| p.function_length),
pdata_validated: pe_rb.is_some(),
pdata_length: pe_rb.map(|p| p.function_length),
has_eh: pe_rb.map(|p| (p.flags & 0x2) != 0).unwrap_or(false),
});
}
@@ -498,6 +509,7 @@ fn analyze_function(
is_saverestore: false,
pdata_validated: false,
pdata_length: None,
has_eh: false,
})
}

View File

@@ -0,0 +1,257 @@
//! Generic function-pointer array detection (M8 + M11).
//!
//! M3 already detects "vtable" candidates — runs of ≥3 contiguous function
//! pointers in `.rdata` / `.data` (with COL/RTTI walk on top). This module
//! widens the net:
//!
//! - **Dispatch tables** (M8): runs of ≥2 function pointers in `.rdata` /
//! `.data` that are NOT already classified as vtables. Captures switch
//! jump tables, callback registries, command tables, gameplay state
//! machines, etc.
//! - **Static initialiser tables** (M11): function-pointer arrays in
//! `.rdata` whose entries all have classic constructor-like prologues
//! (small frame; either leaf or calling well-known runtime helpers).
//! The MSVC convention names the bracketing symbols `__xc_a` /
//! `__xc_z` (C++ ctors) and `__xi_a` / `__xi_z` (C runtime), but the
//! names are stripped from Sylpheed; we classify by structure.
//!
//! All findings are written to a single `function_pointer_arrays` table
//! with a `kind` column — `"vtable"`, `"dispatch_table"`, or `"static_init"`.
//! Vtable rows are duplicated from M3's `vtables` table for join
//! convenience (so a single query covers all classification kinds).
//!
//! ### What this module does NOT do
//!
//! - No alias-based classification — `static_init` is heuristic and may
//! include any function-pointer array near the binary's `__xc_*` region.
//! - Does not parse the bracket symbols' actual addresses — we'd need
//! debug symbols, which Sylpheed doesn't ship.
//! - Two-element runs in `.data` are common false positives (struct fields
//! that happen to alias function entries); we only emit `dispatch_table`
//! rows for `.rdata`.
use std::collections::BTreeSet;
use xenia_xex::pe::PeSection;
use crate::vtables::Vtable;
/// One detected function-pointer array.
#[derive(Debug, Clone)]
pub struct FuncPtrArray {
pub address: u32,
pub length: u32,
pub kind: &'static str, // "vtable" | "dispatch_table" | "static_init"
/// Array entries (function VAs).
pub entries: Vec<u32>,
}
/// Run the pass. `vtables` is the M3 result — those addresses are skipped
/// in the dispatch-table scan to avoid duplication. `function_starts` is
/// the M1 corrected function-start set (used to validate that each array
/// entry actually points at a known function).
#[tracing::instrument(skip_all, fields(image_base = format_args!("{:#010x}", image_base)))]
pub fn analyze(
pe: &[u8],
image_base: u32,
sections: &[PeSection],
function_starts: &BTreeSet<u32>,
vtables: &[Vtable],
) -> Vec<FuncPtrArray> {
let started = std::time::Instant::now();
let vtable_addrs: BTreeSet<u32> = vtables.iter().map(|v| v.address).collect();
let mut out: Vec<FuncPtrArray> = Vec::new();
// Re-emit vtables in this table for unified-query convenience.
for v in vtables {
out.push(FuncPtrArray {
address: v.address,
length: v.length,
kind: "vtable",
entries: v.methods.clone(),
});
}
// Scan only .rdata for dispatch tables — .data has too many false
// positives from struct fields aliasing function VAs.
for section in sections {
if section.name != ".rdata" { continue; }
let raw_start = section.virtual_address as usize;
let raw_end = (section.virtual_address + section.virtual_size) as usize;
if raw_end > pe.len() { continue; }
let bytes = &pe[raw_start..raw_end.min(pe.len())];
let va_base = image_base + section.virtual_address;
let mut i = 0usize;
while i + 8 <= bytes.len() {
if !i.is_multiple_of(4) { i += 1; continue; }
let mut entries: Vec<u32> = Vec::new();
let mut j = i;
while j + 4 <= bytes.len() {
let val = u32::from_be_bytes([bytes[j], bytes[j + 1], bytes[j + 2], bytes[j + 3]]);
if function_starts.contains(&val) {
entries.push(val);
j += 4;
} else {
break;
}
}
if entries.len() >= 2 {
let address = va_base + (i as u32);
if !vtable_addrs.contains(&address) {
let kind = classify_run(image_base, &entries, pe);
out.push(FuncPtrArray {
address,
length: entries.len() as u32,
kind,
entries,
});
}
i += j - i;
} else {
i += 4;
}
}
}
let elapsed_ms = started.elapsed().as_millis() as f64;
let n_vt = out.iter().filter(|a| a.kind == "vtable").count();
let n_dt = out.iter().filter(|a| a.kind == "dispatch_table").count();
let n_si = out.iter().filter(|a| a.kind == "static_init").count();
metrics::histogram!("analysis.phase_ms", "phase" => "funcptr_arrays").record(elapsed_ms);
tracing::info!(
total = out.len(), vtable = n_vt, dispatch_table = n_dt, static_init = n_si,
elapsed_ms,
"function-pointer array scan complete",
);
out
}
/// Classify a non-vtable function-pointer array. Currently distinguishes
/// only "static_init" (all entries have constructor-like prologues — a
/// brief mfspr+stwu prologue with a small frame) from "dispatch_table"
/// (anything else).
fn classify_run(image_base: u32, entries: &[u32], pe: &[u8]) -> &'static str {
// Heuristic: a static initialiser's prologue is small (frame ≤ 0x80,
// typically ≤ 0x40). If every entry's first instruction is mfspr+LR
// (opcode 31, xo 339, spr 8) followed by a small stwu, classify as
// static_init.
let mut all_ctor = true;
let mut any_ctor = false;
for &fn_va in entries {
if !is_ctor_like(pe, image_base, fn_va) {
all_ctor = false;
} else {
any_ctor = true;
}
}
if all_ctor && any_ctor && entries.len() >= 3 {
"static_init"
} else {
"dispatch_table"
}
}
/// True if the function at `fn_va` looks like a tiny C++ static initialiser:
/// `mfspr r12, LR` immediately followed by `stwu r1, -N(r1)` with `N ≤ 0x80`.
fn is_ctor_like(pe: &[u8], image_base: u32, fn_va: u32) -> bool {
let off = fn_va.wrapping_sub(image_base) as usize;
if off + 8 > pe.len() { return false; }
let i0 = u32::from_be_bytes([pe[off], pe[off + 1], pe[off + 2], pe[off + 3]]);
let i1 = u32::from_be_bytes([pe[off + 4], pe[off + 5], pe[off + 6], pe[off + 7]]);
// i0: mfspr rD, LR — opcode 31, xo 339, spr 8.
let op0 = i0 >> 26;
let xo0 = (i0 >> 1) & 0x3FF;
let spr0 = (((i0 >> 11) & 0x1F) << 5) | ((i0 >> 16) & 0x1F);
if !(op0 == 31 && xo0 == 339 && spr0 == 8) { return false; }
// i1 must be stwu r1, -N(r1) with N ≤ 0x80, OR a `bl __savegprlr_*`
// followed eventually by stwu (full prologue). Allow either.
let op1 = i1 >> 26;
if op1 == 37 {
// stwu D-form: rS=1, rA=1
let rs = (i1 >> 21) & 0x1F;
let ra = (i1 >> 16) & 0x1F;
let d = ((i1 & 0xFFFF) as i16) as i32;
rs == 1 && ra == 1 && d <= 0 && (-d) <= 0x80
} else if op1 == 18 {
// bl __savegprlr_NN — accept; ctor with frame ≤ 0x80 is the
// common case, but if the compiler emits a save-stub call we
// can't easily verify the frame size without walking further.
true
} else {
false
}
}
#[cfg(test)]
mod tests {
use super::*;
use xenia_xex::pe::PeSection;
fn mk_section(name: &str, va: u32, size: u32) -> PeSection {
PeSection {
name: name.into(),
virtual_address: va,
virtual_size: size,
raw_offset: va,
raw_size: size,
flags: 0x4000_0040,
}
}
fn write_be_u32(buf: &mut [u8], at: usize, val: u32) {
buf[at..at + 4].copy_from_slice(&val.to_be_bytes());
}
#[test]
fn detects_dispatch_table_in_rdata() {
let image_base = 0x82000000u32;
let rdata_va = 0x1000u32;
let mut pe = vec![0u8; 0x4000];
// Two consecutive function pointers, no vtable shadowing them.
let pcs = [image_base + 0x2000, image_base + 0x2010];
for (i, p) in pcs.iter().enumerate() {
write_be_u32(&mut pe, rdata_va as usize + i * 4, *p);
}
let sections = vec![mk_section(".rdata", rdata_va, 0x100)];
let mut starts = BTreeSet::new();
for &p in &pcs { starts.insert(p); }
let arrs = analyze(&pe, image_base, &sections, &starts, &[]);
assert_eq!(arrs.len(), 1);
assert_eq!(arrs[0].kind, "dispatch_table");
assert_eq!(arrs[0].length, 2);
}
#[test]
fn vtable_overrides_dispatch_classification() {
let image_base = 0x82000000u32;
let rdata_va = 0x1000u32;
let mut pe = vec![0u8; 0x4000];
let pcs = [image_base + 0x2000, image_base + 0x2010, image_base + 0x2020];
for (i, p) in pcs.iter().enumerate() {
write_be_u32(&mut pe, rdata_va as usize + i * 4, *p);
}
let sections = vec![mk_section(".rdata", rdata_va, 0x100)];
let mut starts = BTreeSet::new();
for &p in &pcs { starts.insert(p); }
let vt = Vtable {
address: image_base + rdata_va,
length: 3,
col_address: None,
class_name: "ANON_test".into(),
rtti_present: false,
base_classes_json: None,
methods: pcs.to_vec(),
};
let arrs = analyze(&pe, image_base, &sections, &starts, &[vt]);
// Vtable + (no dispatch-table dup): the M3 vtable is re-emitted, but
// the scan also skips the same address from re-classification.
assert_eq!(arrs.len(), 1);
assert_eq!(arrs[0].kind, "vtable");
}
}

View File

@@ -374,6 +374,7 @@ mod tests {
is_saverestore: false,
pdata_validated: false,
pdata_length: None,
has_eh: false,
});
let func_analysis = FuncAnalysis {
functions,
@@ -414,6 +415,7 @@ mod tests {
is_saverestore: false,
pdata_validated: false,
pdata_length: None,
has_eh: false,
});
let func_analysis = FuncAnalysis {
functions,
@@ -448,6 +450,7 @@ mod tests {
is_saverestore: false,
pdata_validated: false,
pdata_length: None,
has_eh: false,
});
let func_analysis = FuncAnalysis {
functions,

View File

@@ -11,6 +11,7 @@ pub mod vtables;
pub mod lookup;
pub mod indirect;
pub mod strings;
pub mod funcptr_arrays;
mod ordinals;
pub use ordinals::resolve_ordinal;

View File

@@ -67,6 +67,7 @@ fn synthetic_func_analysis(image_base: u32) -> FuncAnalysis {
is_saverestore: false,
pdata_validated: false,
pdata_length: None,
has_eh: false,
},
);
FuncAnalysis {
@@ -106,7 +107,7 @@ fn db_schema_matches_expected_columns() {
w.write_base(&info).expect("write_base");
w.ingest_instructions(&pe, &info, &func_analysis, &labels)
.expect("ingest_instructions");
w.write_analysis_results(&pe, &info, &func_analysis, &labels, &xrefs, &[], &[])
w.write_analysis_results(&pe, &info, &func_analysis, &labels, &xrefs, &[], &[], &[])
.expect("write_analysis_results");
w.create_sql_views().expect("create_sql_views");
}
@@ -159,6 +160,7 @@ fn db_schema_matches_expected_columns() {
("is_saverestore", "BOOLEAN"),
("pdata_validated", "BOOLEAN"),
("pdata_length", "BIGINT"),
("has_eh", "BOOLEAN"),
]),
("pdata_entries", &[
("begin_address", "BIGINT"),
@@ -208,6 +210,28 @@ fn db_schema_matches_expected_columns() {
("length", "BIGINT"),
("content", "VARCHAR"),
]),
("tls_info", &[
("raw_data_start", "BIGINT"),
("raw_data_end", "BIGINT"),
("index_address", "BIGINT"),
("callback_array", "BIGINT"),
("zero_fill_size", "BIGINT"),
("characteristics", "BIGINT"),
]),
("tls_callbacks", &[
("slot", "BIGINT"),
("address", "BIGINT"),
]),
("function_pointer_arrays", &[
("address", "BIGINT"),
("length", "BIGINT"),
("kind", "VARCHAR"),
]),
("function_pointer_array_entries", &[
("array_address", "BIGINT"),
("slot", "BIGINT"),
("function_address", "BIGINT"),
]),
("xrefs", &[
("source", "BIGINT"),
("target", "BIGINT"),

View File

@@ -230,6 +230,18 @@ enum Commands {
/// Default: `sylpheed.db` next to the .iso file when present.
#[arg(long)]
probe_db: Option<String>,
/// M12 — comma-separated PCs to capture as JSONL records on every
/// fire. Designed to diff against xenia-canary's `--log_lr_on_pc`
/// patch. Each record carries pc/tid/hw/cycle/r3/r4/r5/r6/lr.
/// Symbolic resolution (`Class::method`) is supported via M4 and
/// reads `--probe-db`. Settable via `XENIA_LR_TRACE`.
/// Read-only; lockstep digest unaffected.
#[arg(long)]
lr_trace: Option<String>,
/// M12 — write `--lr-trace` JSONL to this file (one record per
/// line). Stdout when omitted.
#[arg(long)]
lr_trace_out: Option<String>,
},
/// Browse XISO disc image contents
Browse {
@@ -391,6 +403,8 @@ fn main() -> Result<()> {
mem_watch,
dump_section,
probe_db,
lr_trace,
lr_trace_out,
} => cmd_exec(
&path,
max_instructions,
@@ -415,6 +429,8 @@ fn main() -> Result<()> {
mem_watch.as_deref(),
dump_section.as_deref(),
probe_db.as_deref(),
lr_trace.as_deref(),
lr_trace_out.as_deref(),
),
Commands::Browse { path } => cmd_browse(&path),
Commands::Info { path } => cmd_info(&path),
@@ -644,6 +660,8 @@ fn cmd_exec(
mem_watch: Option<&str>,
dump_section: Option<&str>,
probe_db: Option<&str>,
lr_trace: Option<&str>,
lr_trace_out: Option<&str>,
) -> Result<()> {
cmd_exec_inner(
path,
@@ -669,6 +687,8 @@ fn cmd_exec(
mem_watch,
dump_section,
probe_db,
lr_trace,
lr_trace_out,
None,
None,
false,
@@ -713,6 +733,8 @@ fn cmd_check(
None, // mem_watch — same
None, // dump_section — same
None, // probe_db — same
None, // lr_trace — same
None, // lr_trace_out — same
out,
expect,
stable_digest,
@@ -743,6 +765,8 @@ fn cmd_exec_inner(
mem_watch: Option<&str>,
dump_section: Option<&str>,
probe_db: Option<&str>,
lr_trace: Option<&str>,
lr_trace_out: Option<&str>,
digest_out: Option<&str>,
digest_expect: Option<&str>,
stable_digest: bool,
@@ -1080,6 +1104,50 @@ fn cmd_exec_inner(
}
}
// M12 — LR trace (canary-diff). Same token grammar as --pc-probe;
// optional `--lr-trace-out=PATH` redirects JSONL to a file.
let lr_trace_combined: Option<String> = match (
lr_trace, std::env::var("XENIA_LR_TRACE").ok(),
) {
(Some(s), _) => Some(s.to_string()),
(None, Some(s)) if !s.is_empty() => Some(s),
_ => None,
};
if let Some(list) = lr_trace_combined {
for token in list.split(',').map(str::trim).filter(|s| !s.is_empty()) {
let pcs = xenia_analysis::lookup::resolve_probe_token(probe_db_path.as_deref(), token)
.map_err(|e| anyhow::anyhow!("--lr-trace {token:?}: {e}"))?;
for pc in pcs {
kernel.lr_trace_pcs.insert(pc);
}
}
// Open the writer if --lr-trace-out is set.
let out_combined: Option<String> = match (
lr_trace_out.map(|s| s.to_string()),
std::env::var("XENIA_LR_TRACE_OUT").ok(),
) {
(Some(s), _) => Some(s),
(None, Some(s)) if !s.is_empty() => Some(s),
_ => None,
};
if let Some(p) = out_combined {
let f = std::fs::File::create(&p)
.map_err(|e| anyhow::anyhow!("--lr-trace-out {p:?}: {e}"))?;
kernel.lr_trace_writer = Some(std::sync::Mutex::new(f));
}
if !quiet && !kernel.lr_trace_pcs.is_empty() {
let mut pcs: Vec<u32> = kernel.lr_trace_pcs.iter().copied().collect();
pcs.sort_unstable();
let strs: Vec<String> = pcs.iter().map(|p| format!("{p:#010x}")).collect();
tracing::info!(
"lr-trace armed: {} ({}); sink={}",
kernel.lr_trace_pcs.len(),
strs.join(", "),
if kernel.lr_trace_writer.is_some() { "file" } else { "stdout" },
);
}
}
// Diagnostic. Parse `--dump-addr=0x828F3D08,...` (or
// `XENIA_DUMP_ADDR=...`) into `kernel.dump_addrs`. The contents
// are dumped at end-of-run by `dump_thread_diagnostic`. Pure
@@ -2131,6 +2199,7 @@ fn worker_prologue(
// the helper, no overhead on the hot path.
kernel.fire_ctor_probe_if_match(hw_id, mem);
kernel.fire_branch_probe_if_match(hw_id);
kernel.fire_lr_trace_if_match(hw_id);
if mem.has_mem_watch() {
let ctx = kernel.scheduler.ctx(hw_id);
@@ -4129,6 +4198,26 @@ fn cmd_dis(
let strings = xenia_analysis::strings::analyze(&pe_image, base, &sections);
info!(strings = strings.len(), "string scan complete");
// .tls directory parse (M10). None for binaries without a .tls section.
let tls_info = xenia_xex::tls::parse_tls(&pe_image, base, &sections);
if let Some(ref t) = tls_info {
info!(callbacks = t.callbacks.len(), "tls directory parsed");
} else {
info!("no .tls section present");
}
// Generic function-pointer-array scan (M8 + M11). Re-emits M3 vtables
// plus dispatch tables and static-init tables in `.rdata`.
let fparrays = xenia_analysis::funcptr_arrays::analyze(
&pe_image, base, &sections, &function_starts, &vtables,
);
info!(
funcptr_arrays = fparrays.len(),
dispatch_tables = fparrays.iter().filter(|a| a.kind == "dispatch_table").count(),
static_inits = fparrays.iter().filter(|a| a.kind == "static_init").count(),
"function-pointer array scan complete",
);
// Build DisasmInfo
let disasm_info = xenia_analysis::formatter::DisasmInfo {
image_base: base,
@@ -4154,7 +4243,9 @@ fn cmd_dis(
&xref_result.xrefs,
&vtables,
&strings,
&fparrays,
)?;
w.write_tls(tls_info.as_ref())?;
if matches!(analyze, AnalyzeMode::Sql | AnalyzeMode::Both) {
w.create_sql_views()?;
info!(db = %db, "SQL views created");

View File

@@ -230,6 +230,17 @@ pub struct KernelState {
/// Distinct from `ctor_probe_pcs` because that helper emits 8
/// frames of back-chain per hit — too noisy for branch tracing.
pub branch_probe_pcs: std::collections::HashSet<u32>,
/// M12 — diagnostic. PCs at which to emit a structured JSONL record
/// per fire, designed for diffing against xenia-canary's
/// `--log_lr_on_pc` patch output. Each line carries
/// `(pc, tid, hw, cycle, r3, r4, r5, r6, lr)` — a superset of what
/// canary logs. Settable via `--lr-trace` / `XENIA_LR_TRACE`. Stdout
/// by default; redirect with `--lr-trace-out=PATH`. Read-only;
/// lockstep digest unaffected.
pub lr_trace_pcs: std::collections::HashSet<u32>,
/// M12 — optional file writer for `--lr-trace` output. `None` means
/// stdout.
pub lr_trace_writer: Option<std::sync::Mutex<std::fs::File>>,
/// Diagnostic. Guest addresses to dump (64 bytes each, hex + u32
/// lanes) at end-of-run. Populated from `--dump-addr=0x828F3D08,
/// 0x828F4070`. Used to inspect static dispatcher / job-queue /
@@ -297,6 +308,8 @@ impl KernelState {
ctor_probe_pcs: std::collections::HashSet::new(),
pc_probe_consumers: HashMap::new(),
branch_probe_pcs: std::collections::HashSet::new(),
lr_trace_pcs: std::collections::HashSet::new(),
lr_trace_writer: None,
dump_addrs: Vec::new(),
dump_section: None,
};
@@ -674,6 +687,46 @@ impl KernelState {
);
}
/// M12 — diagnostic. If the live PC for HW slot `hw_id` is in
/// `self.lr_trace_pcs`, emit one JSONL record. Format mirrors what
/// xenia-canary's `--log_lr_on_pc` patch emits, plus the cycle
/// counter. Read-only; lockstep digest unaffected.
pub fn fire_lr_trace_if_match(&self, hw_id: u8) {
if self.lr_trace_pcs.is_empty() {
return;
}
let ctx = self.scheduler.ctx(hw_id);
let pc = ctx.pc;
if !self.lr_trace_pcs.contains(&pc) {
return;
}
let tid = self.scheduler.tid(hw_id).unwrap_or(0);
let r3 = ctx.gpr[3] as u32;
let r4 = ctx.gpr[4] as u32;
let r5 = ctx.gpr[5] as u32;
let r6 = ctx.gpr[6] as u32;
let lr = ctx.lr as u32;
let cycle = ctx.cycle_count;
let line = format!(
"{{\"pc\":\"{:#010x}\",\"tid\":{},\"hw\":{},\"cycle\":{},\
\"r3\":\"{:#010x}\",\"r4\":\"{:#010x}\",\"r5\":\"{:#010x}\",\
\"r6\":\"{:#010x}\",\"lr\":\"{:#010x}\"}}\n",
pc, tid, hw_id, cycle, r3, r4, r5, r6, lr,
);
match &self.lr_trace_writer {
Some(mu) => {
if let Ok(mut f) = mu.lock() {
use std::io::Write;
let _ = f.write_all(line.as_bytes());
}
}
None => {
// Stdout path; small alloc, fine for diagnostic use.
print!("{line}");
}
}
}
/// Read a TLS slot for the currently running HW thread.
pub fn tls_get(&self, index: u32) -> u64 {
self.scheduler.tls_get(index)

View File

@@ -3,5 +3,6 @@ pub mod loader;
pub mod lzx;
pub mod pe;
pub mod pdata;
pub mod tls;
pub use header::Xex2Header;

172
crates/xenia-xex/src/tls.rs Normal file
View File

@@ -0,0 +1,172 @@
//! `.tls` section parser for PE32 PowerPC.
//!
//! When MSVC links a binary that uses `__declspec(thread)` storage, it emits
//! a `.tls` section plus an IMAGE_TLS_DIRECTORY32 inside `.rdata`. The
//! directory points at:
//! - the raw initialised TLS data range (start, end VAs)
//! - the address of the index field (a u32 written at runtime by the
//! loader to identify which TLS slot was assigned)
//! - an array of TLS callback function pointers (NUL-terminated)
//! - the size of the zero-fill area appended after raw data
//!
//! Xbox 360 binaries follow the standard PE layout. Sylpheed has no `.tls`
//! section and no TLS directory — the parser simply returns `None` and
//! callers emit zero rows.
//!
//! Reference: Microsoft PE/COFF spec, IMAGE_TLS_DIRECTORY32 layout.
use crate::pe::PeSection;
/// One TLS callback function pointer extracted from the directory's
/// callback array.
#[derive(Debug, Clone, Copy)]
pub struct TlsCallback {
pub address: u32,
}
/// Parsed `.tls` directory information. All fields are absolute VAs.
#[derive(Debug, Clone)]
pub struct TlsInfo {
/// VA of the start of the initialised raw TLS data (template).
pub raw_data_start: u32,
/// VA of one-past-end of the raw TLS data.
pub raw_data_end: u32,
/// VA of the u32 the loader writes the assigned slot index into.
pub index_address: u32,
/// VA of the zero-terminated callback array; 0 when no callbacks.
pub callback_array: u32,
/// Bytes of zero-fill appended after the raw template at thread init.
pub zero_fill_size: u32,
/// Characteristics flags (alignment / etc).
pub characteristics: u32,
/// Resolved TLS callbacks (parsed from `callback_array`).
pub callbacks: Vec<TlsCallback>,
}
/// Parse the `.tls` section. Returns `None` if the binary has no `.tls`
/// section or the directory is malformed.
pub fn parse_tls(pe: &[u8], image_base: u32, sections: &[PeSection]) -> Option<TlsInfo> {
// Find the `.tls` section. The IMAGE_TLS_DIRECTORY32 lives somewhere
// in `.rdata`; rather than hunt the IMAGE_DATA_DIRECTORY entry through
// the optional header, we accept any 24-byte struct at the start of
// `.tls` if the section's raw data looks like a valid directory.
//
// Per MS docs, IMAGE_TLS_DIRECTORY32 layout (24 bytes):
// +0x00 StartAddressOfRawData (VA, 4)
// +0x04 EndAddressOfRawData (VA, 4)
// +0x08 AddressOfIndex (VA, 4)
// +0x0C AddressOfCallBacks (VA, 4 — array of FN ptrs, NUL-terminated)
// +0x10 SizeOfZeroFill (4)
// +0x14 Characteristics (4)
let tls_section = sections.iter().find(|s| s.name == ".tls")?;
let off = tls_section.virtual_address as usize;
if off + 24 > pe.len() { return None; }
// Xbox 360 PE bodies are big-endian; this is consistent with how we
// parse the PE elsewhere (e.g. xref scanning reads BE u32 from PE).
let read_u32 = |start: usize| -> u32 {
u32::from_be_bytes([pe[start], pe[start + 1], pe[start + 2], pe[start + 3]])
};
let raw_data_start = read_u32(off);
let raw_data_end = read_u32(off + 4);
let index_address = read_u32(off + 8);
let callback_array = read_u32(off + 12);
let zero_fill_size = read_u32(off + 16);
let characteristics = read_u32(off + 20);
// Sanity: raw_data_start should land somewhere inside the image.
if raw_data_start == 0 && raw_data_end == 0 && index_address == 0 {
return None;
}
// Walk the callback array (zero-terminated array of u32 VAs).
let mut callbacks = Vec::new();
if callback_array != 0 {
let mut p = callback_array.wrapping_sub(image_base) as usize;
while p + 4 <= pe.len() {
let v = read_u32(p);
if v == 0 { break; }
callbacks.push(TlsCallback { address: v });
p += 4;
if callbacks.len() >= 64 { break; } // sanity cap
}
}
Some(TlsInfo {
raw_data_start,
raw_data_end,
index_address,
callback_array,
zero_fill_size,
characteristics,
callbacks,
})
}
#[cfg(test)]
mod tests {
use super::*;
use crate::pe::PeSection;
fn mk_section(name: &str, va: u32, size: u32) -> PeSection {
PeSection {
name: name.into(),
virtual_address: va,
virtual_size: size,
raw_offset: va,
raw_size: size,
flags: 0x4000_0040,
}
}
#[test]
fn returns_none_when_no_tls_section() {
let pe = vec![0u8; 0x100];
let sections = vec![mk_section(".text", 0x10, 0x40)];
assert!(parse_tls(&pe, 0x82000000, &sections).is_none());
}
#[test]
fn parses_directory_and_callback_array() {
let image_base = 0x82000000u32;
let mut pe = vec![0u8; 0x4000];
// Place the .tls section at RVA 0x100 with the directory.
let tls_va: u32 = 0x100;
let cb_va: u32 = 0x200;
// Directory fields:
let raw_start = 0x800u32;
let raw_end = 0x900u32;
let idx = 0x1000u32;
let zero_fill = 0x40u32;
let chars = 0x0u32;
let cb_array = image_base + cb_va;
for (i, v) in [
image_base + raw_start, image_base + raw_end,
image_base + idx, cb_array, zero_fill, chars,
].iter().enumerate() {
pe[tls_va as usize + i * 4..tls_va as usize + i * 4 + 4]
.copy_from_slice(&v.to_be_bytes());
}
// Two callbacks + NUL terminator at cb_va.
let cb1 = image_base + 0x500;
let cb2 = image_base + 0x600;
pe[cb_va as usize..cb_va as usize + 4].copy_from_slice(&cb1.to_be_bytes());
pe[cb_va as usize + 4..cb_va as usize + 8].copy_from_slice(&cb2.to_be_bytes());
// pe[cb_va + 8..cb_va + 12] already zero (terminator).
let sections = vec![mk_section(".tls", tls_va, 0x100)];
let info = parse_tls(&pe, image_base, &sections).expect("parses");
assert_eq!(info.raw_data_start, image_base + raw_start);
assert_eq!(info.raw_data_end, image_base + raw_end);
assert_eq!(info.index_address, image_base + idx);
assert_eq!(info.callback_array, cb_array);
assert_eq!(info.zero_fill_size, zero_fill);
assert_eq!(info.callbacks.len(), 2);
assert_eq!(info.callbacks[0].address, cb1);
assert_eq!(info.callbacks[1].address, cb2);
}
}