M5.5: this-flow indirect-dispatch resolution via vptr-write inference

Closes the dominant case M5 could not resolve — `lwz vt, off(this);
lwz fn, slot(vt); mtctr; bcctrl` (real C++ dispatch). Implements
class-membership inference using constructor-side vptr writes as an
oracle for which vtables can land at each offset.

## Algorithm

Phase 1 — vptr-write scan: walk every function with the existing
lis+addi register tracker. When `stw rA, off(rB)` writes a known M3
vtable address into off(rB), record `(vtable_addr, vptr_offset,
writer_pc, writer_function)` as a constructor-side vptr write.

Phase 2 — invert by offset: `vtables_by_offset[off] = {V : V written
at off in any ctor}`.

Phase 3 — dispatch detection: from each `bcctrl LK=1`, walk back
≤16 instructions looking for the canonical chain. Bail on register
clobber, branch, or label (basic-block) boundary.

Phase 4 — edge emission: for `(dispatch_pc, vptr_off, slot)`, emit one
`xrefs.kind='ind_call'` row per vtable V where:
  - `vtables_by_offset[vptr_off]` contains V, AND
  - `V.length > slot` (V actually has a method at that slot)

Multi-candidate sites (the common case at offset 0) are an
over-approximation; downstream queries filter to single-candidate sites
for high confidence:
  `WHERE candidate_count=1` in `indirect_dispatch_sites`.

## Schema

NEW TABLES:
- `vptr_writes(writer_pc, vtable_address, vptr_offset, writer_function)`
- `indirect_dispatch_sites(dispatch_pc PK, vptr_offset, slot, candidate_count)`
- `indirect_dispatch_candidates(dispatch_pc, vtable_address, method_address)`

NEW INDICES on vtable_address / vptr_offset / method_address /
(vptr_offset, slot) for fast joins.

## Sylpheed yield

- 567 vptr writes / 214 vtables / 29 offsets (offset 0 = 88%).
- 6,842 dispatch sites resolved: 97 single-candidate (high-confidence) +
  6,745 multi-candidate.
- 687,963 ind_call xref rows.
- 2,746 newly-reachable functions via v_indirect_reachability_from_entry
  (compared to 0 with M5 alone).
- Audit-009 cluster: functions including 0x823BC9E0, 0x823BC290,
  0x823BC5A0, 0x823BB158 newly reachable — actionable for the
  renderer-plateau hunt.

Tests 640→649 (+4 ind_dispatch_typed unit tests + 5 from tighter golden
expansion). Schema golden + write_analysis_results signature updated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-05-09 23:35:05 +02:00
parent d8766c6242
commit 56ffa40a6a
7 changed files with 854 additions and 8 deletions

View File

@@ -4218,6 +4218,35 @@ fn cmd_dis(
"function-pointer array scan complete",
);
// M5.5 — typed indirect-dispatch resolution (this->vptr → method).
let typed_ind = xenia_analysis::ind_dispatch_typed::analyze(
&pe_image, base, &func_analysis, &vtables, &xref_result.labels,
);
let single = typed_ind.dispatches.iter().filter(|d| d.candidate_vtables.len() == 1).count();
let multi = typed_ind.dispatches.len() - single;
let typed_edges: usize = typed_ind.dispatches.iter().map(|d| d.method_pcs.len()).sum();
info!(
vptr_writes = typed_ind.vptr_writes.len(),
dispatches = typed_ind.dispatches.len(),
single_candidate = single,
multi_candidate = multi,
edges = typed_edges,
"M5.5 typed indirect-dispatch scan complete",
);
// Add ind_call edges for every (dispatch_pc, method) candidate.
for d in &typed_ind.dispatches {
for &method_pc in &d.method_pcs {
xref_result.xrefs
.entry(method_pc)
.or_default()
.push(xenia_analysis::xref::Xref {
source: d.dispatch_pc,
kind: xenia_analysis::xref::XrefKind::IndirectCall,
addr_mode: None,
});
}
}
// Build DisasmInfo
let disasm_info = xenia_analysis::formatter::DisasmInfo {
image_base: base,
@@ -4244,6 +4273,7 @@ fn cmd_dis(
&vtables,
&strings,
&fparrays,
Some(&typed_ind),
)?;
w.write_tls(tls_info.as_ref())?;
if matches!(analyze, AnalyzeMode::Sql | AnalyzeMode::Both) {