M3: vtable scan + MSVC RTTI walk + 3 new tables
Adds detection of statically-allocated MSVC vtables in .rdata/.data: - New `xenia_analysis::vtables` walks read-only sections looking for runs of ≥3 contiguous big-endian u32 values where each value lands on a known function start (from M1's corrected functions table). 2-slot runs are rejected to keep false-positive rate down. - For each candidate the MSVC RTTI walk vtable[-1] → CompleteObjectLocator → TypeDescriptor → mangled name is attempted; on success the demangled class name is recorded along with a best-effort RTTIClassHierarchyDescriptor walk to fill base_classes_json. On failure (RTTI stripped — common for shipped game binaries) the class is named ANON_Class_<fnv1a-hash> keyed by sorted method-PC list, so identical vtables collapse to one entry. - DB: new tables `vtables`, `methods`, `classes` with indices on function_address and rtti_present. `write_analysis_results` takes a `&[Vtable]` slice; `write_disasm` (back-compat) passes empty. - cmd_dis wires the scan after xref analysis using `func_analysis.functions.keys()` as the function-start oracle. Validation on Sylpheed (RTTI stripped, as expected): 722 vtables / 499 unique classes / 5571 methods. Sanity invariant: every methods.function_address joins to functions.address (0 broken refs). Largest vtable: 131 slots. Tests 617→621 (+4 vtable unit tests covering 3-slot detect, 2-slot reject, synth name stability, and synth name divergence). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -102,12 +102,45 @@ SELECT name FROM functions WHERE address = 2186674160; -- 0x824D29F0
|
||||
- `msvc-demangler` crate (`https://docs.rs/msvc-demangler/0.11`).
|
||||
- LLVM `MicrosoftDemangle.cpp` (the parser this crate ports).
|
||||
|
||||
## Layer M3 — Vtable + RTTI detection (planned)
|
||||
## Layer M3 — Vtable + RTTI detection (landed)
|
||||
|
||||
Adds `vtables`, `methods`, `classes` tables. Heuristic vtable scan over
|
||||
`.rdata` + `.data`, optional MSVC RTTI `CompleteObjectLocator → TypeDescriptor`
|
||||
walk, anonymous-class fallback when RTTI is stripped. See
|
||||
`crates/xenia-analysis/src/vtables.rs` (when landed).
|
||||
### Schema additions
|
||||
- `vtables(address PK, length, col_address NULL, class_name, rtti_present,
|
||||
base_classes_json NULL)` — every detected static vtable.
|
||||
- `methods(vtable_address, slot, function_address, mangled_name NULL,
|
||||
demangled_name NULL, PRIMARY KEY (vtable_address, slot))` — one row per
|
||||
method slot.
|
||||
- `classes(name PK, vtable_address, rtti_present, base_classes_json NULL)` —
|
||||
deduped by class name (first-detected vtable wins).
|
||||
- Indices: `methods.function_address`, `classes.rtti_present`.
|
||||
|
||||
### What this layer does
|
||||
- Walks `.rdata` and `.data` looking for runs of ≥3 consecutive 4-byte BE
|
||||
values where each value is a known function start (from M1's corrected
|
||||
`functions` table). Single-2-method vtables are intentionally rejected to
|
||||
control false-positive rate.
|
||||
- Attempts the MSVC RTTI walk `vtable[-1] → CompleteObjectLocator → TypeDescriptor`
|
||||
for each candidate. When successful, the demangled `class ClassName`
|
||||
string fills `class_name` and a best-effort
|
||||
`RTTIClassHierarchyDescriptor` walk fills `base_classes_json` (JSON array
|
||||
of base class names).
|
||||
- Falls back to `ANON_Class_<8-hex>` keyed by FNV-1a hash of the sorted
|
||||
method-PC tuple when RTTI is absent (typical for shipped game binaries).
|
||||
Identical vtables across the binary (multiple instances) collapse to the
|
||||
same anonymous name.
|
||||
|
||||
### What this layer does NOT do
|
||||
- Vtables built at runtime in heap-allocated memory (e.g. by ctors copying
|
||||
static templates) are out of scope — only static `.rdata`/`.data` content.
|
||||
- Multiple-inheritance "extra" vftables (one per base subobject) are detected
|
||||
as independent vtables with no link between them.
|
||||
- Inheritance-tree walking beyond `RTTIClassHierarchyDescriptor`'s direct
|
||||
base list is not attempted.
|
||||
|
||||
### Reference docs
|
||||
- openrce.org "Reversing Microsoft Visual C++" — RTTI layout articles
|
||||
(CompleteObjectLocator at vtable[-1]; TypeDescriptor at COL+0xC; mangled
|
||||
name at TD+0x8).
|
||||
|
||||
## Layer M4 — Class-aware probe targeting (planned)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user