Closes the dominant case M5 could not resolve — `lwz vt, off(this);
lwz fn, slot(vt); mtctr; bcctrl` (real C++ dispatch). Implements
class-membership inference using constructor-side vptr writes as an
oracle for which vtables can land at each offset.
## Algorithm
Phase 1 — vptr-write scan: walk every function with the existing
lis+addi register tracker. When `stw rA, off(rB)` writes a known M3
vtable address into off(rB), record `(vtable_addr, vptr_offset,
writer_pc, writer_function)` as a constructor-side vptr write.
Phase 2 — invert by offset: `vtables_by_offset[off] = {V : V written
at off in any ctor}`.
Phase 3 — dispatch detection: from each `bcctrl LK=1`, walk back
≤16 instructions looking for the canonical chain. Bail on register
clobber, branch, or label (basic-block) boundary.
Phase 4 — edge emission: for `(dispatch_pc, vptr_off, slot)`, emit one
`xrefs.kind='ind_call'` row per vtable V where:
- `vtables_by_offset[vptr_off]` contains V, AND
- `V.length > slot` (V actually has a method at that slot)
Multi-candidate sites (the common case at offset 0) are an
over-approximation; downstream queries filter to single-candidate sites
for high confidence:
`WHERE candidate_count=1` in `indirect_dispatch_sites`.
## Schema
NEW TABLES:
- `vptr_writes(writer_pc, vtable_address, vptr_offset, writer_function)`
- `indirect_dispatch_sites(dispatch_pc PK, vptr_offset, slot, candidate_count)`
- `indirect_dispatch_candidates(dispatch_pc, vtable_address, method_address)`
NEW INDICES on vtable_address / vptr_offset / method_address /
(vptr_offset, slot) for fast joins.
## Sylpheed yield
- 567 vptr writes / 214 vtables / 29 offsets (offset 0 = 88%).
- 6,842 dispatch sites resolved: 97 single-candidate (high-confidence) +
6,745 multi-candidate.
- 687,963 ind_call xref rows.
- 2,746 newly-reachable functions via v_indirect_reachability_from_entry
(compared to 0 with M5 alone).
- Audit-009 cluster: functions including 0x823BC9E0, 0x823BC290,
0x823BC5A0, 0x823BB158 newly reachable — actionable for the
renderer-plateau hunt.
Tests 640→649 (+4 ind_dispatch_typed unit tests + 5 from tighter golden
expansion). Schema golden + write_analysis_results signature updated.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
468 lines
22 KiB
Markdown
468 lines
22 KiB
Markdown
# `xenia-analysis` schema reference
|
||
|
||
Authoritative documentation for the DuckDB tables and SQL views produced by
|
||
`xenia-rs dis --db sylpheed.db`. Track schema changes here alongside any
|
||
update to the `db_schema_golden` test fixture.
|
||
|
||
The base + disasm tables (`metadata`, `sections`, `imports`, `functions`,
|
||
`labels`, `instructions`, `xrefs`, opt-in `exec_trace` / `import_calls` /
|
||
`branch_trace`) are documented inline in `src/db.rs` doc comment. This file
|
||
collects layered analysis additions and forward-work notes.
|
||
|
||
---
|
||
|
||
## Layer M1 — `.pdata` boundary correction (landed)
|
||
|
||
### Schema additions
|
||
- `functions.pdata_validated BOOLEAN NOT NULL` — `true` when the row's
|
||
`address` matches a `RUNTIME_FUNCTION.BeginAddress` from `.pdata`. Linker
|
||
ground truth.
|
||
- `functions.pdata_length BIGINT NULL` — `function_length` (bytes) from the
|
||
matching pdata entry; `NULL` when the row is prologue-only.
|
||
- New table `pdata_entries(begin_address BIGINT PRIMARY KEY, end_address
|
||
BIGINT, function_length BIGINT, prolog_length BIGINT, flags BIGINT)` — every
|
||
parsed `.pdata` `RUNTIME_FUNCTION` entry (raw, before any merge with
|
||
prologue analysis).
|
||
- Index `idx_functions_pdata_validated` on `functions(pdata_validated)`.
|
||
|
||
### What this layer does
|
||
- Parses `.pdata` 8-byte `RUNTIME_FUNCTION` entries (PowerPC PE32 layout):
|
||
word 0 `BeginAddress` (absolute VA), word 1 packed
|
||
`{prolog_length:8, function_length:22, flags:2}`, both big-endian.
|
||
- Unions pdata `BeginAddress` values into the function-candidate set fed to
|
||
the prologue walker, so functions our prologue heuristic missed still get
|
||
rows.
|
||
- When pdata supplies a longer `function_length` than the prologue walk
|
||
found, extends `end_address` to the pdata-implied end (catches mis-split
|
||
where the walker stopped at an early `blr`).
|
||
- After the walker, performs a forward pass that trims `function.end` to the
|
||
next start when they overlap (catches mis-merge where one row spanned two
|
||
prologues — the audit-031 `sub_824D23B0` / `sub_824D29F0` case).
|
||
|
||
### What this layer does NOT do
|
||
- Does not adjust prolog-derived `frame_size` / `saved_gprs` from `.pdata`'s
|
||
`prolog_length` field — those remain prologue-only inferences.
|
||
- Does not classify functions further than the existing `is_leaf` /
|
||
`is_saverestore` columns. Class membership is M3.
|
||
- Does not detect functions whose entries are missing from BOTH `.pdata`
|
||
and the bl-target scan (extremely rare; would require executable-byte
|
||
linear sweep).
|
||
|
||
### Reference docs
|
||
- Microsoft PE32+ exception data spec for PowerPC RUNTIME_FUNCTION.
|
||
- xenia-canary `src/xenia/cpu/xex_module.cc:1570-1587` — canary's reference
|
||
parser (extracts `BeginAddress` only; we additionally decode word 1).
|
||
|
||
### Validation queries
|
||
```sql
|
||
-- All pdata entries found
|
||
SELECT COUNT(*) FROM pdata_entries; -- ~23073 for Sylpheed
|
||
-- Functions cross-validated against pdata
|
||
SELECT COUNT(*) FROM functions WHERE pdata_validated;
|
||
-- Functions detected ONLY by prologue (orphans of pdata)
|
||
SELECT COUNT(*) FROM functions WHERE NOT pdata_validated;
|
||
-- Pdata orphans NOT yet in functions (should be 0 after this layer)
|
||
SELECT COUNT(*) FROM pdata_entries p
|
||
LEFT JOIN functions f ON f.address = p.begin_address
|
||
WHERE f.address IS NULL;
|
||
-- Audit-031 mis-merge resolved: 0x824D29F0 should have its own row
|
||
SELECT name FROM functions WHERE address = 2186674160; -- 0x824D29F0
|
||
```
|
||
|
||
---
|
||
|
||
## Layer M2 — MSVC C++ name demangler (landed)
|
||
|
||
### Schema additions
|
||
- New table `demangled_names(address BIGINT NULL, mangled VARCHAR NOT NULL,
|
||
raw_demangled VARCHAR NOT NULL, namespace_path VARCHAR NULL,
|
||
class_name VARCHAR NULL, method_name VARCHAR NULL,
|
||
params_signature VARCHAR NULL)`.
|
||
- Indices on `address`, `class_name`, `method_name`.
|
||
|
||
### What this layer does
|
||
- Wraps `msvc_demangler::demangle` (a Rust port of LLVM's
|
||
`MicrosoftDemangle.cpp`) and splits the formatted output into structured
|
||
fields via a heuristic top-level parser (handles templates and nested parens
|
||
correctly).
|
||
- Populates `demangled_names` from any label whose name starts with `?` plus
|
||
any import name that happens to be mangled (defensive — typical kernel
|
||
imports use C names).
|
||
|
||
### What this layer does NOT do
|
||
- Does not parse the AST returned by `msvc_demangler::parse` — uses the formatted
|
||
string and a heuristic split. Adequate for typical class member functions
|
||
and RTTI strings; exotic template / lambda forms still get `raw_demangled`
|
||
populated but may have NULL structured fields.
|
||
- Does not yet ingest RTTI strings discovered in `.rdata` — that's M3's job;
|
||
M3 will append rows to this table at the addresses where it finds RTTI
|
||
TypeDescriptors.
|
||
|
||
### Reference docs
|
||
- `msvc-demangler` crate (`https://docs.rs/msvc-demangler/0.11`).
|
||
- LLVM `MicrosoftDemangle.cpp` (the parser this crate ports).
|
||
|
||
## Layer M3 — Vtable + RTTI detection (landed)
|
||
|
||
### Schema additions
|
||
- `vtables(address PK, length, col_address NULL, class_name, rtti_present,
|
||
base_classes_json NULL)` — every detected static vtable.
|
||
- `methods(vtable_address, slot, function_address, mangled_name NULL,
|
||
demangled_name NULL, PRIMARY KEY (vtable_address, slot))` — one row per
|
||
method slot.
|
||
- `classes(name PK, vtable_address, rtti_present, base_classes_json NULL)` —
|
||
deduped by class name (first-detected vtable wins).
|
||
- Indices: `methods.function_address`, `classes.rtti_present`.
|
||
|
||
### What this layer does
|
||
- Walks `.rdata` and `.data` looking for runs of ≥3 consecutive 4-byte BE
|
||
values where each value is a known function start (from M1's corrected
|
||
`functions` table). Single-2-method vtables are intentionally rejected to
|
||
control false-positive rate.
|
||
- Attempts the MSVC RTTI walk `vtable[-1] → CompleteObjectLocator → TypeDescriptor`
|
||
for each candidate. When successful, the demangled `class ClassName`
|
||
string fills `class_name` and a best-effort
|
||
`RTTIClassHierarchyDescriptor` walk fills `base_classes_json` (JSON array
|
||
of base class names).
|
||
- Falls back to `ANON_Class_<8-hex>` keyed by FNV-1a hash of the sorted
|
||
method-PC tuple when RTTI is absent (typical for shipped game binaries).
|
||
Identical vtables across the binary (multiple instances) collapse to the
|
||
same anonymous name.
|
||
|
||
### What this layer does NOT do
|
||
- Vtables built at runtime in heap-allocated memory (e.g. by ctors copying
|
||
static templates) are out of scope — only static `.rdata`/`.data` content.
|
||
- Multiple-inheritance "extra" vftables (one per base subobject) are detected
|
||
as independent vtables with no link between them.
|
||
- Inheritance-tree walking beyond `RTTIClassHierarchyDescriptor`'s direct
|
||
base list is not attempted.
|
||
|
||
### Reference docs
|
||
- openrce.org "Reversing Microsoft Visual C++" — RTTI layout articles
|
||
(CompleteObjectLocator at vtable[-1]; TypeDescriptor at COL+0xC; mangled
|
||
name at TD+0x8).
|
||
|
||
## Layer M4 — Class-aware probe targeting (landed)
|
||
|
||
CLI extension only — no schema changes. The probe-token grammar adds three
|
||
symbolic forms on top of the existing `0xADDR` literal:
|
||
|
||
- `Class::method` — joins `classes` × `methods` × `demangled_names` to find
|
||
every PC whose vtable belongs to that class and whose demangled
|
||
`method_name` matches.
|
||
- `Class::*` — joins `classes` × `methods` to find every method PC of that
|
||
class.
|
||
- `function_name` — falls back to `functions.name` lookup for free functions
|
||
/ saverestore stubs / labels.
|
||
|
||
Numeric tokens never touch the DB (preserves zero-IO fast path; lockstep
|
||
digest unaffected). Symbolic tokens require the DuckDB at `--probe-db PATH`
|
||
or `XENIA_PROBE_DB`; default is `sylpheed.db` next to the .iso when present.
|
||
|
||
Resolution happens BEFORE guest exec begins, so it cannot affect the
|
||
lockstep digest.
|
||
|
||
See `crates/xenia-analysis/src/lookup.rs`.
|
||
|
||
---
|
||
|
||
## Layer M5 — Indirect-dispatch reachability (landed)
|
||
|
||
### Schema additions
|
||
- New value `'ind_call'` in the `xrefs.kind` set.
|
||
- New SQL view `v_indirect_reachability_from_entry` — strict superset of
|
||
`v_reachability_from_entry`, taking `ind_call` edges in the BFS.
|
||
|
||
### What this layer does
|
||
- Walks each `FuncAnalysis.functions` entry with a per-basic-block register
|
||
tracker. Recognises the canonical static-vtable pattern:
|
||
`lis+addi → lwz off(rA) → mtctr → bcctrl`, where `rA` ends up holding a
|
||
known vtable's start address from M3.
|
||
- Honours the PowerPC ABI: `bl`-style calls (op 18 / 16 with LK=1) clobber
|
||
volatile r0..r12 + ctr but preserve non-volatile r13..r31, so a vtable
|
||
pointer parked in r30/r31 before a call survives.
|
||
- Treats every M3 `loc_*` label as a basic-block boundary (kills register
|
||
state) so jump-IN paths cannot induce false positives.
|
||
|
||
### What this layer does NOT do (and observed impact)
|
||
- Vtable pointer loaded from a `this`-pointer field
|
||
(`lwz r_vt, off(rA)` where `rA = this`) — by far the dominant pattern in
|
||
real C++ — is unresolvable without alias / points-to analysis.
|
||
- On Sylpheed: the layer detects 0 edges. The binary's 1,001 lis+addi
|
||
references into vtables are mostly constructor-side **vptr writes**
|
||
(`stw rVtable, vptr_offset(this)`), not direct dispatches. The renderer
|
||
hunt's audit-009 cluster therefore needs a future M5.5 with `this`-flow
|
||
tracking before this layer surfaces it.
|
||
|
||
### Reference docs
|
||
- IBM PowerPC ABI: register-save convention (volatile r0..r12 + ctr,
|
||
non-volatile r13..r31).
|
||
|
||
## Layer M7 — String / constant-pool detection (landed)
|
||
|
||
### Schema additions
|
||
- New table `strings(address PK, encoding, length, content)`.
|
||
- Index `idx_strings_encoding`.
|
||
|
||
### What this layer does
|
||
- Scans `.rdata` for runs of length ≥ 6 of printable ASCII bytes followed by
|
||
a NUL terminator.
|
||
- Scans `.rdata` for UTF-16LE runs of length ≥ 6 code units (printable-ASCII
|
||
basic plane only) followed by a u16 NUL terminator.
|
||
- Cross-reference is implicit: existing `xrefs.kind='ref'` rows whose
|
||
`target` falls in `strings.address`'s exact match set name the referencing
|
||
PCs. SQL: `SELECT s.content, x.source FROM xrefs x JOIN strings s
|
||
ON s.address = x.target WHERE x.kind='ref'`.
|
||
|
||
### What this layer does NOT do
|
||
- No UTF-8 multibyte / non-ASCII basic plane in either encoding.
|
||
- No `.data` scan (read-only-section bias).
|
||
- No multi-byte CJK encodings — Japanese text in localised builds may be
|
||
represented in shift_jis / utf-8 with non-printable bytes that this
|
||
scanner skips.
|
||
|
||
### Sylpheed yield
|
||
- 6,311 ASCII strings (including full embedded HLSL shader source).
|
||
- 0 UTF-16LE strings (binary uses ASCII / native CJK encoding).
|
||
- 9,132 lis+addi sites cross-reference into the detected strings — names
|
||
the source PCs that reference each string.
|
||
|
||
## Layer M6 — Extended store-class xrefs + `addr_mode` column (landed)
|
||
|
||
### Schema additions
|
||
- `xrefs.addr_mode VARCHAR NULL` — sub-classifies how the source instruction
|
||
computes its target. NULL for control-flow edges (call / ind_call / j /
|
||
br); one of the following tags for data edges:
|
||
- `d_form` — standard signed-16 displacement (lwz/stw/lfs/stfs/etc.)
|
||
- `lis_addi` — address materialised via `lis + addi` register tracking
|
||
- `lis_ori` — address materialised via `lis + ori`
|
||
- `multiword` — `lmw / stmw` (one xref per slot; up to 32-rS slots)
|
||
- `x_form_indexed` — `stwx / stbx / sthx / stwux / stbux / sthux / stdx /
|
||
stdux / lwzx / lbzx / lhzx / lhax / lwzux / lbzux / lhzux / lhaux / ldx /
|
||
ldux` — emitted only when both rA and rB are tracked constants
|
||
- `x_form_byterev` — `stwbrx / sthbrx / lwbrx / lhbrx`
|
||
- `atomic` — `stwcx. / stdcx.` reservation-conditional stores
|
||
- `dcbz` — cache-line clear (32-byte zero at rA+rB)
|
||
- Index `idx_xrefs_addr_mode`.
|
||
|
||
### What this layer does
|
||
- Tags every existing data xref with its addressing mode (`d_form` for the
|
||
bulk; `lis_addi` / `lis_ori` for the lift-and-add cases that produce
|
||
DataRef rows).
|
||
- Adds new dispatch for opcode 47 (`stmw`) and 46 (`lmw`), expanding to
|
||
per-slot DataWrite / DataRead rows.
|
||
- Adds new dispatch for opcode 31 X-form: stores, atomic, byte-reverse,
|
||
dcbz. X-form rows are emitted ONLY when both rA and rB resolve to known
|
||
constants (otherwise the address is runtime-dependent and we skip).
|
||
|
||
### What this layer does NOT do
|
||
- VMX / VMX128 vector stores (opcode 31 with vector XO codes) are not
|
||
emitted — they always have register-indexed addresses that the
|
||
lis+addi tracker can't usually resolve, and detecting them adds noise
|
||
without improving target resolution.
|
||
- The dominant runtime-of-stwx pattern (rA = base, rB = runtime index) is
|
||
not resolved — by design; mem-watch covers the runtime side per VERIFY-B.
|
||
|
||
### Sylpheed yield
|
||
- 28,834 `lis_addi` refs, 18,485 `d_form` reads, 3,288 `d_form` writes —
|
||
the existing baseline now properly tagged.
|
||
- **442 newly-detected `x_form_indexed` reads** — primarily lwzx/lhzx
|
||
reads from in-table dispatch (each pair (rA,rB) resolved statically).
|
||
- **40 newly-detected `atomic` writes** — every `stwcx.` site with a
|
||
resolvable address; useful for reservation-table audits.
|
||
- 9 `lis_ori` refs.
|
||
- 0 multiword / dcbz / byterev — these instructions exist in the binary
|
||
but are not in lis+addi-tracked code paths.
|
||
|
||
## Layer M8 + M11 — Function-pointer arrays beyond vtables (landed)
|
||
|
||
### Schema additions
|
||
- New table `function_pointer_arrays(address PK, length, kind)` where
|
||
`kind` is `'vtable'` (M3 re-emit), `'dispatch_table'` (M8), or
|
||
`'static_init'` (M11).
|
||
- New table `function_pointer_array_entries(array_address, slot,
|
||
function_address, PRIMARY KEY (array_address, slot))` — one row per
|
||
slot of every detected array (vtable + non-vtable).
|
||
- Indices on `function_pointer_arrays.kind` and
|
||
`function_pointer_array_entries.function_address`.
|
||
|
||
### What this layer does
|
||
- Walks `.rdata` (only — `.data` produces too many false positives) for
|
||
runs of ≥ 2 consecutive 4-byte BE values where each value is a known
|
||
function entry from M1's `functions` table.
|
||
- Skips runs whose start matches an M3 vtable head — those are re-emitted
|
||
in this table with `kind='vtable'` for unified queries but not
|
||
re-classified.
|
||
- Heuristically classifies non-vtable runs:
|
||
- `static_init` (M11): every entry's first instruction is `mfspr r12, LR`
|
||
AND the next is `stwu r1, -N(r1)` with `N ≤ 0x80` (or a save-stub `bl`).
|
||
Mirrors the typical C++ static-initialiser prologue.
|
||
- `dispatch_table` (M8): everything else.
|
||
|
||
### What this layer does NOT do
|
||
- Does not parse symbol-table-bracketed regions like `__xc_a` / `__xc_z`
|
||
/ `__xi_a` / `__xi_z` directly — Sylpheed's symbol table is stripped.
|
||
- Does not chain multi-segment static-init drivers; future M11.5 could
|
||
walk the entry-point's static-init driver call chain to surface
|
||
ground-truth ctor PCs.
|
||
- 2-slot runs in `.rdata` may be false positives where two struct fields
|
||
happen to alias function VAs; downstream queries should use a length
|
||
filter (`WHERE length >= 3`) when high precision matters.
|
||
|
||
### Sylpheed yield
|
||
- 722 vtables (M3 re-emit) + 388 dispatch_tables = 1,110 arrays in
|
||
`function_pointer_arrays`.
|
||
- 0 static_init detected — Sylpheed's ctors don't all match the
|
||
conservative prologue heuristic. Lengths concentrate at 2 slots
|
||
(typical of switch-case jump tables).
|
||
|
||
## Layer M9 — `has_eh` from `.pdata` exception flag (landed)
|
||
|
||
### Schema additions
|
||
- `functions.has_eh BOOLEAN NOT NULL` — true when `.pdata`'s exception-
|
||
handler-present bit (bit 31 of word 1, the high bit) is set.
|
||
- Index `idx_functions_has_eh`.
|
||
|
||
### What this layer does
|
||
- Derived directly from M1's already-parsed `pdata.flags` bit field (no
|
||
new parsing). The bit was always available in `pdata_entries.flags`;
|
||
this layer surfaces it as a first-class column on `functions`.
|
||
|
||
### What this layer does NOT do
|
||
- Does not parse the actual `__CxxFrameHandler` / `__C_specific_handler`
|
||
scope-table records that the exception bit gates. Walking those tables
|
||
would let us name try/catch ranges and per-state cleanup actions, but
|
||
is out of scope for a derive-only milestone.
|
||
|
||
### Sylpheed yield
|
||
- 2,975 of 23,073 pdata-validated functions have `has_eh=true` (12.9%) —
|
||
plausible MSVC C++ EH coverage rate. Largest EH function: 26,328 bytes
|
||
(`sub_823518F0`).
|
||
|
||
## Layer M10 — `.tls` section / TLS directory (landed)
|
||
|
||
### Schema additions
|
||
- New table `tls_info(raw_data_start, raw_data_end, index_address,
|
||
callback_array, zero_fill_size, characteristics)` — at most one row
|
||
(the IMAGE_TLS_DIRECTORY32).
|
||
- New table `tls_callbacks(slot PK, address)` — one row per resolved TLS
|
||
callback function.
|
||
|
||
### What this layer does
|
||
- Reads the first 24 bytes of the `.tls` section as an
|
||
`IMAGE_TLS_DIRECTORY32` and walks the zero-terminated callback array.
|
||
- All addresses stored as absolute VAs.
|
||
|
||
### What this layer does NOT do
|
||
- Does not parse the raw TLS template content (the variable initialiser
|
||
block); just records its start/end VAs.
|
||
|
||
### Sylpheed yield
|
||
- 0 rows — Sylpheed has no `.tls` section. Infrastructure ready for any
|
||
binary that uses `__declspec(thread)` storage.
|
||
|
||
## Layer M12 — `--lr-trace` runtime canary-diff harness (landed)
|
||
|
||
### Runtime additions (no DB)
|
||
- New CLI flag `--lr-trace=PC[,PC,...]` on `exec` — comma-separated PCs
|
||
to capture as JSONL records on every fire. Symbolic tokens (`Class::method`)
|
||
resolve via M4's lookup against `--probe-db`. Settable via
|
||
`XENIA_LR_TRACE`.
|
||
- New CLI flag `--lr-trace-out=PATH` — writes JSONL to a file (one
|
||
record per line). Stdout when omitted. Settable via `XENIA_LR_TRACE_OUT`.
|
||
- New kernel state fields `lr_trace_pcs: HashSet<u32>` +
|
||
`lr_trace_writer: Option<Mutex<File>>` and helper
|
||
`KernelState::fire_lr_trace_if_match(hw_id)` invoked from the
|
||
per-instruction probe slot.
|
||
|
||
### JSONL record fields
|
||
`pc, tid, hw, cycle, r3, r4, r5, r6, lr` — superset of what
|
||
xenia-canary's `--log_lr_on_pc` patch emits, with a cycle counter added
|
||
for cross-run reproducibility.
|
||
|
||
### What this layer does NOT do
|
||
- Does not capture VMX / FP register state (only GPRs r3..r6).
|
||
- Does not buffer / batch records — one `write_all` per fire. For
|
||
high-frequency probes (e.g. tight loops at >1M fires/sec), redirect
|
||
to a file and use a SSD.
|
||
|
||
### Determinism
|
||
Lockstep digest unaffected: probe firing happens after the per-instr
|
||
hooks for ctor/branch probes and only emits side-channel output. Verified
|
||
end-of-session: `check sylpheed.iso --stable-digest -n 2M` ×2 produced
|
||
byte-identical digests (`instructions=2000005`).
|
||
|
||
---
|
||
|
||
## Layer M5.5 — `this`-flow indirect-dispatch resolution (landed)
|
||
|
||
### Schema additions
|
||
- New table `vptr_writes(writer_pc, vtable_address, vptr_offset, writer_function)` —
|
||
every detected `stw rVtable, vptr_off(rThis)` site.
|
||
- New table `indirect_dispatch_sites(dispatch_pc PK, vptr_offset, slot, candidate_count)` —
|
||
one row per resolved dispatch.
|
||
- New table `indirect_dispatch_candidates(dispatch_pc, vtable_address, method_address)` —
|
||
one row per (dispatch × candidate vtable). Joined to existing
|
||
`xrefs.kind='ind_call'` edges (one ind_call row per candidate).
|
||
- New indices on `vptr_writes.vtable_address`, `vptr_writes.vptr_offset`,
|
||
`indirect_dispatch_candidates.method_address`,
|
||
`indirect_dispatch_candidates.vtable_address`,
|
||
`indirect_dispatch_sites.(vptr_offset, slot)`.
|
||
|
||
### What this layer does (class-membership inference)
|
||
1. **Phase 1 — vptr-write scan**: walk every function with the lis+addi
|
||
tracker; whenever `stw rA, off(rB)` writes a known M3 vtable address,
|
||
record `(vtable_addr, vptr_offset, writer_pc)`.
|
||
2. **Phase 2 — invert**: build `vtables_by_offset[vptr_off] = {V}` for the
|
||
set of vtables ever written at that offset.
|
||
3. **Phase 3 — dispatch detection**: walk back ≤16 instructions from each
|
||
`bcctrl`/`bctr LK=1`, find the canonical
|
||
`lwz vt, off(this); lwz fn, slot*4(vt); mtctr fn` chain. Extract
|
||
`(vptr_off, slot)`. Bail on register clobber, branch, or label
|
||
boundary.
|
||
4. **Phase 4 — emit**: for each `(dispatch_pc, vptr_off, slot)`, emit one
|
||
`xrefs.kind='ind_call'` row per candidate vtable that has a
|
||
matching slot. Multi-candidate rows are an over-approximation.
|
||
|
||
### What this layer does NOT do
|
||
- No alias resolution at multi-candidate sites — emits one edge per
|
||
matching vtable. Downstream queries should filter
|
||
`indirect_dispatch_sites WHERE candidate_count=1` for high-confidence
|
||
edges.
|
||
- No flow-sensitive analysis: register state is killed at every label
|
||
(basic-block boundary) and at `bl`/`bcl` calls (volatile r0..r12 +
|
||
ctr). We do NOT propagate values across calls in the chain-walker.
|
||
- No tracking of vptr writes via X-form indexed (`stwx`), VMX, or
|
||
multiword stores. Only D-form `stw rA, off(rB)`.
|
||
- Does not synthesise vptr writes for inlined / elided constructors.
|
||
If a class never has a writer at offset `vptr_off`, dispatches
|
||
through that offset find no candidates.
|
||
|
||
### Sylpheed yield
|
||
- 567 vptr writes covering 214 distinct vtables (~30% of M3's 722).
|
||
- 29 distinct vptr offsets used; offset 0 dominates (501/567 = 88%,
|
||
single-inheritance).
|
||
- **6,842 dispatch sites resolved**: 97 single-candidate
|
||
(high-confidence) + 6,745 multi-candidate (over-approximation).
|
||
- 687,963 `ind_call` xref rows total.
|
||
- **2,746 newly-reachable functions** via the M5 BFS view
|
||
(`v_indirect_reachability_from_entry`) compared to call/j/br alone.
|
||
- Audit-009 cluster (renderer plateau): functions newly visible
|
||
include `0x823BC9E0`, `0x823BC290`, `0x823BC5A0`, `0x823BB158`,
|
||
`0x823BB1E0`, `0x823BCAF0`, `0x823BC4C8` — actionable starting
|
||
points for the cluster's reachability hunt.
|
||
|
||
### Reference docs
|
||
- IBM PowerPC ABI (volatile/non-volatile register partition).
|
||
- Itanium C++ ABI on vtable layout (offset-from-`this` model adapted
|
||
by MSVC for Win32 PPC).
|
||
|
||
## Forward work (not yet landed)
|
||
|
||
- **M9.5** — full `__CxxFrameHandler` scope-table parsing (try/catch
|
||
range names, per-state cleanup actions).
|
||
- **M11.5** — walk the static-initialiser driver call chain from the
|
||
entry point to surface ground-truth ctor PCs.
|
||
- VMX/VMX128 vector-store xref emission (M6 follow-up).
|
||
- UTF-8 / shift_jis localised-string detection in `.rdata` (M7 follow-up).
|