M9.5 + M11.5 + VMX + SJIS/UTF-8: close the post-M5.5 deferred set

Closes the four remaining deferred follow-up items in one bundle.
All four are smaller-scope and additive; lockstep determinism
unaffected (analyzer-only changes).

## M9.5 — __CxxFrameHandler scope-table parsing

- New `xenia_analysis::eh_scope` module. Magic-scans .rdata for the
  three documented MSVC FuncInfo signatures (0x19930520/21/22) on
  4-byte alignment. Each match is parsed as the documented struct
  (BE u32 fields), with sanity caps on max_state / n_try_blocks /
  pointer validity.
- Walks pUnwindMap (UnwindMapEntry, 8 bytes) and pTryBlockMap
  (TryBlockMapEntry, 20 bytes) into one row each.
- New tables eh_funcinfo, eh_unwind_map, eh_try_blocks.
- Sylpheed yield: 2,588 FuncInfo (all version 0x19930522) /
  10,019 unwind entries / 315 try-blocks.

## M11.5 — Static-init driver chain detection

- New `xenia_analysis::static_init` module. Walks every function
  looking for the canonical _initterm loop: lwz cursor; mtctr;
  bcctrl; addi cursor, cursor, 4 bounded by a compare against another
  constant register. Extracts (array_start, array_end) and reads
  the array.
- Reuses `function_pointer_arrays` table — drivers' arrays land with
  kind='static_init' (replacing M11's prologue-heuristic output where
  the structurally-grounded pattern fires).
- Sylpheed yield: 0 drivers detected — the binary's static-init
  structure does not match the canonical CRT loop. Infrastructure
  ready; future M11.6 can relax.

## VMX vector-store xrefs (M6 follow-up)

- Adds AltiVec/VMX X-form load/store XOs to the M6 opcode-31
  dispatch: lvx/lvxl/lvebx/lvehx/lvewx (reads) and
  stvx/stvxl/stvebx/stvehx/stvewx (writes), all addr_mode=
  'x_form_indexed'. Static resolution still requires both rA and rB
  constant.
- Sylpheed yield: 110 newly-detected stvx writes.

## Shift_JIS + UTF-8 localised-string detection (M7 follow-up)

- Extends `xenia_analysis::strings::analyze` with scan_shift_jis (JIS
  X 0208 lead/trail byte ranges + half-width katakana pass-through)
  and scan_utf8 (2- and 3-byte sequences). At least one multi-byte
  unit required so pure-ASCII strings aren't double-counted.
- SJIS bytes rendered as \xHH escapes for diagnostic readability;
  full SJIS→UTF-8 decoding deferred.
- Sylpheed yield: 790 Shift_JIS strings (Japanese debug + UI text)
  + 39 UTF-8.

## Tests

- +2 EH (parses_minimal_funcinfo_v0, rejects_bogus_max_state)
- +2 static_init (detects_canonical_initterm_loop, rejects_function_without_pattern)
- +2 strings (detects_shift_jis_string, detects_utf8_multibyte_string)

Tests 649→655 (+6 unit tests). DB schema golden + write_analysis_results
signature updated for new EH parameter.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-05-10 00:36:53 +02:00
parent b03192c772
commit e428ce33aa
9 changed files with 1159 additions and 14 deletions

View File

@@ -107,7 +107,7 @@ fn db_schema_matches_expected_columns() {
w.write_base(&info).expect("write_base");
w.ingest_instructions(&pe, &info, &func_analysis, &labels)
.expect("ingest_instructions");
w.write_analysis_results(&pe, &info, &func_analysis, &labels, &xrefs, &[], &[], &[], None)
w.write_analysis_results(&pe, &info, &func_analysis, &labels, &xrefs, &[], &[], &[], None, &[])
.expect("write_analysis_results");
w.create_sql_views().expect("create_sql_views");
}
@@ -249,6 +249,33 @@ fn db_schema_matches_expected_columns() {
("vptr_offset", "BIGINT"),
("writer_function", "BIGINT"),
]),
("eh_funcinfo", &[
("address", "BIGINT"),
("magic", "BIGINT"),
("max_state", "BIGINT"),
("p_unwind_map", "BIGINT"),
("n_try_blocks", "BIGINT"),
("p_try_block_map", "BIGINT"),
("n_ip_map_entries", "BIGINT"),
("p_ip_to_state_map", "BIGINT"),
("p_es_type_list", "BIGINT"),
("eh_flags", "BIGINT"),
]),
("eh_unwind_map", &[
("funcinfo_address", "BIGINT"),
("state_index", "BIGINT"),
("to_state", "BIGINT"),
("action_pc", "BIGINT"),
]),
("eh_try_blocks", &[
("funcinfo_address", "BIGINT"),
("try_index", "BIGINT"),
("try_low", "BIGINT"),
("try_high", "BIGINT"),
("catch_high", "BIGINT"),
("n_catches", "BIGINT"),
("p_handler_array", "BIGINT"),
]),
("xrefs", &[
("source", "BIGINT"),
("target", "BIGINT"),