Files
xenia-rs/migration/claude-memory/project_xenia_rs_disasm_unify_phase3.md
MechaCat02 e6d43a23ac chore: add migration/ bundle for cross-machine setup
Bundles state that lives OUTSIDE the xenia-rs repo so a fresh clone on
another machine can be brought up to identical configuration via
migration/setup.sh:

  - claude-memory/             ~/.claude/projects/-home-fabi-RE-Project-Sylpheed/memory/
                               (103 files, 1.1 MB - MEMORY.md + every
                                project_xenia_rs_*.md from audits
                                addis_signext through audit-058)
  - project-root/dot-claude/   <project-root>/.claude/settings.json
                               (Stop hook + permissions)
  - project-root/ppc-manual/   <project-root>/ppc-manual/
                               (PowerPC reference docs, 397 files, 3.7 MB)
  - project-root/run-canary.sh <project-root>/run-canary.sh
  - README.md                  Human-readable setup checklist
  - setup.sh                   Idempotent installer (also reclones
                               xenia-canary at pinned HEAD 6de80dffe)
  - MANIFEST.md                Per-file mapping + per-file-not-bundled
                               restoration recipe

Excluded from bundle (not shippable via git):
  - Sylpheed ISO (7.8 GB; copyright; manual copy required)
  - sylpheed.db (395 MB; regenerable from XEX via analysis tooling)
  - target/ build artifacts (rebuild on target)
  - audit-runs probe firehoses (.log/.stdout/.stderr ~11 GB; rerun if needed)
  - audit-runs memory dumps (.bin ~4.5 GB; rerun audit-026/027/029 if needed)
  - xenia-canary checkout (setup.sh reclones from
    git.mc02.dev/fabi/Xenia-Canary.git at HEAD 6de80dffe)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 21:38:38 +02:00

6.5 KiB

name, description, type, originSessionId
name description type originSessionId
Disassembler unification — Phase 3 complete (2026-04-27) db.rs split into ingest_instructions + write_analysis_results; new sql_views.rs with 5 views; --analyze=rust|sql|both CLI flag; target_hex column on instructions; Rust/SQL cross-check warning in `both` mode. project 680cc54c-e77a-4d2d-a11b-ca562e9a68ec

Phase 3 of disassembler unification is COMPLETE (2026-04-27, same session).

What's in place

Schema

Split DbWriter API (crates/xenia-analysis/src/db.rs)

  • pub fn ingest_instructions(pe, info, func_analysis, labels) — creates instructions table + indices and streams rows via the iterator + duckdb sink. No analysis tables.
  • pub fn write_analysis_results(pe, info, func_analysis, labels, xrefs) — creates functions, labels, xrefs tables + indices. Populated from Rust pass output.
  • pub fn write_disasm(...) — back-compat wrapper that calls both. Existing callers (e.g. cmd_exec) keep working unchanged.
  • pub fn create_sql_views(&mut self) — runs the SQL view definitions from crate::sql_views::ALL_VIEWS.
  • pub fn cross_check_branch_xrefs(&self) -> Result<(u64, u64)> — returns (sql_only, rust_only) row counts for symmetric difference between v_branch_xrefs and xrefs WHERE kind IN ('call','jump','branch').

SQL views (crates/xenia-analysis/src/sql_views.rs) — 5 views

  • v_branch_xrefs — derived from instructions.target_hex self-join. CASE on mnemonic mirrors xref.rs kind logic: bl/bla → call, b/ba → jump, bc* → branch.
  • v_call_graphxrefs ⨝ functions filtered to kind = 'call'. Surfaces caller/callee names.
  • v_reachability_from_entry — recursive CTE seeded from labels.name = 'entry_point', transitive over xrefs.kind IN ('call','jump','branch'). UNION (not UNION ALL) handles call-graph cycles.
  • v_function_first_instructionfunctions ⨝ instructions ON address. Convenience for inspecting prologues.
  • v_imports_calledxrefs ⨝ labels filtered to xrefs.kind = 'call' AND labels.kind = 'import'. Per-function import call summary.

All views are CREATE OR REPLACE — re-running is idempotent.

CLI (crates/xenia-app/src/main.rs)

  • New AnalyzeMode enum (Rust / Sql / Both) derived ValueEnum.
  • Dis { ..., analyze: AnalyzeMode } field with default_value_t = AnalyzeMode::Rust.
  • cmd_dis routes through:
    • Always: write_baseingest_instructionswrite_analysis_results (Rust passes always run, honoring constraint #3).
    • Sql or Both: also create_sql_views.
    • Both: also cross_check_branch_xrefs and log on disagreement (info if both zero, warn otherwise).

Constraint #3 honored: Rust analysis stays default and functional

  • Default flag value is rust.
  • Rust passes (func.rs + xref.rs) ALWAYS run when --db is set. The analyze flag only controls whether SQL views are additionally created.
  • The xrefs table is always populated by Rust passes. v_branch_xrefs is an alternative read surface, not a replacement.
  • Data-ref pass (xref.rs lis+addi/ori register tracking) and function detection (func.rs prologue patterns) remain Rust-only — they are not cleanly relational.

Verification

  • cargo build --workspace: clean.
  • cargo test -p xenia-cpu / -p xenia-analysis: all green (10 disasm tests + 9 audit + 168 cpu).
  • xenia dis --analyze=both --db <out> smoke verified end-to-end: 1.87M instructions written, 299,615 with target_hex (16% — direct branches), all 5 views queryable, cross-check returns (0, 0) — Rust and SQL agree on every (source, target, kind) tuple.
  • Sample reachability: 7,557 of 12,156 functions reachable from entry_point (62%) — sensible for a game with significant dead/unused code.

Bugs found and fixed during verification

  1. Kind-tag mismatch. XrefKind::tag() (xref.rs:21-29) returns the SHORT tags "call" / "j" / "br" (and "read" / "write" / "ref"). The first version of v_branch_xrefs and cross_check_branch_xrefs used the LONG names ('call' / 'jump' / 'branch') — which the comment in db.rs describes for the trace table, not xrefs. Cross-check returned 195K SQL-only rows. Fixed by changing CASE to 'call' / 'j' / 'br'. Don't trust the docstring at the top of db.rsbranch_trace.kind uses long names but xrefs.kind uses short tags.

  2. Reachability view collapsed to 1 row. First version seeded with labels.address (a single instruction VA) and looked for xrefs.source = r.addr. But the entry-point address (mflr r12) has no outgoing xref — branches happen at later instructions of the function. Fixed by reformulating as function-level reachability: seed with the function containing the entry_point label, then walk function → instructions → xrefs → target's enclosing function. UNION handles call-graph cycles.

LOC delta (Phase 3)

  • xenia-analysis/src/db.rs: +60 (split write_disasm; new methods)
  • xenia-analysis/src/sql_views.rs: +120 (NEW)
  • xenia-analysis/src/sinks/duckdb.rs: +1 line (target_hex column write)
  • xenia-analysis/src/lib.rs: +1 line (mod sql_views)
  • xenia-app/src/main.rs: +35 (AnalyzeMode enum + flag + routing + cross-check log)
  • Net: +217 LOC.

Behavior changes visible to users

  1. New --analyze=rust|sql|both flag on xenia dis, default rust. Backward compatible — existing scripts behave the same.
  2. New target_hex BIGINT column on instructions table. Existing queries work; new column adds query power for SQL-side branch xref derivation.
  3. 5 SQL views available when --analyze is sql or both. Read-only, idempotent.
  4. Cross-check warning in both mode flags any drift between formatter mnemonic strings and xref.rs kind classification.

What's next (Phase 4)

Per /home/fabi/.claude/plans/ok-execute-your-proposed-refactored-dolphin.md:

  • Phase 4: Replace println-only audits with assert-based JSON-fixture goldens. Expand coverage to base + extended + VMX128 (silent-bug area) + DB schema golden + ISO-gated end-to-end consistency.