Files
xenia-rs/migration/claude-memory/project_xenia_rs_disasm_unify_phase2.md
MechaCat02 e6d43a23ac chore: add migration/ bundle for cross-machine setup
Bundles state that lives OUTSIDE the xenia-rs repo so a fresh clone on
another machine can be brought up to identical configuration via
migration/setup.sh:

  - claude-memory/             ~/.claude/projects/-home-fabi-RE-Project-Sylpheed/memory/
                               (103 files, 1.1 MB - MEMORY.md + every
                                project_xenia_rs_*.md from audits
                                addis_signext through audit-058)
  - project-root/dot-claude/   <project-root>/.claude/settings.json
                               (Stop hook + permissions)
  - project-root/ppc-manual/   <project-root>/ppc-manual/
                               (PowerPC reference docs, 397 files, 3.7 MB)
  - project-root/run-canary.sh <project-root>/run-canary.sh
  - README.md                  Human-readable setup checklist
  - setup.sh                   Idempotent installer (also reclones
                               xenia-canary at pinned HEAD 6de80dffe)
  - MANIFEST.md                Per-file mapping + per-file-not-bundled
                               restoration recipe

Excluded from bundle (not shippable via git):
  - Sylpheed ISO (7.8 GB; copyright; manual copy required)
  - sylpheed.db (395 MB; regenerable from XEX via analysis tooling)
  - target/ build artifacts (rebuild on target)
  - audit-runs probe firehoses (.log/.stdout/.stderr ~11 GB; rerun if needed)
  - audit-runs memory dumps (.bin ~4.5 GB; rerun audit-026/027/029 if needed)
  - xenia-canary checkout (setup.sh reclones from
    git.mc02.dev/fabi/Xenia-Canary.git at HEAD 6de80dffe)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 21:38:38 +02:00

7.5 KiB
Raw Blame History

name, description, type, originSessionId
name description type originSessionId
Disassembler unification — Phase 2 complete (2026-04-27) Iterator + 3 sinks (text/JSON/DuckDB) layered over Phase 1's format(). New `xenia dis --json` subcommand. db.rs and formatter.rs both drive through enrich_section. project 680cc54c-e77a-4d2d-a11b-ca562e9a68ec

Phase 2 of disassembler unification is COMPLETE (2026-04-27, same session as Phase 1).

What's in place

xenia-cpu (decoder + iterator)

  • crates/xenia-cpu/src/disasm.rs adds:
    • pub struct DisasmItem { addr, raw, opcode, text } — yielded by the iterator.
    • pub fn iter_disasm(image, image_base, va_start, va_end) -> impl Iterator<Item = DisasmItem> — walks bytes in PPC big-endian, decodes via decoder::decode, formats via format, yields one DisasmItem per 4-byte word. Stops on truncated tail.
    • 2 new unit tests: iter_disasm_walks_byte_slice_in_order, iter_disasm_stops_on_truncated_tail.
  • crates/xenia-cpu/src/lib.rs re-exports DisasmItem, iter_disasm.

xenia-analysis (enrichment + sinks)

  • crates/xenia-analysis/src/disasm.rs (NEW, ~50 LOC):
    • pub struct RichDisasmItem<'a> { item, section, function, label } — adds analysis context.
    • pub fn enrich_section(image, image_base, section_name, va_start, va_end, func_analysis, labels) -> impl Iterator<Item = RichDisasmItem<'a>> — wraps iter_disasm with rolling-window function tracking + label lookup.
  • crates/xenia-analysis/src/sinks/mod.rs (NEW): module declarations.
  • crates/xenia-analysis/src/sinks/duckdb.rs (NEW, ~30 LOC): append_instructions(appender, items) -> Result<u64> — DuckDB Appender call per row.
  • crates/xenia-analysis/src/sinks/json.rs (NEW, ~60 LOC): write_jsonl<W: Write>(out, items) -> io::Result<u64> — one JSON object per line. Internal JsonRow<'a> derives Serialize; uses #[serde(skip_serializing_if = "Option::is_none")] to keep rows compact.
  • crates/xenia-analysis/src/sinks/text.rs (NEW, ~50 LOC): write_instr_line<W: Write + ?Sized>(out, item, labels, sections, image_base, data_annotation) — renders one .asm line with branch-target / data-ref annotation. Uses the structured branch_target field (not a regex over the disasm string — cleaner than the old annotate_branch).
  • crates/xenia-analysis/src/lib.rs declares disasm and sinks modules; re-exports RichDisasmItem and enrich_section.
  • crates/xenia-analysis/Cargo.toml adds serde_json = { workspace = true } dep.

Refactored call sites

  • crates/xenia-analysis/src/db.rs insert_instructions_streaming collapsed from a 50-line byte loop into 12 lines: for section { let items = enrich_section(...); total += sinks::duckdb::append_instructions(&mut appender, items)?; }.
  • crates/xenia-analysis/src/formatter.rs code-section loop collapsed: now iterates enrich_section and calls write_instr_line for the per-line render. Orchestration (function headers, labels, xref comments, import annotations) stays in formatter.rs. The old annotate_branch helper is deleted — branch-target annotation lives in the text sink and uses branch_target: Option<u32> from DisasmText.

CLI

  • crates/xenia-app/src/main.rs: new --json <path> flag on dis subcommand. Writes JSON Lines via sinks::json::write_jsonl per code section. Wires through cmd_dis signature.

Architecture

                 ┌──────────────┐
                 │ image bytes  │
                 │  + image_base│
                 └──────┬───────┘
                        │
                        ▼
   xenia-cpu::iter_disasm(image, base, range)
                        │  yields DisasmItem
                        ▼
xenia-analysis::enrich_section(...).map(|i| RichDisasmItem { i, section, function, label })
                        │  yields RichDisasmItem
                        ▼
       ┌────────────────┼────────────────┐
       ▼                ▼                ▼
sinks::duckdb     sinks::json     sinks::text
append_instructions write_jsonl    write_instr_line
       │                │                │
       ▼                ▼                ▼
   instructions    .jsonl            .asm
   table (DuckDB)   (one row/line)    (formatted)

DecodedInstr (8 bytes, in decode cache) is unchanged. DisasmItem and RichDisasmItem only exist at the sink layer.

Constraint #1 honored: DecodedInstr unchanged

Same as Phase 1 — DecodedInstr is still the 8-byte cache-resident struct; DisasmItem is allocated only in the iterator/sink layer.

Verification

  • cargo build --workspace clean (one previously-existing analysis warning was fixed during refactor).
  • cargo test -p xenia-cpu — all 168 tests + 10 disasm tests pass (2 new for iter_disasm).
  • cargo test -p xenia-analysis — all 9 audit tests pass.
  • xenia disasm <iso> -n 8 smoke test: same extended-mnemonic output as Phase 1.
  • xenia dis --db --json --quiet <iso> end-to-end smoke test: PENDING (running at write time).

LOC delta (Phase 2)

  • xenia-cpu/src/disasm.rs: +60 (DisasmItem + iter_disasm + 2 tests)
  • xenia-cpu/src/lib.rs: +1
  • xenia-analysis/src/disasm.rs: +50 (new file)
  • xenia-analysis/src/sinks/{mod,duckdb,json,text}.rs: +160 (new files)
  • xenia-analysis/src/db.rs: 38 (collapsed loop)
  • xenia-analysis/src/formatter.rs: 15 (annotate_branch deleted, inner loop replaced)
  • xenia-analysis/Cargo.toml: +1 (serde_json dep)
  • xenia-app/src/main.rs: +20 (--json flag + sink call)
  • Net: ~+240 LOC (in line with the plan's "+250 / 250 net 0" estimate, modulo the new JSON sink which had no prior counterpart).

Behavior changes visible to users

  1. New xenia dis --json <path> flag — emits one structured JSON object per instruction. Schema: addr, raw, mnemonic, operands, disasm, ext_mnemonic?, ext_operands?, ext_disasm?, branch_target?, section, function?, label?.
  2. Branch-target annotation in the .asm text output is now driven by the structured branch_target field (was a regex find of "0x" in the disasm string). Functionally equivalent for direct branches; immune to false-positive matches in non-branch operands containing hex.
  3. Three sinks share one decode/format pass per instruction, but db+json+asm output runs decode 3 times (once per sink). Phase 7 / future work could fan out from a single iterator if needed.

What's next (Phases 3-4)

Per /home/fabi/.claude/plans/ok-execute-your-proposed-refactored-dolphin.md:

  • Phase 3: Split db.rs into ingest_instructions + write_analysis_results; add target_hex BIGINT column on instructions; add crates/xenia-analysis/src/sql_views.rs with v_branch_xrefs/v_call_graph/v_reachability_from_entry/v_function_first_instruction/v_imports_called; add --analyze=rust|sql|both flag (default rust). Rust passes (func.rs, xref.rs) stay default.
  • Phase 4: Replace println-only audits with assert-based JSON-fixture goldens. Expand coverage to base + extended + VMX128 (silent-bug area) + DB schema + ISO-gated end-to-end.