chore: add migration/ bundle for cross-machine setup

Bundles state that lives OUTSIDE the xenia-rs repo so a fresh clone on
another machine can be brought up to identical configuration via
migration/setup.sh:

  - claude-memory/             ~/.claude/projects/-home-fabi-RE-Project-Sylpheed/memory/
                               (103 files, 1.1 MB - MEMORY.md + every
                                project_xenia_rs_*.md from audits
                                addis_signext through audit-058)
  - project-root/dot-claude/   <project-root>/.claude/settings.json
                               (Stop hook + permissions)
  - project-root/ppc-manual/   <project-root>/ppc-manual/
                               (PowerPC reference docs, 397 files, 3.7 MB)
  - project-root/run-canary.sh <project-root>/run-canary.sh
  - README.md                  Human-readable setup checklist
  - setup.sh                   Idempotent installer (also reclones
                               xenia-canary at pinned HEAD 6de80dffe)
  - MANIFEST.md                Per-file mapping + per-file-not-bundled
                               restoration recipe

Excluded from bundle (not shippable via git):
  - Sylpheed ISO (7.8 GB; copyright; manual copy required)
  - sylpheed.db (395 MB; regenerable from XEX via analysis tooling)
  - target/ build artifacts (rebuild on target)
  - audit-runs probe firehoses (.log/.stdout/.stderr ~11 GB; rerun if needed)
  - audit-runs memory dumps (.bin ~4.5 GB; rerun audit-026/027/029 if needed)
  - xenia-canary checkout (setup.sh reclones from
    git.mc02.dev/fabi/Xenia-Canary.git at HEAD 6de80dffe)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-05-10 21:38:38 +02:00
parent 8e709b0a24
commit e6d43a23ac
505 changed files with 86028 additions and 0 deletions

View File

@@ -0,0 +1,97 @@
---
name: Disassembler unification — Phase 2 complete (2026-04-27)
description: Iterator + 3 sinks (text/JSON/DuckDB) layered over Phase 1's format(). New `xenia dis --json` subcommand. db.rs and formatter.rs both drive through enrich_section.
type: project
originSessionId: 680cc54c-e77a-4d2d-a11b-ca562e9a68ec
---
**Phase 2 of disassembler unification is COMPLETE** (2026-04-27, same session as Phase 1).
## What's in place
### xenia-cpu (decoder + iterator)
- **[crates/xenia-cpu/src/disasm.rs](crates/xenia-cpu/src/disasm.rs)** adds:
- `pub struct DisasmItem { addr, raw, opcode, text }` — yielded by the iterator.
- `pub fn iter_disasm(image, image_base, va_start, va_end) -> impl Iterator<Item = DisasmItem>` — walks bytes in PPC big-endian, decodes via `decoder::decode`, formats via `format`, yields one `DisasmItem` per 4-byte word. Stops on truncated tail.
- 2 new unit tests: `iter_disasm_walks_byte_slice_in_order`, `iter_disasm_stops_on_truncated_tail`.
- **[crates/xenia-cpu/src/lib.rs](crates/xenia-cpu/src/lib.rs)** re-exports `DisasmItem`, `iter_disasm`.
### xenia-analysis (enrichment + sinks)
- **[crates/xenia-analysis/src/disasm.rs](crates/xenia-analysis/src/disasm.rs)** (NEW, ~50 LOC):
- `pub struct RichDisasmItem<'a> { item, section, function, label }` — adds analysis context.
- `pub fn enrich_section(image, image_base, section_name, va_start, va_end, func_analysis, labels) -> impl Iterator<Item = RichDisasmItem<'a>>` — wraps `iter_disasm` with rolling-window function tracking + label lookup.
- **[crates/xenia-analysis/src/sinks/mod.rs](crates/xenia-analysis/src/sinks/mod.rs)** (NEW): module declarations.
- **[crates/xenia-analysis/src/sinks/duckdb.rs](crates/xenia-analysis/src/sinks/duckdb.rs)** (NEW, ~30 LOC): `append_instructions(appender, items) -> Result<u64>` — DuckDB Appender call per row.
- **[crates/xenia-analysis/src/sinks/json.rs](crates/xenia-analysis/src/sinks/json.rs)** (NEW, ~60 LOC): `write_jsonl<W: Write>(out, items) -> io::Result<u64>` — one JSON object per line. Internal `JsonRow<'a>` derives Serialize; uses `#[serde(skip_serializing_if = "Option::is_none")]` to keep rows compact.
- **[crates/xenia-analysis/src/sinks/text.rs](crates/xenia-analysis/src/sinks/text.rs)** (NEW, ~50 LOC): `write_instr_line<W: Write + ?Sized>(out, item, labels, sections, image_base, data_annotation)` — renders one .asm line with branch-target / data-ref annotation. Uses the structured `branch_target` field (not a regex over the disasm string — cleaner than the old `annotate_branch`).
- **[crates/xenia-analysis/src/lib.rs](crates/xenia-analysis/src/lib.rs)** declares `disasm` and `sinks` modules; re-exports `RichDisasmItem` and `enrich_section`.
- **[crates/xenia-analysis/Cargo.toml](crates/xenia-analysis/Cargo.toml)** adds `serde_json = { workspace = true }` dep.
### Refactored call sites
- **[crates/xenia-analysis/src/db.rs](crates/xenia-analysis/src/db.rs)** `insert_instructions_streaming` collapsed from a 50-line byte loop into 12 lines: `for section { let items = enrich_section(...); total += sinks::duckdb::append_instructions(&mut appender, items)?; }`.
- **[crates/xenia-analysis/src/formatter.rs](crates/xenia-analysis/src/formatter.rs)** code-section loop collapsed: now iterates `enrich_section` and calls `write_instr_line` for the per-line render. Orchestration (function headers, labels, xref comments, import annotations) stays in formatter.rs. The old `annotate_branch` helper is **deleted** — branch-target annotation lives in the text sink and uses `branch_target: Option<u32>` from `DisasmText`.
### CLI
- **[crates/xenia-app/src/main.rs](crates/xenia-app/src/main.rs)**: new `--json <path>` flag on `dis` subcommand. Writes JSON Lines via `sinks::json::write_jsonl` per code section. Wires through `cmd_dis` signature.
## Architecture
```
┌──────────────┐
│ image bytes │
│ + image_base│
└──────┬───────┘
xenia-cpu::iter_disasm(image, base, range)
│ yields DisasmItem
xenia-analysis::enrich_section(...).map(|i| RichDisasmItem { i, section, function, label })
│ yields RichDisasmItem
┌────────────────┼────────────────┐
▼ ▼ ▼
sinks::duckdb sinks::json sinks::text
append_instructions write_jsonl write_instr_line
│ │ │
▼ ▼ ▼
instructions .jsonl .asm
table (DuckDB) (one row/line) (formatted)
```
`DecodedInstr` (8 bytes, in decode cache) is unchanged. `DisasmItem` and `RichDisasmItem` only exist at the sink layer.
## Constraint #1 honored: DecodedInstr unchanged
Same as Phase 1 — `DecodedInstr` is still the 8-byte cache-resident struct; `DisasmItem` is allocated only in the iterator/sink layer.
## Verification
- `cargo build --workspace` clean (one previously-existing analysis warning was fixed during refactor).
- `cargo test -p xenia-cpu` — all 168 tests + 10 disasm tests pass (2 new for `iter_disasm`).
- `cargo test -p xenia-analysis` — all 9 audit tests pass.
- `xenia disasm <iso> -n 8` smoke test: same extended-mnemonic output as Phase 1.
- `xenia dis --db --json --quiet <iso>` end-to-end smoke test: PENDING (running at write time).
## LOC delta (Phase 2)
- xenia-cpu/src/disasm.rs: +60 (DisasmItem + iter_disasm + 2 tests)
- xenia-cpu/src/lib.rs: +1
- xenia-analysis/src/disasm.rs: +50 (new file)
- xenia-analysis/src/sinks/{mod,duckdb,json,text}.rs: +160 (new files)
- xenia-analysis/src/db.rs: 38 (collapsed loop)
- xenia-analysis/src/formatter.rs: 15 (annotate_branch deleted, inner loop replaced)
- xenia-analysis/Cargo.toml: +1 (serde_json dep)
- xenia-app/src/main.rs: +20 (--json flag + sink call)
- **Net: ~+240 LOC** (in line with the plan's "+250 / 250 net 0" estimate, modulo the new JSON sink which had no prior counterpart).
## Behavior changes visible to users
1. **New `xenia dis --json <path>` flag** — emits one structured JSON object per instruction. Schema: `addr, raw, mnemonic, operands, disasm, ext_mnemonic?, ext_operands?, ext_disasm?, branch_target?, section, function?, label?`.
2. Branch-target annotation in the .asm text output is now driven by the structured `branch_target` field (was a regex find of "0x" in the disasm string). Functionally equivalent for direct branches; immune to false-positive matches in non-branch operands containing hex.
3. Three sinks share one decode/format pass per instruction, but db+json+asm output runs decode 3 times (once per sink). Phase 7 / future work could fan out from a single iterator if needed.
## What's next (Phases 3-4)
Per [/home/fabi/.claude/plans/ok-execute-your-proposed-refactored-dolphin.md](plan):
- **Phase 3**: Split `db.rs` into `ingest_instructions` + `write_analysis_results`; add `target_hex BIGINT` column on `instructions`; add `crates/xenia-analysis/src/sql_views.rs` with `v_branch_xrefs`/`v_call_graph`/`v_reachability_from_entry`/`v_function_first_instruction`/`v_imports_called`; add `--analyze=rust|sql|both` flag (default `rust`). Rust passes (`func.rs`, `xref.rs`) stay default.
- **Phase 4**: Replace println-only audits with assert-based JSON-fixture goldens. Expand coverage to base + extended + VMX128 (silent-bug area) + DB schema + ISO-gated end-to-end.