chore: add migration/ bundle for cross-machine setup

Bundles state that lives OUTSIDE the xenia-rs repo so a fresh clone on
another machine can be brought up to identical configuration via
migration/setup.sh:

  - claude-memory/             ~/.claude/projects/-home-fabi-RE-Project-Sylpheed/memory/
                               (103 files, 1.1 MB - MEMORY.md + every
                                project_xenia_rs_*.md from audits
                                addis_signext through audit-058)
  - project-root/dot-claude/   <project-root>/.claude/settings.json
                               (Stop hook + permissions)
  - project-root/ppc-manual/   <project-root>/ppc-manual/
                               (PowerPC reference docs, 397 files, 3.7 MB)
  - project-root/run-canary.sh <project-root>/run-canary.sh
  - README.md                  Human-readable setup checklist
  - setup.sh                   Idempotent installer (also reclones
                               xenia-canary at pinned HEAD 6de80dffe)
  - MANIFEST.md                Per-file mapping + per-file-not-bundled
                               restoration recipe

Excluded from bundle (not shippable via git):
  - Sylpheed ISO (7.8 GB; copyright; manual copy required)
  - sylpheed.db (395 MB; regenerable from XEX via analysis tooling)
  - target/ build artifacts (rebuild on target)
  - audit-runs probe firehoses (.log/.stdout/.stderr ~11 GB; rerun if needed)
  - audit-runs memory dumps (.bin ~4.5 GB; rerun audit-026/027/029 if needed)
  - xenia-canary checkout (setup.sh reclones from
    git.mc02.dev/fabi/Xenia-Canary.git at HEAD 6de80dffe)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-05-10 21:38:38 +02:00
parent 8e709b0a24
commit e6d43a23ac
505 changed files with 86028 additions and 0 deletions

View File

@@ -0,0 +1,78 @@
---
name: Disassembler unification — Phase 3 complete (2026-04-27)
description: db.rs split into ingest_instructions + write_analysis_results; new sql_views.rs with 5 views; --analyze=rust|sql|both CLI flag; target_hex column on instructions; Rust/SQL cross-check warning in `both` mode.
type: project
originSessionId: 680cc54c-e77a-4d2d-a11b-ca562e9a68ec
---
**Phase 3 of disassembler unification is COMPLETE** (2026-04-27, same session).
## What's in place
### Schema
- **`instructions` table** gains a `target_hex BIGINT NULL` column populated from `DisasmText.branch_target`. Indexed via `idx_instructions_target_hex`. Documented in [crates/xenia-analysis/src/db.rs](crates/xenia-analysis/src/db.rs) module docstring.
- **DuckDB sink** ([crates/xenia-analysis/src/sinks/duckdb.rs](crates/xenia-analysis/src/sinks/duckdb.rs)) writes the new column.
### Split DbWriter API ([crates/xenia-analysis/src/db.rs](crates/xenia-analysis/src/db.rs))
- `pub fn ingest_instructions(pe, info, func_analysis, labels)` — creates `instructions` table + indices and streams rows via the iterator + duckdb sink. **No analysis tables.**
- `pub fn write_analysis_results(pe, info, func_analysis, labels, xrefs)` — creates `functions`, `labels`, `xrefs` tables + indices. Populated from Rust pass output.
- `pub fn write_disasm(...)` — back-compat wrapper that calls both. Existing callers (e.g. `cmd_exec`) keep working unchanged.
- `pub fn create_sql_views(&mut self)` — runs the SQL view definitions from `crate::sql_views::ALL_VIEWS`.
- `pub fn cross_check_branch_xrefs(&self) -> Result<(u64, u64)>` — returns `(sql_only, rust_only)` row counts for symmetric difference between `v_branch_xrefs` and `xrefs WHERE kind IN ('call','jump','branch')`.
### SQL views ([crates/xenia-analysis/src/sql_views.rs](crates/xenia-analysis/src/sql_views.rs)) — 5 views
- `v_branch_xrefs` — derived from `instructions.target_hex` self-join. CASE on mnemonic mirrors `xref.rs` kind logic: `bl`/`bla` → call, `b`/`ba` → jump, `bc*` → branch.
- `v_call_graph``xrefs ⨝ functions` filtered to `kind = 'call'`. Surfaces caller/callee names.
- `v_reachability_from_entry` — recursive CTE seeded from `labels.name = 'entry_point'`, transitive over `xrefs.kind IN ('call','jump','branch')`. `UNION` (not `UNION ALL`) handles call-graph cycles.
- `v_function_first_instruction``functions ⨝ instructions ON address`. Convenience for inspecting prologues.
- `v_imports_called``xrefs ⨝ labels` filtered to `xrefs.kind = 'call' AND labels.kind = 'import'`. Per-function import call summary.
All views are `CREATE OR REPLACE` — re-running is idempotent.
### CLI ([crates/xenia-app/src/main.rs](crates/xenia-app/src/main.rs))
- New `AnalyzeMode` enum (`Rust` / `Sql` / `Both`) derived `ValueEnum`.
- `Dis { ..., analyze: AnalyzeMode }` field with `default_value_t = AnalyzeMode::Rust`.
- `cmd_dis` routes through:
- Always: `write_base``ingest_instructions``write_analysis_results` (Rust passes always run, honoring constraint #3).
- `Sql` or `Both`: also `create_sql_views`.
- `Both`: also `cross_check_branch_xrefs` and log on disagreement (info if both zero, warn otherwise).
## Constraint #3 honored: Rust analysis stays default and functional
- Default flag value is `rust`.
- Rust passes (`func.rs` + `xref.rs`) ALWAYS run when `--db` is set. The `analyze` flag only controls whether SQL views are *additionally* created.
- The `xrefs` table is always populated by Rust passes. `v_branch_xrefs` is an alternative read surface, not a replacement.
- Data-ref pass (xref.rs lis+addi/ori register tracking) and function detection (func.rs prologue patterns) remain Rust-only — they are not cleanly relational.
## Verification
- `cargo build --workspace`: clean.
- `cargo test -p xenia-cpu` / `-p xenia-analysis`: all green (10 disasm tests + 9 audit + 168 cpu).
- `xenia dis --analyze=both --db <out>` smoke verified end-to-end: 1.87M instructions written, 299,615 with `target_hex` (16% — direct branches), all 5 views queryable, cross-check returns `(0, 0)` — Rust and SQL agree on every (source, target, kind) tuple.
- Sample reachability: 7,557 of 12,156 functions reachable from entry_point (62%) — sensible for a game with significant dead/unused code.
### Bugs found and fixed during verification
1. **Kind-tag mismatch.** `XrefKind::tag()` ([xref.rs:21-29](crates/xenia-analysis/src/xref.rs)) returns the SHORT tags `"call"` / `"j"` / `"br"` (and `"read"` / `"write"` / `"ref"`). The first version of `v_branch_xrefs` and `cross_check_branch_xrefs` used the LONG names (`'call'` / `'jump'` / `'branch'`) — which the comment in [db.rs](crates/xenia-analysis/src/db.rs) describes for the *trace* table, not `xrefs`. Cross-check returned 195K SQL-only rows. Fixed by changing CASE to `'call'` / `'j'` / `'br'`. **Don't trust the docstring at the top of db.rs**`branch_trace.kind` uses long names but `xrefs.kind` uses short tags.
2. **Reachability view collapsed to 1 row.** First version seeded with `labels.address` (a single instruction VA) and looked for `xrefs.source = r.addr`. But the entry-point address (`mflr r12`) has no outgoing xref — branches happen at later instructions of the function. Fixed by reformulating as function-level reachability: seed with the function containing the entry_point label, then walk `function → instructions → xrefs → target's enclosing function`. `UNION` handles call-graph cycles.
## LOC delta (Phase 3)
- xenia-analysis/src/db.rs: +60 (split write_disasm; new methods)
- xenia-analysis/src/sql_views.rs: +120 (NEW)
- xenia-analysis/src/sinks/duckdb.rs: +1 line (target_hex column write)
- xenia-analysis/src/lib.rs: +1 line (mod sql_views)
- xenia-app/src/main.rs: +35 (AnalyzeMode enum + flag + routing + cross-check log)
- **Net: +217 LOC**.
## Behavior changes visible to users
1. **New `--analyze=rust|sql|both` flag on `xenia dis`**, default `rust`. Backward compatible — existing scripts behave the same.
2. **New `target_hex BIGINT` column on `instructions` table**. Existing queries work; new column adds query power for SQL-side branch xref derivation.
3. **5 SQL views** available when `--analyze` is `sql` or `both`. Read-only, idempotent.
4. **Cross-check warning** in `both` mode flags any drift between formatter mnemonic strings and `xref.rs` kind classification.
## What's next (Phase 4)
Per [/home/fabi/.claude/plans/ok-execute-your-proposed-refactored-dolphin.md](plan):
- **Phase 4**: Replace println-only audits with assert-based JSON-fixture goldens. Expand coverage to base + extended + VMX128 (silent-bug area) + DB schema golden + ISO-gated end-to-end consistency.