--- name: Disassembler unification — Phase 4 complete (2026-04-27) description: Assert-based fixture goldens replace println-only audits. Three JSON snapshots locked, VMX128 silent-bug area covered by direct accessor unit tests + integration fixture, DB schema golden enforces 7-table column layout + 5 SQL views. type: project originSessionId: 680cc54c-e77a-4d2d-a11b-ca562e9a68ec --- **Phase 4 of disassembler unification is COMPLETE** (2026-04-27, same session). ## What's in place ### Fixture-based goldens for the disassembler - **[crates/xenia-cpu/tests/disasm_goldens.rs](crates/xenia-cpu/tests/disasm_goldens.rs)** (~280 LOC) — three tests, each loads a JSON fixture and asserts every field of `xenia_cpu::disasm::format(decode(raw, addr))` matches. - `tests/golden/base_mnemonics.json` — 77 cases covering common ALU / load-store / branch / compare / FPU forms. - `tests/golden/extended_mnemonics.json` — 51 cases covering the simplified-mnemonic priority order (`li`, `lis`, `subi`, `mr`, `not`, `nop`, `slwi`, `srwi`, `clrlwi`, `clrrwi`, `extlwi`, `clrldi`, `srdi`, `rotldi`, `cmpwi`, `cmpdi`, `cmplwi`, `blr`, `blrl`, `bctr`, `bctrl`, `beqlr`, `bnelr`, `beq`, `bne`, `blt`, `bge`, `bgt`, `ble`, `bdnz`, `bdz`, `b` from `bc 20`, `mflr`, `mfctr`, `mfxer`, `mtlr`, `mtctr`, `mtxer`, `crnot`, `crclr`, `crset`, `crmove`, `lwsync`, `trap`, `tdeqi`, `twlti`). - `tests/golden/vmx128_registers.json` — 16 cases covering standard VMX (5-bit regs) and the silent-bug VMX128 op6 vd128 high-bit area (vd128 = 96, 127 with vrlimi128; lower-bit encodings record what they actually decode to since the op6 secondary key constrains bits 21-23). - **Regen workflow**: `REGEN_GOLDENS=1 cargo test -p xenia-cpu --test disasm_goldens` overwrites all three fixtures from current `format()` output. First run also auto-creates if the file is missing (with a panic afterwards forcing the developer to inspect+commit). - **xenia-cpu Cargo.toml** gains `serde` + `serde_json` as `dev-dependencies` only — production lib stays serde-free, honoring constraint #1. ### VMX128 silent-bug area: direct accessor unit tests - **[crates/xenia-cpu/src/decoder.rs](crates/xenia-cpu/src/decoder.rs)** has 7 new unit tests in the existing `tests` module that pin the canonical bit positions for `va128`/`vb128`/`vd128`/`vs128`: - `vmx128_vd128_low_5_bits_only` — 32 iterations covering vd_lo = 0..31 with vd_b21 = vd_b22 = 0. - `vmx128_vd128_bit21_adds_32` — bit 21 = 1 produces vd128 = 32. - `vmx128_vd128_bit22_adds_64` — bit 22 = 1 produces vd128 = 64. - `vmx128_vd128_full_127` — vd_lo = 31 + bit 21 + bit 22 = 127. - `vmx128_va128_uses_bit29` — va128 = bits 6-10 + bit 29. - `vmx128_vb128_uses_bits28_and_30` — vb128 = bits 16-20 + bit 28 + bit 30. - `vmx128_vs128_aliases_vd128` — vs128 ≡ vd128 across {0, 31, 32, 64, 96, 127}. - These pin decoder.rs as the canonical source. The pre-Phase-1 ppc.rs had different (wrong) positions; this test set guarantees the bug never returns silently. ### Analysis-side goldens (shim parity) - **[crates/xenia-analysis/tests/disasm_goldens.rs](crates/xenia-analysis/tests/disasm_goldens.rs)** (~120 LOC) — 4 tests that load the *same* cpu-side fixture JSON files (via `..` relative path) and verify: - `xenia_analysis::ppc::disasm(raw, addr).base` == `xenia_cpu::disasm::format(...).disasm` - `.ext` == `format().ext_disasm` - All structured fields (`mnemonic`/`operands`/`ext_*`/`branch_target`) match the fixture row. - `display()` returns extended form when present, base otherwise. - The cpu fixtures are the single source of truth; analysis shim drift surfaces immediately. ### DB schema golden - **[crates/xenia-analysis/tests/db_schema_golden.rs](crates/xenia-analysis/tests/db_schema_golden.rs)** (~230 LOC) — one test that builds an in-memory 16-byte PE-shaped fixture (4 instructions: mflr / nop / blr / nop), runs the full `DbWriter` pipeline (`write_base` → `ingest_instructions` → `write_analysis_results` → `create_sql_views`), and asserts: - Every column name + type for all 7 tables (`metadata`, `sections`, `imports`, `instructions`, `functions`, `labels`, `xrefs`) via `PRAGMA table_info`. - Row counts (4 instructions, 0 with target_hex since the fixture is indirect-only). - All 5 SQL views (`v_branch_xrefs`, `v_call_graph`, `v_function_first_instruction`, `v_imports_called`, `v_reachability_from_entry`) exist after `create_sql_views`. - Schema drift caught immediately. - **Caveat noted in comments**: SQL `LIKE 'v_%'` matches DuckDB's built-in `views` system view because `_` is a single-char wildcard. The test enumerates view names explicitly. ### Deletions - **xenia-cpu/tests/disasm_audit.rs** (161 LOC) — println-only, no assertions. Migrated to the assert-based goldens above. - **xenia-analysis/tests/disasm_audit.rs** (164 LOC) — same. ## Verification `cargo test --workspace`: 29 test groups, 0 failures. All previously-passing tests still pass (176 cpu interpreter + 13 disasm + 7 VMX128 accessors + 4 analysis goldens + 1 schema golden + everything else). ``` $ cargo test --workspace 2>&1 | grep -c "test result: ok" 29 $ cargo test --workspace 2>&1 | grep -c "FAILED\|failed" 0 ``` ## Discoveries / fixture-author surprises 1. **VMX128 op6 vrlimi128 vd128 < 96 is not a valid encoding.** The secondary key uses bits 21-23 = 111, so the high two bits of vd128 (which share bits 21+22) MUST be 11 for the dispatch to land on vrlimi128. Lower-bit attempts decode as vsrw128 / vpermwi128 instead. The fixture records this exact behavior — labeled honestly so future readers don't think these cases test what they don't. 2. **Sylpheed's real corpus only contains vrlimi128 with vd128 ∈ 96..=127** (consistent with the constraint above). The decoder has been emitting these correctly since Phase 1's silent-bug fix; the goldens now lock that behavior. 3. **`PRAGMA table_info` doesn't accept bind parameters** in DuckDB the way `WHERE` does — it uses the statement-level interpolation route. Inlined the table name into the query string with simple format!. 4. **DuckDB has a built-in `views` system view** that matches SQL `LIKE 'v_%'` (because `_` is a single-char wildcard, `views` = `v` + 'i' + 'ews' fits). Always enumerate view names explicitly, or use `LIKE 'v\_%' ESCAPE '\'`. ## LOC delta (Phase 4) - xenia-cpu/tests/disasm_goldens.rs: +388 (new) - xenia-cpu/tests/golden/*.json: +28k bytes (~700 lines committed JSON across 3 files) - xenia-cpu/src/decoder.rs: +95 (7 new VMX128 accessor unit tests) - xenia-cpu/Cargo.toml: +4 (dev-dependencies serde+serde_json) - xenia-cpu/tests/disasm_audit.rs: −161 (deleted) - xenia-analysis/tests/disasm_goldens.rs: +120 (new) - xenia-analysis/tests/db_schema_golden.rs: +245 (new) - xenia-analysis/tests/disasm_audit.rs: −164 (deleted) - **Net: +527 LOC test code + ~700 lines JSON fixtures, −325 LOC of useless println audits.** ## Tooling for future authors - **Adding new test cases**: edit `cases: &[(u32, u32, &str)]` array inline in `tests/disasm_goldens.rs`, run `REGEN_GOLDENS=1 cargo test -p xenia-cpu --test disasm_goldens`, inspect the diff in the JSON fixture, commit. - **Detecting drift**: any change to `format()` output that affects existing cases will fail the assertion test, naming the row label and showing the diff. Either the change is intentional (regen) or it's a regression (fix code). - **Schema changes**: `db_schema_golden.rs` will fail if you add/remove/rename a column or change a type. Update the `expected` slice in the test. ## End-of-phase status All four phases of the disassembler unification are now complete: - **Phase 1**: single-source-of-truth `format()` in xenia-cpu; analysis ppc.rs collapsed to a 30-line shim; VMX128 silent bug fixed. - **Phase 2**: iterator + 3 sinks (text/JSON/DuckDB) layer; `--json` CLI flag. - **Phase 3**: db.rs split into ingest/analyze; 5 additive SQL views; `--analyze=rust|sql|both` flag with cross-check warning. - **Phase 4**: assert-based fixture goldens + VMX128 accessor unit tests + DB schema golden replacing the println-only audits. The `DecodedInstr` struct stays at 8 bytes throughout; the decode cache stays at 1.3 MiB; Rust analysis (`func.rs`, `xref.rs`) remains the default and is unchanged. All three user constraints honored end-to-end.