chore: add migration/ bundle for cross-machine setup

Bundles state that lives OUTSIDE the xenia-rs repo so a fresh clone on another machine can be brought up to identical configuration via migration/setup.sh: - claude-memory/ ~/.claude/projects/-home-fabi-RE-Project-Sylpheed/memory/ (103 files, 1.1 MB - MEMORY.md + every project_xenia_rs_*.md from audits addis_signext through audit-058) - project-root/dot-claude/ <project-root>/.claude/settings.json (Stop hook + permissions) - project-root/ppc-manual/ <project-root>/ppc-manual/ (PowerPC reference docs, 397 files, 3.7 MB) - project-root/run-canary.sh <project-root>/run-canary.sh - README.md Human-readable setup checklist - setup.sh Idempotent installer (also reclones xenia-canary at pinned HEAD 6de80dffe) - MANIFEST.md Per-file mapping + per-file-not-bundled restoration recipe Excluded from bundle (not shippable via git): - Sylpheed ISO (7.8 GB; copyright; manual copy required) - sylpheed.db (395 MB; regenerable from XEX via analysis tooling) - target/ build artifacts (rebuild on target) - audit-runs probe firehoses (.log/.stdout/.stderr ~11 GB; rerun if needed) - audit-runs memory dumps (.bin ~4.5 GB; rerun audit-026/027/029 if needed) - xenia-canary checkout (setup.sh reclones from git.mc02.dev/fabi/Xenia-Canary.git at HEAD 6de80dffe) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 21:38:38 +02:00
parent 8e709b0a24
commit e6d43a23ac
505 changed files with 86028 additions and 0 deletions
--- a/migration/project-root/ppc-manual/vmx/vaddfp.md
+++ b/migration/project-root/ppc-manual/vmx/vaddfp.md
@@ -0,0 +1,189 @@
+# `vaddfp` — Vector Add Floating Point
+
+> **Category:** [VMX (Altivec)](../categories/vmx.md) · **Form:** [VX](../forms/VX.md) · **Opcode:** `0x1000000a`
+
+<!-- GENERATED: BEGIN -->
+
+## Assembler Mnemonics
+
+| Mnemonic | XML entry | Flags | Description |
+| --- | --- | --- | --- |
+| `vaddfp` | `vaddfp` | — | Vector Add Floating Point |
+| `vaddfp128` | `vaddfp128` | — | Vector128 Add Floating Point |
+
+## Syntax
+
+```asm
+vaddfp [VD], [VA], [VB]
+vaddfp128 [VD], [VA], [VB]
+```
+
+## Encoding
+
+### `vaddfp` — form `VX`
+
+- **Opcode word:** `0x1000000a`
+- **Primary opcode (bits 0–5):** `4`
+- **Extended opcode:** `10`
+- **Synchronising:** no
+
+| Bits | Field | Meaning |
+| --- | --- | --- |
+| 0–5 | `OPCD` | primary opcode (4) |
+| 6–10 | `VRT/VD` | destination vector register |
+| 11–15 | `VRA/VA` | source A vector register |
+| 16–20 | `VRB/VB` | source B vector register |
+| 21–31 | `XO` | extended opcode (11 bits) |
+
+### `vaddfp128` — form `VX128`
+
+- **Opcode word:** `0x14000010`
+- **Primary opcode (bits 0–5):** `5`
+- **Extended opcode:** `16`
+- **Synchronising:** no
+
+| Bits | Field | Meaning |
+| --- | --- | --- |
+| 0–5 | `OPCD` | primary opcode (4 or 5) |
+| 6–10 | `VD128l` | destination low 5 bits |
+| 11–15 | `VA128l` | source A low 5 bits |
+| 16–20 | `VB128l` | source B low 5 bits |
+| 21 | `VA128H` | source A high bit |
+| 22 | `—` | reserved |
+| 23–25 | `VC` | optional VC / XO sub-field |
+| 26 | `VA128h` | source A middle bit |
+| 27 | `—` | reserved |
+| 28–29 | `VD128h` | destination high 2 bits |
+| 30–31 | `VB128h` | source B high 2 bits |
+
+## Operands
+
+| Field | Role | Description |
+| --- | --- | --- |
+| `VA` | vaddfp: read; vaddfp128: read | Source A vector register. |
+| `VB` | vaddfp: read; vaddfp128: read | Source B vector register. |
+| `VD` | vaddfp: write; vaddfp128: write | Destination vector register. |
+
+## Register Effects
+
+### `vaddfp`
+
+- **Reads (always):** `VA`, `VB`
+- **Reads (conditional):** _none_
+- **Writes (always):** `VD`
+- **Writes (conditional):** _none_
+
+### `vaddfp128`
+
+- **Reads (always):** `VA`, `VB`
+- **Reads (conditional):** _none_
+- **Writes (always):** `VD`
+- **Writes (conditional):** _none_
+
+## Status-Register Effects
+
+_No condition-register or status-register effects._
+
+## Operation (pseudocode)
+
+```
+for each 32-bit float lane i in 0..3:
+    VD[i] <- VA[i] + VB[i]
+```
+
+## C Translation Example
+
+```c
+/* vaddfp VD, VA, VB — lane-wise float add                         */
+for (int i = 0; i < 4; ++i) v[insn.VD].f[i] = v[insn.VA].f[i] + v[insn.VB].f[i];
+```
+
+## Implementation References
+
+**`vaddfp`**
+- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vaddfp"`](../../xenia-canary/tools/ppc-instructions.xml)
+- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:341`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L341)
+- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:89`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L89)
+- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:438`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L438)
+- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:1984-1998`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L1984-L1998)
+<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
+
+```rust
+        PpcOpcode::vaddfp => {
+            // PPCBUG-435: VSCR.NJ=1 (Xbox 360 always boots with this set) requires
+            // flush-to-zero on subnormal inputs and outputs. Canary VMX float
+            // arithmetic flushes denormals unconditionally.
+            let a = ctx.vr[instr.ra()].as_f32x4();
+            let b = ctx.vr[instr.rb()].as_f32x4();
+            let mut r = [0f32; 4];
+            for i in 0..4 {
+                let ai = vmx::flush_denorm(a[i]);
+                let bi = vmx::flush_denorm(b[i]);
+                r[i] = vmx::flush_denorm(ai + bi);
+            }
+            ctx.vr[instr.rd()] = xenia_types::Vec128::from_f32x4_array(r);
+            ctx.pc += 4;
+        }
+```
+</details>
+
+**`vaddfp128`**
+- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vaddfp128"`](../../xenia-canary/tools/ppc-instructions.xml)
+- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:344`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L344)
+- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:89`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L89)
+- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:610`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L610)
+- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:1999-2011`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L1999-L2011)
+<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
+
+```rust
+        PpcOpcode::vaddfp128 => {
+            // PPCBUG-435: same as vaddfp.
+            let a = ctx.vr[instr.va128()].as_f32x4();
+            let b = ctx.vr[instr.vb128()].as_f32x4();
+            let mut r = [0f32; 4];
+            for i in 0..4 {
+                let ai = vmx::flush_denorm(a[i]);
+                let bi = vmx::flush_denorm(b[i]);
+                r[i] = vmx::flush_denorm(ai + bi);
+            }
+            ctx.vr[instr.vd128()] = xenia_types::Vec128::from_f32x4_array(r);
+            ctx.pc += 4;
+        }
+```
+</details>
+
+<!-- GENERATED: END -->
+
+## Extended Pseudocode
+
+```
+; Four independent lane-wise IEEE-754 single-precision adds
+for i in 0..3:
+    VD[i] <- VA[i] + VB[i]                       ; binary32, rounded to nearest
+
+; No FPSCR update (VMX uses VSCR, which only has NJ / SAT — and vaddfp doesn't saturate)
+```
+
+## Special Cases & Edge Conditions
+
+- **Lane indexing is big-endian.** Lane 0 is the **most significant** 4 bytes of the 128-bit register (the one that appears at the lowest byte offset after a `stvx`). Xenia's `Vec128::as_f32x4()` already reads lanes in PPC order on x86-64. When writing C that manipulates individual lanes, index `v.f[0]` as "the byte 0..3" of the big-endian layout.
+- **Flush-denormals ("NJ") mode.** Altivec is independent of FPSCR — it has its own 2-bit VSCR (`NJ` for non-Java mode + `SAT` sticky-saturation). VMX float operations honour `VSCR[NJ]`: when set (the Xenon boot default), denormal inputs and outputs are flushed to zero. This is **opposite** to the scalar FPU, which has its own non-IEEE bit. Xenia sets `NJ = 1` at context creation ([`context.rs`](../../xenia-rs/crates/xenia-cpu/src/context.rs)).
+- **No exception, no trap.** Altivec floats never raise exceptions. NaN inputs produce NaN outputs; `±∞ − ±∞` yields a NaN; there is no VXISI-style status bit. `VSCR[SAT]` is **not** touched by `vaddfp` (it saturates integer ops, not floats).
+- **Four independent lanes.** Each lane's operation is unaffected by the others. Aliasing between `VA`, `VB`, and `VD` is legal and common (`vaddfp v3, v3, v4`).
+- **VMX128 sibling (`vaddfp128`).** Semantics identical; only the register encoding differs. VMX128 uses a 7-bit operand ID per source (and destination) built from two or three non-contiguous bit fields — see [`categories/vmx128.md`](../categories/vmx128.md). Any bit pattern encodable as a 32-register VX-form is also encodable as a VMX128 form, so compilers picked the more compact form that reached the needed register range.
+- **On x86-64 hosts.** A natural compilation uses `_mm_add_ps` or AVX `vaddps`. These preserve lane indexing because PPC lane 0 maps to x86 lane 3 only if you treat the 128-bit value as "big-endian in memory" — i.e. byte-swap on load/store. With xenia's `_be` memory helpers, `_mm_add_ps` gives the right per-lane result.
+
+## Related Instructions
+
+- [`vsubfp`](vsubfp.md) — lane-wise float subtract.
+- [`vmaddfp`](vmaddfp.md) — lane-wise `(VA × VC) + VB` (fused multiply-add with single rounding).
+- [`vnmsubfp`](vnmsubfp.md) — `−((VA × VC) − VB)`.
+- [`vmaxfp`](vmaxfp.md), [`vminfp`](vminfp.md) — IEEE-754-aware max/min (NaN propagation).
+- [`vcmpeqfp`](vcmpeqfp.md), [`vcmpgtfp`](vcmpgtfp.md), [`vcmpgefp`](vcmpgefp.md), [`vcmpbfp`](vcmpbfp.md) — compares producing per-lane all-ones / all-zero masks.
+- [`vrfin`](vrfin.md), [`vrfim`](vrfim.md), [`vrfip`](vrfip.md), [`vrfiz`](vrfiz.md) — round to integer (to-nearest / down / up / toward-zero).
+- [`vmulfp`](vmulfp.md) — xenia's helper; not a native Altivec op, included for convenience. Hardware games use `vmaddfp v, va, vc, v0_zero` instead.
+
+## IBM Reference
+
+- [AIX 7.3 — `vaddfp` (Vector Add Floating Point)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-vaddfp-vector-add-floating-point-instruction)
+- [IBM AltiVec Technology Programmer's Interface Manual, Chapter 5 — Floating-Point Arithmetic](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf)