chore: add migration/ bundle for cross-machine setup

Bundles state that lives OUTSIDE the xenia-rs repo so a fresh clone on
another machine can be brought up to identical configuration via
migration/setup.sh:

  - claude-memory/             ~/.claude/projects/-home-fabi-RE-Project-Sylpheed/memory/
                               (103 files, 1.1 MB - MEMORY.md + every
                                project_xenia_rs_*.md from audits
                                addis_signext through audit-058)
  - project-root/dot-claude/   <project-root>/.claude/settings.json
                               (Stop hook + permissions)
  - project-root/ppc-manual/   <project-root>/ppc-manual/
                               (PowerPC reference docs, 397 files, 3.7 MB)
  - project-root/run-canary.sh <project-root>/run-canary.sh
  - README.md                  Human-readable setup checklist
  - setup.sh                   Idempotent installer (also reclones
                               xenia-canary at pinned HEAD 6de80dffe)
  - MANIFEST.md                Per-file mapping + per-file-not-bundled
                               restoration recipe

Excluded from bundle (not shippable via git):
  - Sylpheed ISO (7.8 GB; copyright; manual copy required)
  - sylpheed.db (395 MB; regenerable from XEX via analysis tooling)
  - target/ build artifacts (rebuild on target)
  - audit-runs probe firehoses (.log/.stdout/.stderr ~11 GB; rerun if needed)
  - audit-runs memory dumps (.bin ~4.5 GB; rerun audit-026/027/029 if needed)
  - xenia-canary checkout (setup.sh reclones from
    git.mc02.dev/fabi/Xenia-Canary.git at HEAD 6de80dffe)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-05-10 21:38:38 +02:00
parent 8e709b0a24
commit e6d43a23ac
505 changed files with 86028 additions and 0 deletions

View File

@@ -0,0 +1,189 @@
# `vaddfp` — Vector Add Floating Point
> **Category:** [VMX (Altivec)](../categories/vmx.md) · **Form:** [VX](../forms/VX.md) · **Opcode:** `0x1000000a`
<!-- GENERATED: BEGIN -->
## Assembler Mnemonics
| Mnemonic | XML entry | Flags | Description |
| --- | --- | --- | --- |
| `vaddfp` | `vaddfp` | — | Vector Add Floating Point |
| `vaddfp128` | `vaddfp128` | — | Vector128 Add Floating Point |
## Syntax
```asm
vaddfp [VD], [VA], [VB]
vaddfp128 [VD], [VA], [VB]
```
## Encoding
### `vaddfp` — form `VX`
- **Opcode word:** `0x1000000a`
- **Primary opcode (bits 05):** `4`
- **Extended opcode:** `10`
- **Synchronising:** no
| Bits | Field | Meaning |
| --- | --- | --- |
| 05 | `OPCD` | primary opcode (4) |
| 610 | `VRT/VD` | destination vector register |
| 1115 | `VRA/VA` | source A vector register |
| 1620 | `VRB/VB` | source B vector register |
| 2131 | `XO` | extended opcode (11 bits) |
### `vaddfp128` — form `VX128`
- **Opcode word:** `0x14000010`
- **Primary opcode (bits 05):** `5`
- **Extended opcode:** `16`
- **Synchronising:** no
| Bits | Field | Meaning |
| --- | --- | --- |
| 05 | `OPCD` | primary opcode (4 or 5) |
| 610 | `VD128l` | destination low 5 bits |
| 1115 | `VA128l` | source A low 5 bits |
| 1620 | `VB128l` | source B low 5 bits |
| 21 | `VA128H` | source A high bit |
| 22 | `—` | reserved |
| 2325 | `VC` | optional VC / XO sub-field |
| 26 | `VA128h` | source A middle bit |
| 27 | `—` | reserved |
| 2829 | `VD128h` | destination high 2 bits |
| 3031 | `VB128h` | source B high 2 bits |
## Operands
| Field | Role | Description |
| --- | --- | --- |
| `VA` | vaddfp: read; vaddfp128: read | Source A vector register. |
| `VB` | vaddfp: read; vaddfp128: read | Source B vector register. |
| `VD` | vaddfp: write; vaddfp128: write | Destination vector register. |
## Register Effects
### `vaddfp`
- **Reads (always):** `VA`, `VB`
- **Reads (conditional):** _none_
- **Writes (always):** `VD`
- **Writes (conditional):** _none_
### `vaddfp128`
- **Reads (always):** `VA`, `VB`
- **Reads (conditional):** _none_
- **Writes (always):** `VD`
- **Writes (conditional):** _none_
## Status-Register Effects
_No condition-register or status-register effects._
## Operation (pseudocode)
```
for each 32-bit float lane i in 0..3:
VD[i] <- VA[i] + VB[i]
```
## C Translation Example
```c
/* vaddfp VD, VA, VB — lane-wise float add */
for (int i = 0; i < 4; ++i) v[insn.VD].f[i] = v[insn.VA].f[i] + v[insn.VB].f[i];
```
## Implementation References
**`vaddfp`**
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vaddfp"`](../../xenia-canary/tools/ppc-instructions.xml)
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:341`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L341)
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:89`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L89)
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:438`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L438)
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:1984-1998`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L1984-L1998)
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
```rust
PpcOpcode::vaddfp => {
// PPCBUG-435: VSCR.NJ=1 (Xbox 360 always boots with this set) requires
// flush-to-zero on subnormal inputs and outputs. Canary VMX float
// arithmetic flushes denormals unconditionally.
let a = ctx.vr[instr.ra()].as_f32x4();
let b = ctx.vr[instr.rb()].as_f32x4();
let mut r = [0f32; 4];
for i in 0..4 {
let ai = vmx::flush_denorm(a[i]);
let bi = vmx::flush_denorm(b[i]);
r[i] = vmx::flush_denorm(ai + bi);
}
ctx.vr[instr.rd()] = xenia_types::Vec128::from_f32x4_array(r);
ctx.pc += 4;
}
```
</details>
**`vaddfp128`**
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vaddfp128"`](../../xenia-canary/tools/ppc-instructions.xml)
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:344`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L344)
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:89`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L89)
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:610`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L610)
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:1999-2011`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L1999-L2011)
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
```rust
PpcOpcode::vaddfp128 => {
// PPCBUG-435: same as vaddfp.
let a = ctx.vr[instr.va128()].as_f32x4();
let b = ctx.vr[instr.vb128()].as_f32x4();
let mut r = [0f32; 4];
for i in 0..4 {
let ai = vmx::flush_denorm(a[i]);
let bi = vmx::flush_denorm(b[i]);
r[i] = vmx::flush_denorm(ai + bi);
}
ctx.vr[instr.vd128()] = xenia_types::Vec128::from_f32x4_array(r);
ctx.pc += 4;
}
```
</details>
<!-- GENERATED: END -->
## Extended Pseudocode
```
; Four independent lane-wise IEEE-754 single-precision adds
for i in 0..3:
VD[i] <- VA[i] + VB[i] ; binary32, rounded to nearest
; No FPSCR update (VMX uses VSCR, which only has NJ / SAT — and vaddfp doesn't saturate)
```
## Special Cases & Edge Conditions
- **Lane indexing is big-endian.** Lane 0 is the **most significant** 4 bytes of the 128-bit register (the one that appears at the lowest byte offset after a `stvx`). Xenia's `Vec128::as_f32x4()` already reads lanes in PPC order on x86-64. When writing C that manipulates individual lanes, index `v.f[0]` as "the byte 0..3" of the big-endian layout.
- **Flush-denormals ("NJ") mode.** Altivec is independent of FPSCR — it has its own 2-bit VSCR (`NJ` for non-Java mode + `SAT` sticky-saturation). VMX float operations honour `VSCR[NJ]`: when set (the Xenon boot default), denormal inputs and outputs are flushed to zero. This is **opposite** to the scalar FPU, which has its own non-IEEE bit. Xenia sets `NJ = 1` at context creation ([`context.rs`](../../xenia-rs/crates/xenia-cpu/src/context.rs)).
- **No exception, no trap.** Altivec floats never raise exceptions. NaN inputs produce NaN outputs; `±∞ ±∞` yields a NaN; there is no VXISI-style status bit. `VSCR[SAT]` is **not** touched by `vaddfp` (it saturates integer ops, not floats).
- **Four independent lanes.** Each lane's operation is unaffected by the others. Aliasing between `VA`, `VB`, and `VD` is legal and common (`vaddfp v3, v3, v4`).
- **VMX128 sibling (`vaddfp128`).** Semantics identical; only the register encoding differs. VMX128 uses a 7-bit operand ID per source (and destination) built from two or three non-contiguous bit fields — see [`categories/vmx128.md`](../categories/vmx128.md). Any bit pattern encodable as a 32-register VX-form is also encodable as a VMX128 form, so compilers picked the more compact form that reached the needed register range.
- **On x86-64 hosts.** A natural compilation uses `_mm_add_ps` or AVX `vaddps`. These preserve lane indexing because PPC lane 0 maps to x86 lane 3 only if you treat the 128-bit value as "big-endian in memory" — i.e. byte-swap on load/store. With xenia's `_be` memory helpers, `_mm_add_ps` gives the right per-lane result.
## Related Instructions
- [`vsubfp`](vsubfp.md) — lane-wise float subtract.
- [`vmaddfp`](vmaddfp.md) — lane-wise `(VA × VC) + VB` (fused multiply-add with single rounding).
- [`vnmsubfp`](vnmsubfp.md) — `((VA × VC) VB)`.
- [`vmaxfp`](vmaxfp.md), [`vminfp`](vminfp.md) — IEEE-754-aware max/min (NaN propagation).
- [`vcmpeqfp`](vcmpeqfp.md), [`vcmpgtfp`](vcmpgtfp.md), [`vcmpgefp`](vcmpgefp.md), [`vcmpbfp`](vcmpbfp.md) — compares producing per-lane all-ones / all-zero masks.
- [`vrfin`](vrfin.md), [`vrfim`](vrfim.md), [`vrfip`](vrfip.md), [`vrfiz`](vrfiz.md) — round to integer (to-nearest / down / up / toward-zero).
- [`vmulfp`](vmulfp.md) — xenia's helper; not a native Altivec op, included for convenience. Hardware games use `vmaddfp v, va, vc, v0_zero` instead.
## IBM Reference
- [AIX 7.3 — `vaddfp` (Vector Add Floating Point)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-vaddfp-vector-add-floating-point-instruction)
- [IBM AltiVec Technology Programmer's Interface Manual, Chapter 5 — Floating-Point Arithmetic](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf)