Files
xenia-rs/migration/project-root/ppc-manual/vmx/vaddfp.md
MechaCat02 e6d43a23ac chore: add migration/ bundle for cross-machine setup
Bundles state that lives OUTSIDE the xenia-rs repo so a fresh clone on
another machine can be brought up to identical configuration via
migration/setup.sh:

  - claude-memory/             ~/.claude/projects/-home-fabi-RE-Project-Sylpheed/memory/
                               (103 files, 1.1 MB - MEMORY.md + every
                                project_xenia_rs_*.md from audits
                                addis_signext through audit-058)
  - project-root/dot-claude/   <project-root>/.claude/settings.json
                               (Stop hook + permissions)
  - project-root/ppc-manual/   <project-root>/ppc-manual/
                               (PowerPC reference docs, 397 files, 3.7 MB)
  - project-root/run-canary.sh <project-root>/run-canary.sh
  - README.md                  Human-readable setup checklist
  - setup.sh                   Idempotent installer (also reclones
                               xenia-canary at pinned HEAD 6de80dffe)
  - MANIFEST.md                Per-file mapping + per-file-not-bundled
                               restoration recipe

Excluded from bundle (not shippable via git):
  - Sylpheed ISO (7.8 GB; copyright; manual copy required)
  - sylpheed.db (395 MB; regenerable from XEX via analysis tooling)
  - target/ build artifacts (rebuild on target)
  - audit-runs probe firehoses (.log/.stdout/.stderr ~11 GB; rerun if needed)
  - audit-runs memory dumps (.bin ~4.5 GB; rerun audit-026/027/029 if needed)
  - xenia-canary checkout (setup.sh reclones from
    git.mc02.dev/fabi/Xenia-Canary.git at HEAD 6de80dffe)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 21:38:38 +02:00

190 lines
8.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# `vaddfp` — Vector Add Floating Point
> **Category:** [VMX (Altivec)](../categories/vmx.md) · **Form:** [VX](../forms/VX.md) · **Opcode:** `0x1000000a`
<!-- GENERATED: BEGIN -->
## Assembler Mnemonics
| Mnemonic | XML entry | Flags | Description |
| --- | --- | --- | --- |
| `vaddfp` | `vaddfp` | — | Vector Add Floating Point |
| `vaddfp128` | `vaddfp128` | — | Vector128 Add Floating Point |
## Syntax
```asm
vaddfp [VD], [VA], [VB]
vaddfp128 [VD], [VA], [VB]
```
## Encoding
### `vaddfp` — form `VX`
- **Opcode word:** `0x1000000a`
- **Primary opcode (bits 05):** `4`
- **Extended opcode:** `10`
- **Synchronising:** no
| Bits | Field | Meaning |
| --- | --- | --- |
| 05 | `OPCD` | primary opcode (4) |
| 610 | `VRT/VD` | destination vector register |
| 1115 | `VRA/VA` | source A vector register |
| 1620 | `VRB/VB` | source B vector register |
| 2131 | `XO` | extended opcode (11 bits) |
### `vaddfp128` — form `VX128`
- **Opcode word:** `0x14000010`
- **Primary opcode (bits 05):** `5`
- **Extended opcode:** `16`
- **Synchronising:** no
| Bits | Field | Meaning |
| --- | --- | --- |
| 05 | `OPCD` | primary opcode (4 or 5) |
| 610 | `VD128l` | destination low 5 bits |
| 1115 | `VA128l` | source A low 5 bits |
| 1620 | `VB128l` | source B low 5 bits |
| 21 | `VA128H` | source A high bit |
| 22 | `—` | reserved |
| 2325 | `VC` | optional VC / XO sub-field |
| 26 | `VA128h` | source A middle bit |
| 27 | `—` | reserved |
| 2829 | `VD128h` | destination high 2 bits |
| 3031 | `VB128h` | source B high 2 bits |
## Operands
| Field | Role | Description |
| --- | --- | --- |
| `VA` | vaddfp: read; vaddfp128: read | Source A vector register. |
| `VB` | vaddfp: read; vaddfp128: read | Source B vector register. |
| `VD` | vaddfp: write; vaddfp128: write | Destination vector register. |
## Register Effects
### `vaddfp`
- **Reads (always):** `VA`, `VB`
- **Reads (conditional):** _none_
- **Writes (always):** `VD`
- **Writes (conditional):** _none_
### `vaddfp128`
- **Reads (always):** `VA`, `VB`
- **Reads (conditional):** _none_
- **Writes (always):** `VD`
- **Writes (conditional):** _none_
## Status-Register Effects
_No condition-register or status-register effects._
## Operation (pseudocode)
```
for each 32-bit float lane i in 0..3:
VD[i] <- VA[i] + VB[i]
```
## C Translation Example
```c
/* vaddfp VD, VA, VB — lane-wise float add */
for (int i = 0; i < 4; ++i) v[insn.VD].f[i] = v[insn.VA].f[i] + v[insn.VB].f[i];
```
## Implementation References
**`vaddfp`**
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vaddfp"`](../../xenia-canary/tools/ppc-instructions.xml)
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:341`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L341)
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:89`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L89)
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:438`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L438)
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:1984-1998`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L1984-L1998)
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
```rust
PpcOpcode::vaddfp => {
// PPCBUG-435: VSCR.NJ=1 (Xbox 360 always boots with this set) requires
// flush-to-zero on subnormal inputs and outputs. Canary VMX float
// arithmetic flushes denormals unconditionally.
let a = ctx.vr[instr.ra()].as_f32x4();
let b = ctx.vr[instr.rb()].as_f32x4();
let mut r = [0f32; 4];
for i in 0..4 {
let ai = vmx::flush_denorm(a[i]);
let bi = vmx::flush_denorm(b[i]);
r[i] = vmx::flush_denorm(ai + bi);
}
ctx.vr[instr.rd()] = xenia_types::Vec128::from_f32x4_array(r);
ctx.pc += 4;
}
```
</details>
**`vaddfp128`**
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vaddfp128"`](../../xenia-canary/tools/ppc-instructions.xml)
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:344`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L344)
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:89`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L89)
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:610`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L610)
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:1999-2011`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L1999-L2011)
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
```rust
PpcOpcode::vaddfp128 => {
// PPCBUG-435: same as vaddfp.
let a = ctx.vr[instr.va128()].as_f32x4();
let b = ctx.vr[instr.vb128()].as_f32x4();
let mut r = [0f32; 4];
for i in 0..4 {
let ai = vmx::flush_denorm(a[i]);
let bi = vmx::flush_denorm(b[i]);
r[i] = vmx::flush_denorm(ai + bi);
}
ctx.vr[instr.vd128()] = xenia_types::Vec128::from_f32x4_array(r);
ctx.pc += 4;
}
```
</details>
<!-- GENERATED: END -->
## Extended Pseudocode
```
; Four independent lane-wise IEEE-754 single-precision adds
for i in 0..3:
VD[i] <- VA[i] + VB[i] ; binary32, rounded to nearest
; No FPSCR update (VMX uses VSCR, which only has NJ / SAT — and vaddfp doesn't saturate)
```
## Special Cases & Edge Conditions
- **Lane indexing is big-endian.** Lane 0 is the **most significant** 4 bytes of the 128-bit register (the one that appears at the lowest byte offset after a `stvx`). Xenia's `Vec128::as_f32x4()` already reads lanes in PPC order on x86-64. When writing C that manipulates individual lanes, index `v.f[0]` as "the byte 0..3" of the big-endian layout.
- **Flush-denormals ("NJ") mode.** Altivec is independent of FPSCR — it has its own 2-bit VSCR (`NJ` for non-Java mode + `SAT` sticky-saturation). VMX float operations honour `VSCR[NJ]`: when set (the Xenon boot default), denormal inputs and outputs are flushed to zero. This is **opposite** to the scalar FPU, which has its own non-IEEE bit. Xenia sets `NJ = 1` at context creation ([`context.rs`](../../xenia-rs/crates/xenia-cpu/src/context.rs)).
- **No exception, no trap.** Altivec floats never raise exceptions. NaN inputs produce NaN outputs; `±∞ ±∞` yields a NaN; there is no VXISI-style status bit. `VSCR[SAT]` is **not** touched by `vaddfp` (it saturates integer ops, not floats).
- **Four independent lanes.** Each lane's operation is unaffected by the others. Aliasing between `VA`, `VB`, and `VD` is legal and common (`vaddfp v3, v3, v4`).
- **VMX128 sibling (`vaddfp128`).** Semantics identical; only the register encoding differs. VMX128 uses a 7-bit operand ID per source (and destination) built from two or three non-contiguous bit fields — see [`categories/vmx128.md`](../categories/vmx128.md). Any bit pattern encodable as a 32-register VX-form is also encodable as a VMX128 form, so compilers picked the more compact form that reached the needed register range.
- **On x86-64 hosts.** A natural compilation uses `_mm_add_ps` or AVX `vaddps`. These preserve lane indexing because PPC lane 0 maps to x86 lane 3 only if you treat the 128-bit value as "big-endian in memory" — i.e. byte-swap on load/store. With xenia's `_be` memory helpers, `_mm_add_ps` gives the right per-lane result.
## Related Instructions
- [`vsubfp`](vsubfp.md) — lane-wise float subtract.
- [`vmaddfp`](vmaddfp.md) — lane-wise `(VA × VC) + VB` (fused multiply-add with single rounding).
- [`vnmsubfp`](vnmsubfp.md) — `((VA × VC) VB)`.
- [`vmaxfp`](vmaxfp.md), [`vminfp`](vminfp.md) — IEEE-754-aware max/min (NaN propagation).
- [`vcmpeqfp`](vcmpeqfp.md), [`vcmpgtfp`](vcmpgtfp.md), [`vcmpgefp`](vcmpgefp.md), [`vcmpbfp`](vcmpbfp.md) — compares producing per-lane all-ones / all-zero masks.
- [`vrfin`](vrfin.md), [`vrfim`](vrfim.md), [`vrfip`](vrfip.md), [`vrfiz`](vrfiz.md) — round to integer (to-nearest / down / up / toward-zero).
- [`vmulfp`](vmulfp.md) — xenia's helper; not a native Altivec op, included for convenience. Hardware games use `vmaddfp v, va, vc, v0_zero` instead.
## IBM Reference
- [AIX 7.3 — `vaddfp` (Vector Add Floating Point)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-vaddfp-vector-add-floating-point-instruction)
- [IBM AltiVec Technology Programmer's Interface Manual, Chapter 5 — Floating-Point Arithmetic](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf)