Files
xenia-rs/migration/project-root/ppc-manual/vmx/vnmsubfp.md
MechaCat02 e6d43a23ac chore: add migration/ bundle for cross-machine setup
Bundles state that lives OUTSIDE the xenia-rs repo so a fresh clone on
another machine can be brought up to identical configuration via
migration/setup.sh:

  - claude-memory/             ~/.claude/projects/-home-fabi-RE-Project-Sylpheed/memory/
                               (103 files, 1.1 MB - MEMORY.md + every
                                project_xenia_rs_*.md from audits
                                addis_signext through audit-058)
  - project-root/dot-claude/   <project-root>/.claude/settings.json
                               (Stop hook + permissions)
  - project-root/ppc-manual/   <project-root>/ppc-manual/
                               (PowerPC reference docs, 397 files, 3.7 MB)
  - project-root/run-canary.sh <project-root>/run-canary.sh
  - README.md                  Human-readable setup checklist
  - setup.sh                   Idempotent installer (also reclones
                               xenia-canary at pinned HEAD 6de80dffe)
  - MANIFEST.md                Per-file mapping + per-file-not-bundled
                               restoration recipe

Excluded from bundle (not shippable via git):
  - Sylpheed ISO (7.8 GB; copyright; manual copy required)
  - sylpheed.db (395 MB; regenerable from XEX via analysis tooling)
  - target/ build artifacts (rebuild on target)
  - audit-runs probe firehoses (.log/.stdout/.stderr ~11 GB; rerun if needed)
  - audit-runs memory dumps (.bin ~4.5 GB; rerun audit-026/027/029 if needed)
  - xenia-canary checkout (setup.sh reclones from
    git.mc02.dev/fabi/Xenia-Canary.git at HEAD 6de80dffe)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 21:38:38 +02:00

195 lines
8.2 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# `vnmsubfp` — Vector Negative Multiply-Subtract Floating Point
> **Category:** [VMX (Altivec)](../categories/vmx.md) · **Form:** [VA](../forms/VA.md) · **Opcode:** `0x1000002f`
<!-- GENERATED: BEGIN -->
## Assembler Mnemonics
| Mnemonic | XML entry | Flags | Description |
| --- | --- | --- | --- |
| `vnmsubfp` | `vnmsubfp` | — | Vector Negative Multiply-Subtract Floating Point |
| `vnmsubfp128` | `vnmsubfp128` | — | Vector128 Negative Multiply-Subtract Floating Point |
## Syntax
```asm
vnmsubfp [VD], [VA], [VC], [VB]
vnmsubfp128 [VD], [VA], [VD], [VB]
```
## Encoding
### `vnmsubfp` — form `VA`
- **Opcode word:** `0x1000002f`
- **Primary opcode (bits 05):** `4`
- **Extended opcode:** `47`
- **Synchronising:** no
| Bits | Field | Meaning |
| --- | --- | --- |
| 05 | `OPCD` | primary opcode (4) |
| 610 | `VRT` | destination vector register |
| 1115 | `VRA` | source A |
| 1620 | `VRB` | source B |
| 2125 | `VRC` | source C / shift |
| 2631 | `XO` | extended opcode (6 bits) |
### `vnmsubfp128` — form `VX128`
- **Opcode word:** `0x14000150`
- **Primary opcode (bits 05):** `5`
- **Extended opcode:** `336`
- **Synchronising:** no
| Bits | Field | Meaning |
| --- | --- | --- |
| 05 | `OPCD` | primary opcode (4 or 5) |
| 610 | `VD128l` | destination low 5 bits |
| 1115 | `VA128l` | source A low 5 bits |
| 1620 | `VB128l` | source B low 5 bits |
| 21 | `VA128H` | source A high bit |
| 22 | `—` | reserved |
| 2325 | `VC` | optional VC / XO sub-field |
| 26 | `VA128h` | source A middle bit |
| 27 | `—` | reserved |
| 2829 | `VD128h` | destination high 2 bits |
| 3031 | `VB128h` | source B high 2 bits |
## Operands
| Field | Role | Description |
| --- | --- | --- |
| `VA` | vnmsubfp: read; vnmsubfp128: read | Source A vector register. |
| `VC` | vnmsubfp: read | Source C vector register / 3-bit selector. |
| `VB` | vnmsubfp: read; vnmsubfp128: read | Source B vector register. |
| `VD` | vnmsubfp: write; vnmsubfp128: read; vnmsubfp128: write | Destination vector register. |
## Register Effects
### `vnmsubfp`
- **Reads (always):** `VA`, `VC`, `VB`
- **Reads (conditional):** _none_
- **Writes (always):** `VD`
- **Writes (conditional):** _none_
### `vnmsubfp128`
- **Reads (always):** `VA`, `VD`, `VB`
- **Reads (conditional):** _none_
- **Writes (always):** `VD`
- **Writes (conditional):** _none_
## Status-Register Effects
_No condition-register or status-register effects._
## Operation (pseudocode)
```
for each 32-bit float lane i in 0..3:
VD[i] <- ((VA[i] * VC[i]) VB[i])
```
## C Translation Example
```c
/* C translation: the xenia-rs interpreter arm below in */
/* Implementation References is the authoritative semantic */
/* snapshot. Translate it line-by-line: */
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
/* The Register Effects and Status-Register Effects tables above */
/* enumerate every side effect a faithful translation must emit. */
```
## Implementation References
**`vnmsubfp`**
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vnmsubfp"`](../../xenia-canary/tools/ppc-instructions.xml)
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:1154`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L1154)
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:110`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L110)
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:589`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L589)
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2074-2089`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2074-L2089)
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
```rust
PpcOpcode::vnmsubfp => {
// vD = -(vA * vC - vB) = vB - vA * vC. Same denorm-flush rule as vmaddfp.
let a = ctx.vr[instr.ra()].as_f32x4();
let b = ctx.vr[instr.rb()].as_f32x4();
let c = ctx.vr[instr.rc()].as_f32x4();
let mut r = [0f32; 4];
for i in 0..4 {
let ai = vmx::flush_denorm(a[i]);
let bi = vmx::flush_denorm(b[i]);
let ci = vmx::flush_denorm(c[i]);
// PPCBUG-426: single FMA rounding instead of two-step (b - a*c).
r[i] = vmx::flush_denorm(-ai.mul_add(ci, -bi));
}
ctx.vr[instr.rd()] = xenia_types::Vec128::from_f32x4_array(r);
ctx.pc += 4;
}
```
</details>
**`vnmsubfp128`**
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vnmsubfp128"`](../../xenia-canary/tools/ppc-instructions.xml)
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:1157`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L1157)
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:110`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L110)
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:615`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L615)
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2090-2107`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2090-L2107)
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
```rust
PpcOpcode::vnmsubfp128 => {
// VMX128 form: vD <- -((vA * vB) - vD) = vD - (vA * vB). Canary
// routes through `InstrEmit_vnmsubfp_` with the same arg-swap,
// which flushes all inputs unconditionally.
let a = ctx.vr[instr.va128()].as_f32x4();
let b = ctx.vr[instr.vb128()].as_f32x4();
let d = ctx.vr[instr.vd128()].as_f32x4();
let mut r = [0f32; 4];
for i in 0..4 {
let ai = vmx::flush_denorm(a[i]);
let bi = vmx::flush_denorm(b[i]);
let di = vmx::flush_denorm(d[i]);
// PPCBUG-427: single FMA rounding.
r[i] = vmx::flush_denorm(-ai.mul_add(bi, -di));
}
ctx.vr[instr.vd128()] = xenia_types::Vec128::from_f32x4_array(r);
ctx.pc += 4;
}
```
</details>
<!-- GENERATED: END -->
## Special Cases & Edge Conditions
- **Lane-wise negative multiply-subtract.** Each of the four lanes computes `VD[i] = ((VA[i] × VC[i]) VB[i])`, i.e. `VB[i] VA[i] × VC[i]`. The multiply and the subsequent add are **not** a single fused rounding step in xenia — they're a multiply, a subtract, then a negate — but the PowerPC ISA specifies the sequence to behave *as if* it were fused (single IEEE-754 rounding). Hardware Xenon indeed rounds only once.
- **IEEE-754 binary32 lanes.** Follows `VSCR[NJ]`: denormal inputs/outputs flush to zero when `NJ = 1`.
- **No VSCR[SAT] update.** VMX float ops never set saturation.
- **No FPSCR effect.** Unlike scalar `fnmsub[s]`, `vnmsubfp` does not touch FPSCR.
- **NaN propagation.** A NaN in any of `VA`, `VB`, or `VC` yields a NaN in the corresponding lane. Sign-of-NaN is unspecified but stable in xenia (matches the x86 host's `vfnmadd`-family output).
- **Big-endian lane indexing.** Lane 0 is the MSB-most 4 bytes.
- **VMX128 sibling: [`vnmsubfp128`](vnmsubfp128.md).** Identical operation with access to `v0..v127`.
- **No `Rc` bit** on this opcode; it never touches CR.
## Related Instructions
- [`vmaddfp`](vmaddfp.md) — the positive-rounded fused MAC `(VA × VC) + VB`.
- [`vaddfp`](vaddfp.md), [`vsubfp`](vsubfp.md) — the underlying adds/subs.
- [`vmulfp`](vmulfp.md) — xenia-convenience lane-wise float multiply (no native Altivec form; usually encoded as `vmaddfp VD, VA, VC, v0_zero`).
- [`vrefp`](vrefp.md), [`vrsqrtefp`](vrsqrtefp.md) — Newton iterations that pair with `vnmsubfp`.
- [`vmaxfp`](vmaxfp.md), [`vminfp`](vminfp.md) — the other float-arithmetic primitives.
## IBM Reference
- [AIX 7.3 — `vnmsubfp` (Vector Negative Multiply-Subtract Floating Point)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-vnmsubfp-vector-negative-multiply-subtract-floating-point-instruction)
- [IBM AltiVec Technology Programmer's Interface Manual, Chapter 5 — Floating-Point Arithmetic](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf)