# `fmaddx` — Floating Multiply-Add > **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [A](../forms/A.md) · **Opcode:** `0xfc00003a` ## Assembler Mnemonics | Mnemonic | XML entry | Flags | Description | | --- | --- | --- | --- | | `fmadd` | `fmaddx` | — | Floating Multiply-Add | | `fmadd.` | `fmaddx` | Rc=1 | Floating Multiply-Add | ## Syntax ```asm fmadd[Rc] [FD], [FA], [FC], [FB] ``` ## Encoding ### `fmaddx` — form `A` - **Opcode word:** `0xfc00003a` - **Primary opcode (bits 0–5):** `63` - **Extended opcode:** `29` - **Synchronising:** no | Bits | Field | Meaning | | --- | --- | --- | | 0–5 | `OPCD` | primary opcode (59 or 63) | | 6–10 | `FRT` | destination FPR | | 11–15 | `FRA` | source A FPR | | 16–20 | `FRB` | source B FPR | | 21–25 | `FRC` | source C FPR (multiplier for madd-style ops) | | 26–30 | `XO` | extended opcode (5 bits) | | 31 | `Rc` | record-form flag (updates CR1) | ## Operands | Field | Role | Description | | --- | --- | --- | | `FA` | fmaddx: read | Source A floating-point register (`fr0`–`fr31`). | | `FC` | fmaddx: read | Source C floating-point register (for madd-style ops). | | `FB` | fmaddx: read | Source B floating-point register. | | `FD` | fmaddx: write | Destination floating-point register. | | `CR` | fmaddx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. | | `FPSCR` | fmaddx: write | Floating-Point Status and Control Register. | ## Register Effects ### `fmaddx` - **Reads (always):** `FA`, `FC`, `FB` - **Reads (conditional):** _none_ - **Writes (always):** `FD`, `FPSCR` - **Writes (conditional):** `CR` ## Status-Register Effects - `fmaddx`: **CR1** ← FPSCR[FX, FEX, VX, OX] when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions). ## Operation (pseudocode) ``` FRT <- (FRA × FRC) + FRB ``` ## C Translation Example ```c /* C translation: the xenia-rs interpreter arm below in */ /* Implementation References is the authoritative semantic */ /* snapshot. Translate it line-by-line: */ /* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */ /* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */ /* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */ /* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */ /* The Register Effects and Status-Register Effects tables above */ /* enumerate every side effect a faithful translation must emit. */ ``` ## Implementation References **`fmaddx`** - xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fmaddx"`](../../xenia-canary/tools/ppc-instructions.xml) - xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:186`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L186) - xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:28`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L28) - xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:928`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L928) - xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2640-2652`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2640-L2652)
xenia-rs interpreter body (frozen snapshot) ```rust PpcOpcode::fmaddx => { // PPCBUG-202: VXISI from input properties (not from `a*c` which has wrong sign on overflow). let a = ctx.fpr[instr.ra()]; let c = ctx.fpr[instr.rc()]; let b = ctx.fpr[instr.rb()]; fpscr::check_invalid_mul(ctx, a, c); fpscr::check_invalid_fma_add(ctx, a, c, b, false); let result = a.mul_add(c, b); ctx.fpr[instr.rd()] = result; fpscr::update_after_op(ctx, result, a.is_finite() && b.is_finite() && c.is_finite()); if instr.rc_bit() { update_cr1_from_fpscr(ctx); } ctx.pc += 4; } ```
## Special Cases & Edge Conditions - **Single rounding step.** `fmadd` computes `(FRA × FRC) + FRB` with one IEEE-754 rounding at the end — strictly more accurate than separate multiply + add. xenia-rs uses Rust's `f64::mul_add`, which guarantees a true FMA on hosts with hardware FMA (x86_64 with FMA3, ARM with NEON-FMA); on hosts without it, Rust's stdlib falls back to a software FMA so the semantic match is preserved. - **Operand layout.** A-form: `FRT, FRA, FRC, FRB`. Note the assembler order — `FRC` (multiplier) comes before `FRB` (addend). Encoding bit fields are `FRA` (11–15), `FRB` (16–20), `FRC` (21–25). - **Invalid operations.** `0×∞ + finite` → `VXIMZ`; `∞×x + ∓∞` (after multiplication produces ±∞ that opposes addend sign) → `VXISI`. Quiet NaN result with `FPSCR[VX, FX]` set. - **FPSCR side effects.** Hardware updates `FPRF`, `FR`, `FI`, `FX`, `OX`, `UX`, `XX`, `VXIMZ`, `VXISI`, `VXSNAN`. xenia-rs does not update FPSCR (xenia quirk). - **`Rc=1` (`fmadd.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1. - **NaN propagation.** Quiet-NaN result for any NaN operand; signalling NaNs are quietened. - **Use case.** Dot products, polynomial evaluation (Horner's method), matrix multiplies, Newton-Raphson divide/sqrt refinement. Hot-path PPC code is dense with `fmadd`. - **Denormal flush.** Xenon boots with `FPSCR[NI]=1`; xenia uses host IEEE behavior. ## Related Instructions - [`fmaddsx`](fmaddsx.md) — single-precision sibling. - [`fmsubx`](fmsubx.md), [`fnmaddx`](fnmaddx.md), [`fnmsubx`](fnmsubx.md) — the other three fused multiply-add variants: - `fmsub` = `(A×C) − B` - `fnmadd` = `−((A×C) + B)` - `fnmsub` = `−((A×C) − B)` - [`fmulx`](fmulx.md), [`faddx`](faddx.md) — non-fused decomposition (two rounding steps; less precise). - [`fresx`](fresx.md), [`frsqrtex`](frsqrtex.md) — reciprocal helpers refined by `fmadd`/`fnmsub`. ## IBM Reference - [AIX 7.3 — `fmadd` (Floating Multiply-Add)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fma-fmadd-floating-multiply-add-instruction) - [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/) (single-rounding fused multiply-add definition).