chore: add migration/ bundle for cross-machine setup
Bundles state that lives OUTSIDE the xenia-rs repo so a fresh clone on
another machine can be brought up to identical configuration via
migration/setup.sh:
- claude-memory/ ~/.claude/projects/-home-fabi-RE-Project-Sylpheed/memory/
(103 files, 1.1 MB - MEMORY.md + every
project_xenia_rs_*.md from audits
addis_signext through audit-058)
- project-root/dot-claude/ <project-root>/.claude/settings.json
(Stop hook + permissions)
- project-root/ppc-manual/ <project-root>/ppc-manual/
(PowerPC reference docs, 397 files, 3.7 MB)
- project-root/run-canary.sh <project-root>/run-canary.sh
- README.md Human-readable setup checklist
- setup.sh Idempotent installer (also reclones
xenia-canary at pinned HEAD 6de80dffe)
- MANIFEST.md Per-file mapping + per-file-not-bundled
restoration recipe
Excluded from bundle (not shippable via git):
- Sylpheed ISO (7.8 GB; copyright; manual copy required)
- sylpheed.db (395 MB; regenerable from XEX via analysis tooling)
- target/ build artifacts (rebuild on target)
- audit-runs probe firehoses (.log/.stdout/.stderr ~11 GB; rerun if needed)
- audit-runs memory dumps (.bin ~4.5 GB; rerun audit-026/027/029 if needed)
- xenia-canary checkout (setup.sh reclones from
git.mc02.dev/fabi/Xenia-Canary.git at HEAD 6de80dffe)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
121
migration/project-root/ppc-manual/fpu/fabsx.md
Normal file
121
migration/project-root/ppc-manual/fpu/fabsx.md
Normal file
@@ -0,0 +1,121 @@
|
||||
# `fabsx` — Floating Absolute Value
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [X](../forms/X.md) · **Opcode:** `0xfc000210`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fabs` | `fabsx` | — | Floating Absolute Value |
|
||||
| `fabs.` | `fabsx` | Rc=1 | Floating Absolute Value |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fabs[Rc] [FD], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fabsx` — form `X`
|
||||
|
||||
- **Opcode word:** `0xfc000210`
|
||||
- **Primary opcode (bits 0–5):** `63`
|
||||
- **Extended opcode:** `264`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode |
|
||||
| 6–10 | `RT/FRT/VRT` | destination |
|
||||
| 11–15 | `RA/FRA/VRA` | source A |
|
||||
| 16–20 | `RB/FRB/VRB` | source B |
|
||||
| 21–30 | `XO` | extended opcode (10 bits) |
|
||||
| 31 | `Rc` | record-form flag |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FB` | fabsx: read | Source B floating-point register. |
|
||||
| `FD` | fabsx: write | Destination floating-point register. |
|
||||
| `CR` | fabsx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fabsx`
|
||||
|
||||
- **Reads (always):** `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fabsx`: **CR0** ← signed-compare(result, 0) with `SO ← XER[SO]`, when `Rc=1`.
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
FRT <- clear_sign(FRB)
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fabsx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fabsx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:478`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L478)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:27`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L27)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:909`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L909)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2757-2761`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2757-L2761)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fabsx => {
|
||||
ctx.fpr[instr.rd()] = ctx.fpr[instr.rb()].abs();
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Bit-pattern operation, no rounding.** `fabs` clears the sign bit (bit 0) of the source FPR's binary64 representation and writes the 64-bit value to the destination unchanged otherwise. No precision loss, no FPSCR exception bits. The mnemonic does not have an `s` variant — there is one form regardless of whether the operand is interpreted as binary32 or binary64.
|
||||
- **NaN handling.** `fabs(NaN)` returns the same NaN with the sign bit cleared. The signalling/quiet bit is **not** modified, and `FPSCR[VXSNAN]` is **not** raised. xenia-rs uses `f64::abs`, which matches: it is bit-level `x & 0x7FFF_FFFF_FFFF_FFFF`.
|
||||
- **Special values.** `fabs(±0) = +0`; `fabs(±∞) = +∞`; `fabs(±NaN)` = `+NaN` (sign cleared, payload preserved).
|
||||
- **FPSCR is largely untouched.** Hardware specifies `FPRF` is **not** updated by `fabs`, and no exception bits are raised. Notation in the page header about `FPSCR` write is generic — the only meaningful write is via `Rc=1`.
|
||||
- **`Rc=1` (`fabs.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1 (these bits are typically stale or zero).
|
||||
- **No `FRA` operand.** X-form, primary 63, XO 264. Reads `FRB` only; bits 11–15 are don't-care.
|
||||
- **Common idiom.** `fabs` followed by `fcmpu` against a small constant for ULP-sized "near zero" tests; or paired with `fneg`/`fnabs` for sign-set-to-known operations.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fnegx`](fnegx.md) — flip sign bit.
|
||||
- [`fnabsx`](fnabsx.md) — absolute value with sign **set** (always negative result).
|
||||
- [`fmrx`](fmrx.md) — copy FPR (no sign manipulation).
|
||||
- [`fselx`](fselx.md) — branch-free select; combined with `fabs` for `min`/`max`/`clamp` patterns.
|
||||
- [`fcmpux`](fcmpu.md), [`fcmpox`](fcmpo.md) — compares often paired with `fabs` for magnitude tests.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fabs` (Floating Absolute Value)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fabs-floating-absolute-value-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/) (sign-bit manipulation defined as bit-pattern, not arithmetic).
|
||||
130
migration/project-root/ppc-manual/fpu/faddsx.md
Normal file
130
migration/project-root/ppc-manual/fpu/faddsx.md
Normal file
@@ -0,0 +1,130 @@
|
||||
# `faddsx` — Floating Add Single
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [A](../forms/A.md) · **Opcode:** `0xec00002a`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fadds` | `faddsx` | — | Floating Add Single |
|
||||
| `fadds.` | `faddsx` | Rc=1 | Floating Add Single |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fadds[Rc] [FD], [FA], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `faddsx` — form `A`
|
||||
|
||||
- **Opcode word:** `0xec00002a`
|
||||
- **Primary opcode (bits 0–5):** `59`
|
||||
- **Extended opcode:** `21`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (59 or 63) |
|
||||
| 6–10 | `FRT` | destination FPR |
|
||||
| 11–15 | `FRA` | source A FPR |
|
||||
| 16–20 | `FRB` | source B FPR |
|
||||
| 21–25 | `FRC` | source C FPR (multiplier for madd-style ops) |
|
||||
| 26–30 | `XO` | extended opcode (5 bits) |
|
||||
| 31 | `Rc` | record-form flag (updates CR1) |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FA` | faddsx: read | Source A floating-point register (`fr0`–`fr31`). |
|
||||
| `FB` | faddsx: read | Source B floating-point register. |
|
||||
| `FD` | faddsx: write | Destination floating-point register. |
|
||||
| `CR` | faddsx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
| `FPSCR` | faddsx: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `faddsx`
|
||||
|
||||
- **Reads (always):** `FA`, `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`, `FPSCR`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `faddsx`: **CR1** ← FPSCR[FX, FEX, VX, OX] when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
FRT <- RoundToSingle(FRA + FRB) ; single-precision
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`faddsx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="faddsx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:46`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L46)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:27`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L27)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:388`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L388)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2565-2574`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2565-L2574)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::faddsx => {
|
||||
let a = ctx.fpr[instr.ra()];
|
||||
let b = ctx.fpr[instr.rb()];
|
||||
fpscr::check_invalid_add(ctx, a, b, false);
|
||||
let result = to_single(ctx, a + b);
|
||||
ctx.fpr[instr.rd()] = result;
|
||||
fpscr::update_after_op(ctx, result, a.is_finite() && b.is_finite());
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Single precision via double FPRs.** The trailing `s` in the mnemonic means the result is rounded to IEEE-754 binary32 after the addition, then re-encoded into the 64-bit FPR using the binary64 representation of that single-precision value. The host computes `to_single(a + b)`; both source operands are read as full binary64.
|
||||
- **FPSCR side effects.** Hardware updates `FPRF` (result class), `FR`/`FI` (rounding info), `FX`, and the exception bits — `OX` on overflow, `UX` on underflow, `XX` on inexact, `VXISI` on `±∞ − ±∞`, `VXSNAN` on a signalling-NaN input. xenia-rs does **not** maintain FPSCR in the interpreter snapshot — call this out as a xenia quirk if you depend on cross-instruction FPSCR observation.
|
||||
- **`Rc=1` (`fadds.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1. xenia models this via `update_cr1_from_fpscr()`.
|
||||
- **NaN propagation.** Any NaN input yields a quiet NaN result; signalling NaNs are quietened (signalling bit cleared) per PowerISA. Host-native `f64 +` may not perform that quietening on every platform.
|
||||
- **`±∞ − ±∞` after rounding.** Although `+`-shaped, opposite-signed infinities still produce `QNaN(VXISI)`.
|
||||
- **`FPSCR[NI]` (non-IEEE / flush-to-zero)** is set at Xenon boot, so denormal results normally flush to zero. Xenia inherits host semantics, which is IEEE-compliant by default; titles tuned around flush-to-zero may see slightly different denormal rounding under xenia.
|
||||
- **Rounding mode** uses `FPSCR[RN]` (00 nearest-even, 01 toward 0, 10 toward +∞, 11 toward −∞). Default is nearest-even and is rarely changed.
|
||||
- **A-form encoding ignores `FRC`.** Bits 21–25 are don't-care for the add family.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`faddx`](faddx.md) — double-precision sibling.
|
||||
- [`fsubsx`](fsubsx.md), [`fmulsx`](fmulsx.md), [`fdivsx`](fdivsx.md) — other single-precision arithmetic ops.
|
||||
- [`fmaddsx`](fmaddsx.md), [`fmsubsx`](fmsubsx.md), [`fnmaddsx`](fnmaddsx.md), [`fnmsubsx`](fnmsubsx.md) — fused multiply-add single-precision family (single rounding step).
|
||||
- [`frspx`](frspx.md) — explicit double→single rounding helper; `fadds` is essentially `frsp(fadd)` fused into one rounding.
|
||||
- [`mffsx`](mffsx.md), [`mtfsfx`](mtfsfx.md) — read/write FPSCR for rounding-mode and exception control.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fadds` (Floating Add Single)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fadds-floating-add-single-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/) (single-precision rounding rules and FPSCR side effects).
|
||||
143
migration/project-root/ppc-manual/fpu/faddx.md
Normal file
143
migration/project-root/ppc-manual/fpu/faddx.md
Normal file
@@ -0,0 +1,143 @@
|
||||
# `faddx` — Floating Add
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [A](../forms/A.md) · **Opcode:** `0xfc00002a`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fadd` | `faddx` | — | Floating Add |
|
||||
| `fadd.` | `faddx` | Rc=1 | Floating Add |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fadd[Rc] [FD], [FA], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `faddx` — form `A`
|
||||
|
||||
- **Opcode word:** `0xfc00002a`
|
||||
- **Primary opcode (bits 0–5):** `63`
|
||||
- **Extended opcode:** `21`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (59 or 63) |
|
||||
| 6–10 | `FRT` | destination FPR |
|
||||
| 11–15 | `FRA` | source A FPR |
|
||||
| 16–20 | `FRB` | source B FPR |
|
||||
| 21–25 | `FRC` | source C FPR (multiplier for madd-style ops) |
|
||||
| 26–30 | `XO` | extended opcode (5 bits) |
|
||||
| 31 | `Rc` | record-form flag (updates CR1) |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FA` | faddx: read | Source A floating-point register (`fr0`–`fr31`). |
|
||||
| `FB` | faddx: read | Source B floating-point register. |
|
||||
| `FD` | faddx: write | Destination floating-point register. |
|
||||
| `CR` | faddx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
| `FPSCR` | faddx: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `faddx`
|
||||
|
||||
- **Reads (always):** `FA`, `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`, `FPSCR`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `faddx`: **CR1** ← FPSCR[FX, FEX, VX, OX] when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
FRT <- FRA + FRB ; double-precision
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* fadd / fadd. — IEEE-754 double-precision add (A-form) */
|
||||
f[insn.FRT] = f[insn.FRA] + f[insn.FRB];
|
||||
if (insn.Rc) update_cr1_from_fpscr();
|
||||
/* FPSCR[FPRF, FR, FI, FX, exceptions] implicitly updated by the FPU. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`faddx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="faddx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:38`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L38)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:27`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L27)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:922`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L922)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2555-2564`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2555-L2564)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::faddx => {
|
||||
let a = ctx.fpr[instr.ra()];
|
||||
let b = ctx.fpr[instr.rb()];
|
||||
fpscr::check_invalid_add(ctx, a, b, false);
|
||||
let result = a + b;
|
||||
ctx.fpr[instr.rd()] = result;
|
||||
fpscr::update_after_op(ctx, result, a.is_finite() && b.is_finite());
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Extended Pseudocode
|
||||
|
||||
```
|
||||
FRT <- round(FRA + FRB, FPSCR[RN]) ; double precision, current rounding mode
|
||||
|
||||
; FPSCR side-effects (always)
|
||||
FPSCR[FPRF] <- classify(FRT) ; sign / class bits
|
||||
FPSCR[FR,FI] <- round_info
|
||||
if overflow then FPSCR[OX] <- 1; FPSCR[FX] <- 1
|
||||
if underflow then FPSCR[UX] <- 1; FPSCR[FX] <- 1
|
||||
if inexact then FPSCR[XX] <- 1; FPSCR[FX] <- 1
|
||||
if NaN input or ±∞−±∞ then FPSCR[VXISI]<- 1; FPSCR[FX] <- 1
|
||||
FPSCR[FEX] <- any-enabled-exception
|
||||
|
||||
if Rc then
|
||||
CR1 <- FPSCR[FX, FEX, VX, OX] ; the four "summary" bits
|
||||
```
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Double precision.** `fadd` always operates on IEEE-754 binary64 regardless of whether either source was produced by a single-precision instruction. Single-precision adds use [`faddsx`](faddsx.md) and automatically round the result to binary32 precision.
|
||||
- **No immediate / carry / OE.** FPU arithmetic has no immediate forms, no carry, and no overflow-enable bit. `Rc` is the only modifier — it writes `CR1` from the four top FPSCR bits.
|
||||
- **FPSCR is always updated.** Even the non-record form (`fadd`) updates `FPSCR[FPRF, FR, FI, FX, …]` as a side effect of execution; xenia's interpreter currently **does not** model this, so translations that rely on observing FPSCR bits across a pair of FPU instructions will diverge from hardware. If your translator needs compatible FPSCR state, emit explicit updates — or accept the simplification, which matches real Xbox 360 title behaviour in practice (titles rarely read FPSCR except via `mffs` for exception sanity checks).
|
||||
- **NaN propagation.** Per IEEE-754, any NaN input produces a NaN output; PowerPC specifies that the *signalling* bit of the result NaN is cleared (quietening a signalling input). Xenia uses host-native `f64 +`, which may preserve the signalling bit on some platforms — assume quietening for correctness.
|
||||
- **`±∞ − ±∞` is an invalid operation.** Produces a quiet NaN (`QNaN(VXISI)`) and sets `FPSCR[VXISI]`. Xenia emits the host-native NaN.
|
||||
- **Denormal handling.** Xenon's default mode flushes denormal results to zero (FPSCR[NI] / "non-IEEE mode" bit set at boot). Xenia inherits host semantics by default; if title code explicitly clears NI (rare) you'll get IEEE-compliant denormals from the host FPU.
|
||||
- **Rounding mode.** `FPSCR[RN]` selects one of four rounding modes (nearest-even, toward 0, toward +∞, toward −∞). Games rarely change RN from the default nearest-even. If your translator needs faithful rounding-mode support emit `fesetround` around the operation.
|
||||
- **Register encoding.** A-form: `FRT`, `FRA`, `FRB`, `FRC`, `Rc` — but `fadd` ignores `FRC` (the "C" multiplier operand used by `fmadd`-style ops). The `FRC` field is architecturally don't-care but typically encoded as 0.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`faddsx`](faddsx.md) — single-precision add; result is rounded to binary32 then stored as binary64.
|
||||
- [`fsubx`](fsubx.md), [`fsubsx`](fsubsx.md) — double / single subtract.
|
||||
- [`fmulx`](fmulx.md), [`fmulsx`](fmulsx.md) — double / single multiply.
|
||||
- [`fmaddx`](fmaddx.md), [`fmsubx`](fmsubx.md), [`fnmaddx`](fnmaddx.md), [`fnmsubx`](fnmsubx.md) — fused multiply-add family (single-rounding; preferred for dot products).
|
||||
- [`mffsx`](mffsx.md), [`mtfsfx`](mtfsfx.md) — read/write FPSCR.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fadd` (Floating Add)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fa-fadd-floating-add-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/) (complete FPSCR and NaN-propagation rules).
|
||||
142
migration/project-root/ppc-manual/fpu/fcfidx.md
Normal file
142
migration/project-root/ppc-manual/fpu/fcfidx.md
Normal file
@@ -0,0 +1,142 @@
|
||||
# `fcfidx` — Floating Convert From Integer Doubleword
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [X](../forms/X.md) · **Opcode:** `0xfc00069c`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fcfid` | `fcfidx` | — | Floating Convert From Integer Doubleword |
|
||||
| `fcfid.` | `fcfidx` | Rc=1 | Floating Convert From Integer Doubleword |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fcfid[Rc] [FD], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fcfidx` — form `X`
|
||||
|
||||
- **Opcode word:** `0xfc00069c`
|
||||
- **Primary opcode (bits 0–5):** `63`
|
||||
- **Extended opcode:** `846`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode |
|
||||
| 6–10 | `RT/FRT/VRT` | destination |
|
||||
| 11–15 | `RA/FRA/VRA` | source A |
|
||||
| 16–20 | `RB/FRB/VRB` | source B |
|
||||
| 21–30 | `XO` | extended opcode (10 bits) |
|
||||
| 31 | `Rc` | record-form flag |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FB` | fcfidx: read | Source B floating-point register. |
|
||||
| `FD` | fcfidx: write | Destination floating-point register. |
|
||||
| `CR` | fcfidx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
| `FPSCR` | fcfidx: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fcfidx`
|
||||
|
||||
- **Reads (always):** `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`, `FPSCR`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fcfidx`: **CR0** ← signed-compare(result, 0) with `SO ← XER[SO]`, when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fcfidx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fcfidx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:253`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L253)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:27`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L27)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:914`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L914)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2872-2885`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2872-L2885)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fcfidx => {
|
||||
// Convert from integer doubleword: frD = (double)(int64_t)frB_as_bits.
|
||||
// PPCBUG-224: set XX when |i64| > 2^53 (precision loss in conversion).
|
||||
let bits = ctx.fpr[instr.rb()].to_bits();
|
||||
let i = bits as i64;
|
||||
let result = i as f64;
|
||||
if (result as i64) != i {
|
||||
fpscr::set_exception(ctx, fpscr::XX);
|
||||
}
|
||||
ctx.fpr[instr.rd()] = result;
|
||||
fpscr::set_fprf(ctx, fpscr::classify_fprf(result));
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **64-bit signed integer → binary64.** Reads `FRB` as a 64-bit signed integer (the bits, interpreted as `i64`) and converts it to IEEE-754 binary64. xenia-rs implements this as `bits as i64 as f64`.
|
||||
- **Loss of precision.** binary64 has 53 bits of significand, so `i64` values with magnitude > 2^53 lose low-order bits. This raises `FPSCR[XX, FX]` (inexact) on hardware. xenia-rs does not update FPSCR (xenia quirk) but the rounded value matches host `f64` rules (round-to-nearest-even by default).
|
||||
- **Always exact for `|x| <= 2^53`.** Within ±9,007,199,254,740,992 the conversion is bit-exact.
|
||||
- **Rounding mode.** Uses `FPSCR[RN]`. Default nearest-even. Rust's `as f64` from `i64` uses platform-native conversion which on Xenon-target hosts will respect the FE rounding mode; xenia uses the host default.
|
||||
- **No NaN/∞ generation.** All `i64` inputs map to finite `f64` outputs (the largest `i64` is well below `f64::MAX`).
|
||||
- **FPSCR side effects.** Hardware updates `FPRF` (result class) and may set `XX`/`FX` on inexact. xenia does not update FPSCR.
|
||||
- **`Rc=1` (`fcfid.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **Encoding.** X-form, primary 63, XO 846. Reads `FRB` only.
|
||||
- **Common pairing.** Used after `lfd` of a stored `i64` to bring an integer into the FP pipeline for arithmetic; the inverse direction is [`fctidx`](fctidx.md) / [`fctidzx`](fctidzx.md).
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fctidx`](fctidx.md), [`fctidzx`](fctidzx.md) — inverse direction (binary64 → 64-bit integer, current rounding / round-toward-zero).
|
||||
- [`fctiwx`](fctiwx.md), [`fctiwzx`](fctiwzx.md) — 32-bit integer conversion variants.
|
||||
- [`frspx`](frspx.md) — round to single precision; commonly chained after `fcfid` to produce a `float`.
|
||||
- `lfd`, `stfd` — load/store doubleword used to move integer values between GPR and FPR via memory.
|
||||
- [`mffsx`](mffsx.md), [`mtfsfx`](mtfsfx.md) — control rounding mode used by the conversion.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fcfid` (Floating Convert From Integer Doubleword)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fcfid-floating-convert-from-integer-doubleword-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/) (integer→FP conversion semantics).
|
||||
164
migration/project-root/ppc-manual/fpu/fcmpo.md
Normal file
164
migration/project-root/ppc-manual/fpu/fcmpo.md
Normal file
@@ -0,0 +1,164 @@
|
||||
# `fcmpo` — Floating Compare Ordered
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [X](../forms/X.md) · **Opcode:** `0xfc000040`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fcmpo` | `fcmpo` | — | Floating Compare Ordered |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fcmpo [CRFD], [FA], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fcmpo` — form `X`
|
||||
|
||||
- **Opcode word:** `0xfc000040`
|
||||
- **Primary opcode (bits 0–5):** `63`
|
||||
- **Extended opcode:** `32`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode |
|
||||
| 6–10 | `RT/FRT/VRT` | destination |
|
||||
| 11–15 | `RA/FRA/VRA` | source A |
|
||||
| 16–20 | `RB/FRB/VRB` | source B |
|
||||
| 21–30 | `XO` | extended opcode (10 bits) |
|
||||
| 31 | `Rc` | record-form flag |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FA` | fcmpo: read | Source A floating-point register (`fr0`–`fr31`). |
|
||||
| `FB` | fcmpo: read | Source B floating-point register. |
|
||||
| `CRFD` | fcmpo: write | CR destination field (`crf`, 0–7). |
|
||||
| `FPSCR` | fcmpo: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fcmpo`
|
||||
|
||||
- **Reads (always):** `FA`, `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `CRFD`, `FPSCR`
|
||||
- **Writes (conditional):** _none_
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fcmpo`: **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fcmpo`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fcmpo"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:362`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L362)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:27`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L27)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:901`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L901)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:3002-3032`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L3002-L3032)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fcmpo => {
|
||||
// Ordered compare: like fcmpu but also sets VXVC on QNaN (or VXSNAN on SNaN).
|
||||
let fra = ctx.fpr[instr.ra()];
|
||||
let frb = ctx.fpr[instr.rb()];
|
||||
let crfd = instr.crfd();
|
||||
if fra.is_nan() || frb.is_nan() {
|
||||
ctx.cr[crfd] = crate::context::CrField { lt: false, gt: false, eq: false, so: true };
|
||||
if fpscr::is_snan(fra) || fpscr::is_snan(frb) {
|
||||
fpscr::set_exception(ctx, fpscr::VXSNAN | fpscr::VXVC);
|
||||
} else {
|
||||
fpscr::set_exception(ctx, fpscr::VXVC);
|
||||
}
|
||||
} else if fra < frb {
|
||||
ctx.cr[crfd] = crate::context::CrField { lt: true, gt: false, eq: false, so: false };
|
||||
} else if fra > frb {
|
||||
ctx.cr[crfd] = crate::context::CrField { lt: false, gt: true, eq: false, so: false };
|
||||
} else {
|
||||
ctx.cr[crfd] = crate::context::CrField { lt: false, gt: false, eq: true, so: false };
|
||||
}
|
||||
let fprf = if fra.is_nan() || frb.is_nan() {
|
||||
0b0_0001
|
||||
} else if fra < frb {
|
||||
0b0_1000
|
||||
} else if fra > frb {
|
||||
0b0_0100
|
||||
} else {
|
||||
0b0_0010
|
||||
};
|
||||
fpscr::set_fprf(ctx, fprf);
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Ordered compare.** Same CR-field semantics as `fcmpu` (`LT/GT/EQ/SO`), but NaN inputs raise additional FPSCR exceptions:
|
||||
- Either operand NaN → `FPSCR[VXVC] = 1` (invalid-operation: compare on QNaN/SNaN).
|
||||
- Either operand signalling NaN → also `FPSCR[VXSNAN] = 1`.
|
||||
- All NaN cases also set `FX = 1` and `VX = 1`.
|
||||
- **xenia quirk.** xenia-rs's `fcmpo` body is identical to `fcmpu` — the FPSCR exception bits are not modelled. The xenia source comment explicitly notes "Same as fcmpu but sets FPSCR exception bits for QNaN (not modeled yet)". Title code that polls FPSCR for compare-class invalid-operation will not observe it.
|
||||
- **CR field bits.**
|
||||
- `LT` (bit 0) — `FRA < FRB`
|
||||
- `GT` (bit 1) — `FRA > FRB`
|
||||
- `EQ` (bit 2) — `FRA == FRB`
|
||||
- `SO` (bit 3) — unordered (NaN involved)
|
||||
- **`+0` and `-0` compare equal.**
|
||||
- **No `Rc` bit.**
|
||||
- **FPSCR side effects.** Hardware updates `FPSCR[FPCC]`, `FX`, `VX`, and (on NaN) `VXVC`/`VXSNAN`. xenia-rs only updates the CR field.
|
||||
- **Use case.** Ordered compares are required by C/C++ semantics for `<`, `>`, `<=`, `>=` (which must signal on NaN per IEEE-754). `fcmpu` corresponds to the C `==`/`!=` semantics (which do not signal).
|
||||
- **Encoding.** X-form, primary 63, XO 32.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fcmpux`](fcmpu.md) — unordered compare; identical CR result, no `VXVC`.
|
||||
- `mcrf`, `mcrfs`, `mfcr` — fan-out CR fields after compare.
|
||||
- `bc`, `bclr`, `bcctr` — conditional branches consume `LT/GT/EQ/SO`.
|
||||
- [`fselx`](fselx.md) — branch-free alternative for single-key compares.
|
||||
- [`mcrfs`](mcrfs.md), [`mffsx`](mffsx.md) — move FPSCR/CR.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fcmpo` (Floating Compare Ordered)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fcmpo-floating-compare-ordered-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/) (`fcmpo` raises `VXVC` on QNaN; both raise `VXSNAN` on SNaN).
|
||||
161
migration/project-root/ppc-manual/fpu/fcmpu.md
Normal file
161
migration/project-root/ppc-manual/fpu/fcmpu.md
Normal file
@@ -0,0 +1,161 @@
|
||||
# `fcmpu` — Floating Compare Unordered
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [X](../forms/X.md) · **Opcode:** `0xfc000000`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fcmpu` | `fcmpu` | — | Floating Compare Unordered |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fcmpu [CRFD], [FA], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fcmpu` — form `X`
|
||||
|
||||
- **Opcode word:** `0xfc000000`
|
||||
- **Primary opcode (bits 0–5):** `63`
|
||||
- **Extended opcode:** `0`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode |
|
||||
| 6–10 | `RT/FRT/VRT` | destination |
|
||||
| 11–15 | `RA/FRA/VRA` | source A |
|
||||
| 16–20 | `RB/FRB/VRB` | source B |
|
||||
| 21–30 | `XO` | extended opcode (10 bits) |
|
||||
| 31 | `Rc` | record-form flag |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FA` | fcmpu: read | Source A floating-point register (`fr0`–`fr31`). |
|
||||
| `FB` | fcmpu: read | Source B floating-point register. |
|
||||
| `CRFD` | fcmpu: write | CR destination field (`crf`, 0–7). |
|
||||
| `FPSCR` | fcmpu: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fcmpu`
|
||||
|
||||
- **Reads (always):** `FA`, `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `CRFD`, `FPSCR`
|
||||
- **Writes (conditional):** _none_
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fcmpu`: **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fcmpu`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fcmpu"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:365`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L365)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:27`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L27)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:897`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L897)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2972-3001`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2972-L3001)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fcmpu => {
|
||||
let fra = ctx.fpr[instr.ra()];
|
||||
let frb = ctx.fpr[instr.rb()];
|
||||
let crfd = instr.crfd();
|
||||
if fra.is_nan() || frb.is_nan() {
|
||||
ctx.cr[crfd] = crate::context::CrField { lt: false, gt: false, eq: false, so: true };
|
||||
// fcmpu: VXSNAN on SNaN input; no VXVC even on QNaN.
|
||||
if fpscr::is_snan(fra) || fpscr::is_snan(frb) {
|
||||
fpscr::set_exception(ctx, fpscr::VXSNAN);
|
||||
}
|
||||
} else if fra < frb {
|
||||
ctx.cr[crfd] = crate::context::CrField { lt: true, gt: false, eq: false, so: false };
|
||||
} else if fra > frb {
|
||||
ctx.cr[crfd] = crate::context::CrField { lt: false, gt: true, eq: false, so: false };
|
||||
} else {
|
||||
ctx.cr[crfd] = crate::context::CrField { lt: false, gt: false, eq: true, so: false };
|
||||
}
|
||||
// Also mirror the comparison result into FPSCR[FPRF (FL/FG/FE/FU)].
|
||||
let fprf = if fra.is_nan() || frb.is_nan() {
|
||||
0b0_0001
|
||||
} else if fra < frb {
|
||||
0b0_1000
|
||||
} else if fra > frb {
|
||||
0b0_0100
|
||||
} else {
|
||||
0b0_0010
|
||||
};
|
||||
fpscr::set_fprf(ctx, fprf);
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Unordered compare.** "Unordered" means NaN inputs do **not** signal an invalid-operation exception — they merely set the unordered (`SO`) bit in the destination CR field. Use [`fcmpox`](fcmpo.md) when NaN should raise `VXSNAN`/`VXVC`.
|
||||
- **CR field bits.** Writes the 4-bit CR field selected by `BF` (`crfd`):
|
||||
- `LT` (bit 0) — `FRA < FRB`
|
||||
- `GT` (bit 1) — `FRA > FRB`
|
||||
- `EQ` (bit 2) — `FRA == FRB`
|
||||
- `SO` (bit 3) — **unordered** (one or both operands is NaN)
|
||||
- **NaN handling.** Either operand NaN → set `SO=1`, clear `LT/GT/EQ`. xenia-rs matches.
|
||||
- **Signalling NaN.** Per PowerISA, `fcmpu` sets `FPSCR[VXSNAN]` if either operand is a signalling NaN, but does **not** set `FPSCR[VXVC]` (the difference vs `fcmpo`). xenia-rs does **not** model this — **xenia quirk**: `fcmpu` and `fcmpo` are observationally identical in xenia.
|
||||
- **`+0` and `-0` compare equal.** Standard IEEE rule; xenia's host `<` / `>` on `f64` matches.
|
||||
- **No `Rc` bit.** The CR field destination is encoded in the instruction (`BF`); there's no record-form variant.
|
||||
- **FPSCR side effects.** Hardware updates `FPSCR[FPCC]` (the four-bit floating-point condition code) and `FPSCR[FX]`. xenia-rs does not maintain `FPCC`.
|
||||
- **Precision-agnostic.** Compares the full binary64 values; works equally for single-precision values stored in FPRs (they are bit-identical to their double-precision representation).
|
||||
- **Encoding.** X-form, primary 63, XO 0. Bits 9–10 of `BF` are unused (reserved 0).
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fcmpox`](fcmpo.md) — ordered compare; raises `VXSNAN`/`VXVC` on NaN/SNaN.
|
||||
- `mcrf`, `mcrfs`, `mfcr` — copy CR fields, useful after `fcmpu` to fan out the result.
|
||||
- `bc`, `bclr`, `bcctr` — conditional branches consume the CR fields written by `fcmpu`.
|
||||
- [`fselx`](fselx.md) — branch-free alternative when only the sign of `FRA - FRB` is needed.
|
||||
- [`mcrfs`](mcrfs.md), [`mffsx`](mffsx.md) — move FPSCR data into the CR.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fcmpu` (Floating Compare Unordered)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fcmpu-floating-compare-unordered-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/) (compare semantics, `FPCC` updates, NaN/SNaN exception rules).
|
||||
149
migration/project-root/ppc-manual/fpu/fctidx.md
Normal file
149
migration/project-root/ppc-manual/fpu/fctidx.md
Normal file
@@ -0,0 +1,149 @@
|
||||
# `fctidx` — Floating Convert to Integer Doubleword
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [X](../forms/X.md) · **Opcode:** `0xfc00065c`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fctid` | `fctidx` | — | Floating Convert to Integer Doubleword |
|
||||
| `fctid.` | `fctidx` | Rc=1 | Floating Convert to Integer Doubleword |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fctid[Rc] [FD], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fctidx` — form `X`
|
||||
|
||||
- **Opcode word:** `0xfc00065c`
|
||||
- **Primary opcode (bits 0–5):** `63`
|
||||
- **Extended opcode:** `814`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode |
|
||||
| 6–10 | `RT/FRT/VRT` | destination |
|
||||
| 11–15 | `RA/FRA/VRA` | source A |
|
||||
| 16–20 | `RB/FRB/VRB` | source B |
|
||||
| 21–30 | `XO` | extended opcode (10 bits) |
|
||||
| 31 | `Rc` | record-form flag |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FB` | fctidx: read | Source B floating-point register. |
|
||||
| `FD` | fctidx: write | Destination floating-point register. |
|
||||
| `CR` | fctidx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
| `FPSCR` | fctidx: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fctidx`
|
||||
|
||||
- **Reads (always):** `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`, `FPSCR`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fctidx`: **CR0** ← signed-compare(result, 0) with `SO ← XER[SO]`, when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fctidx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fctidx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:280`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L280)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:27`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L27)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:912`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L912)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2886-2906`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2886-L2906)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fctidx => {
|
||||
// Convert to integer doubleword (round per FPSCR[RN]).
|
||||
// PPCBUG-229: set XX on inexact (fractional input).
|
||||
let val = ctx.fpr[instr.rb()];
|
||||
let result = if val.is_nan() {
|
||||
fpscr::set_exception(ctx, fpscr::VXCVI | if fpscr::is_snan(val) { fpscr::VXSNAN } else { 0 });
|
||||
0x8000_0000_0000_0000u64
|
||||
} else if val >= (i64::MAX as f64) {
|
||||
fpscr::set_exception(ctx, fpscr::VXCVI);
|
||||
0x7FFF_FFFF_FFFF_FFFFu64
|
||||
} else if val < (i64::MIN as f64) {
|
||||
fpscr::set_exception(ctx, fpscr::VXCVI);
|
||||
0x8000_0000_0000_0000u64
|
||||
} else {
|
||||
if val != val.trunc() { fpscr::set_exception(ctx, fpscr::XX); }
|
||||
fpscr::round_to_i64(ctx, val) as u64
|
||||
};
|
||||
ctx.fpr[instr.rd()] = f64::from_bits(result);
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **binary64 → 64-bit signed integer, current rounding mode.** Result is the integer rounded per `FPSCR[RN]`, packed into the 64-bit FPR as raw bits (the FPR is reinterpreted as an `i64` by subsequent `stfd`/integer code).
|
||||
- **Saturation on out-of-range.** Per PowerISA, values outside `[i64::MIN, i64::MAX]` (or NaN) yield the most-negative integer (`0x8000_0000_0000_0000`) and set `FPSCR[VXCVI, VX, FX]`. xenia-rs special-cases NaN to `0x8000_0000_0000_0000` but **does not saturate** out-of-range finite values — Rust's `as i64` from a too-large `f64` produces an undefined-then-saturated result that may differ from the PPC convention. **xenia quirk:** very-large finite inputs may round to a different sentinel than hardware.
|
||||
- **xenia round implementation.** xenia uses Rust's `f64::round`, which rounds half-cases **away from zero** (NOT round-to-nearest-even). PowerISA round-to-nearest in default mode rounds half-cases to even. **xenia quirk:** values like `0.5`, `1.5`, `2.5` may produce different integers (xenia: `1, 2, 3`; PPC default: `0, 2, 2`).
|
||||
- **Rounding mode.** PPC uses `FPSCR[RN]` for the rounding direction. xenia ignores the FPSCR mode and always uses `f64::round` (i.e. round-half-away-from-zero) regardless of `RN`. **xenia quirk:** non-default rounding modes are not respected.
|
||||
- **Inexact.** Sets `FPSCR[XX, FX]` on any non-integer input. xenia does not update FPSCR.
|
||||
- **NaN.** Returns sentinel `0x8000_0000_0000_0000` and sets `FPSCR[VXCVI]`. xenia matches the sentinel, but does not raise the FPSCR bit.
|
||||
- **`Rc=1` (`fctid.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **Encoding.** X-form, primary 63, XO 814. Reads `FRB` only.
|
||||
- **Pair with `stfd`** to extract the `i64` value to memory or a GPR (Xbox 360 has no direct FPR↔GPR move; round-trip via stack).
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fctidzx`](fctidzx.md) — same conversion but always rounds toward zero (truncation).
|
||||
- [`fctiwx`](fctiwx.md), [`fctiwzx`](fctiwzx.md) — 32-bit integer variants (saturate to `i32` range).
|
||||
- [`fcfidx`](fcfidx.md) — inverse direction (`i64` → binary64).
|
||||
- [`mffsx`](mffsx.md), [`mtfsfx`](mtfsfx.md) — control `FPSCR[RN]`.
|
||||
- `stfd`, `stfiwx` — store the integer-bits FPR to memory; `stfiwx` stores only the low 32 bits (use after `fctiwx` / `fctiwzx`).
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fctid` (Floating Convert to Integer Doubleword)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fctid-floating-convert-integer-doubleword-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/) (`VXCVI` is the invalid-conversion exception bit; saturation sentinel is `0x8000_0000_0000_0000`).
|
||||
151
migration/project-root/ppc-manual/fpu/fctidzx.md
Normal file
151
migration/project-root/ppc-manual/fpu/fctidzx.md
Normal file
@@ -0,0 +1,151 @@
|
||||
# `fctidzx` — Floating Convert to Integer Doubleword with Round Toward Zero
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [X](../forms/X.md) · **Opcode:** `0xfc00065e`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fctidz` | `fctidzx` | — | Floating Convert to Integer Doubleword with Round Toward Zero |
|
||||
| `fctidz.` | `fctidzx` | Rc=1 | Floating Convert to Integer Doubleword with Round Toward Zero |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fctidz[Rc] [FD], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fctidzx` — form `X`
|
||||
|
||||
- **Opcode word:** `0xfc00065e`
|
||||
- **Primary opcode (bits 0–5):** `63`
|
||||
- **Extended opcode:** `815`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode |
|
||||
| 6–10 | `RT/FRT/VRT` | destination |
|
||||
| 11–15 | `RA/FRA/VRA` | source A |
|
||||
| 16–20 | `RB/FRB/VRB` | source B |
|
||||
| 21–30 | `XO` | extended opcode (10 bits) |
|
||||
| 31 | `Rc` | record-form flag |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FB` | fctidzx: read | Source B floating-point register. |
|
||||
| `FD` | fctidzx: write | Destination floating-point register. |
|
||||
| `CR` | fctidzx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
| `FPSCR` | fctidzx: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fctidzx`
|
||||
|
||||
- **Reads (always):** `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`, `FPSCR`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fctidzx`: **CR0** ← signed-compare(result, 0) with `SO ← XER[SO]`, when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fctidzx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fctidzx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:285`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L285)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:27`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L27)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:913`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L913)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2907-2927`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2907-L2927)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fctidzx => {
|
||||
// Convert to integer doubleword (round toward zero).
|
||||
// PPCBUG-229: set XX on inexact.
|
||||
let val = ctx.fpr[instr.rb()];
|
||||
let result = if val.is_nan() {
|
||||
fpscr::set_exception(ctx, fpscr::VXCVI | if fpscr::is_snan(val) { fpscr::VXSNAN } else { 0 });
|
||||
0x8000_0000_0000_0000u64
|
||||
} else if val >= (i64::MAX as f64) {
|
||||
fpscr::set_exception(ctx, fpscr::VXCVI);
|
||||
0x7FFF_FFFF_FFFF_FFFFu64
|
||||
} else if val < (i64::MIN as f64) {
|
||||
fpscr::set_exception(ctx, fpscr::VXCVI);
|
||||
0x8000_0000_0000_0000u64
|
||||
} else {
|
||||
if val != val.trunc() { fpscr::set_exception(ctx, fpscr::XX); }
|
||||
(val.trunc() as i64) as u64
|
||||
};
|
||||
ctx.fpr[instr.rd()] = f64::from_bits(result);
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **binary64 → 64-bit signed integer, round toward zero.** The "z" suffix forces truncation regardless of `FPSCR[RN]`. xenia-rs uses Rust's `as i64` (which truncates toward zero), bypassing the FPSCR rounding mode entirely — this matches PPC `fctidz` semantics correctly.
|
||||
- **Saturation on out-of-range.** PowerISA: out-of-range or NaN → `0x8000_0000_0000_0000` and `FPSCR[VXCVI, VX, FX]`. xenia handles NaN explicitly with the sentinel, but uses raw `as i64` for finite values; in current Rust (since 1.45) `as i64` from out-of-range `f64` is **defined to saturate** to `i64::MIN`/`i64::MAX`. So:
|
||||
- **+∞ or large positive → `i64::MAX`** (`0x7FFF_FFFF_FFFF_FFFF`) under xenia.
|
||||
- **−∞ or large negative → `i64::MIN`** (`0x8000_0000_0000_0000`) under xenia.
|
||||
- **PPC** spec returns `0x8000_0000_0000_0000` for both. **xenia quirk:** positive overflow returns the wrong sentinel.
|
||||
- **NaN.** Returns sentinel `0x8000_0000_0000_0000` (matches PPC).
|
||||
- **Inexact.** Sets `FPSCR[XX, FX]` on any non-integer input. xenia does not update FPSCR (xenia quirk).
|
||||
- **No `FPSCR[RN]` dependence.** `fctidz` always truncates; this is the right choice for C/C++ `(int64_t)` casts.
|
||||
- **`Rc=1` (`fctidz.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **Encoding.** X-form, primary 63, XO 815. Reads `FRB` only.
|
||||
- **Common pairing.** Translation of C `(int64_t)d` casts; combined with `stfd` to move the value to integer memory.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fctidx`](fctidx.md) — same conversion but uses `FPSCR[RN]` (default nearest-even on PPC; xenia uses `f64::round` regardless).
|
||||
- [`fctiwzx`](fctiwzx.md) — 32-bit truncating variant.
|
||||
- [`fctiwx`](fctiwx.md) — 32-bit `FPSCR[RN]`-rounded variant.
|
||||
- [`fcfidx`](fcfidx.md) — inverse direction.
|
||||
- `stfd` — store the integer-bits FPR to memory.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fctidz` (Floating Convert to Integer Doubleword with Round Toward Zero)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fctidz-floating-convert-integer-doubleword-round-toward-zero-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/).
|
||||
149
migration/project-root/ppc-manual/fpu/fctiwx.md
Normal file
149
migration/project-root/ppc-manual/fpu/fctiwx.md
Normal file
@@ -0,0 +1,149 @@
|
||||
# `fctiwx` — Floating Convert to Integer Word
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [X](../forms/X.md) · **Opcode:** `0xfc00001c`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fctiw` | `fctiwx` | — | Floating Convert to Integer Word |
|
||||
| `fctiw.` | `fctiwx` | Rc=1 | Floating Convert to Integer Word |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fctiw[Rc] [FD], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fctiwx` — form `X`
|
||||
|
||||
- **Opcode word:** `0xfc00001c`
|
||||
- **Primary opcode (bits 0–5):** `63`
|
||||
- **Extended opcode:** `14`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode |
|
||||
| 6–10 | `RT/FRT/VRT` | destination |
|
||||
| 11–15 | `RA/FRA/VRA` | source A |
|
||||
| 16–20 | `RB/FRB/VRB` | source B |
|
||||
| 21–30 | `XO` | extended opcode (10 bits) |
|
||||
| 31 | `Rc` | record-form flag |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FB` | fctiwx: read | Source B floating-point register. |
|
||||
| `FD` | fctiwx: write | Destination floating-point register. |
|
||||
| `CR` | fctiwx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
| `FPSCR` | fctiwx: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fctiwx`
|
||||
|
||||
- **Reads (always):** `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`, `FPSCR`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fctiwx`: **CR0** ← signed-compare(result, 0) with `SO ← XER[SO]`, when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fctiwx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fctiwx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:308`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L308)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:27`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L27)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:899`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L899)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2928-2948`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2928-L2948)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fctiwx => {
|
||||
// Convert to integer word (round per FPSCR[RN]).
|
||||
// PPCBUG-230: set XX on inexact.
|
||||
let val = ctx.fpr[instr.rb()];
|
||||
let result_u32: u32 = if val.is_nan() {
|
||||
fpscr::set_exception(ctx, fpscr::VXCVI | if fpscr::is_snan(val) { fpscr::VXSNAN } else { 0 });
|
||||
0x8000_0000
|
||||
} else if val > (i32::MAX as f64) {
|
||||
fpscr::set_exception(ctx, fpscr::VXCVI);
|
||||
0x7FFF_FFFF
|
||||
} else if val < (i32::MIN as f64) {
|
||||
fpscr::set_exception(ctx, fpscr::VXCVI);
|
||||
0x8000_0000
|
||||
} else {
|
||||
if val != val.trunc() { fpscr::set_exception(ctx, fpscr::XX); }
|
||||
fpscr::round_to_i32(ctx, val) as u32
|
||||
};
|
||||
ctx.fpr[instr.rd()] = f64::from_bits(result_u32 as u64);
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **binary64 → 32-bit signed integer, current rounding mode.** Result is rounded per `FPSCR[RN]` and packed into the low 32 bits of the destination FPR. The high 32 bits are architecturally undefined per PowerISA but xenia produces zero-extended `u32` (i.e. the high 32 bits are 0).
|
||||
- **Explicit saturation in xenia.** xenia's body clamps the rounded `f64` to `[i32::MIN as f64, i32::MAX as f64]` before the integer cast — this matches PPC's saturation behaviour for out-of-range positive/negative finite inputs.
|
||||
- **NaN sentinel.** xenia returns `0x0000_0000_8000_0000` for NaN inputs (i.e. `i32::MIN` in the low word). Matches PPC's `VXCVI` sentinel for NaN/out-of-range.
|
||||
- **Rounding implementation.** xenia uses `f64::round`, which rounds half-cases **away from zero** rather than to nearest-even. **xenia quirk:** values like `0.5`/`1.5`/`2.5` produce `1`/`2`/`3` under xenia vs `0`/`2`/`2` on PPC default rounding.
|
||||
- **`FPSCR[RN]` not honored.** xenia always uses `f64::round`, ignoring the rounding-mode field. **xenia quirk** for non-default modes.
|
||||
- **FPSCR side effects.** PPC: sets `XX`/`FX` on inexact, `VXCVI` on NaN/out-of-range. xenia does not update FPSCR.
|
||||
- **`Rc=1` (`fctiw.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **Encoding.** X-form, primary 63, XO 14. Reads `FRB` only.
|
||||
- **Common pairing.** Followed by `stfiwx` to store the low-32-bit integer to memory (`stfd` would write the doubleword including the high bits, which on hardware are undefined).
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fctiwzx`](fctiwzx.md) — 32-bit integer with round-toward-zero (truncation).
|
||||
- [`fctidx`](fctidx.md), [`fctidzx`](fctidzx.md) — 64-bit integer variants.
|
||||
- [`fcfidx`](fcfidx.md) — inverse direction (i64 → f64); for i32 → f64, sign-extend then `fcfid`.
|
||||
- `stfiwx` — store low-32-bits FPR (the canonical companion to `fctiw`/`fctiwz`).
|
||||
- [`mffsx`](mffsx.md), [`mtfsfx`](mtfsfx.md) — control `FPSCR[RN]` (currently a no-op under xenia for this instruction).
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fctiw` (Floating Convert to Integer Word)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fctiw-floating-convert-integer-word-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/) (high 32 bits are architecturally undefined; only `stfiwx` is the spec-blessed consumer).
|
||||
148
migration/project-root/ppc-manual/fpu/fctiwzx.md
Normal file
148
migration/project-root/ppc-manual/fpu/fctiwzx.md
Normal file
@@ -0,0 +1,148 @@
|
||||
# `fctiwzx` — Floating Convert to Integer Word with Round Toward Zero
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [X](../forms/X.md) · **Opcode:** `0xfc00001e`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fctiwz` | `fctiwzx` | — | Floating Convert to Integer Word with Round Toward Zero |
|
||||
| `fctiwz.` | `fctiwzx` | Rc=1 | Floating Convert to Integer Word with Round Toward Zero |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fctiwz[Rc] [FD], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fctiwzx` — form `X`
|
||||
|
||||
- **Opcode word:** `0xfc00001e`
|
||||
- **Primary opcode (bits 0–5):** `63`
|
||||
- **Extended opcode:** `15`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode |
|
||||
| 6–10 | `RT/FRT/VRT` | destination |
|
||||
| 11–15 | `RA/FRA/VRA` | source A |
|
||||
| 16–20 | `RB/FRB/VRB` | source B |
|
||||
| 21–30 | `XO` | extended opcode (10 bits) |
|
||||
| 31 | `Rc` | record-form flag |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FB` | fctiwzx: read | Source B floating-point register. |
|
||||
| `FD` | fctiwzx: write | Destination floating-point register. |
|
||||
| `CR` | fctiwzx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
| `FPSCR` | fctiwzx: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fctiwzx`
|
||||
|
||||
- **Reads (always):** `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`, `FPSCR`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fctiwzx`: **CR0** ← signed-compare(result, 0) with `SO ← XER[SO]`, when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fctiwzx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fctiwzx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:313`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L313)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:27`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L27)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:900`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L900)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2949-2969`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2949-L2969)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fctiwzx => {
|
||||
// Convert to integer word (round toward zero).
|
||||
// PPCBUG-230: set XX on inexact.
|
||||
let val = ctx.fpr[instr.rb()];
|
||||
let result_u32: u32 = if val.is_nan() {
|
||||
fpscr::set_exception(ctx, fpscr::VXCVI | if fpscr::is_snan(val) { fpscr::VXSNAN } else { 0 });
|
||||
0x8000_0000
|
||||
} else if val > (i32::MAX as f64) {
|
||||
fpscr::set_exception(ctx, fpscr::VXCVI);
|
||||
0x7FFF_FFFF
|
||||
} else if val < (i32::MIN as f64) {
|
||||
fpscr::set_exception(ctx, fpscr::VXCVI);
|
||||
0x8000_0000
|
||||
} else {
|
||||
if val != val.trunc() { fpscr::set_exception(ctx, fpscr::XX); }
|
||||
val.trunc() as i32 as u32
|
||||
};
|
||||
ctx.fpr[instr.rd()] = f64::from_bits(result_u32 as u64);
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **binary64 → 32-bit signed integer, round toward zero.** Truncates regardless of `FPSCR[RN]`. xenia-rs uses `clamp` to saturate to `[i32::MIN, i32::MAX]` then `as i32`, which truncates — matching PPC `fctiwz` semantics.
|
||||
- **Most common conversion in compiled code.** Translates C/C++ `(int32_t)f` casts, which require truncation per the C standard.
|
||||
- **Saturation on out-of-range.** Hardware saturates to `i32::MAX` for large positives, `i32::MIN` for large negatives or NaN, and sets `FPSCR[VXCVI, VX, FX]`. xenia's explicit `clamp` correctly reproduces the saturation, but does not raise FPSCR bits (xenia quirk).
|
||||
- **NaN sentinel.** xenia returns `0x0000_0000_8000_0000` (i.e. `i32::MIN` in low 32 bits). Matches PPC sentinel.
|
||||
- **High 32 bits of FPR.** Architecturally undefined per PowerISA, but xenia produces zero-extended `u32`. Use `stfiwx` (store low 32 bits) — never `stfd` — for the canonical "store this integer" idiom.
|
||||
- **Inexact.** Sets `FPSCR[XX, FX]` on any non-integer input. xenia does not update FPSCR.
|
||||
- **`Rc=1` (`fctiwz.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **Encoding.** X-form, primary 63, XO 15. Reads `FRB` only.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fctiwx`](fctiwx.md) — 32-bit integer with `FPSCR[RN]` rounding.
|
||||
- [`fctidx`](fctidx.md), [`fctidzx`](fctidzx.md) — 64-bit integer variants.
|
||||
- [`fcfidx`](fcfidx.md) — inverse direction (i64 → f64); for i32 → f64, sign-extend to i64 first.
|
||||
- `stfiwx` — store low 32 bits of FPR; canonical companion.
|
||||
- [`mffsx`](mffsx.md), [`mtfsfx`](mtfsfx.md) — FPSCR control (no effect on `fctiwz` since rounding mode is fixed to truncation).
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fctiwz` (Floating Convert to Integer Word with Round Toward Zero)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fctiwz-floating-convert-integer-word-round-toward-zero-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/).
|
||||
143
migration/project-root/ppc-manual/fpu/fdivsx.md
Normal file
143
migration/project-root/ppc-manual/fpu/fdivsx.md
Normal file
@@ -0,0 +1,143 @@
|
||||
# `fdivsx` — Floating Divide Single
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [A](../forms/A.md) · **Opcode:** `0xec000024`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fdivs` | `fdivsx` | — | Floating Divide Single |
|
||||
| `fdivs.` | `fdivsx` | Rc=1 | Floating Divide Single |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fdivs[Rc] [FD], [FA], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fdivsx` — form `A`
|
||||
|
||||
- **Opcode word:** `0xec000024`
|
||||
- **Primary opcode (bits 0–5):** `59`
|
||||
- **Extended opcode:** `18`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (59 or 63) |
|
||||
| 6–10 | `FRT` | destination FPR |
|
||||
| 11–15 | `FRA` | source A FPR |
|
||||
| 16–20 | `FRB` | source B FPR |
|
||||
| 21–25 | `FRC` | source C FPR (multiplier for madd-style ops) |
|
||||
| 26–30 | `XO` | extended opcode (5 bits) |
|
||||
| 31 | `Rc` | record-form flag (updates CR1) |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FA` | fdivsx: read | Source A floating-point register (`fr0`–`fr31`). |
|
||||
| `FB` | fdivsx: read | Source B floating-point register. |
|
||||
| `FD` | fdivsx: write | Destination floating-point register. |
|
||||
| `CR` | fdivsx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
| `FPSCR` | fdivsx: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fdivsx`
|
||||
|
||||
- **Reads (always):** `FA`, `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`, `FPSCR`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fdivsx`: **CR1** ← FPSCR[FX, FEX, VX, OX] when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fdivsx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fdivsx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:71`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L71)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:28`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L28)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:386`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L386)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2627-2637`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2627-L2637)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fdivsx => {
|
||||
let a = ctx.fpr[instr.ra()];
|
||||
let b = ctx.fpr[instr.rb()];
|
||||
fpscr::check_invalid_div(ctx, a, b);
|
||||
fpscr::check_zero_divide(ctx, a, b);
|
||||
let result = to_single(ctx, a / b);
|
||||
ctx.fpr[instr.rd()] = result;
|
||||
fpscr::update_after_op(ctx, result, a.is_finite() && b.is_finite() && b != 0.0);
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Single precision.** Result is rounded to IEEE-754 binary32 then re-encoded into the 64-bit FPR. xenia computes `to_single(a / b)`.
|
||||
- **Divide by zero.** Finite/±0 sets `FPSCR[ZX, FX]` and yields ±∞. xenia returns the host ±∞ but does not update FPSCR (xenia quirk).
|
||||
- **`0 / 0`** → `FPSCR[VXZDZ, VX, FX]`, quiet NaN result.
|
||||
- **`±∞ / ±∞`** → `FPSCR[VXIDI, VX, FX]`, quiet NaN result.
|
||||
- **FPSCR side effects.** Hardware updates `FPRF`, `FR`, `FI`, `FX`, plus exception bits `OX`, `UX`, `XX`, `ZX`, `VXZDZ`, `VXIDI`, `VXSNAN`.
|
||||
- **`Rc=1` (`fdivs.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **NaN propagation.** Quiet-NaN result for any NaN operand; signalling NaNs are quietened.
|
||||
- **Single-precision overflow** returns ±∞ and sets `OX`/`XX`/`FX`.
|
||||
- **Performance.** Hardware divide is multi-cycle. Title code commonly uses `fres` + Newton-Raphson for hot loops; this instruction is reserved for non-critical paths.
|
||||
- **Denormal flush.** Xenon boots with `FPSCR[NI]=1`; xenia uses host IEEE behavior.
|
||||
- **Encoding.** A-form, primary 59, XO 18.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fdivx`](fdivx.md) — double-precision sibling.
|
||||
- [`fresx`](fresx.md) — reciprocal estimate, used to build software divides.
|
||||
- [`fmulsx`](fmulsx.md), [`faddsx`](faddsx.md), [`fsubsx`](fsubsx.md) — companion single-precision arithmetic.
|
||||
- [`fmaddsx`](fmaddsx.md), [`fnmsubsx`](fnmsubsx.md) — Newton-Raphson refinement helpers.
|
||||
- [`frspx`](frspx.md) — explicit double→single rounding.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fdivs` (Floating Divide Single)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fdivs-floating-divide-single-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/).
|
||||
133
migration/project-root/ppc-manual/fpu/fdivx.md
Normal file
133
migration/project-root/ppc-manual/fpu/fdivx.md
Normal file
@@ -0,0 +1,133 @@
|
||||
# `fdivx` — Floating Divide
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [A](../forms/A.md) · **Opcode:** `0xfc000024`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fdiv` | `fdivx` | — | Floating Divide |
|
||||
| `fdiv.` | `fdivx` | Rc=1 | Floating Divide |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fdiv[Rc] [FD], [FA], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fdivx` — form `A`
|
||||
|
||||
- **Opcode word:** `0xfc000024`
|
||||
- **Primary opcode (bits 0–5):** `63`
|
||||
- **Extended opcode:** `18`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (59 or 63) |
|
||||
| 6–10 | `FRT` | destination FPR |
|
||||
| 11–15 | `FRA` | source A FPR |
|
||||
| 16–20 | `FRB` | source B FPR |
|
||||
| 21–25 | `FRC` | source C FPR (multiplier for madd-style ops) |
|
||||
| 26–30 | `XO` | extended opcode (5 bits) |
|
||||
| 31 | `Rc` | record-form flag (updates CR1) |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FA` | fdivx: read | Source A floating-point register (`fr0`–`fr31`). |
|
||||
| `FB` | fdivx: read | Source B floating-point register. |
|
||||
| `FD` | fdivx: write | Destination floating-point register. |
|
||||
| `CR` | fdivx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
| `FPSCR` | fdivx: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fdivx`
|
||||
|
||||
- **Reads (always):** `FA`, `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`, `FPSCR`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fdivx`: **CR1** ← FPSCR[FX, FEX, VX, OX] when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
FRT <- FRA ÷ FRB
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fdivx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fdivx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:55`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L55)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:28`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L28)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:920`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L920)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2616-2626`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2616-L2626)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fdivx => {
|
||||
let a = ctx.fpr[instr.ra()];
|
||||
let b = ctx.fpr[instr.rb()];
|
||||
fpscr::check_invalid_div(ctx, a, b);
|
||||
fpscr::check_zero_divide(ctx, a, b);
|
||||
let result = a / b;
|
||||
ctx.fpr[instr.rd()] = result;
|
||||
fpscr::update_after_op(ctx, result, a.is_finite() && b.is_finite() && b != 0.0);
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Double precision.** Operates on IEEE-754 binary64; [`fdivsx`](fdivsx.md) is the single-precision sibling.
|
||||
- **Divide by zero.** `FRA / ±0` (with `FRA` finite, non-zero) sets `FPSCR[ZX, FX]` and produces a correctly-signed infinity. xenia relies on host `f64 /`, which produces the same ±∞ — but does not raise `ZX` in the interpreter snapshot. **xenia quirk:** title code that polls FPSCR for divide-by-zero will not observe it.
|
||||
- **`0 / 0`** sets `FPSCR[VXZDZ, VX, FX]` and yields a quiet NaN.
|
||||
- **`±∞ / ±∞`** sets `FPSCR[VXIDI, VX, FX]` and yields a quiet NaN.
|
||||
- **FPSCR side effects.** Hardware updates `FPRF`, `FR`, `FI`, `FX` plus exception bits `OX`, `UX`, `XX`, `ZX`, `VXZDZ`, `VXIDI`, `VXSNAN`. xenia-rs does not maintain these.
|
||||
- **`Rc=1` (`fdiv.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **NaN propagation.** Quiet-NaN result for any NaN operand; signalling NaNs are quietened.
|
||||
- **Performance.** Hardware divide is multi-cycle and not pipelined on Xenon. Many titles prefer `fres`/`frsqrte` followed by Newton-Raphson refinement (or by `fmadd` chains) to avoid the divider.
|
||||
- **Denormal flush.** Xenon boots with `FPSCR[NI]=1`; xenia uses host IEEE.
|
||||
- **Encoding.** A-form, primary 63, XO 18. `FRC` is don't-care.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fdivsx`](fdivsx.md) — single-precision divide.
|
||||
- [`fresx`](fresx.md) — reciprocal estimate `~1/FRB`; combined with `fmul`/`fmadd` to implement reciprocal divides.
|
||||
- [`fmulx`](fmulx.md), [`faddx`](faddx.md), [`fsubx`](fsubx.md) — companion arithmetic.
|
||||
- [`fmaddx`](fmaddx.md), [`fnmsubx`](fnmsubx.md) — used in Newton-Raphson refinement steps.
|
||||
- [`mffsx`](mffsx.md), [`mtfsfx`](mtfsfx.md) — FPSCR control (rounding mode, exception masks).
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fdiv` (Floating Divide)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fd-fdiv-floating-divide-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/) (divide-by-zero and invalid-operation rules).
|
||||
147
migration/project-root/ppc-manual/fpu/fmaddsx.md
Normal file
147
migration/project-root/ppc-manual/fpu/fmaddsx.md
Normal file
@@ -0,0 +1,147 @@
|
||||
# `fmaddsx` — Floating Multiply-Add Single
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [A](../forms/A.md) · **Opcode:** `0xec00003a`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fmadds` | `fmaddsx` | — | Floating Multiply-Add Single |
|
||||
| `fmadds.` | `fmaddsx` | Rc=1 | Floating Multiply-Add Single |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fmadds[Rc] [FD], [FA], [FC], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fmaddsx` — form `A`
|
||||
|
||||
- **Opcode word:** `0xec00003a`
|
||||
- **Primary opcode (bits 0–5):** `59`
|
||||
- **Extended opcode:** `29`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (59 or 63) |
|
||||
| 6–10 | `FRT` | destination FPR |
|
||||
| 11–15 | `FRA` | source A FPR |
|
||||
| 16–20 | `FRB` | source B FPR |
|
||||
| 21–25 | `FRC` | source C FPR (multiplier for madd-style ops) |
|
||||
| 26–30 | `XO` | extended opcode (5 bits) |
|
||||
| 31 | `Rc` | record-form flag (updates CR1) |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FA` | fmaddsx: read | Source A floating-point register (`fr0`–`fr31`). |
|
||||
| `FC` | fmaddsx: read | Source C floating-point register (for madd-style ops). |
|
||||
| `FB` | fmaddsx: read | Source B floating-point register. |
|
||||
| `FD` | fmaddsx: write | Destination floating-point register. |
|
||||
| `CR` | fmaddsx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
| `FPSCR` | fmaddsx: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fmaddsx`
|
||||
|
||||
- **Reads (always):** `FA`, `FC`, `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`, `FPSCR`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fmaddsx`: **CR1** ← FPSCR[FX, FEX, VX, OX] when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fmaddsx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fmaddsx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:190`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L190)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:28`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L28)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:393`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L393)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2653-2665`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2653-L2665)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fmaddsx => {
|
||||
// PPCBUG-181: missing VXISI on add step.
|
||||
let a = ctx.fpr[instr.ra()];
|
||||
let c = ctx.fpr[instr.rc()];
|
||||
let b = ctx.fpr[instr.rb()];
|
||||
fpscr::check_invalid_mul(ctx, a, c);
|
||||
fpscr::check_invalid_fma_add(ctx, a, c, b, false);
|
||||
let result = to_single(ctx, a.mul_add(c, b));
|
||||
ctx.fpr[instr.rd()] = result;
|
||||
fpscr::update_after_op(ctx, result, a.is_finite() && b.is_finite() && c.is_finite());
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Single rounding step then single-precision rounding.** PowerISA semantics: compute `(FRA × FRC) + FRB` to infinite precision, then round once to binary32. xenia-rs implements this as `to_single(a.mul_add(c, b))` — the `mul_add` is the single-step fused multiply-add at double precision, then `to_single` rounds the binary64 result to binary32. This matches PPC's "single rounding" requirement because the intermediate `mul_add` is already exact-rounded.
|
||||
- **Operand order.** Assembler: `FD, FA, FC, FB` (multiplier `FRC` before addend `FRB`).
|
||||
- **Invalid operations.** `0×∞ + finite` → `VXIMZ`; opposite-signed-∞ collision → `VXISI`. Quiet NaN result with `FPSCR[VX, FX]`.
|
||||
- **FPSCR side effects.** Hardware updates `FPRF`, `FR`, `FI`, `FX`, `OX`, `UX`, `XX`, `VXIMZ`, `VXISI`, `VXSNAN`. xenia-rs does not (xenia quirk).
|
||||
- **`Rc=1` (`fmadds.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **NaN propagation.** Quiet-NaN result for any NaN operand; signalling NaNs are quietened.
|
||||
- **Single-precision overflow** of the final rounded result returns ±∞ and sets `OX`/`XX`/`FX`.
|
||||
- **Use case.** Dominates single-precision graphics math: matrix–vector multiplies, dot products, lighting equations, normal-map blending. Xbox 360 titles emit `fmadds` constantly.
|
||||
- **Denormal flush.** Xenon boots with `FPSCR[NI]=1`; xenia uses host IEEE behavior.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fmaddx`](fmaddx.md) — double-precision sibling.
|
||||
- [`fmsubsx`](fmsubsx.md), [`fnmaddsx`](fnmaddsx.md), [`fnmsubsx`](fnmsubsx.md) — single-precision fused-multiply siblings:
|
||||
- `fmsubs` = `(A×C) − B`
|
||||
- `fnmadds` = `−((A×C) + B)`
|
||||
- `fnmsubs` = `−((A×C) − B)`
|
||||
- [`fmulsx`](fmulsx.md), [`faddsx`](faddsx.md) — non-fused decomposition.
|
||||
- [`fresx`](fresx.md), [`frsqrtex`](frsqrtex.md) — reciprocal helpers; Newton-Raphson refinement uses `fmadds`/`fnmsubs`.
|
||||
- [`frspx`](frspx.md) — explicit double→single rounding.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fmadds` (Floating Multiply-Add Single)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fmadds-floating-multiply-add-single-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/).
|
||||
136
migration/project-root/ppc-manual/fpu/fmaddx.md
Normal file
136
migration/project-root/ppc-manual/fpu/fmaddx.md
Normal file
@@ -0,0 +1,136 @@
|
||||
# `fmaddx` — Floating Multiply-Add
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [A](../forms/A.md) · **Opcode:** `0xfc00003a`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fmadd` | `fmaddx` | — | Floating Multiply-Add |
|
||||
| `fmadd.` | `fmaddx` | Rc=1 | Floating Multiply-Add |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fmadd[Rc] [FD], [FA], [FC], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fmaddx` — form `A`
|
||||
|
||||
- **Opcode word:** `0xfc00003a`
|
||||
- **Primary opcode (bits 0–5):** `63`
|
||||
- **Extended opcode:** `29`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (59 or 63) |
|
||||
| 6–10 | `FRT` | destination FPR |
|
||||
| 11–15 | `FRA` | source A FPR |
|
||||
| 16–20 | `FRB` | source B FPR |
|
||||
| 21–25 | `FRC` | source C FPR (multiplier for madd-style ops) |
|
||||
| 26–30 | `XO` | extended opcode (5 bits) |
|
||||
| 31 | `Rc` | record-form flag (updates CR1) |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FA` | fmaddx: read | Source A floating-point register (`fr0`–`fr31`). |
|
||||
| `FC` | fmaddx: read | Source C floating-point register (for madd-style ops). |
|
||||
| `FB` | fmaddx: read | Source B floating-point register. |
|
||||
| `FD` | fmaddx: write | Destination floating-point register. |
|
||||
| `CR` | fmaddx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
| `FPSCR` | fmaddx: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fmaddx`
|
||||
|
||||
- **Reads (always):** `FA`, `FC`, `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`, `FPSCR`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fmaddx`: **CR1** ← FPSCR[FX, FEX, VX, OX] when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
FRT <- (FRA × FRC) + FRB
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fmaddx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fmaddx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:186`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L186)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:28`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L28)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:928`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L928)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2640-2652`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2640-L2652)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fmaddx => {
|
||||
// PPCBUG-202: VXISI from input properties (not from `a*c` which has wrong sign on overflow).
|
||||
let a = ctx.fpr[instr.ra()];
|
||||
let c = ctx.fpr[instr.rc()];
|
||||
let b = ctx.fpr[instr.rb()];
|
||||
fpscr::check_invalid_mul(ctx, a, c);
|
||||
fpscr::check_invalid_fma_add(ctx, a, c, b, false);
|
||||
let result = a.mul_add(c, b);
|
||||
ctx.fpr[instr.rd()] = result;
|
||||
fpscr::update_after_op(ctx, result, a.is_finite() && b.is_finite() && c.is_finite());
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Single rounding step.** `fmadd` computes `(FRA × FRC) + FRB` with one IEEE-754 rounding at the end — strictly more accurate than separate multiply + add. xenia-rs uses Rust's `f64::mul_add`, which guarantees a true FMA on hosts with hardware FMA (x86_64 with FMA3, ARM with NEON-FMA); on hosts without it, Rust's stdlib falls back to a software FMA so the semantic match is preserved.
|
||||
- **Operand layout.** A-form: `FRT, FRA, FRC, FRB`. Note the assembler order — `FRC` (multiplier) comes before `FRB` (addend). Encoding bit fields are `FRA` (11–15), `FRB` (16–20), `FRC` (21–25).
|
||||
- **Invalid operations.** `0×∞ + finite` → `VXIMZ`; `∞×x + ∓∞` (after multiplication produces ±∞ that opposes addend sign) → `VXISI`. Quiet NaN result with `FPSCR[VX, FX]` set.
|
||||
- **FPSCR side effects.** Hardware updates `FPRF`, `FR`, `FI`, `FX`, `OX`, `UX`, `XX`, `VXIMZ`, `VXISI`, `VXSNAN`. xenia-rs does not update FPSCR (xenia quirk).
|
||||
- **`Rc=1` (`fmadd.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **NaN propagation.** Quiet-NaN result for any NaN operand; signalling NaNs are quietened.
|
||||
- **Use case.** Dot products, polynomial evaluation (Horner's method), matrix multiplies, Newton-Raphson divide/sqrt refinement. Hot-path PPC code is dense with `fmadd`.
|
||||
- **Denormal flush.** Xenon boots with `FPSCR[NI]=1`; xenia uses host IEEE behavior.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fmaddsx`](fmaddsx.md) — single-precision sibling.
|
||||
- [`fmsubx`](fmsubx.md), [`fnmaddx`](fnmaddx.md), [`fnmsubx`](fnmsubx.md) — the other three fused multiply-add variants:
|
||||
- `fmsub` = `(A×C) − B`
|
||||
- `fnmadd` = `−((A×C) + B)`
|
||||
- `fnmsub` = `−((A×C) − B)`
|
||||
- [`fmulx`](fmulx.md), [`faddx`](faddx.md) — non-fused decomposition (two rounding steps; less precise).
|
||||
- [`fresx`](fresx.md), [`frsqrtex`](frsqrtex.md) — reciprocal helpers refined by `fmadd`/`fnmsub`.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fmadd` (Floating Multiply-Add)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fma-fmadd-floating-multiply-add-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/) (single-rounding fused multiply-add definition).
|
||||
120
migration/project-root/ppc-manual/fpu/fmrx.md
Normal file
120
migration/project-root/ppc-manual/fpu/fmrx.md
Normal file
@@ -0,0 +1,120 @@
|
||||
# `fmrx` — Floating Move Register
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [X](../forms/X.md) · **Opcode:** `0xfc000090`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fmr` | `fmrx` | — | Floating Move Register |
|
||||
| `fmr.` | `fmrx` | Rc=1 | Floating Move Register |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fmr[Rc] [FD], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fmrx` — form `X`
|
||||
|
||||
- **Opcode word:** `0xfc000090`
|
||||
- **Primary opcode (bits 0–5):** `63`
|
||||
- **Extended opcode:** `72`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode |
|
||||
| 6–10 | `RT/FRT/VRT` | destination |
|
||||
| 11–15 | `RA/FRA/VRA` | source A |
|
||||
| 16–20 | `RB/FRB/VRB` | source B |
|
||||
| 21–30 | `XO` | extended opcode (10 bits) |
|
||||
| 31 | `Rc` | record-form flag |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FB` | fmrx: read | Source B floating-point register. |
|
||||
| `FD` | fmrx: write | Destination floating-point register. |
|
||||
| `CR` | fmrx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fmrx`
|
||||
|
||||
- **Reads (always):** `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fmrx`: **CR0** ← signed-compare(result, 0) with `SO ← XER[SO]`, when `Rc=1`.
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
FRT <- FRB
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fmrx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fmrx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:496`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L496)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:28`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L28)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:906`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L906)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2752-2756`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2752-L2756)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fmrx => {
|
||||
ctx.fpr[instr.rd()] = ctx.fpr[instr.rb()];
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Bit-pattern copy, no rounding.** `fmr` copies the 64-bit binary representation of `FRB` into `FRT` unchanged. No precision loss, no FPSCR exception bits, no NaN quietening. xenia-rs implements this as a plain `f64` copy.
|
||||
- **NaN preserved verbatim.** Signalling/quiet bit, payload, and sign are all preserved exactly. Unlike arithmetic instructions, `fmr` does **not** quieten signalling NaNs.
|
||||
- **Special values.** All bit patterns pass through untouched, including ±0, ±∞, and any NaN. The destination receives an exact copy.
|
||||
- **FPSCR.** Hardware does **not** update `FPRF` or any exception bit. The "FPSCR write" implied in the header refers only to `Rc=1` updating CR1 from existing FPSCR contents.
|
||||
- **`Rc=1` (`fmr.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **No `FRA`.** X-form, primary 63, XO 72. Reads `FRB` only.
|
||||
- **Cheaper than load-store.** Compilers emit `fmr` for FPR-to-FPR moves; transferring a value via memory (`stfd`/`lfd`) would be far more expensive.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fabsx`](fabsx.md), [`fnegx`](fnegx.md), [`fnabsx`](fnabsx.md) — sign-bit variants of the move (clear / toggle / set).
|
||||
- [`fselx`](fselx.md) — branch-free select; like a conditional `fmr`.
|
||||
- [`mffsx`](mffsx.md) — read FPSCR into an FPR; complementary "FPR move" for a control register.
|
||||
- `stfd`/`lfd` — memory-mediated FPR transfer (much slower; used for register window spills).
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fmr` (Floating Move Register)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fmr-floating-move-register-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/) (move-class instructions explicitly bypass quietening and FPSCR side effects).
|
||||
144
migration/project-root/ppc-manual/fpu/fmsubsx.md
Normal file
144
migration/project-root/ppc-manual/fpu/fmsubsx.md
Normal file
@@ -0,0 +1,144 @@
|
||||
# `fmsubsx` — Floating Multiply-Subtract Single
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [A](../forms/A.md) · **Opcode:** `0xec000038`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fmsubs` | `fmsubsx` | — | Floating Multiply-Subtract Single |
|
||||
| `fmsubs.` | `fmsubsx` | Rc=1 | Floating Multiply-Subtract Single |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fmsubs[Rc] [FD], [FA], [FC], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fmsubsx` — form `A`
|
||||
|
||||
- **Opcode word:** `0xec000038`
|
||||
- **Primary opcode (bits 0–5):** `59`
|
||||
- **Extended opcode:** `28`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (59 or 63) |
|
||||
| 6–10 | `FRT` | destination FPR |
|
||||
| 11–15 | `FRA` | source A FPR |
|
||||
| 16–20 | `FRB` | source B FPR |
|
||||
| 21–25 | `FRC` | source C FPR (multiplier for madd-style ops) |
|
||||
| 26–30 | `XO` | extended opcode (5 bits) |
|
||||
| 31 | `Rc` | record-form flag (updates CR1) |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FA` | fmsubsx: read | Source A floating-point register (`fr0`–`fr31`). |
|
||||
| `FC` | fmsubsx: read | Source C floating-point register (for madd-style ops). |
|
||||
| `FB` | fmsubsx: read | Source B floating-point register. |
|
||||
| `FD` | fmsubsx: write | Destination floating-point register. |
|
||||
| `CR` | fmsubsx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
| `FPSCR` | fmsubsx: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fmsubsx`
|
||||
|
||||
- **Reads (always):** `FA`, `FC`, `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`, `FPSCR`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fmsubsx`: **CR1** ← FPSCR[FX, FEX, VX, OX] when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fmsubsx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fmsubsx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:209`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L209)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:28`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L28)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:392`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L392)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2679-2691`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2679-L2691)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fmsubsx => {
|
||||
// PPCBUG-182: missing VXISI on sub step.
|
||||
let a = ctx.fpr[instr.ra()];
|
||||
let c = ctx.fpr[instr.rc()];
|
||||
let b = ctx.fpr[instr.rb()];
|
||||
fpscr::check_invalid_mul(ctx, a, c);
|
||||
fpscr::check_invalid_fma_add(ctx, a, c, b, true);
|
||||
let result = to_single(ctx, a.mul_add(c, -b));
|
||||
ctx.fpr[instr.rd()] = result;
|
||||
fpscr::update_after_op(ctx, result, a.is_finite() && b.is_finite() && c.is_finite());
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Single rounding step then round-to-single.** Computes `(FRA × FRC) − FRB` with one fused rounding at double precision, then rounds the binary64 result to binary32. xenia-rs implements this as `to_single(a.mul_add(c, -b))`.
|
||||
- **Operand order.** Assembler: `FD, FA, FC, FB`. The multiplier `FRC` precedes the addend `FRB`.
|
||||
- **Invalid operations.** `0×∞ − finite` → `VXIMZ`; `(±∞×x) − ±∞` (same sign) → `VXISI`. Quiet NaN result with `FPSCR[VX, FX]`.
|
||||
- **FPSCR side effects.** Hardware updates `FPRF`, `FR`, `FI`, `FX`, `OX`, `UX`, `XX`, `VXIMZ`, `VXISI`, `VXSNAN`. xenia-rs does not (xenia quirk).
|
||||
- **`Rc=1` (`fmsubs.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **NaN propagation.** Quiet-NaN result for any NaN operand; signalling NaNs are quietened.
|
||||
- **Single-precision overflow** of the final rounded result returns ±∞ and sets `OX`/`XX`/`FX`.
|
||||
- **Use case.** Newton-Raphson refinement of `fres`: `x_new = x*(2 - d*x)` decomposes to a `fmsubs`/`fnmsubs` pair. Also common in residual-correction loops.
|
||||
- **Denormal flush.** Xenon boots with `FPSCR[NI]=1`; xenia uses host IEEE behavior.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fmsubx`](fmsubx.md) — double-precision sibling.
|
||||
- [`fmaddsx`](fmaddsx.md), [`fnmaddsx`](fnmaddsx.md), [`fnmsubsx`](fnmsubsx.md) — other single-precision fused-multiply variants.
|
||||
- [`fmulsx`](fmulsx.md), [`fsubsx`](fsubsx.md) — non-fused decomposition.
|
||||
- [`fresx`](fresx.md), [`frsqrtex`](frsqrtex.md) — reciprocal helpers refined by `fmsubs`/`fnmsubs`.
|
||||
- [`frspx`](frspx.md) — explicit double→single rounding.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fmsubs` (Floating Multiply-Subtract Single)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fmsubs-floating-multiply-subtract-single-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/).
|
||||
135
migration/project-root/ppc-manual/fpu/fmsubx.md
Normal file
135
migration/project-root/ppc-manual/fpu/fmsubx.md
Normal file
@@ -0,0 +1,135 @@
|
||||
# `fmsubx` — Floating Multiply-Subtract
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [A](../forms/A.md) · **Opcode:** `0xfc000038`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fmsub` | `fmsubx` | — | Floating Multiply-Subtract |
|
||||
| `fmsub.` | `fmsubx` | Rc=1 | Floating Multiply-Subtract |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fmsub[Rc] [FD], [FA], [FC], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fmsubx` — form `A`
|
||||
|
||||
- **Opcode word:** `0xfc000038`
|
||||
- **Primary opcode (bits 0–5):** `63`
|
||||
- **Extended opcode:** `28`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (59 or 63) |
|
||||
| 6–10 | `FRT` | destination FPR |
|
||||
| 11–15 | `FRA` | source A FPR |
|
||||
| 16–20 | `FRB` | source B FPR |
|
||||
| 21–25 | `FRC` | source C FPR (multiplier for madd-style ops) |
|
||||
| 26–30 | `XO` | extended opcode (5 bits) |
|
||||
| 31 | `Rc` | record-form flag (updates CR1) |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FA` | fmsubx: read | Source A floating-point register (`fr0`–`fr31`). |
|
||||
| `FC` | fmsubx: read | Source C floating-point register (for madd-style ops). |
|
||||
| `FB` | fmsubx: read | Source B floating-point register. |
|
||||
| `FD` | fmsubx: write | Destination floating-point register. |
|
||||
| `CR` | fmsubx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
| `FPSCR` | fmsubx: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fmsubx`
|
||||
|
||||
- **Reads (always):** `FA`, `FC`, `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`, `FPSCR`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fmsubx`: **CR1** ← FPSCR[FX, FEX, VX, OX] when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
FRT <- (FRA × FRC) − FRB
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fmsubx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fmsubx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:205`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L205)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:28`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L28)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:927`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L927)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2666-2678`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2666-L2678)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fmsubx => {
|
||||
// PPCBUG-203: missing VXISI on sub step.
|
||||
let a = ctx.fpr[instr.ra()];
|
||||
let c = ctx.fpr[instr.rc()];
|
||||
let b = ctx.fpr[instr.rb()];
|
||||
fpscr::check_invalid_mul(ctx, a, c);
|
||||
fpscr::check_invalid_fma_add(ctx, a, c, b, true);
|
||||
let result = a.mul_add(c, -b);
|
||||
ctx.fpr[instr.rd()] = result;
|
||||
fpscr::update_after_op(ctx, result, a.is_finite() && b.is_finite() && c.is_finite());
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Single rounding step.** `fmsub` computes `(FRA × FRC) − FRB` with one rounding at the end. xenia-rs implements this as `a.mul_add(c, -b)`, which is a true FMA on hosts that have hardware support and a software FMA on those that don't.
|
||||
- **Subtle: negate-then-FMA.** Negating `b` before passing to FMA matters for sign of zero and overflow. `(+0×+0) − (+0)` = `+0` in round-to-nearest, but `(+0×+0) − (−0)` = `+0` (the negation flips it before the FMA). Standard IEEE rules apply.
|
||||
- **Operand order.** Assembler: `FD, FA, FC, FB`.
|
||||
- **Invalid operations.** `0×∞ − finite` → `VXIMZ`; same-signed infinity collision (e.g. `(+∞×+1) − (+∞)`) → `VXISI`. Quiet NaN result with `FPSCR[VX, FX]`.
|
||||
- **FPSCR side effects.** Hardware updates `FPRF`, `FR`, `FI`, `FX`, `OX`, `UX`, `XX`, `VXIMZ`, `VXISI`, `VXSNAN`. xenia-rs does not update FPSCR (xenia quirk).
|
||||
- **`Rc=1` (`fmsub.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **NaN propagation.** Quiet-NaN result for any NaN operand; signalling NaNs are quietened.
|
||||
- **Use case.** Newton-Raphson refinement of reciprocal estimates: `x_new = x*(2 - d*x) = -((d*x) - 2)` uses `fnmsub`, but `fmsub` shows up wherever `(a*c) - b` appears (residuals, error correction).
|
||||
- **Denormal flush.** Xenon boots with `FPSCR[NI]=1`; xenia uses host IEEE behavior.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fmsubsx`](fmsubsx.md) — single-precision sibling.
|
||||
- [`fmaddx`](fmaddx.md), [`fnmaddx`](fnmaddx.md), [`fnmsubx`](fnmsubx.md) — other fused multiply-add variants.
|
||||
- [`fmulx`](fmulx.md), [`fsubx`](fsubx.md) — non-fused decomposition (two rounding steps).
|
||||
- [`fresx`](fresx.md), [`frsqrtex`](frsqrtex.md) — reciprocal helpers refined by fused multiply-subtracts.
|
||||
- [`fnegx`](fnegx.md) — sign flip (the bit-pattern op behind `-FRB` in xenia's implementation).
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fmsub` (Floating Multiply-Subtract)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fms-fmsub-floating-multiply-subtract-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/).
|
||||
140
migration/project-root/ppc-manual/fpu/fmulsx.md
Normal file
140
migration/project-root/ppc-manual/fpu/fmulsx.md
Normal file
@@ -0,0 +1,140 @@
|
||||
# `fmulsx` — Floating Multiply Single
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [A](../forms/A.md) · **Opcode:** `0xec000032`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fmuls` | `fmulsx` | — | Floating Multiply Single |
|
||||
| `fmuls.` | `fmulsx` | Rc=1 | Floating Multiply Single |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fmuls[Rc] [FD], [FA], [FC]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fmulsx` — form `A`
|
||||
|
||||
- **Opcode word:** `0xec000032`
|
||||
- **Primary opcode (bits 0–5):** `59`
|
||||
- **Extended opcode:** `25`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (59 or 63) |
|
||||
| 6–10 | `FRT` | destination FPR |
|
||||
| 11–15 | `FRA` | source A FPR |
|
||||
| 16–20 | `FRB` | source B FPR |
|
||||
| 21–25 | `FRC` | source C FPR (multiplier for madd-style ops) |
|
||||
| 26–30 | `XO` | extended opcode (5 bits) |
|
||||
| 31 | `Rc` | record-form flag (updates CR1) |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FA` | fmulsx: read | Source A floating-point register (`fr0`–`fr31`). |
|
||||
| `FC` | fmulsx: read | Source C floating-point register (for madd-style ops). |
|
||||
| `FD` | fmulsx: write | Destination floating-point register. |
|
||||
| `CR` | fmulsx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
| `FPSCR` | fmulsx: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fmulsx`
|
||||
|
||||
- **Reads (always):** `FA`, `FC`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`, `FPSCR`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fmulsx`: **CR1** ← FPSCR[FX, FEX, VX, OX] when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fmulsx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fmulsx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:97`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L97)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:28`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L28)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:391`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L391)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2606-2615`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2606-L2615)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fmulsx => {
|
||||
let a = ctx.fpr[instr.ra()];
|
||||
let c = ctx.fpr[instr.rc()];
|
||||
fpscr::check_invalid_mul(ctx, a, c);
|
||||
let result = to_single(ctx, a * c);
|
||||
ctx.fpr[instr.rd()] = result;
|
||||
fpscr::update_after_op(ctx, result, a.is_finite() && c.is_finite());
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **A-form quirk: multiplier is `FRC`.** Operands come from `FRA` (bits 11–15) and `FRC` (bits 21–25). xenia decodes via `instr.rc()` (don't confuse with `rc_bit()` for the record bit).
|
||||
- **Single precision.** Result is rounded to IEEE-754 binary32 then re-encoded into the 64-bit FPR. xenia uses `to_single(a * c)`.
|
||||
- **`0 × ±∞`** sets `FPSCR[VXIMZ, VX, FX]` and yields a quiet NaN.
|
||||
- **FPSCR side effects.** Hardware updates `FPRF`, `FR`, `FI`, `FX` and exception bits `OX`, `UX`, `XX`, `VXIMZ`, `VXSNAN`. xenia-rs does **not** maintain FPSCR (xenia quirk).
|
||||
- **`Rc=1` (`fmuls.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **NaN propagation.** Quiet-NaN result for any NaN operand; signalling NaNs are quietened.
|
||||
- **Single-precision overflow** returns ±∞ and sets `OX`/`XX`/`FX`.
|
||||
- **Denormal flush.** Xenon boots with `FPSCR[NI]=1`; xenia inherits host IEEE behavior, so subnormal results may differ subtly from hardware.
|
||||
- **Encoding.** A-form, primary 59, XO 25.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fmulx`](fmulx.md) — double-precision multiply.
|
||||
- [`fmaddsx`](fmaddsx.md), [`fmsubsx`](fmsubsx.md), [`fnmaddsx`](fnmaddsx.md), [`fnmsubsx`](fnmsubsx.md) — single-precision fused multiply-add family (one rounding step; preferred for dot products).
|
||||
- [`faddsx`](faddsx.md), [`fsubsx`](fsubsx.md), [`fdivsx`](fdivsx.md) — companion single-precision arithmetic.
|
||||
- [`fresx`](fresx.md), [`frsqrtex`](frsqrtex.md) — reciprocal estimates often paired with `fmuls` to compute `a * (1/b)`.
|
||||
- [`frspx`](frspx.md) — explicit double→single rounding helper.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fmuls` (Floating Multiply Single)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fmuls-floating-multiply-single-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/).
|
||||
132
migration/project-root/ppc-manual/fpu/fmulx.md
Normal file
132
migration/project-root/ppc-manual/fpu/fmulx.md
Normal file
@@ -0,0 +1,132 @@
|
||||
# `fmulx` — Floating Multiply
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [A](../forms/A.md) · **Opcode:** `0xfc000032`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fmul` | `fmulx` | — | Floating Multiply |
|
||||
| `fmul.` | `fmulx` | Rc=1 | Floating Multiply |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fmul[Rc] [FD], [FA], [FC]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fmulx` — form `A`
|
||||
|
||||
- **Opcode word:** `0xfc000032`
|
||||
- **Primary opcode (bits 0–5):** `63`
|
||||
- **Extended opcode:** `25`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (59 or 63) |
|
||||
| 6–10 | `FRT` | destination FPR |
|
||||
| 11–15 | `FRA` | source A FPR |
|
||||
| 16–20 | `FRB` | source B FPR |
|
||||
| 21–25 | `FRC` | source C FPR (multiplier for madd-style ops) |
|
||||
| 26–30 | `XO` | extended opcode (5 bits) |
|
||||
| 31 | `Rc` | record-form flag (updates CR1) |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FA` | fmulx: read | Source A floating-point register (`fr0`–`fr31`). |
|
||||
| `FC` | fmulx: read | Source C floating-point register (for madd-style ops). |
|
||||
| `FD` | fmulx: write | Destination floating-point register. |
|
||||
| `CR` | fmulx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
| `FPSCR` | fmulx: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fmulx`
|
||||
|
||||
- **Reads (always):** `FA`, `FC`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`, `FPSCR`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fmulx`: **CR1** ← FPSCR[FX, FEX, VX, OX] when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
FRT <- FRA × FRC
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fmulx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fmulx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:89`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L89)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:28`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L28)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:925`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L925)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2595-2605`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2595-L2605)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fmulx => {
|
||||
// A-form: frD = frA * frC (frC is at rc() field, bits 21-25)
|
||||
let a = ctx.fpr[instr.ra()];
|
||||
let c = ctx.fpr[instr.rc()];
|
||||
fpscr::check_invalid_mul(ctx, a, c);
|
||||
let result = a * c;
|
||||
ctx.fpr[instr.rd()] = result;
|
||||
fpscr::update_after_op(ctx, result, a.is_finite() && c.is_finite());
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **A-form quirk: multiplier is `FRC`, not `FRB`.** `fmul` reads operands from the `FRA` (bits 11–15) and `FRC` (bits 21–25) fields, bridging the multiply and fused-multiply-add families. xenia decodes this as `instr.rc()` (the FRC field, distinct from `rc_bit()` for the record bit).
|
||||
- **Double precision.** Operates on IEEE-754 binary64; [`fmulsx`](fmulsx.md) rounds to binary32.
|
||||
- **`0 × ±∞` is invalid.** Sets `FPSCR[VXIMZ, VX, FX]` and yields a quiet NaN.
|
||||
- **FPSCR side effects.** Hardware updates `FPRF`, `FR`, `FI`, `FX` plus exception bits `OX` (overflow), `UX` (underflow), `XX` (inexact), `VXIMZ` (0×∞), `VXSNAN` (signalling NaN). xenia-rs does **not** update FPSCR in the interpreter snapshot — xenia quirk.
|
||||
- **`Rc=1` (`fmul.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **NaN propagation.** Any NaN operand yields a quiet NaN; signalling NaNs are quietened.
|
||||
- **Sign of result.** Standard IEEE: `sign(a) XOR sign(c)`. `+0 × −0 = −0` and `−x × +∞ = −∞`.
|
||||
- **Denormal flush.** Xenon boots with `FPSCR[NI]=1` (flush-to-zero); xenia inherits host IEEE behavior, so multiplications that produce subnormal results may differ subtly from hardware.
|
||||
- **Rounding mode** uses `FPSCR[RN]` (default nearest-even).
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fmulsx`](fmulsx.md) — single-precision multiply.
|
||||
- [`fmaddx`](fmaddx.md), [`fmsubx`](fmsubx.md), [`fnmaddx`](fnmaddx.md), [`fnmsubx`](fnmsubx.md) — fused multiply-add family; share the same `FRA × FRC` core but add/subtract `FRB` with a single rounding step. Prefer fused forms for dot products and polynomial evaluation.
|
||||
- [`faddx`](faddx.md), [`fsubx`](fsubx.md), [`fdivx`](fdivx.md) — sibling double-precision arithmetic.
|
||||
- [`fresx`](fresx.md), [`frsqrtex`](frsqrtex.md) — reciprocal helpers commonly paired with `fmul` for reciprocal divides.
|
||||
- [`mffsx`](mffsx.md), [`mtfsfx`](mtfsfx.md) — FPSCR control.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fmul` (Floating Multiply)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fm-fmul-floating-multiply-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/).
|
||||
120
migration/project-root/ppc-manual/fpu/fnabsx.md
Normal file
120
migration/project-root/ppc-manual/fpu/fnabsx.md
Normal file
@@ -0,0 +1,120 @@
|
||||
# `fnabsx` — Floating Negative Absolute Value
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [X](../forms/X.md) · **Opcode:** `0xfc000110`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fnabs` | `fnabsx` | — | Floating Negative Absolute Value |
|
||||
| `fnabs.` | `fnabsx` | Rc=1 | Floating Negative Absolute Value |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fnabs[Rc] [FD], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fnabsx` — form `X`
|
||||
|
||||
- **Opcode word:** `0xfc000110`
|
||||
- **Primary opcode (bits 0–5):** `63`
|
||||
- **Extended opcode:** `136`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode |
|
||||
| 6–10 | `RT/FRT/VRT` | destination |
|
||||
| 11–15 | `RA/FRA/VRA` | source A |
|
||||
| 16–20 | `RB/FRB/VRB` | source B |
|
||||
| 21–30 | `XO` | extended opcode (10 bits) |
|
||||
| 31 | `Rc` | record-form flag |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FB` | fnabsx: read | Source B floating-point register. |
|
||||
| `FD` | fnabsx: write | Destination floating-point register. |
|
||||
| `CR` | fnabsx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fnabsx`
|
||||
|
||||
- **Reads (always):** `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fnabsx`: **CR0** ← signed-compare(result, 0) with `SO ← XER[SO]`, when `Rc=1`.
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
FRT <- set_sign(FRB)
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fnabsx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fnabsx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:504`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L504)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:29`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L29)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:908`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L908)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2767-2771`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2767-L2771)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fnabsx => {
|
||||
ctx.fpr[instr.rd()] = -(ctx.fpr[instr.rb()].abs());
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Bit-pattern operation, no rounding.** `fnabs` **sets** the sign bit (bit 0) of the source binary64 value to 1, producing `-|FRB|`. No precision change, no exception bits. xenia-rs implements this as `-(b.abs())` — the abs clears the sign bit, then negation sets it.
|
||||
- **NaN handling.** Returns the source NaN with the sign bit set to 1; payload preserved; signalling/quiet bit unchanged. `FPSCR[VXSNAN]` is **not** raised.
|
||||
- **Special values.** `fnabs(±0) = -0`; `fnabs(±∞) = -∞`; `fnabs(±NaN) = -NaN` (sign set, payload preserved).
|
||||
- **FPSCR.** Hardware does not update `FPRF` and does not raise any exception bit. Sign-bit ops are not arithmetic.
|
||||
- **`Rc=1` (`fnabs.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **No `FRA`.** X-form, primary 63, XO 136. Reads `FRB` only.
|
||||
- **Use case.** Less common than `fabs`/`fneg`. Useful for unconditional negative-magnitude values, e.g. forcing a value to be on the negative side of zero before a subsequent compare or for bit-pattern setup.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fabsx`](fabsx.md) — clear sign bit (positive).
|
||||
- [`fnegx`](fnegx.md) — toggle sign bit.
|
||||
- [`fmrx`](fmrx.md) — plain register copy.
|
||||
- [`fselx`](fselx.md) — branch-free select; with `fabs`/`fnabs` synthesises `copysign`-like helpers.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fnabs` (Floating Negative Absolute Value)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fnabs-floating-negative-absolute-value-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/).
|
||||
121
migration/project-root/ppc-manual/fpu/fnegx.md
Normal file
121
migration/project-root/ppc-manual/fpu/fnegx.md
Normal file
@@ -0,0 +1,121 @@
|
||||
# `fnegx` — Floating Negate
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [X](../forms/X.md) · **Opcode:** `0xfc000050`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fneg` | `fnegx` | — | Floating Negate |
|
||||
| `fneg.` | `fnegx` | Rc=1 | Floating Negate |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fneg[Rc] [FD], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fnegx` — form `X`
|
||||
|
||||
- **Opcode word:** `0xfc000050`
|
||||
- **Primary opcode (bits 0–5):** `63`
|
||||
- **Extended opcode:** `40`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode |
|
||||
| 6–10 | `RT/FRT/VRT` | destination |
|
||||
| 11–15 | `RA/FRA/VRA` | source A |
|
||||
| 16–20 | `RB/FRB/VRB` | source B |
|
||||
| 21–30 | `XO` | extended opcode (10 bits) |
|
||||
| 31 | `Rc` | record-form flag |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FB` | fnegx: read | Source B floating-point register. |
|
||||
| `FD` | fnegx: write | Destination floating-point register. |
|
||||
| `CR` | fnegx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fnegx`
|
||||
|
||||
- **Reads (always):** `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fnegx`: **CR0** ← signed-compare(result, 0) with `SO ← XER[SO]`, when `Rc=1`.
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
FRT <- flip_sign(FRB)
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fnegx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fnegx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:515`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L515)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:29`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L29)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:903`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L903)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2762-2766`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2762-L2766)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fnegx => {
|
||||
ctx.fpr[instr.rd()] = -ctx.fpr[instr.rb()];
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Bit-pattern operation, no rounding.** `fneg` toggles the sign bit (bit 0) of the source binary64 value and writes the 64-bit pattern to the destination. No precision change, no exception bits.
|
||||
- **NaN handling.** PowerISA specifies that `fneg` toggles the NaN sign bit (unlike `fnmadd` which does **not**). xenia-rs uses Rust's unary `-`, which toggles the sign bit on NaN values for binary64 — semantic match.
|
||||
- **Special values.** `fneg(+0) = -0`; `fneg(-0) = +0`; `fneg(±∞) = ∓∞`. No `FPSCR[VXSNAN]` raised even on signalling NaN inputs (sign-bit ops are not arithmetic).
|
||||
- **FPSCR.** Hardware does **not** update `FPRF` and does **not** raise any exception bit. The "FPSCR write" in the header refers only to `Rc=1` updating CR1 from existing FPSCR contents.
|
||||
- **`Rc=1` (`fneg.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **No `FRA`.** X-form, primary 63, XO 40. Reads `FRB` only.
|
||||
- **Use as a free negate.** Common in compiled PPC code for `-x` or as part of negate-and-fma sequences when no fused negative variant exists.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fabsx`](fabsx.md) — clear sign bit (positive).
|
||||
- [`fnabsx`](fnabsx.md) — set sign bit (negative).
|
||||
- [`fmrx`](fmrx.md) — plain register copy.
|
||||
- [`fnmaddx`](fnmaddx.md), [`fnmsubx`](fnmsubx.md), [`fnmaddsx`](fnmaddsx.md), [`fnmsubsx`](fnmsubsx.md) — fused negate-multiply-add forms; eliminate the need for an explicit `fneg` after an FMA.
|
||||
- [`fselx`](fselx.md) — combined with `fneg` for branch-free `copysign`/clamp patterns.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fneg` (Floating Negate)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fneg-floating-negate-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/) (PPC's `fneg` toggles NaN sign — distinct from the `fnmadd` family).
|
||||
147
migration/project-root/ppc-manual/fpu/fnmaddsx.md
Normal file
147
migration/project-root/ppc-manual/fpu/fnmaddsx.md
Normal file
@@ -0,0 +1,147 @@
|
||||
# `fnmaddsx` — Floating Negative Multiply-Add Single
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [A](../forms/A.md) · **Opcode:** `0xec00003e`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fnmadds` | `fnmaddsx` | — | Floating Negative Multiply-Add Single |
|
||||
| `fnmadds.` | `fnmaddsx` | Rc=1 | Floating Negative Multiply-Add Single |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fnmadds[Rc] [FD], [FA], [FC], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fnmaddsx` — form `A`
|
||||
|
||||
- **Opcode word:** `0xec00003e`
|
||||
- **Primary opcode (bits 0–5):** `59`
|
||||
- **Extended opcode:** `31`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (59 or 63) |
|
||||
| 6–10 | `FRT` | destination FPR |
|
||||
| 11–15 | `FRA` | source A FPR |
|
||||
| 16–20 | `FRB` | source B FPR |
|
||||
| 21–25 | `FRC` | source C FPR (multiplier for madd-style ops) |
|
||||
| 26–30 | `XO` | extended opcode (5 bits) |
|
||||
| 31 | `Rc` | record-form flag (updates CR1) |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FA` | fnmaddsx: read | Source A floating-point register (`fr0`–`fr31`). |
|
||||
| `FC` | fnmaddsx: read | Source C floating-point register (for madd-style ops). |
|
||||
| `FB` | fnmaddsx: read | Source B floating-point register. |
|
||||
| `FD` | fnmaddsx: write | Destination floating-point register. |
|
||||
| `CR` | fnmaddsx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
| `FPSCR` | fnmaddsx: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fnmaddsx`
|
||||
|
||||
- **Reads (always):** `FA`, `FC`, `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`, `FPSCR`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fnmaddsx`: **CR1** ← FPSCR[FX, FEX, VX, OX] when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fnmaddsx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fnmaddsx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:222`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L222)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:29`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L29)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:395`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L395)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2706-2720`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2706-L2720)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fnmaddsx => {
|
||||
// PPCBUG-181 + PPCBUG-183: VXISI + NaN sign preservation.
|
||||
let a = ctx.fpr[instr.ra()];
|
||||
let c = ctx.fpr[instr.rc()];
|
||||
let b = ctx.fpr[instr.rb()];
|
||||
fpscr::check_invalid_mul(ctx, a, c);
|
||||
fpscr::check_invalid_fma_add(ctx, a, c, b, false);
|
||||
let fma = a.mul_add(c, b);
|
||||
let neg = if fma.is_nan() { fma } else { -fma };
|
||||
let result = to_single(ctx, neg);
|
||||
ctx.fpr[instr.rd()] = result;
|
||||
fpscr::update_after_op(ctx, result, a.is_finite() && b.is_finite() && c.is_finite());
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Single rounding then negate then round-to-single.** Computes `−((FRA × FRC) + FRB)` and rounds to binary32. xenia-rs uses `to_single(-(a.mul_add(c, b)))` — the negation is a sign-flip on the binary64 intermediate, then `to_single` rounds to binary32.
|
||||
- **NaN sign behaviour.** PowerISA specifies the negation does **not** flip the sign bit of a NaN result. xenia uses Rust's `Neg`, which does flip the NaN sign bit. Observable only via bit-level inspection. **xenia quirk.**
|
||||
- **Operand order.** Assembler: `FD, FA, FC, FB`.
|
||||
- **Invalid operations.** `0×∞` → `VXIMZ`; opposing-infinity collision → `VXISI`. Quiet NaN result with `FPSCR[VX, FX]`.
|
||||
- **FPSCR side effects.** Hardware updates `FPRF`, `FR`, `FI`, `FX`, `OX`, `UX`, `XX`, `VXIMZ`, `VXISI`, `VXSNAN`. xenia-rs does not (xenia quirk).
|
||||
- **`Rc=1` (`fnmadds.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **NaN propagation.** Quiet-NaN result for any NaN operand; signalling NaNs are quietened.
|
||||
- **Single-precision overflow** returns ±∞ and sets `OX`/`XX`/`FX`.
|
||||
- **Use case.** Single-precision Newton-Raphson refinement and graphics-pipeline math where the negated product-sum form is convenient.
|
||||
- **Denormal flush.** Xenon boots with `FPSCR[NI]=1`; xenia uses host IEEE behavior.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fnmaddx`](fnmaddx.md) — double-precision sibling.
|
||||
- [`fmaddsx`](fmaddsx.md), [`fmsubsx`](fmsubsx.md), [`fnmsubsx`](fnmsubsx.md) — other single-precision fused-multiply variants.
|
||||
- [`fmulsx`](fmulsx.md), [`faddsx`](faddsx.md) — non-fused decomposition.
|
||||
- [`fnegx`](fnegx.md), [`fnabsx`](fnabsx.md) — sign-bit ops.
|
||||
- [`frspx`](frspx.md) — explicit double→single rounding.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fnmadds` (Floating Negative Multiply-Add Single)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fnmadds-floating-negative-multiply-add-single-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/).
|
||||
138
migration/project-root/ppc-manual/fpu/fnmaddx.md
Normal file
138
migration/project-root/ppc-manual/fpu/fnmaddx.md
Normal file
@@ -0,0 +1,138 @@
|
||||
# `fnmaddx` — Floating Negative Multiply-Add
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [A](../forms/A.md) · **Opcode:** `0xfc00003e`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fnmadd` | `fnmaddx` | — | Floating Negative Multiply-Add |
|
||||
| `fnmadd.` | `fnmaddx` | Rc=1 | Floating Negative Multiply-Add |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fnmadd[Rc] [FD], [FA], [FC], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fnmaddx` — form `A`
|
||||
|
||||
- **Opcode word:** `0xfc00003e`
|
||||
- **Primary opcode (bits 0–5):** `63`
|
||||
- **Extended opcode:** `31`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (59 or 63) |
|
||||
| 6–10 | `FRT` | destination FPR |
|
||||
| 11–15 | `FRA` | source A FPR |
|
||||
| 16–20 | `FRB` | source B FPR |
|
||||
| 21–25 | `FRC` | source C FPR (multiplier for madd-style ops) |
|
||||
| 26–30 | `XO` | extended opcode (5 bits) |
|
||||
| 31 | `Rc` | record-form flag (updates CR1) |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FA` | fnmaddx: read | Source A floating-point register (`fr0`–`fr31`). |
|
||||
| `FC` | fnmaddx: read | Source C floating-point register (for madd-style ops). |
|
||||
| `FB` | fnmaddx: read | Source B floating-point register. |
|
||||
| `FD` | fnmaddx: write | Destination floating-point register. |
|
||||
| `CR` | fnmaddx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
| `FPSCR` | fnmaddx: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fnmaddx`
|
||||
|
||||
- **Reads (always):** `FA`, `FC`, `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`, `FPSCR`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fnmaddx`: **CR1** ← FPSCR[FX, FEX, VX, OX] when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
FRT <- −((FRA × FRC) + FRB)
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fnmaddx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fnmaddx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:213`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L213)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:29`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L29)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:930`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L930)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2692-2705`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2692-L2705)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fnmaddx => {
|
||||
// PPCBUG-203: missing VXISI. PPCBUG-205: NaN sign preserved (no negation on NaN).
|
||||
let a = ctx.fpr[instr.ra()];
|
||||
let c = ctx.fpr[instr.rc()];
|
||||
let b = ctx.fpr[instr.rb()];
|
||||
fpscr::check_invalid_mul(ctx, a, c);
|
||||
fpscr::check_invalid_fma_add(ctx, a, c, b, false);
|
||||
let fma = a.mul_add(c, b);
|
||||
let result = if fma.is_nan() { fma } else { -fma };
|
||||
ctx.fpr[instr.rd()] = result;
|
||||
fpscr::update_after_op(ctx, result, a.is_finite() && b.is_finite() && c.is_finite());
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Single rounding step, then sign flip.** Computes `−((FRA × FRC) + FRB)` with one fused rounding for the FMA; the final negation is a bit-pattern sign-flip and does not introduce additional rounding error. xenia-rs implements this as `-(a.mul_add(c, b))`.
|
||||
- **Sign of NaN.** Per PowerISA, `fnmadd` does **not** flip the sign of a NaN result. xenia uses Rust's `Neg` which does flip the NaN sign bit (`f64::neg`); for IEEE-754 binary64 this is observable through bit-level inspection but not through arithmetic comparisons. **xenia quirk** — title code that inspects NaN sign bits will diverge.
|
||||
- **Operand order.** Assembler: `FD, FA, FC, FB`.
|
||||
- **Invalid operations.** Same as `fmadd`: `VXIMZ` for `0×∞`, `VXISI` for opposing-infinity collision. Quiet NaN result.
|
||||
- **FPSCR side effects.** Hardware updates `FPRF`, `FR`, `FI`, `FX`, `OX`, `UX`, `XX`, `VXIMZ`, `VXISI`, `VXSNAN`. xenia-rs does not (xenia quirk).
|
||||
- **`Rc=1` (`fnmadd.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **NaN propagation.** Quiet-NaN result for any NaN operand; signalling NaNs are quietened.
|
||||
- **Use case.** Computing `-a*c - b` directly without an intermediate negate. Useful in iterative solvers and in transforming polynomial coefficients.
|
||||
- **Denormal flush.** Xenon boots with `FPSCR[NI]=1`; xenia uses host IEEE behavior.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fnmaddsx`](fnmaddsx.md) — single-precision sibling.
|
||||
- [`fmaddx`](fmaddx.md), [`fmsubx`](fmsubx.md), [`fnmsubx`](fnmsubx.md) — other fused multiply-add variants:
|
||||
- `fmadd` = `(A×C) + B`
|
||||
- `fmsub` = `(A×C) − B`
|
||||
- `fnmsub` = `−((A×C) − B)`
|
||||
- [`fnegx`](fnegx.md), [`fnabsx`](fnabsx.md) — sign-bit operations on FPRs.
|
||||
- [`fmulx`](fmulx.md), [`faddx`](faddx.md) — non-fused decomposition.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fnmadd` (Floating Negative Multiply-Add)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fnma-fnmadd-floating-negative-multiply-add-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/) (note: PowerISA specifies the negation does not flip NaN sign bits).
|
||||
147
migration/project-root/ppc-manual/fpu/fnmsubsx.md
Normal file
147
migration/project-root/ppc-manual/fpu/fnmsubsx.md
Normal file
@@ -0,0 +1,147 @@
|
||||
# `fnmsubsx` — Floating Negative Multiply-Subtract Single
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [A](../forms/A.md) · **Opcode:** `0xec00003c`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fnmsubs` | `fnmsubsx` | — | Floating Negative Multiply-Subtract Single |
|
||||
| `fnmsubs.` | `fnmsubsx` | Rc=1 | Floating Negative Multiply-Subtract Single |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fnmsubs[Rc] [FD], [FA], [FC], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fnmsubsx` — form `A`
|
||||
|
||||
- **Opcode word:** `0xec00003c`
|
||||
- **Primary opcode (bits 0–5):** `59`
|
||||
- **Extended opcode:** `30`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (59 or 63) |
|
||||
| 6–10 | `FRT` | destination FPR |
|
||||
| 11–15 | `FRA` | source A FPR |
|
||||
| 16–20 | `FRB` | source B FPR |
|
||||
| 21–25 | `FRC` | source C FPR (multiplier for madd-style ops) |
|
||||
| 26–30 | `XO` | extended opcode (5 bits) |
|
||||
| 31 | `Rc` | record-form flag (updates CR1) |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FA` | fnmsubsx: read | Source A floating-point register (`fr0`–`fr31`). |
|
||||
| `FC` | fnmsubsx: read | Source C floating-point register (for madd-style ops). |
|
||||
| `FB` | fnmsubsx: read | Source B floating-point register. |
|
||||
| `FD` | fnmsubsx: write | Destination floating-point register. |
|
||||
| `CR` | fnmsubsx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
| `FPSCR` | fnmsubsx: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fnmsubsx`
|
||||
|
||||
- **Reads (always):** `FA`, `FC`, `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`, `FPSCR`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fnmsubsx`: **CR1** ← FPSCR[FX, FEX, VX, OX] when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fnmsubsx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fnmsubsx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:241`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L241)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:29`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L29)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:394`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L394)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2735-2749`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2735-L2749)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fnmsubsx => {
|
||||
// PPCBUG-182 + PPCBUG-183: VXISI + NaN sign preservation.
|
||||
let a = ctx.fpr[instr.ra()];
|
||||
let c = ctx.fpr[instr.rc()];
|
||||
let b = ctx.fpr[instr.rb()];
|
||||
fpscr::check_invalid_mul(ctx, a, c);
|
||||
fpscr::check_invalid_fma_add(ctx, a, c, b, true);
|
||||
let fma = a.mul_add(c, -b);
|
||||
let neg = if fma.is_nan() { fma } else { -fma };
|
||||
let result = to_single(ctx, neg);
|
||||
ctx.fpr[instr.rd()] = result;
|
||||
fpscr::update_after_op(ctx, result, a.is_finite() && b.is_finite() && c.is_finite());
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Single rounding then negate then round-to-single.** Computes `−((FRA × FRC) − FRB)` = `FRB − (FRA × FRC)` with one fused rounding at double precision, then rounds to binary32. xenia-rs uses `to_single(-(a.mul_add(c, -b)))`.
|
||||
- **NaN sign behaviour.** PowerISA: the negation does **not** flip a NaN's sign bit. xenia uses Rust's `Neg` which does. Observable only by bit-level inspection. **xenia quirk.**
|
||||
- **Operand order.** Assembler: `FD, FA, FC, FB`.
|
||||
- **Invalid operations.** `0×∞` → `VXIMZ`; same-signed-infinity collision → `VXISI`. Quiet NaN result.
|
||||
- **FPSCR side effects.** Hardware updates `FPRF`, `FR`, `FI`, `FX`, `OX`, `UX`, `XX`, `VXIMZ`, `VXISI`, `VXSNAN`. xenia-rs does not (xenia quirk).
|
||||
- **`Rc=1` (`fnmsubs.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **NaN propagation.** Quiet-NaN result for any NaN operand; signalling NaNs are quietened.
|
||||
- **Single-precision overflow** returns ±∞ and sets `OX`/`XX`/`FX`.
|
||||
- **Use case.** Single-precision Newton-Raphson divide refinement: `x_new = x*(2 - d*x)` is implemented as a `fnmsubs`/`fmuls` pair throughout Xbox 360 graphics code.
|
||||
- **Denormal flush.** Xenon boots with `FPSCR[NI]=1`; xenia uses host IEEE behavior.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fnmsubx`](fnmsubx.md) — double-precision sibling.
|
||||
- [`fmaddsx`](fmaddsx.md), [`fmsubsx`](fmsubsx.md), [`fnmaddsx`](fnmaddsx.md) — other single-precision fused-multiply variants.
|
||||
- [`fresx`](fresx.md), [`frsqrtex`](frsqrtex.md) — reciprocal estimates whose Newton-Raphson refinement leans on `fnmsubs`.
|
||||
- [`fmulsx`](fmulsx.md), [`fsubsx`](fsubsx.md) — non-fused decomposition.
|
||||
- [`frspx`](frspx.md) — explicit double→single rounding.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fnmsubs` (Floating Negative Multiply-Subtract Single)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fnmsubs-floating-negative-multiply-subtract-single-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/).
|
||||
136
migration/project-root/ppc-manual/fpu/fnmsubx.md
Normal file
136
migration/project-root/ppc-manual/fpu/fnmsubx.md
Normal file
@@ -0,0 +1,136 @@
|
||||
# `fnmsubx` — Floating Negative Multiply-Subtract
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [A](../forms/A.md) · **Opcode:** `0xfc00003c`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fnmsub` | `fnmsubx` | — | Floating Negative Multiply-Subtract |
|
||||
| `fnmsub.` | `fnmsubx` | Rc=1 | Floating Negative Multiply-Subtract |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fnmsub[Rc] [FD], [FA], [FC], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fnmsubx` — form `A`
|
||||
|
||||
- **Opcode word:** `0xfc00003c`
|
||||
- **Primary opcode (bits 0–5):** `63`
|
||||
- **Extended opcode:** `30`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (59 or 63) |
|
||||
| 6–10 | `FRT` | destination FPR |
|
||||
| 11–15 | `FRA` | source A FPR |
|
||||
| 16–20 | `FRB` | source B FPR |
|
||||
| 21–25 | `FRC` | source C FPR (multiplier for madd-style ops) |
|
||||
| 26–30 | `XO` | extended opcode (5 bits) |
|
||||
| 31 | `Rc` | record-form flag (updates CR1) |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FA` | fnmsubx: read | Source A floating-point register (`fr0`–`fr31`). |
|
||||
| `FC` | fnmsubx: read | Source C floating-point register (for madd-style ops). |
|
||||
| `FB` | fnmsubx: read | Source B floating-point register. |
|
||||
| `FD` | fnmsubx: write | Destination floating-point register. |
|
||||
| `CR` | fnmsubx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
| `FPSCR` | fnmsubx: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fnmsubx`
|
||||
|
||||
- **Reads (always):** `FA`, `FC`, `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`, `FPSCR`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fnmsubx`: **CR1** ← FPSCR[FX, FEX, VX, OX] when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
FRT <- −((FRA × FRC) − FRB)
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fnmsubx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fnmsubx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:232`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L232)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:29`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L29)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:929`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L929)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2721-2734`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2721-L2734)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fnmsubx => {
|
||||
// PPCBUG-203: VXISI. PPCBUG-205: NaN sign preservation.
|
||||
let a = ctx.fpr[instr.ra()];
|
||||
let c = ctx.fpr[instr.rc()];
|
||||
let b = ctx.fpr[instr.rb()];
|
||||
fpscr::check_invalid_mul(ctx, a, c);
|
||||
fpscr::check_invalid_fma_add(ctx, a, c, b, true);
|
||||
let fma = a.mul_add(c, -b);
|
||||
let result = if fma.is_nan() { fma } else { -fma };
|
||||
ctx.fpr[instr.rd()] = result;
|
||||
fpscr::update_after_op(ctx, result, a.is_finite() && b.is_finite() && c.is_finite());
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Single rounding step, then sign flip.** Computes `−((FRA × FRC) − FRB)` = `FRB − (FRA × FRC)`, with one fused rounding. xenia-rs implements this as `-(a.mul_add(c, -b))`, which is mathematically equivalent.
|
||||
- **NaN sign behaviour.** PowerISA: the negation does **not** flip the sign of a NaN result. xenia uses Rust's `Neg` which does flip the sign bit on NaNs. Observable only via bit-level inspection. **xenia quirk.**
|
||||
- **Operand order.** Assembler: `FD, FA, FC, FB`.
|
||||
- **Invalid operations.** `0×∞` → `VXIMZ`; same-signed-infinity collision (e.g. `(+∞) − (+∞)`) → `VXISI`. Quiet NaN result.
|
||||
- **FPSCR side effects.** Hardware updates `FPRF`, `FR`, `FI`, `FX`, `OX`, `UX`, `XX`, `VXIMZ`, `VXISI`, `VXSNAN`. xenia-rs does not (xenia quirk).
|
||||
- **`Rc=1` (`fnmsub.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **NaN propagation.** Quiet-NaN result for any NaN operand; signalling NaNs are quietened.
|
||||
- **Use case.** The canonical Newton-Raphson divide refinement step: `x_new = x*(2 - d*x)`. This is the most common operand pattern in compiled PPC graphics code that does software reciprocals.
|
||||
- **Denormal flush.** Xenon boots with `FPSCR[NI]=1`; xenia uses host IEEE behavior.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fnmsubsx`](fnmsubsx.md) — single-precision sibling.
|
||||
- [`fmaddx`](fmaddx.md), [`fmsubx`](fmsubx.md), [`fnmaddx`](fnmaddx.md) — other fused multiply-add variants.
|
||||
- [`fresx`](fresx.md) — reciprocal estimate; `fnmsub` is the workhorse of NR refinement of `fres` outputs.
|
||||
- [`frsqrtex`](frsqrtex.md) — reciprocal-sqrt estimate; also refined with `fnmsub`-style chains.
|
||||
- [`fmulx`](fmulx.md), [`fsubx`](fsubx.md) — non-fused decomposition.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fnmsub` (Floating Negative Multiply-Subtract)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fnms-fnmsub-floating-negative-multiply-subtract-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/).
|
||||
151
migration/project-root/ppc-manual/fpu/fresx.md
Normal file
151
migration/project-root/ppc-manual/fpu/fresx.md
Normal file
@@ -0,0 +1,151 @@
|
||||
# `fresx` — Floating Reciprocal Estimate Single
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [A](../forms/A.md) · **Opcode:** `0xec000030`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fres` | `fresx` | — | Floating Reciprocal Estimate Single |
|
||||
| `fres.` | `fresx` | Rc=1 | Floating Reciprocal Estimate Single |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fres[Rc] [FD], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fresx` — form `A`
|
||||
|
||||
- **Opcode word:** `0xec000030`
|
||||
- **Primary opcode (bits 0–5):** `59`
|
||||
- **Extended opcode:** `24`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (59 or 63) |
|
||||
| 6–10 | `FRT` | destination FPR |
|
||||
| 11–15 | `FRA` | source A FPR |
|
||||
| 16–20 | `FRB` | source B FPR |
|
||||
| 21–25 | `FRC` | source C FPR (multiplier for madd-style ops) |
|
||||
| 26–30 | `XO` | extended opcode (5 bits) |
|
||||
| 31 | `Rc` | record-form flag (updates CR1) |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FB` | fresx: read | Source B floating-point register. |
|
||||
| `FD` | fresx: write | Destination floating-point register. |
|
||||
| `CR` | fresx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
| `FPSCR` | fresx: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fresx`
|
||||
|
||||
- **Reads (always):** `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`, `FPSCR`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fresx`: **CR1** ← FPSCR[FX, FEX, VX, OX] when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fresx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fresx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:106`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L106)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:29`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L29)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:390`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L390)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2815-2835`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2815-L2835)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fresx => {
|
||||
// Single-precision reciprocal estimate: frD = 1.0 / frB.
|
||||
// PPCBUG-184: pre-quantize input to f32 to match canary's
|
||||
// `f.Recip(f.Convert(frB, FLOAT32_TYPE))` behavior. Hardware
|
||||
// produces a ~12-bit LUT estimate; both emulators produce a
|
||||
// fully-IEEE single reciprocal, but the f32 quantization at
|
||||
// least makes the input precision match.
|
||||
let b_full = ctx.fpr[instr.rb()];
|
||||
let b = b_full as f32 as f64;
|
||||
if b == 0.0 {
|
||||
fpscr::set_exception(ctx, fpscr::ZX);
|
||||
}
|
||||
if fpscr::is_snan(b_full) {
|
||||
fpscr::set_exception(ctx, fpscr::VXSNAN);
|
||||
}
|
||||
let result = to_single(ctx, 1.0 / b);
|
||||
ctx.fpr[instr.rd()] = result;
|
||||
fpscr::update_after_op(ctx, result, b.is_finite() && b != 0.0);
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Single-precision reciprocal estimate.** PowerISA specifies a *low-precision* approximation of `1/FRB` accurate to roughly 12–14 bits of significand, intended as the seed for a Newton-Raphson refinement step. **xenia quirk:** xenia-rs computes the *full-precision* `1.0 / b` then rounds to single, so it produces a far more accurate result than hardware. Title code that depends on the limited precision of `fres` to trigger refinement loops will still work (the loops just refine an already-correct value), but bit-exact correlation with hardware is impossible.
|
||||
- **Single precision result.** Final value is rounded to binary32 then re-encoded into the FPR.
|
||||
- **Divide by zero.** `1/±0` → ±∞ and sets `FPSCR[ZX, FX]`. xenia returns the host ±∞ but does not update FPSCR.
|
||||
- **`fres(±∞) = ±0`** (correctly signed).
|
||||
- **`fres(NaN) = NaN`**; signalling NaNs are quietened.
|
||||
- **Overflow / underflow.** May set `OX`/`UX`/`XX`/`FX`. xenia does not update FPSCR.
|
||||
- **`Rc=1` (`fres.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **Encoding.** A-form, primary 59, XO 24. Reads `FRB` only; `FRA`/`FRC` are don't-care.
|
||||
- **Use case.** Software reciprocal: `1/d ≈ x = fres(d); x = x*(2 - d*x);` (one Newton-Raphson step recovers full single precision). Two iterations recover full double precision. The `(2 - d*x)` step compiles to `fnmsub`.
|
||||
- **Performance.** Cheap on Xenon (single-cycle issue) — divides by `fres` + 1–2 NR steps + `fmul` are far faster than `fdiv`/`fdivs`.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`frsqrtex`](frsqrtex.md) — reciprocal-square-root estimate; same NR refinement approach.
|
||||
- [`fdivx`](fdivx.md), [`fdivsx`](fdivsx.md) — true divide; alternative when refinement isn't needed.
|
||||
- [`fnmsubx`](fnmsubx.md), [`fnmsubsx`](fnmsubsx.md) — the workhorse for the `(2 - d*x)` step.
|
||||
- [`fmulx`](fmulx.md), [`fmulsx`](fmulsx.md) — final multiply to apply the reciprocal.
|
||||
- [`fmaddsx`](fmaddsx.md) — alternate refinement formulation.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fres` (Floating Reciprocal Estimate Single)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fres-floating-reciprocal-estimate-single-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/) (relative-error bound for `fres`; intended Newton-Raphson refinement pattern).
|
||||
144
migration/project-root/ppc-manual/fpu/frspx.md
Normal file
144
migration/project-root/ppc-manual/fpu/frspx.md
Normal file
@@ -0,0 +1,144 @@
|
||||
# `frspx` — Floating Round to Single
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [X](../forms/X.md) · **Opcode:** `0xfc000018`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `frsp` | `frspx` | — | Floating Round to Single |
|
||||
| `frsp.` | `frspx` | Rc=1 | Floating Round to Single |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
frsp[Rc] [FD], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `frspx` — form `X`
|
||||
|
||||
- **Opcode word:** `0xfc000018`
|
||||
- **Primary opcode (bits 0–5):** `63`
|
||||
- **Extended opcode:** `12`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode |
|
||||
| 6–10 | `RT/FRT/VRT` | destination |
|
||||
| 11–15 | `RA/FRA/VRA` | source A |
|
||||
| 16–20 | `RB/FRB/VRB` | source B |
|
||||
| 21–30 | `XO` | extended opcode (10 bits) |
|
||||
| 31 | `Rc` | record-form flag |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FB` | frspx: read | Source B floating-point register. |
|
||||
| `FD` | frspx: write | Destination floating-point register. |
|
||||
| `CR` | frspx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
| `FPSCR` | frspx: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `frspx`
|
||||
|
||||
- **Reads (always):** `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`, `FPSCR`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `frspx`: **CR0** ← signed-compare(result, 0) with `SO ← XER[SO]`, when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`frspx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="frspx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:318`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L318)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:29`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L29)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:898`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L898)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2856-2871`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2856-L2871)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::frspx => {
|
||||
// Round to single precision honouring FPSCR[RN].
|
||||
// PPCBUG-225: set XX on inexact rounding (almost every frsp call).
|
||||
let b = ctx.fpr[instr.rb()];
|
||||
if fpscr::is_snan(b) {
|
||||
fpscr::set_exception(ctx, fpscr::VXSNAN);
|
||||
}
|
||||
let result = to_single(ctx, b);
|
||||
if b.is_finite() && result.is_finite() && result != b {
|
||||
fpscr::set_exception(ctx, fpscr::XX);
|
||||
}
|
||||
ctx.fpr[instr.rd()] = result;
|
||||
fpscr::update_after_op(ctx, result, b.is_finite());
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Round to single-precision.** Rounds the binary64 value in `FRB` to binary32 using `FPSCR[RN]`, then re-encodes the result back into the destination as a binary64 representation of that single value. xenia-rs uses `to_single(b)`, which performs `f64 → f32 → f64` round-trip (Rust's `as f32` uses round-to-nearest-even, matching the PPC default).
|
||||
- **`FPSCR[RN]` not honored in xenia.** Like other conversion ops, xenia's `to_single` is hard-coded to round-to-nearest-even regardless of `FPSCR[RN]`. **xenia quirk** for non-default rounding modes.
|
||||
- **Overflow.** Values whose magnitude exceeds binary32's max (~3.4e38) round to ±∞ and set `FPSCR[OX, XX, FX]`.
|
||||
- **Underflow.** Values whose magnitude is below binary32's smallest normal (~1.2e-38) flush to zero or denormal per `FPSCR[NI]`; `UX`/`XX`/`FX` set on hardware. xenia uses host IEEE.
|
||||
- **NaN propagation.** Quiet NaNs pass through; signalling NaNs are quietened (sign-bit cleared on the SNaN-quietening payload bit). Host `as f32` does not perform PPC-style quietening; **xenia quirk** for SNaN bit-level inspection.
|
||||
- **Inexact.** Most rounding produces inexact; sets `FPSCR[XX, FX]`. xenia does not update FPSCR (xenia quirk).
|
||||
- **`Rc=1` (`frsp.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **Encoding.** X-form, primary 63, XO 12. Reads `FRB` only.
|
||||
- **Use case.** Compilers emit `frsp` after a chain of `fadd`/`fmul`/etc. when storing the value with `stfs` (store single). Without an explicit `frsp`, the in-FPR double would not match the `stfs`-rounded single.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`faddsx`](faddsx.md), [`fsubsx`](fsubsx.md), [`fmulsx`](fmulsx.md), [`fdivsx`](fdivsx.md) — single-precision arithmetic; equivalent to `frsp(double_op(...))`.
|
||||
- [`fmaddsx`](fmaddsx.md), [`fmsubsx`](fmsubsx.md), [`fnmaddsx`](fnmaddsx.md), [`fnmsubsx`](fnmsubsx.md) — single-precision fused FMA family.
|
||||
- `stfs` — store single; expects an FPR already rounded to single via `frsp` or via single-precision arithmetic.
|
||||
- [`fcfidx`](fcfidx.md) — `fcfid` + `frsp` is the standard `i64 → float` conversion.
|
||||
- [`mffsx`](mffsx.md), [`mtfsfx`](mtfsfx.md) — FPSCR rounding-mode control.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `frsp` (Floating Round to Single)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-frsp-floating-round-single-precision-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/) (single-precision rounding rules; SNaN quietening).
|
||||
147
migration/project-root/ppc-manual/fpu/frsqrtex.md
Normal file
147
migration/project-root/ppc-manual/fpu/frsqrtex.md
Normal file
@@ -0,0 +1,147 @@
|
||||
# `frsqrtex` — Floating Reciprocal Square Root Estimate
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [A](../forms/A.md) · **Opcode:** `0xfc000034`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `frsqrte` | `frsqrtex` | — | Floating Reciprocal Square Root Estimate |
|
||||
| `frsqrte.` | `frsqrtex` | Rc=1 | Floating Reciprocal Square Root Estimate |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
frsqrte[Rc] [FD], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `frsqrtex` — form `A`
|
||||
|
||||
- **Opcode word:** `0xfc000034`
|
||||
- **Primary opcode (bits 0–5):** `63`
|
||||
- **Extended opcode:** `26`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (59 or 63) |
|
||||
| 6–10 | `FRT` | destination FPR |
|
||||
| 11–15 | `FRA` | source A FPR |
|
||||
| 16–20 | `FRB` | source B FPR |
|
||||
| 21–25 | `FRC` | source C FPR (multiplier for madd-style ops) |
|
||||
| 26–30 | `XO` | extended opcode (5 bits) |
|
||||
| 31 | `Rc` | record-form flag (updates CR1) |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FB` | frsqrtex: read | Source B floating-point register. |
|
||||
| `FD` | frsqrtex: write | Destination floating-point register. |
|
||||
| `CR` | frsqrtex: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
| `FPSCR` | frsqrtex: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `frsqrtex`
|
||||
|
||||
- **Reads (always):** `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`, `FPSCR`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `frsqrtex`: **CR1** ← FPSCR[FX, FEX, VX, OX] when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`frsqrtex`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="frsqrtex"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:118`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L118)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:29`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L29)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:926`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L926)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2836-2853`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2836-L2853)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::frsqrtex => {
|
||||
// Reciprocal square root estimate: frD = 1.0 / sqrt(frB)
|
||||
let b = ctx.fpr[instr.rb()];
|
||||
if b == 0.0 {
|
||||
fpscr::set_exception(ctx, fpscr::ZX);
|
||||
}
|
||||
if b.is_sign_negative() && b != 0.0 && !b.is_nan() {
|
||||
fpscr::set_exception(ctx, fpscr::VXSQRT);
|
||||
}
|
||||
if fpscr::is_snan(b) {
|
||||
fpscr::set_exception(ctx, fpscr::VXSNAN);
|
||||
}
|
||||
let result = 1.0 / b.sqrt();
|
||||
ctx.fpr[instr.rd()] = result;
|
||||
fpscr::update_after_op(ctx, result, b.is_finite() && b > 0.0);
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Reciprocal-square-root estimate.** PowerISA: low-precision approximation of `1/sqrt(FRB)` accurate to roughly 12–14 bits, designed as the seed for Newton-Raphson refinement. **xenia quirk:** xenia-rs computes the *full-precision* `1.0 / b.sqrt()` (no rounding to single — `frsqrte` is double-precision per the spec). The result is far more accurate than hardware. Title code that depends on the limited precision still functions; the NR refinement converges in one iteration on either platform.
|
||||
- **Double precision result.** Per PowerISA, `frsqrte` returns a binary64 estimate (not a single-rounded value, unlike `fres`).
|
||||
- **Negative input is invalid.** `frsqrte(x < 0)` (other than `-0`) sets `FPSCR[VXSQRT, VX, FX]` and yields a quiet NaN. xenia returns host NaN (Rust's `f64::sqrt` of a negative is NaN, then `1/NaN` is NaN) but does not raise the FPSCR bit.
|
||||
- **`frsqrte(+0) = +∞`** and sets `FPSCR[ZX]` per spec. **`frsqrte(-0) = -∞`**.
|
||||
- **`frsqrte(+∞) = +0`**.
|
||||
- **NaN propagation.** Quiet NaN; signalling NaNs are quietened.
|
||||
- **`Rc=1` (`frsqrte.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **Encoding.** A-form, primary 63, XO 26. Reads `FRB` only; `FRA`/`FRC` are don't-care.
|
||||
- **Use case.** The canonical `length`/`normalize` recipe: `inv_len = frsqrte(dot); inv_len = 0.5 * inv_len * (3 - dot * inv_len * inv_len);` — one NR step gets to full double precision. For single precision use `frsp` after.
|
||||
- **Performance.** Cheap on Xenon. The `length`/`normalize` macro built on `frsqrte` is the hot inner loop in any 3D Xbox 360 game.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fresx`](fresx.md) — reciprocal estimate; same NR-refinement design pattern.
|
||||
- [`fsqrtx`](fsqrtx.md), [`fsqrtsx`](fsqrtsx.md) — full-precision square root (multi-cycle, non-pipelined).
|
||||
- [`fmulx`](fmulx.md), [`fmaddx`](fmaddx.md), [`fnmsubx`](fnmsubx.md) — the multiply/FMA ops that drive NR refinement.
|
||||
- [`frspx`](frspx.md) — round to single after `frsqrte` for graphics-pipeline producers expecting `float`.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `frsqrte` (Floating Reciprocal Square Root Estimate)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-frsqrte-floating-reciprocal-square-root-estimate-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/) (relative-error bound for `frsqrte`; canonical NR refinement step).
|
||||
143
migration/project-root/ppc-manual/fpu/fselx.md
Normal file
143
migration/project-root/ppc-manual/fpu/fselx.md
Normal file
@@ -0,0 +1,143 @@
|
||||
# `fselx` — Floating Select
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [A](../forms/A.md) · **Opcode:** `0xfc00002e`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fsel` | `fselx` | — | Floating Select |
|
||||
| `fsel.` | `fselx` | Rc=1 | Floating Select |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fsel[Rc] [FD], [FA], [FC], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fselx` — form `A`
|
||||
|
||||
- **Opcode word:** `0xfc00002e`
|
||||
- **Primary opcode (bits 0–5):** `63`
|
||||
- **Extended opcode:** `23`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (59 or 63) |
|
||||
| 6–10 | `FRT` | destination FPR |
|
||||
| 11–15 | `FRA` | source A FPR |
|
||||
| 16–20 | `FRB` | source B FPR |
|
||||
| 21–25 | `FRC` | source C FPR (multiplier for madd-style ops) |
|
||||
| 26–30 | `XO` | extended opcode (5 bits) |
|
||||
| 31 | `Rc` | record-form flag (updates CR1) |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FA` | fselx: read | Source A floating-point register (`fr0`–`fr31`). |
|
||||
| `FC` | fselx: read | Source C floating-point register (for madd-style ops). |
|
||||
| `FB` | fselx: read | Source B floating-point register. |
|
||||
| `FD` | fselx: write | Destination floating-point register. |
|
||||
| `CR` | fselx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fselx`
|
||||
|
||||
- **Reads (always):** `FA`, `FC`, `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fselx`: **CR1** ← FPSCR[FX, FEX, VX, OX] when `Rc=1`.
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fselx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fselx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:144`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L144)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:30`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L30)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:924`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L924)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2774-2783`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2774-L2783)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fselx => {
|
||||
// frD = if frA >= 0.0 then frC else frB
|
||||
ctx.fpr[instr.rd()] = if ctx.fpr[instr.ra()] >= 0.0 {
|
||||
ctx.fpr[instr.rc()]
|
||||
} else {
|
||||
ctx.fpr[instr.rb()]
|
||||
};
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Non-IEEE branch-free select.** PowerPC-specific; not in the IEEE-754 spec. Semantics: `FRT = (FRA >= 0.0) ? FRC : FRB`. Used pervasively in compiled PPC for `min`/`max`/`clamp`/`copysign` without branches. xenia-rs uses Rust's `>=` which matches.
|
||||
- **`-0.0` selects `FRC`.** Per PowerISA, `-0` compares as `>= 0`, so it routes to `FRC` (the "true" branch). xenia's `-0.0 >= 0.0` evaluates true in Rust — semantic match.
|
||||
- **NaN selects `FRB`.** Per PowerISA, NaN does **not** satisfy `>= 0`, so the result is `FRB`. xenia: any comparison with NaN returns false in Rust, so `>= 0` is false → `FRB` selected. Match.
|
||||
- **No FPSCR side effects.** `fsel` does **not** raise `VXSNAN` even on signalling NaN inputs, and does **not** update `FPRF`. It is purely a data-movement op.
|
||||
- **`Rc=1` (`fsel.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **A-form encoding.** Reads `FRA, FRB, FRC`, writes `FRT`. Assembler order: `fsel FD, FA, FC, FB` (note: `FRC` before `FRB`).
|
||||
- **Common idioms.**
|
||||
- `min(a,b) = fsel(a-b, b, a)`
|
||||
- `max(a,b) = fsel(a-b, a, b)`
|
||||
- `clamp(x, lo, hi) = fsel(x-lo, fsel(hi-x, x, hi), lo)`
|
||||
- `copysign(x, y) = fsel(y, |x|, -|x|)` (using `fabs`/`fnabs`)
|
||||
- **Optional ISA.** `fsel` is an optional PowerISA instruction; some implementations trap. Xenon implements it natively.
|
||||
- **No precision change.** Bit-pattern selection — no rounding regardless of source precision.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fabsx`](fabsx.md), [`fnegx`](fnegx.md), [`fnabsx`](fnabsx.md) — sign-bit ops; common companions for `fsel`-based copysign/clamp idioms.
|
||||
- [`fsubx`](fsubx.md) — subtract is the standard way to produce the comparison key (`a - b`).
|
||||
- [`fcmpux`](fcmpu.md), [`fcmpox`](fcmpo.md) — IEEE compare with branch; the heavyweight alternative to `fsel`.
|
||||
- [`fmrx`](fmrx.md) — unconditional copy.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fsel` (Floating Select)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fsel-floating-select-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/) (note: `fsel` is non-IEEE and uses the `>= 0` convention, not `> 0`).
|
||||
143
migration/project-root/ppc-manual/fpu/fsqrtsx.md
Normal file
143
migration/project-root/ppc-manual/fpu/fsqrtsx.md
Normal file
@@ -0,0 +1,143 @@
|
||||
# `fsqrtsx` — Floating Square Root Single
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [A](../forms/A.md) · **Opcode:** `0xec00002c`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fsqrts` | `fsqrtsx` | — | Floating Square Root Single |
|
||||
| `fsqrts.` | `fsqrtsx` | Rc=1 | Floating Square Root Single |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fsqrts[Rc] [FD], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fsqrtsx` — form `A`
|
||||
|
||||
- **Opcode word:** `0xec00002c`
|
||||
- **Primary opcode (bits 0–5):** `59`
|
||||
- **Extended opcode:** `22`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (59 or 63) |
|
||||
| 6–10 | `FRT` | destination FPR |
|
||||
| 11–15 | `FRA` | source A FPR |
|
||||
| 16–20 | `FRB` | source B FPR |
|
||||
| 21–25 | `FRC` | source C FPR (multiplier for madd-style ops) |
|
||||
| 26–30 | `XO` | extended opcode (5 bits) |
|
||||
| 31 | `Rc` | record-form flag (updates CR1) |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FB` | fsqrtsx: read | Source B floating-point register. |
|
||||
| `FD` | fsqrtsx: write | Destination floating-point register. |
|
||||
| `CR` | fsqrtsx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
| `FPSCR` | fsqrtsx: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fsqrtsx`
|
||||
|
||||
- **Reads (always):** `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`, `FPSCR`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fsqrtsx`: **CR1** ← FPSCR[FX, FEX, VX, OX] when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fsqrtsx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fsqrtsx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:168`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L168)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:30`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L30)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:389`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L389)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2801-2814`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2801-L2814)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fsqrtsx => {
|
||||
let b = ctx.fpr[instr.rb()];
|
||||
if b.is_sign_negative() && b != 0.0 && !b.is_nan() {
|
||||
fpscr::set_exception(ctx, fpscr::VXSQRT);
|
||||
}
|
||||
if fpscr::is_snan(b) {
|
||||
fpscr::set_exception(ctx, fpscr::VXSNAN);
|
||||
}
|
||||
let result = to_single(ctx, b.sqrt());
|
||||
ctx.fpr[instr.rd()] = result;
|
||||
fpscr::update_after_op(ctx, result, b.is_finite());
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Single precision.** Result is rounded to IEEE-754 binary32 then re-encoded into the destination 64-bit FPR. xenia computes `to_single(b.sqrt())`.
|
||||
- **Negative inputs are invalid.** `sqrt(x < 0)` (other than `-0`) sets `FPSCR[VXSQRT, VX, FX]` and yields a quiet NaN. `sqrt(-0) = -0` per IEEE-754.
|
||||
- **`sqrt(+∞) = +∞`**, exact.
|
||||
- **FPSCR side effects.** Hardware updates `FPRF`, `FR`, `FI`, `FX` plus exception bits `XX` (very common — sqrt is rarely exact in single precision), `VXSQRT`, `VXSNAN`. xenia-rs does not update FPSCR (xenia quirk).
|
||||
- **`Rc=1` (`fsqrts.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **NaN propagation.** Quiet-NaN result for any NaN operand; signalling NaNs are quietened.
|
||||
- **Performance.** `fsqrts` is a multi-cycle, non-pipelined operation on Xenon. Hot-path code commonly uses `frsqrte` + Newton-Raphson + `fmul`.
|
||||
- **Encoding.** A-form, primary 59, XO 22; reads `FRB` only.
|
||||
- **Rounding mode** uses `FPSCR[RN]` (default nearest-even).
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fsqrtx`](fsqrtx.md) — double-precision square root.
|
||||
- [`frsqrtex`](frsqrtex.md) — reciprocal-square-root estimate; combined with `fmuls` to compute `1/sqrt(x)` cheaply.
|
||||
- [`fresx`](fresx.md) — reciprocal estimate; pairs with `fsqrts` to compute `1/sqrt(x)`.
|
||||
- [`fmulsx`](fmulsx.md), [`fmaddsx`](fmaddsx.md) — used in Newton-Raphson refinement of `frsqrte` outputs.
|
||||
- [`frspx`](frspx.md) — explicit double→single rounding.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fsqrts` (Floating Square Root Single)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fsqrts-floating-square-root-single-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/).
|
||||
144
migration/project-root/ppc-manual/fpu/fsqrtx.md
Normal file
144
migration/project-root/ppc-manual/fpu/fsqrtx.md
Normal file
@@ -0,0 +1,144 @@
|
||||
# `fsqrtx` — Floating Square Root
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [A](../forms/A.md) · **Opcode:** `0xfc00002c`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fsqrt` | `fsqrtx` | — | Floating Square Root |
|
||||
| `fsqrt.` | `fsqrtx` | Rc=1 | Floating Square Root |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fsqrt[Rc] [FD], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fsqrtx` — form `A`
|
||||
|
||||
- **Opcode word:** `0xfc00002c`
|
||||
- **Primary opcode (bits 0–5):** `63`
|
||||
- **Extended opcode:** `22`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (59 or 63) |
|
||||
| 6–10 | `FRT` | destination FPR |
|
||||
| 11–15 | `FRA` | source A FPR |
|
||||
| 16–20 | `FRB` | source B FPR |
|
||||
| 21–25 | `FRC` | source C FPR (multiplier for madd-style ops) |
|
||||
| 26–30 | `XO` | extended opcode (5 bits) |
|
||||
| 31 | `Rc` | record-form flag (updates CR1) |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FB` | fsqrtx: read | Source B floating-point register. |
|
||||
| `FD` | fsqrtx: write | Destination floating-point register. |
|
||||
| `CR` | fsqrtx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
| `FPSCR` | fsqrtx: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fsqrtx`
|
||||
|
||||
- **Reads (always):** `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`, `FPSCR`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fsqrtx`: **CR1** ← FPSCR[FX, FEX, VX, OX] when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fsqrtx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fsqrtx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:164`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L164)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:30`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L30)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:923`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L923)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2786-2800`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2786-L2800)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fsqrtx => {
|
||||
let b = ctx.fpr[instr.rb()];
|
||||
// sqrt of negative (non-zero) is invalid operation → VXSQRT.
|
||||
if b.is_sign_negative() && b != 0.0 && !b.is_nan() {
|
||||
fpscr::set_exception(ctx, fpscr::VXSQRT);
|
||||
}
|
||||
if fpscr::is_snan(b) {
|
||||
fpscr::set_exception(ctx, fpscr::VXSNAN);
|
||||
}
|
||||
let result = b.sqrt();
|
||||
ctx.fpr[instr.rd()] = result;
|
||||
fpscr::update_after_op(ctx, result, b.is_finite());
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Double precision.** Operates on IEEE-754 binary64; [`fsqrtsx`](fsqrtsx.md) is the single-precision sibling. xenia delegates to host `f64::sqrt`.
|
||||
- **Negative inputs are invalid.** `sqrt(x < 0)` (other than `-0`) sets `FPSCR[VXSQRT, VX, FX]` and yields a quiet NaN. Note: `sqrt(-0) = -0` per IEEE-754 (preserves sign of zero) — host `f64::sqrt` matches.
|
||||
- **`sqrt(+∞) = +∞`**, exact.
|
||||
- **FPSCR side effects.** Hardware updates `FPRF`, `FR`, `FI`, `FX` plus exception bits `XX` (inexact, very common since `sqrt` is rarely exact), `VXSQRT`, `VXSNAN`. xenia-rs does **not** update FPSCR (xenia quirk).
|
||||
- **`Rc=1` (`fsqrt.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **NaN propagation.** Quiet-NaN result for any NaN operand; signalling NaNs are quietened.
|
||||
- **Performance / availability.** `fsqrt` is a Power-ISA optional instruction; some implementations trap as illegal-opcode. Xenon implements it natively. xenia-rs supports it directly.
|
||||
- **Encoding.** A-form, primary 63, XO 22; reads `FRB` only — `FRA` and `FRC` are don't-care.
|
||||
- **Rounding mode** uses `FPSCR[RN]`.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fsqrtsx`](fsqrtsx.md) — single-precision square root.
|
||||
- [`frsqrtex`](frsqrtex.md) — reciprocal-square-root estimate (`~1/sqrt(x)`); preferred for normalize/length operations.
|
||||
- [`fresx`](fresx.md) — reciprocal estimate; pairs with `fsqrt` for `1/sqrt(x)`.
|
||||
- [`fmulx`](fmulx.md), [`fmaddx`](fmaddx.md) — used in Newton-Raphson refinement of `frsqrte` outputs.
|
||||
- [`mffsx`](mffsx.md), [`mtfsfx`](mtfsfx.md) — FPSCR control.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fsqrt` (Floating Square Root)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fsqrt-floating-square-root-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/) (square-root invalid-operation rules).
|
||||
140
migration/project-root/ppc-manual/fpu/fsubsx.md
Normal file
140
migration/project-root/ppc-manual/fpu/fsubsx.md
Normal file
@@ -0,0 +1,140 @@
|
||||
# `fsubsx` — Floating Subtract Single
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [A](../forms/A.md) · **Opcode:** `0xec000028`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fsubs` | `fsubsx` | — | Floating Subtract Single |
|
||||
| `fsubs.` | `fsubsx` | Rc=1 | Floating Subtract Single |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fsubs[Rc] [FD], [FA], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fsubsx` — form `A`
|
||||
|
||||
- **Opcode word:** `0xec000028`
|
||||
- **Primary opcode (bits 0–5):** `59`
|
||||
- **Extended opcode:** `20`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (59 or 63) |
|
||||
| 6–10 | `FRT` | destination FPR |
|
||||
| 11–15 | `FRA` | source A FPR |
|
||||
| 16–20 | `FRB` | source B FPR |
|
||||
| 21–25 | `FRC` | source C FPR (multiplier for madd-style ops) |
|
||||
| 26–30 | `XO` | extended opcode (5 bits) |
|
||||
| 31 | `Rc` | record-form flag (updates CR1) |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FA` | fsubsx: read | Source A floating-point register (`fr0`–`fr31`). |
|
||||
| `FB` | fsubsx: read | Source B floating-point register. |
|
||||
| `FD` | fsubsx: write | Destination floating-point register. |
|
||||
| `CR` | fsubsx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
| `FPSCR` | fsubsx: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fsubsx`
|
||||
|
||||
- **Reads (always):** `FA`, `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`, `FPSCR`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fsubsx`: **CR1** ← FPSCR[FX, FEX, VX, OX] when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fsubsx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fsubsx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:135`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L135)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:30`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L30)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:387`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L387)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2585-2594`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2585-L2594)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fsubsx => {
|
||||
let a = ctx.fpr[instr.ra()];
|
||||
let b = ctx.fpr[instr.rb()];
|
||||
fpscr::check_invalid_add(ctx, a, b, true);
|
||||
let result = to_single(ctx, a - b);
|
||||
ctx.fpr[instr.rd()] = result;
|
||||
fpscr::update_after_op(ctx, result, a.is_finite() && b.is_finite());
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Single precision.** Result is rounded to IEEE-754 binary32 then re-encoded into the destination 64-bit FPR using the binary64 representation. xenia-rs uses `to_single(a - b)` which performs the round trip via `f64 -> f32 -> f64`.
|
||||
- **`±∞ − ±∞`** sets `FPSCR[VXISI, VX, FX]` and yields a quiet NaN.
|
||||
- **FPSCR side effects.** Always updated on hardware: `FPRF`, `FR`, `FI`, `FX`, plus exception bits `OX`, `UX`, `XX`, `VXISI`, `VXSNAN`. xenia-rs does **not** maintain FPSCR in the interpreter snapshot (xenia quirk).
|
||||
- **`Rc=1` (`fsubs.`)** copies `FPSCR[FX, FEX, VX, OX]` into CR1.
|
||||
- **NaN propagation.** Quiet-NaN result for any NaN operand; signalling NaNs are quietened.
|
||||
- **Single-precision overflow.** A double-precision result that would round to a binary32 overflow returns ±∞ and sets `OX`/`XX`/`FX`.
|
||||
- **Denormal flush.** Xenon boots with `FPSCR[NI]=1`; hardware flushes single-precision denormals to zero. xenia inherits host IEEE semantics.
|
||||
- **Rounding mode** is taken from `FPSCR[RN]`; default is nearest-even.
|
||||
- **Encoding.** A-form, primary 59, XO 20. `FRC` is don't-care.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fsubx`](fsubx.md) — double-precision sibling.
|
||||
- [`faddsx`](faddsx.md), [`fmulsx`](fmulsx.md), [`fdivsx`](fdivsx.md) — companion single-precision ops.
|
||||
- [`fmsubsx`](fmsubsx.md), [`fnmsubsx`](fnmsubsx.md) — fused single-precision multiply-subtract.
|
||||
- [`fnegx`](fnegx.md) — sign flip used to express subtract as add-of-negation.
|
||||
- [`frspx`](frspx.md) — explicit double→single rounding (semantically equivalent to chaining `frsp(fsub(...))`).
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fsubs` (Floating Subtract Single)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fsubs-floating-subtract-single-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/).
|
||||
130
migration/project-root/ppc-manual/fpu/fsubx.md
Normal file
130
migration/project-root/ppc-manual/fpu/fsubx.md
Normal file
@@ -0,0 +1,130 @@
|
||||
# `fsubx` — Floating Subtract
|
||||
|
||||
> **Category:** [Floating-Point](../categories/fpu.md) · **Form:** [A](../forms/A.md) · **Opcode:** `0xfc000028`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `fsub` | `fsubx` | — | Floating Subtract |
|
||||
| `fsub.` | `fsubx` | Rc=1 | Floating Subtract |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
fsub[Rc] [FD], [FA], [FB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `fsubx` — form `A`
|
||||
|
||||
- **Opcode word:** `0xfc000028`
|
||||
- **Primary opcode (bits 0–5):** `63`
|
||||
- **Extended opcode:** `20`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (59 or 63) |
|
||||
| 6–10 | `FRT` | destination FPR |
|
||||
| 11–15 | `FRA` | source A FPR |
|
||||
| 16–20 | `FRB` | source B FPR |
|
||||
| 21–25 | `FRC` | source C FPR (multiplier for madd-style ops) |
|
||||
| 26–30 | `XO` | extended opcode (5 bits) |
|
||||
| 31 | `Rc` | record-form flag (updates CR1) |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `FA` | fsubx: read | Source A floating-point register (`fr0`–`fr31`). |
|
||||
| `FB` | fsubx: read | Source B floating-point register. |
|
||||
| `FD` | fsubx: write | Destination floating-point register. |
|
||||
| `CR` | fsubx: write (conditional) | Condition-register update. When `Rc=1`, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result. |
|
||||
| `FPSCR` | fsubx: write | Floating-Point Status and Control Register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `fsubx`
|
||||
|
||||
- **Reads (always):** `FA`, `FB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `FD`, `FPSCR`
|
||||
- **Writes (conditional):** `CR`
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `fsubx`: **CR1** ← FPSCR[FX, FEX, VX, OX] when `Rc=1`.; **FPSCR** updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
FRT <- FRA − FRB
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`fsubx`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="fsubx"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_fpu.cc:127`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_fpu.cc#L127)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:30`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L30)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:921`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L921)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2575-2584`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2575-L2584)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::fsubx => {
|
||||
let a = ctx.fpr[instr.ra()];
|
||||
let b = ctx.fpr[instr.rb()];
|
||||
fpscr::check_invalid_add(ctx, a, b, true);
|
||||
let result = a - b;
|
||||
ctx.fpr[instr.rd()] = result;
|
||||
fpscr::update_after_op(ctx, result, a.is_finite() && b.is_finite());
|
||||
if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Double precision.** `fsub` operates on IEEE-754 binary64. The single-precision sibling is [`fsubsx`](fsubsx.md), which rounds the result to binary32 before re-encoding it into the 64-bit FPR.
|
||||
- **`±∞ − ±∞` is the canonical invalid case.** Same-signed infinity subtraction (or opposite-signed addition) yields `QNaN(VXISI)` and sets `FPSCR[VXISI, VX, FX]`.
|
||||
- **FPSCR side effects.** Hardware updates `FPRF`, `FR`, `FI`, `FX` plus exception bits `OX`, `UX`, `XX`, `VXISI`, `VXSNAN` as appropriate. xenia-rs's interpreter does **not** model FPSCR updates — a xenia quirk that almost never matters in practice.
|
||||
- **`Rc=1` (`fsub.`)** writes `CR1` from `FPSCR[FX, FEX, VX, OX]`.
|
||||
- **NaN propagation.** Any NaN operand yields a quiet NaN; a signalling NaN input is quietened (signalling bit cleared) per PowerISA. Host `f64 -` is relied on for the value.
|
||||
- **Sign of zero.** `+0 − +0 = +0` in round-to-nearest, `−0` in round-toward-negative-infinity. xenia inherits host semantics.
|
||||
- **Denormal flush.** Xenon boots with `FPSCR[NI]=1` (non-IEEE mode) so subnormal results flush to zero on hardware. Xenia produces IEEE-compliant denormals from the host FPU; titles relying on flush-to-zero typically see no observable difference for game logic but may see subtle differences in audio DSP.
|
||||
- **Encoding.** A-form, primary 63, XO 20. `FRC` is don't-care for sub.
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`fsubsx`](fsubsx.md) — single-precision subtract (rounds to binary32).
|
||||
- [`faddx`](faddx.md), [`faddsx`](faddsx.md) — add counterparts; subtract is implemented as add-with-negated-B on most cores.
|
||||
- [`fnegx`](fnegx.md) — sign flip (the bit-pattern operation behind `−FRB`).
|
||||
- [`fmsubx`](fmsubx.md), [`fnmsubx`](fnmsubx.md) — fused multiply-subtract (single rounding step).
|
||||
- [`mffsx`](mffsx.md), [`mtfsfx`](mtfsfx.md), [`mtfsb0x`](mtfsb0x.md), [`mtfsb1x`](mtfsb1x.md) — FPSCR control.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- [AIX 7.3 — `fsub` (Floating Subtract)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-fs-fsub-floating-subtract-instruction)
|
||||
- [PowerISA v2.07B, Book I, Chapter 4 — Floating-Point Processor](https://openpowerfoundation.org/specifications/isa/).
|
||||
Reference in New Issue
Block a user