Files
xenia-rs/migration/project-root/ppc-manual/fpu/frsqrtex.md
MechaCat02 e6d43a23ac chore: add migration/ bundle for cross-machine setup
Bundles state that lives OUTSIDE the xenia-rs repo so a fresh clone on
another machine can be brought up to identical configuration via
migration/setup.sh:

  - claude-memory/             ~/.claude/projects/-home-fabi-RE-Project-Sylpheed/memory/
                               (103 files, 1.1 MB - MEMORY.md + every
                                project_xenia_rs_*.md from audits
                                addis_signext through audit-058)
  - project-root/dot-claude/   <project-root>/.claude/settings.json
                               (Stop hook + permissions)
  - project-root/ppc-manual/   <project-root>/ppc-manual/
                               (PowerPC reference docs, 397 files, 3.7 MB)
  - project-root/run-canary.sh <project-root>/run-canary.sh
  - README.md                  Human-readable setup checklist
  - setup.sh                   Idempotent installer (also reclones
                               xenia-canary at pinned HEAD 6de80dffe)
  - MANIFEST.md                Per-file mapping + per-file-not-bundled
                               restoration recipe

Excluded from bundle (not shippable via git):
  - Sylpheed ISO (7.8 GB; copyright; manual copy required)
  - sylpheed.db (395 MB; regenerable from XEX via analysis tooling)
  - target/ build artifacts (rebuild on target)
  - audit-runs probe firehoses (.log/.stdout/.stderr ~11 GB; rerun if needed)
  - audit-runs memory dumps (.bin ~4.5 GB; rerun audit-026/027/029 if needed)
  - xenia-canary checkout (setup.sh reclones from
    git.mc02.dev/fabi/Xenia-Canary.git at HEAD 6de80dffe)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 21:38:38 +02:00

6.8 KiB
Raw Permalink Blame History

frsqrtex — Floating Reciprocal Square Root Estimate

Category: Floating-Point · Form: A · Opcode: 0xfc000034

Assembler Mnemonics

Mnemonic XML entry Flags Description
frsqrte frsqrtex Floating Reciprocal Square Root Estimate
frsqrte. frsqrtex Rc=1 Floating Reciprocal Square Root Estimate

Syntax

frsqrte[Rc] [FD], [FB]

Encoding

frsqrtex — form A

  • Opcode word: 0xfc000034
  • Primary opcode (bits 05): 63
  • Extended opcode: 26
  • Synchronising: no
Bits Field Meaning
05 OPCD primary opcode (59 or 63)
610 FRT destination FPR
1115 FRA source A FPR
1620 FRB source B FPR
2125 FRC source C FPR (multiplier for madd-style ops)
2630 XO extended opcode (5 bits)
31 Rc record-form flag (updates CR1)

Operands

Field Role Description
FB frsqrtex: read Source B floating-point register.
FD frsqrtex: write Destination floating-point register.
CR frsqrtex: write (conditional) Condition-register update. When Rc=1, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result.
FPSCR frsqrtex: write Floating-Point Status and Control Register.

Register Effects

frsqrtex

  • Reads (always): FB
  • Reads (conditional): none
  • Writes (always): FD, FPSCR
  • Writes (conditional): CR

Status-Register Effects

  • frsqrtex: CR1 ← FPSCR[FX, FEX, VX, OX] when Rc=1.; FPSCR updated per IEEE-754 flags (FX, FEX, FPRF, FR, FI, exceptions).

Operation (pseudocode)

; Pseudocode derives directly from the xenia-rs interpreter
; arm (see Implementation References). Operation semantics:
;   - Read source operands from the fields listed under Operands.
;   - Apply the arithmetic / logical / memory action described
;     in the Description field above.
;   - Write results to the destination register(s); update any
;     status bits enumerated under Status-Register Effects.
; Consult the IBM AIX reference link under IBM Reference for
; canonical PPC-style pseudocode where xenia's expression is
; terse.

C Translation Example

/* C translation: the xenia-rs interpreter arm below in           */
/* Implementation References is the authoritative semantic        */
/* snapshot. Translate it line-by-line:                            */
/*   - ctx.gpr[N]  -> r[N]       (or f[]/v[] for FPRs/VRs)        */
/*   - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be   */
/*   - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v)   */
/*   - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO   */
/* The Register Effects and Status-Register Effects tables above  */
/* enumerate every side effect a faithful translation must emit.  */

Implementation References

frsqrtex

xenia-rs interpreter body (frozen snapshot)
        PpcOpcode::frsqrtex => {
            // Reciprocal square root estimate: frD = 1.0 / sqrt(frB)
            let b = ctx.fpr[instr.rb()];
            if b == 0.0 {
                fpscr::set_exception(ctx, fpscr::ZX);
            }
            if b.is_sign_negative() && b != 0.0 && !b.is_nan() {
                fpscr::set_exception(ctx, fpscr::VXSQRT);
            }
            if fpscr::is_snan(b) {
                fpscr::set_exception(ctx, fpscr::VXSNAN);
            }
            let result = 1.0 / b.sqrt();
            ctx.fpr[instr.rd()] = result;
            fpscr::update_after_op(ctx, result, b.is_finite() && b > 0.0);
            if instr.rc_bit() { update_cr1_from_fpscr(ctx); }
            ctx.pc += 4;
        }

Special Cases & Edge Conditions

  • Reciprocal-square-root estimate. PowerISA: low-precision approximation of 1/sqrt(FRB) accurate to roughly 1214 bits, designed as the seed for Newton-Raphson refinement. xenia quirk: xenia-rs computes the full-precision 1.0 / b.sqrt() (no rounding to single — frsqrte is double-precision per the spec). The result is far more accurate than hardware. Title code that depends on the limited precision still functions; the NR refinement converges in one iteration on either platform.
  • Double precision result. Per PowerISA, frsqrte returns a binary64 estimate (not a single-rounded value, unlike fres).
  • Negative input is invalid. frsqrte(x < 0) (other than -0) sets FPSCR[VXSQRT, VX, FX] and yields a quiet NaN. xenia returns host NaN (Rust's f64::sqrt of a negative is NaN, then 1/NaN is NaN) but does not raise the FPSCR bit.
  • frsqrte(+0) = +∞ and sets FPSCR[ZX] per spec. frsqrte(-0) = -∞.
  • frsqrte(+∞) = +0.
  • NaN propagation. Quiet NaN; signalling NaNs are quietened.
  • Rc=1 (frsqrte.) copies FPSCR[FX, FEX, VX, OX] into CR1.
  • Encoding. A-form, primary 63, XO 26. Reads FRB only; FRA/FRC are don't-care.
  • Use case. The canonical length/normalize recipe: inv_len = frsqrte(dot); inv_len = 0.5 * inv_len * (3 - dot * inv_len * inv_len); — one NR step gets to full double precision. For single precision use frsp after.
  • Performance. Cheap on Xenon. The length/normalize macro built on frsqrte is the hot inner loop in any 3D Xbox 360 game.
  • fresx — reciprocal estimate; same NR-refinement design pattern.
  • fsqrtx, fsqrtsx — full-precision square root (multi-cycle, non-pipelined).
  • fmulx, fmaddx, fnmsubx — the multiply/FMA ops that drive NR refinement.
  • frspx — round to single after frsqrte for graphics-pipeline producers expecting float.

IBM Reference