From 9f88e275b8148928bb47b57ad3e2049fb9d8bf56 Mon Sep 17 00:00:00 2001 From: MechaCat02 Date: Sat, 2 May 2026 12:39:46 +0200 Subject: [PATCH] chore(audit): mark P5 PPCBUGs applied; append P5 progress section P5 phase merged at d39d0ba. Update audit-findings.md status fields (21 PPCBUGs marked applied) and append the P5 progress section to audit-report-2026-04-29.md. Co-Authored-By: Claude Sonnet 4.6 --- audit-findings.md | 42 +++++++++++++++++++------------------- audit-report-2026-04-29.md | 36 ++++++++++++++++++++++++++++++++ 2 files changed, 57 insertions(+), 21 deletions(-) diff --git a/audit-findings.md b/audit-findings.md index 7102fe6..aae7119 100644 --- a/audit-findings.md +++ b/audit-findings.md @@ -1761,7 +1761,7 @@ the ISA-canonical derivation. ### PPCBUG-181 — fmaddsx / fnmaddsx missing VXISI check for add-phase ±∞ collision - **Severity**: MEDIUM -- **Status**: open +- **Status**: applied (P5 d39d0ba, 2026-05-02) - **Locations**: interpreter.rs:2339-2348 (fmaddsx), 2383-2392 (fnmaddsx) - **Symptom**: When `FRA × FRC = +∞` and `FRB = -∞` (or vice versa), PowerISA §4.3.4 requires `FPSCR[VXISI]` to be set and the result to be a QNaN. The double-precision sibling @@ -1779,7 +1779,7 @@ the ISA-canonical derivation. ### PPCBUG-182 — fmsubsx / fnmsubsx missing VXISI check for subtract-phase ±∞ collision - **Severity**: MEDIUM -- **Status**: open +- **Status**: applied (P5 d39d0ba, 2026-05-02) - **Locations**: interpreter.rs:2361-2370 (fmsubsx), 2405-2414 (fnmsubsx) - **Symptom**: When `FRA × FRC = ±∞` and `FRB = ±∞` with the same sign, `(±∞) − (±∞)` should fire `FPSCR[VXISI]`. Neither `fmsubsx` nor `fnmsubsx` calls `check_invalid_add`. @@ -1794,7 +1794,7 @@ the ISA-canonical derivation. ### PPCBUG-183 — fnmaddsx / fnmsubsx NaN sign bit incorrectly flipped by Rust unary `-` - **Severity**: MEDIUM -- **Status**: open +- **Status**: applied (P5 d39d0ba, 2026-05-02) - **Locations**: interpreter.rs:2388 (fnmaddsx), 2410 (fnmsubsx) - **Symptom**: `to_single(ctx, -(a.mul_add(c, b)))` — Rust's unary `-f64` always flips the IEEE sign bit, including when the value is NaN. PowerISA §4.3.2 specifies that the final @@ -1817,7 +1817,7 @@ the ISA-canonical derivation. ### PPCBUG-184 — fresx produces full-precision IEEE 1/b instead of ~12-bit hardware estimate - **Severity**: HIGH -- **Status**: open +- **Status**: applied (P5 d39d0ba, 2026-05-02) - **Location**: interpreter.rs:2481-2494 - **Symptom**: `fres` on Xenon hardware produces a reciprocal approximation via a 256-entry LUT with linear interpolation, accurate to roughly 1/4096 relative error (~12 mantissa @@ -1916,7 +1916,7 @@ Group 30 summary: **9 findings (PPCBUG-200..208). 2 MEDIUM cross-cutting, 3 MEDI ### PPCBUG-202 — fmaddx: non-FMA `a * c` used in check_invalid_add can spuriously raise/miss VXISI - **Severity**: MEDIUM -- **Status**: open +- **Status**: applied (P5 d39d0ba, 2026-05-02) - **Location**: `interpreter.rs:2332` - **Symptom**: `check_invalid_add(ctx, a * c, b, false)` uses a separate two-rounding multiply to approximate the FMA intermediate product. When the true FMA intermediate is finite but the standalone product overflows to ±∞, VXISI fires spuriously. When the true intermediate is ±∞ but the standalone product is finite (extreme cancellation), VXISI is missed. - **Fix**: Derive VXISI from input-value properties directly: if `(a.is_infinite() || c.is_infinite())` (product is mathematically infinite) and `b.is_infinite()` with opposing sign → VXISI. @@ -1924,7 +1924,7 @@ Group 30 summary: **9 findings (PPCBUG-200..208). 2 MEDIUM cross-cutting, 3 MEDI ### PPCBUG-203 — fmsubx, fnmaddx, fnmsubx: VXISI never raised for ∞-collision in add/sub step - **Severity**: MEDIUM -- **Status**: open +- **Status**: applied (P5 d39d0ba, 2026-05-02) - **Locations**: `interpreter.rs:2354` (fmsubx), `2376` (fnmaddx), `2398` (fnmsubx) - **Symptom**: Same pattern as PPCBUG-181/182 for the double-precision variants. These three arms call only `check_invalid_mul` and omit `check_invalid_add`. Per ISA, all four FMA variants must raise VXISI when the add step yields ∞+∓∞. Example for fmsub: `A×C = +∞`, `B = +∞` → `+∞ − +∞` → VXISI. Currently the result NaN propagates silently with no FPSCR update. The fnmsub pattern is the canonical Newton-Raphson step — the most common FPU path in Xbox 360 graphics code. - **Fix**: Add `fpscr::check_invalid_add(ctx, a * c, b, true)` for `fmsubx`/`fnmsubx` and `fpscr::check_invalid_add(ctx, a * c, b, false)` for `fnmaddx` (apply PPCBUG-202 sign-fix simultaneously). @@ -1938,7 +1938,7 @@ Group 30 summary: **9 findings (PPCBUG-200..208). 2 MEDIUM cross-cutting, 3 MEDI ### PPCBUG-205 — fnmaddx / fnmsubx: Rust `−` flips NaN sign bit; ISA requires NaN sign preserved - **Severity**: MEDIUM -- **Status**: open +- **Status**: applied (P5 d39d0ba, 2026-05-02) - **Locations**: `interpreter.rs:2377` (fnmaddx), `interpreter.rs:2399` (fnmsubx) - **Symptom**: Same pattern as PPCBUG-183 for the double-precision variants. Rust's unary `-` applied to a NaN result flips the IEEE-754 sign bit. PowerISA Book I §4.3.4 states the negation is not applied to NaN results. Title code using NaN sentinels (audio middleware, debug fills) receives sign-flipped NaN payloads. - **Fix**: @@ -2008,7 +2008,7 @@ IDs PPCBUG-232..239 unallocated.** ### PPCBUG-221 — `fctidx` / `round_to_i64` NearestEven tie-breaking uses f64::EPSILON; broken for |v| > 2^52 - **Severity**: HIGH -- **Status**: open +- **Status**: applied (P5 d39d0ba, 2026-05-02) - **Location**: `fpscr.rs:220–238` (`round_to_i64`, `NearestEven` case) - **Symptom**: The tie-breaking code computes `diff = (v - v.trunc()).abs()` and tests `(diff - 0.5).abs() < f64::EPSILON` to detect a half-integer. Above `|v| = 2^52`, @@ -2040,7 +2040,7 @@ IDs PPCBUG-232..239 unallocated.** ### PPCBUG-223 — `fcmpo` omits FPSCR[VXSNAN] and FPSCR[VXVC] on NaN operands - **Severity**: MEDIUM -- **Status**: open +- **Status**: applied (P5 d39d0ba, 2026-05-02) - **Location**: `interpreter.rs:2645–2675` - **Symptom**: `fcmpo` body is identical to `fcmpu` — it sets FPRF and the CR field correctly but calls no `fpscr::set_exception`. PowerISA requires: QNaN → `FPSCR[VXVC, VX, FX]`; @@ -2064,7 +2064,7 @@ IDs PPCBUG-232..239 unallocated.** ### PPCBUG-224 — `fcfidx` does not set FPSCR[XX/FX] for inexact i64→f64 conversion - **Severity**: MEDIUM -- **Status**: open +- **Status**: applied (P5 d39d0ba, 2026-05-02) - **Location**: `interpreter.rs:2528–2536` - **Symptom**: Only FPRF is updated. Per ISA, `fcfid` sets `FPSCR[XX, FX]` (and FR/FI) when the i64 value has more than 53 significant bits and precision is lost. Any i64 with @@ -2075,7 +2075,7 @@ IDs PPCBUG-232..239 unallocated.** ### PPCBUG-225 — `frspx` does not set FPSCR[XX/FX/FR/FI] on inexact rounding - **Severity**: MEDIUM -- **Status**: open +- **Status**: applied (P5 d39d0ba, 2026-05-02) - **Location**: `interpreter.rs:2516–2527` - **Symptom**: `update_after_op` sets OX/UX only. The ISA requires FR/FI/XX/FX on any f64→f32 rounding that is not exact. `frsp` is the canonical double→single-precision narrowing idiom @@ -2086,7 +2086,7 @@ IDs PPCBUG-232..239 unallocated.** ### PPCBUG-227 — `fctiwx` rounding: `round_to_i32` inherits NearestEven defect via `round_to_i64` - **Severity**: LOW -- **Status**: open +- **Status**: applied (P5 d39d0ba, 2026-05-02) - **Location**: `fpscr.rs:241–243` - **Symptom**: `round_to_i32` calls `round_to_i64` then clamps. The PPCBUG-221 defect in `round_to_i64` does not manifest for i32-range values (the epsilon check accidentally works @@ -2109,7 +2109,7 @@ IDs PPCBUG-232..239 unallocated.** ### PPCBUG-229 — `fctidx` / `fctidzx` do not set FPSCR[XX/FX] for inexact inputs - **Severity**: LOW -- **Status**: open +- **Status**: applied (P5 d39d0ba, 2026-05-02) - **Locations**: `interpreter.rs:2537–2574` - **Symptom**: Per ISA, float-to-integer conversions set `FPSCR[XX, FX]` when the source value is not an integer (the fractional part is discarded). Neither opcode sets XX. @@ -2118,7 +2118,7 @@ IDs PPCBUG-232..239 unallocated.** ### PPCBUG-230 — `fctiwx` / `fctiwzx` do not set FPSCR[XX/FX] for inexact inputs - **Severity**: LOW -- **Status**: open +- **Status**: applied (P5 d39d0ba, 2026-05-02) - **Locations**: `interpreter.rs:2575–2612` - **Symptom**: Same omission as PPCBUG-229 for the word-width conversion pair. @@ -2624,7 +2624,7 @@ entries in decode_op6. ### PPCBUG-426 — vnmsubfp: two rounding steps instead of fused FMA; NaN sign may be flipped - **Severity**: MEDIUM -- **Status**: open +- **Status**: applied (P5 d39d0ba, 2026-05-02) - **Location**: `interpreter.rs:1786` (`r[i] = bi - ai * ci`) - **Symptom**: `vmaddfp` uses single-rounded `ai.mul_add(ci, bi)`, but `vnmsubfp` uses `bi - ai * ci` (two operations, two rounding steps). ISA specifies a single fused operation. @@ -2635,7 +2635,7 @@ entries in decode_op6. ### PPCBUG-427 — vnmsubfp128: same two-rounding form as vnmsubfp - **Severity**: MEDIUM -- **Status**: open +- **Status**: applied (P5 d39d0ba, 2026-05-02) - **Location**: `interpreter.rs:1803` (`r[i] = di - ai * bi`) - **Symptom**: Same class as PPCBUG-426 for the VMX128 form. - **Fix**: `r[i] = -ai.mul_add(bi, -di);` @@ -2672,7 +2672,7 @@ entries in decode_op6. ### PPCBUG-432 — vrfin / vrfin128: Rust `round()` is round-half-away-from-zero; ISA requires round-to-nearest-even - **Severity**: MEDIUM -- **Status**: open +- **Status**: applied (P5 d39d0ba, 2026-05-02) - **Location**: `interpreter.rs:2172` (`r[i] = b[i].round()`) - **Symptom**: `vrfin(0.5)` → ISA = 0.0; Rust = 1.0. `vrfin(2.5)` → ISA = 2.0; Rust = 3.0. Canary uses SSE2 `ROUNDPS` which is round-to-nearest-even. @@ -2681,7 +2681,7 @@ entries in decode_op6. ### PPCBUG-433 — vctsxs / vcfpsxws128: NaN input returns 0 instead of saturating to INT_MIN (0x80000000) - **Severity**: MEDIUM -- **Status**: open +- **Status**: applied (P5 d39d0ba, 2026-05-02) - **Location**: `vmx.rs:217` (`if x.is_nan() { return (0, true); }`) - **Symptom**: AltiVec ISA: NaN in vctsxs saturates to INT_MIN (0x80000000). Xenia-rs returns 0. - **Fix**: `if x.is_nan() { return (i32::MIN, true); }` @@ -2696,7 +2696,7 @@ entries in decode_op6. ### PPCBUG-435 — vaddfp / vsubfp / vmulfp128: subnormal inputs not flushed when VSCR.NJ=1 - **Severity**: MEDIUM (latent — Xbox 360 always boots with NJ=1) -- **Status**: open +- **Status**: applied (P5 d39d0ba, 2026-05-02) - **Location**: `interpreter.rs:1713`, `1729`, `1812` - **Symptom**: VSCR.NJ=1 requires flush-to-zero for subnormal inputs. vmaddfp family correctly calls `vmx::flush_denorm()`; plain add/sub/mul do not check VSCR. @@ -2704,7 +2704,7 @@ entries in decode_op6. ### PPCBUG-436 — vmsum3fp128 / vmsum4fp128: per-product intermediates not individually flushed - **Severity**: MEDIUM (latent) -- **Status**: open +- **Status**: applied (P5 d39d0ba, 2026-05-02) - **Location**: `interpreter.rs:4076`, `4083` - **Symptom**: `flush_denorm` on final sum only. Per-lane products can be subnormal and accumulate before the final flush. @@ -2712,7 +2712,7 @@ entries in decode_op6. ### PPCBUG-437 — vmaddfp / vmaddfp128 / vmaddcfp128 / vnmsubfp128: subnormal output not flushed - **Severity**: MEDIUM (latent) -- **Status**: open +- **Status**: applied (P5 d39d0ba, 2026-05-02) - **Location**: `interpreter.rs:1752–1754`, `1771–1773`, `4064–4067`, `1803–1805` - **Symptom**: VSCR.NJ=1 requires flushing subnormal results. Inputs flushed; outputs are not. diff --git a/audit-report-2026-04-29.md b/audit-report-2026-04-29.md index 734bb40..dfb8d2e 100644 --- a/audit-report-2026-04-29.md +++ b/audit-report-2026-04-29.md @@ -400,6 +400,42 @@ After applying Phase 1 alone, run `xenia-rs check sylpheed.iso -n 4B --parallel` --- +### P5 — FPU correctness (merged 2026-05-02, HEAD d39d0ba) + +**PPCBUGs fixed**: 21 IDs across the 5a-5f sub-sections. +- 5a (round-to-int): 221+227 (round_to_i64 NearestEven near 2^52, coupled), 432 (vrfin round-to-even) +- 5b (FMA VXISI + NaN sign): 181, 182 (single fmaddsx/fmsubsx VXISI), 202, 203 (double fmaddx/fmsubx/fnmaddx/fnmsubx VXISI), 183, 205 (NaN sign preservation in fnmaddx/fnmsubx and *sx siblings) +- 5c (XX-on-inexact): 223 (verified already correct), 224 (fcfidx XX), 225 (frspx XX), 229 (fctidx/fctidzx XX), 230 (fctiwx/fctiwzx XX) +- 5d (subnormal flush): 435 (vaddfp/vsubfp/vmulfp128 missing flush), 436 (vmsum3fp128/vmsum4fp128 per-product flush), 437 (vmaddfp family output flush) +- 5e (estimate precision): 184 (fresx canary parity via f32 input quantization) +- 5f (saturation + single-FMA): 426 (vnmsubfp single FMA), 427 (vnmsubfp128 single FMA), 433 (vctsxs NaN→INT_MIN) + +**Batches**: +- Batch 1 (f6a444b): 5a round-to-int + vrfin +- Batch 2 (26b9897): 5b FMA — new `check_invalid_fma_add` helper in fpscr.rs derives VXISI from input properties +- Batch 3 (49bf74f): 5c XX bit on conversions +- Batch 4 (538fa5a): 5d VSCR.NJ unconditional flush (matches Canary; Xbox 360 always boots NJ=1) +- Batch 5 (6ba8f83): 5e fresx pre-quantize input +- Batch 6 (6fe2cbf): 5f single-FMA + vctsxs NaN +- Review-fix nit (05f2f72): vrfin → stdlib `f32::round_ties_even()` + +**Deferred for focused sub-batches** (Status: open in audit-findings.md): +- PPCBUG-201 (FPSCR.RN for double arithmetic) — requires MXCSR set/restore wrappers around 10+ FPU arms +- PPCBUG-185 (FPSCR.NI flush for scalar FPU) — requires NI bit constant + post-op flush wrapper +- PPCBUG-180 + PPCBUG-200 (XX/FR/FI in update_after_op) — requires pre-vs-post-round comparison + +**Review findings**: +- Independent reviewer verdict: **MERGE-READY**. No blocking issues. +- Two non-blocking minor follow-ups noted: (a) `check_invalid_fma_add` doesn't catch the finite-product-overflow + infinite-b cancellation half of PPCBUG-202 (audit-acknowledged as rare); (b) vrfin used inline tie-breaker — replaced with stdlib `round_ties_even()` in 05f2f72. + +**Gate results**: +- `cargo test --workspace --release`: **498 passed, 0 failed** (up from 494 at P4 merge; 5 new regression tests across the batches) +- **Acid test** `-n 4B --parallel --reservations-table`: deferred per user direction + +**Conclusion**: P5 covers the FPU correctness foundation (round-to-int, VXISI, NaN preservation, XX bit, subnormal flush). Three substantive items deferred. Next: P6 — Other MEDIUM correctness (overflow.rs sweep, trap PC-after-advance, sc LEV, twi typed-trap, etc.). + +--- + ## Index — every PPCBUG referenced (in numerical order) This list intentionally includes every ID found in `audit-findings.md` so nothing is dropped. For each entry's full description / file:line / fix snippet / test recommendation, see the corresponding `### PPCBUG-NNN` heading in `audit-findings.md`.