fix(cpu): PPCBUG-424+425 vmaddfp128/vmaddcfp128 operand swap + va128 field fix

PPCBUG-424: vmaddfp128 computed VA×VB+VD instead of ISA-mandated VA×VD+VB. PPCBUG-425: vmaddcfp128 computed VD×VB+VA instead of ISA-mandated VA×VD+VB. Root-cause discovered while writing the operand-order regression tests: va128() was extracting PPC bits 6-10 (the same field as vd128's low 5 bits), not PPC bits 11-15 where VA lives in VX128 form. This meant va128() silently aliased vd128 for any instruction where VA != VD, making the operand swap invisible in the existing denorm-flush test (which used VA == VD == v2). Fixes in this commit: - decoder.rs: va128() now extracts PPC bits 11-15 (host bits 20-16) + bit29. The vmx128_va128_uses_bit29 test encoding updated to match the correct field. - interpreter.rs: vmaddfp128 changed from ai.mul_add(bi,di) to ai.mul_add(di,bi) (VA×VD+VB). vmaddcfp128 changed from di.mul_add(bi,ai) to ai.mul_add(di,bi). vmaddfp128_flushes_denormal_inputs redesigned with distinct VA/VD/VB registers (v1/v2/v3) so the flush test is independent of the accessor fix. New vmaddfp128_operand_order_va_times_vd_plus_vb and vmaddcfp128_operand_order_va_times_vd_plus_vb tests verify 2×3+10=16. - disasm_goldens.rs + vmx128_registers.json: vmaddfp128/vmaddcfp128/vnmsubfp128 golden raws updated to properly encode VA at PPC bits 11-15 (new raws: 0x146328D4 / 0x14632914 / 0x14632954). vperm128 / vsrw128 golden operands updated to reflect correct VA extraction (v4 instead of v3/v0). Affects all VMX128 binary ops that call va128(): vaddfp128, vsubfp128, vmulfp128, vmaddfp128, vmaddcfp128, vnmsubfp128, vperm128, vsrw128 etc. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-02 10:33:24 +02:00
parent cedee3c385
commit 52ece4bd86
4 changed files with 71 additions and 40 deletions
--- a/crates/xenia-cpu/tests/disasm_goldens.rs
+++ b/crates/xenia-cpu/tests/disasm_goldens.rs
@@ -514,12 +514,12 @@ fn vmx128_registers() {
    //     vmaddcfp128  VD, VA, VD, VB  → "v3, v35, v3, v5"
    //     vnmsubfp128  VD, VA, VD, VB  → "v3, v35, v3, v5"
    let vmx128_4op = [
-        // vmaddfp128: bits 24=1, 25=1, 27=1, bit 29=1 (VA high), VB=5
-        (0x146028D4u32, 0x82000000, "vmaddfp128 v3, v35, v5, v3"),
-        // vmaddcfp128: bits 23=1, 27=1, bit 29=1, VB=5
-        (0x14602914u32, 0x82000000, "vmaddcfp128 v3, v35, v3, v5"),
-        // vnmsubfp128: bits 23=1, 25=1, 27=1, bit 29=1, VB=5
-        (0x14602954u32, 0x82000000, "vnmsubfp128 v3, v35, v3, v5"),
+        // vmaddfp128: vd=3(bits 6-10), va=35(bits 11-15=3 + bit29=1), vb=5(bits 16-20), key2=0b001101
+        (0x146328D4u32, 0x82000000, "vmaddfp128 v3, v35, v5, v3"),
+        // vmaddcfp128: same vd/va/vb layout, key2=0b010001
+        (0x14632914u32, 0x82000000, "vmaddcfp128 v3, v35, v3, v5"),
+        // vnmsubfp128: same vd/va/vb layout, key2=0b010101
+        (0x14632954u32, 0x82000000, "vnmsubfp128 v3, v35, v3, v5"),
    ];

    let mut all = Vec::new();