Files
xenia-rs/audit-report-2026-04-29.md
MechaCat02 f424132a5b chore(audit): mark P3 PPCBUGs applied; append P3 progress section
P3 phase merged at f3ebaba. Update audit-findings.md status fields and
append the P3 progress section to audit-report-2026-04-29.md, including
the new PPCBUG-700 discovery (VMX128 register accessor canary-compliance).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-02 11:28:38 +02:00

26 KiB
Raw Blame History

PPC Instruction Audit — Triaged Report (2026-04-29)

Status: audit complete. No code modified. This file is the fix-order plan for the follow-up session. Source of truth: detailed bug entries (one heading per PPCBUG ID) live in audit-findings.md. This file references every entry by ID so nothing is lost — it does not duplicate the per-bug detail.

Counts

  • Total findings: 253 PPCBUG IDs, of which 4 are explicitly retracted/withdrawn (PPCBUG-220, 222, 226, 482, 483 — see Notes section).
  • Net findings: ~248 actionable.
  • Severity breakdown (rough):
    • HIGH: ~55 (~22%)
    • MEDIUM: ~75 (~30%)
    • LOW (test gaps + cosmetic + informational): ~118 (~48%)

Headline findings (most likely Sylpheed-renderer-blockers)

  1. PPCBUG-107 cascadeReservationTable::invalidate_for_write defined and unit-tested but never called from any of the 50+ store opcodes in the interpreter. Under --parallel, every cross-thread atomic via lwarx/stwcx. is silently broken: spinlocks succeed without exclusion, atomic counters race, condition-variable handshakes never sync. Plausible direct cause of the 4-worker-thread renderer plateau (project_xenia_rs_sylpheed_stage3_2026_04_29.md). Fix is mechanical: one-line if t.has_active_reservers() { t.invalidate_for_write(ea) } before every mem.write_* in interpreter.rs.

  2. PPCBUG-053+054 cascadebcx/bclrx CTR zero-test compares all 64 bits; mtspr CTR writes full 64-bit GPR. Combined with PPCBUG-006 (negx poisons GPR upper 32) → neg; mtctr; bdnz loops run forever.

  3. 8 decoder/field-extraction bugs collapse into 6 missing accessors + 1 wrong sh64 formula + 1 missing decode_op6 dot-form entry. The disassembler already has correct local versions. Single mechanical sweep.

  4. PPCBUG-046 (clrldi r3, r4, 32) — the canonical zero-extend-low-32 idiom is currently a no-op. Emitted constantly by 32-bit-ABI compilers.

  5. PPCBUG-510stvewx128 corrupts 12 adjacent bytes per call.

  6. PPCBUG-424/425vmaddfp128/vmaddcfp128 operand swap. Every D3D vertex/pixel shader using FMA with non-aliased operands gets wrong arithmetic.

  7. PPCBUG-360/363vperm128 uses wrong control vector (every D3D shader swizzle); vpkd3d128 missing post-pack permutation (canonical D3D vertex-pack pack=1 always wrong).

  8. PPCBUG-275/420-422 — VC-form and VMX128_R-form rc_bit() reads bit 0 instead of bit 21/27 → CR6 never updated for ANY VMX vector compare dot form. Breaks every vcmpequb. + bc CR6_all_true early-exit loop in audio mixing, font rendering, string ops.

The phases below are the recommended fix order for the follow-up session. Each phase is independently mergeable; later phases may reveal that earlier phases unblocked their symptoms (e.g. P1 by itself could be sufficient to break open the Sylpheed renderer plateau).

After each phase: cargo test --workspace --release (must stay at 506+ pass) AND xenia-rs check sylpheed.iso -n 100M (must not regress against the 2026-04-29 addis-fix baseline of swaps=2). The acid test is whether draws > 0 opens after P1 or P2.


Phase 1 — Cross-thread atomicity (PPCBUG-107 cascade)

Why first: highest confidence smoking-gun for the renderer plateau. Single, mechanical, low-risk fix. Largest leverage relative to size.

Coupled — must land together:

  • PPCBUG-107 (root: missing call from stores)
  • PPCBUG-130 (9 byte/halfword stores)
  • PPCBUG-140, 141, 142, 143, 144 (5 word stores: stw/stwu/stwx/stwux/stwbrx)
  • PPCBUG-150 (5 doubleword stores: std/stdu/stdx/stdux/stdbrx)
  • PPCBUG-160 (3 multiple/string stores: stmw/stswi/stswx)
  • PPCBUG-167 (9 FP stores)
  • PPCBUG-511, 512, 513, 514 (16 VMX stores)

Independent but related:

  • PPCBUG-151 (stwcx/stdcx reservation width discriminator) — separate fix; add reservation_width: u8 to PpcContext.
  • PPCBUG-108 (legacy per-context path: cross-thread invalidation impossible) — informational; --reservations-table mode bypasses.

Approach — one PR adds if t.has_active_reservers() { t.invalidate_for_write(ea) } before every mem.write_* call site. Scope:

mem.write_u8 / write_u16 / write_u32 / write_u64 / write_f32 / write_f64
mem.write_vec128 / write_vec128_aligned (for VMX)

~38 sites total. Add 1+ targeted concurrency tests (lwarx + cross-thread plain store + stwcx., expect EQ=0).


Phase 2 — Decoder/field-extraction structural sweep

Why second: single mechanical sweep, fixes 12 distinct HIGH-severity findings, unblocks correct execution of compiler-emitted code. Disassembler already has correct local extraction logic — promote/port.

Coupled — same commit:

  • PPCBUG-040 + PPCBUG-560 — fix sh64() bit order AND fix the test helper that was masking it
  • PPCBUG-046 + PPCBUG-561 — promote mb_md() from disasm.rs:1256 to decoder.rs; replace 6 inline-formula sites in interpreter.rs (rldicl/rldicr/rldic/rldimi/rldcl/rldcr)
  • PPCBUG-275 + PPCBUG-276 + PPCBUG-420 + PPCBUG-421 + PPCBUG-422 + PPCBUG-562 — add vc_rc_bit() (PPC bit 21) and vx128r_rc_bit() (PPC bit 27); replace instr.rc_bit() at all VMX compare dot-form sites
  • PPCBUG-315 + PPCBUG-563 — add vx128_4_z(), vx128_4_imm(); fix vrlimi128
  • PPCBUG-361 + PPCBUG-565 — add vx128_5_sh(); fix vsldoi128
  • PPCBUG-362 + PPCBUG-564 — add vx128_p_perm(); fix vpermwi128
  • PPCBUG-423 + PPCBUG-600 — add 5 odd-key entries to decode_op6 key4 for vcmp*fp128. dot forms

Independent in this phase:

  • PPCBUG-360 — vperm128 reads VC from vd128() instead of VX128_2 VC field at integer bits 6-8. Fix at the call site (or add vx128_2_vc() accessor).
  • PPCBUG-363 + PPCBUG-369 — vpkd3d128 missing post-pack permutation; add the pack/shift field handling per Canary.

Test fixture updates required (PPCBUG-560 lesson) — once sh64() is fixed, verify all disasm_goldens.rs test helpers encode shifts ISA-correctly. Don't trust the existing fixtures blindly.


Phase 3 — Other HIGH bugs (single targeted fixes)

Independent:

  • PPCBUG-510 — stvewx128 corrupting 12 bytes per call. Direct fix: align EA to word, write only 4 bytes.
  • PPCBUG-424 — vmaddfp128 operand order: change ai.mul_add(bi, di)ai.mul_add(di, bi).
  • PPCBUG-425 — vmaddcfp128 operand order similarly.
  • PPCBUG-053 + PPCBUG-054 — bcx/bclrx CTR zero-test (32-bit) + mtspr CTR truncation (defensive firewall). Coupled.
  • PPCBUG-640 — fmt_bc spurious condition suffix on pure bdnz/bdz. Port the fmt_bclr pattern.
  • PPCBUG-641 — lwsync shows as sync in disassembler (re-assessment of PPCBUG-088). Same fix.

Phase 4 — 32-bit ABI writeback truncation sweep

Why this phase: cross-cutting, mechanical. Once ALL writebacks truncate via as u32 as u64, the systemic 32-bit-ABI invariant is restored and most CR0/CA helper-correctness concerns become moot.

4a — Active poisoning (every execution corrupts GPR upper bits)

These bugs corrupt GPR upper bits regardless of whether upstream sources are clean — typically because the implementation applies Rust's !u64 (full 64-bit NOT) somewhere:

  • PPCBUG-006 (negx — (!ra).wrapping_add(1))
  • PPCBUG-008 (subfex — (!ra).wrapping_add(rb).wrapping_add(ca))
  • PPCBUG-018 (subfzex)
  • PPCBUG-019 (subfmex)
  • PPCBUG-028 (orcx — rs | !rb)
  • PPCBUG-029 (norx — !(rs | rb) — the canonical not mnemonic, hot path)
  • PPCBUG-030 (nandx)
  • PPCBUG-031 (eqvx — !(rs ^ rb) — common eqv rA, rA, rA set-to-all-ones)
  • PPCBUG-033 (andcx via !rb)
  • PPCBUG-034 (extsbx — as i8 as i64 as u64)
  • PPCBUG-035 (extshx)

4b — Same-shape-as-addis (latent under clean inputs, active when upstream is poisoned)

  • PPCBUG-001 (addi), PPCBUG-002 (addic), PPCBUG-003 (addicx), PPCBUG-005 (subficx), PPCBUG-007 (subfcx CA), PPCBUG-008 (subfex CA — also in 4a)
  • PPCBUG-004 (mulli), PPCBUG-009 (mullwx)
  • PPCBUG-010 + PPCBUG-011 (divwx writeback + CR0 — must land together, not independently)
  • PPCBUG-041 + PPCBUG-042 + PPCBUG-043 (srawx/srawix writeback + CR0 coupling — must land together)
  • PPCBUG-095, 096, 097, 098 (lha/lhax/lhau/lhaux halfword sign-extension)
  • PPCBUG-105 (lwa/lwax/lwaux — note: 64-bit-mode-only; less common in 32-bit-ABI binaries)

4c — Latent writeback (only triggers if 4a/4b are unfixed)

These can be fixed in the same sweep but won't fire under clean inputs:

  • PPCBUG-012, 013, 014, 015, 016, 017 (addx/addcx/addex/addzex/addmex/subfx)
  • PPCBUG-032 (andx/orx/xorx)

4d — CR0 32-bit-ABI compare (cross-cutting catch-all)

PPCBUG-020 documents the catch-all; the per-opcode locations are referenced from there:

  • PPCBUG-020 (catch-all in groups 2-5)
  • PPCBUG-023 (andisx)
  • PPCBUG-024 (rlwinmx), PPCBUG-025 (rlwimix), PPCBUG-026 (rlwnmx)
  • PPCBUG-036 (extsbx), PPCBUG-037 (extshx) — must land with PPCBUG-034/035
  • PPCBUG-044 (slwx/srwx)

Fix shape — at every Rc=1 path, change update_cr_signed(0, result as i64) to update_cr_signed(0, result as u32 as i32 as i64). Once 4a/4b/4c land, both forms become equivalent and 4d becomes belt-and-suspenders (still recommended for resilience).


Phase 5 — FPU correctness (graphics middleware impact)

5a — Round-to-int and FPSCR.RN

  • PPCBUG-221 + PPCBUG-227 (round_to_i64 NearestEven broken near 2^52 — must land together; round_to_i32 delegates)
  • PPCBUG-201 (FPSCR.RN not honored for double arithmetic)
  • PPCBUG-432 (vrfin/vrfin128 round-half-away-from-zero vs round-to-nearest-even)

5b — VXISI / NaN / SNaN handling for FMA family

  • PPCBUG-181, 182 (single fmaddsx/fmsubsx/fnmaddsx/fnmsubsx VXISI)
  • PPCBUG-202, 203, 204 (double fmaddx/fmsubx/fnmaddx/fnmsubx VXISI — esp. 203 hot for Newton-Raphson)
  • PPCBUG-183, 205 (fnmadd/fnmsub Rust unary - flips NaN sign — fix: skip negation on NaN)
  • PPCBUG-186 (SNaN priority for FMA)
  • PPCBUG-128 (lfs SNaN quietening — bit-manipulation widening helper needed)

5c — Inexact / FPSCR exception bits

  • PPCBUG-180 (single XX/FR/FI never set), PPCBUG-200 (double XX/FR/FI never set)
  • PPCBUG-223 (fcmpo VXSNAN/VXVC), PPCBUG-224 (fcfidx XX), PPCBUG-225 (frspx XX/FR/FI), PPCBUG-229 (fctidx/fctidzx XX/FX), PPCBUG-230 (fctiwx/fctiwzx XX/FX), PPCBUG-231 (frspx SNaN host dependency)
  • PPCBUG-165 + PPCBUG-166 + PPCBUG-168 (stfs* FPSCR + RN + SNaN)

5d — Subnormal flush (FPSCR.NI / VSCR.NJ)

  • PPCBUG-185 (FPU NI subnormal flush not modeled)
  • PPCBUG-435, 436, 437 (VMX NJ subnormal flush — vaddfp/vsubfp/vmulfp128, vmsum3fp128/vmsum4fp128 product intermediates, vmaddfp/vmaddfp128/vmaddcfp128/vnmsubfp128 outputs)

5e — Estimate precision (vs hardware ~12-bit)

  • PPCBUG-184 (fres)
  • PPCBUG-428..431 (vrefp, vrsqrtefp, vexptefp, vlogefp — same shape as fres)

5f — VMX float compares + saturation

  • PPCBUG-426, 427 (vnmsubfp/vnmsubfp128 double-rounding)
  • PPCBUG-433 (vctsxs/vcfpsxws128 NaN saturate to INT_MIN)

Phase 6 — Other MEDIUM correctness

  • PPCBUG-021 (overflow.rs OE checks at bit 63 — sub-register ops; partly covered by P4)
  • PPCBUG-022 (mulld_ov missing INT_MIN × -1)
  • PPCBUG-027 (rlwimix upper-32 ISA-deviation — auto-resolves once P4 lands)
  • PPCBUG-039 (cntlzdx 32-bit-ABI counts upper-zero — only matters if emitted)
  • PPCBUG-063 (trap pc-after-advance)
  • PPCBUG-064 (sc LEV field)
  • PPCBUG-065 (twi 31, r0, IMM typed-trap — relevant to Sylpheed C++ throw work, see project_xenia_rs_sylpheed_throw_2026_04_28.md)
  • PPCBUG-068 (mcrfs VX summary recomputation)
  • PPCBUG-078 (mtmsrd L=1 partial MSR-write)
  • PPCBUG-080 (mfvscr zero upper 96 bits)
  • PPCBUG-123 + PPCBUG-124 + PPCBUG-161 + PPCBUG-566 (XER TBC for lswx/stswx — coupled; add xer_tbc: u8 to PpcContext, wire into xer()/set_xer(); enables lswx and stswx)
  • PPCBUG-125 (lmw RA-in-destination skip)
  • PPCBUG-126 + PPCBUG-162 (lswi/stswi instr.rb()instr.nb())
  • PPCBUG-487 + PPCBUG-495 (vsum* operand naming)
  • PPCBUG-515 (lvebx/lvehx/lvewx vs Canary divergence — document; xenia-rs is more ISA-faithful)
  • PPCBUG-516 (lvsr sh=0 case — add comment + debug_assert)
  • PPCBUG-601 (decode_op6 overlapping windows — document the invariant)
  • PPCBUG-642 (fmt_bcctr extended forms)
  • PPCBUG-643 + PPCBUG-644 (SIMM/D-form decimal vs hex — alignment with Canary disassembly)
  • PPCBUG-367 (vupkhpx/vupklpx channel replication vs zero-extend)
  • PPCBUG-368 (vpkpx pack_pixel_555 channel assignment unverified)
  • PPCBUG-366 (vspltisb/vspltish sign-extension idiom — fragile, not wrong)

Phase 7 — Frozen-snapshot drift (separate sweep)

8 opcodes' frozen snapshots in ppc-manual/<cat>/<op>.md differ from live code:

  • PPCBUG-066 (td/tdi/tw/twi)
  • PPCBUG-117 (ldarx)
  • PPCBUG-145 (stwcx)
  • PPCBUG-560 (already-listed: rldicl test helper bit-order)
  • Plus the implicit drift in addicx (PPCBUG-003), andisx (PPCBUG-023), cmp/cmpi (PPCBUG-050), extsbx/extshx (PPCBUG-036/037, PPCBUG-032 in batch 1)

Recommendation: regenerate frozen snapshots from current code for the entire ppc-manual after Phases 1-4 land. Add a CI check that compares snapshots vs live code on every PR.


Phase 8 — Test gap closure (broad)

Single PR per group is overkill; recommend bundling test additions with each Phase 1-6 PR (test the bug being fixed). The remaining LOW IDs are pure-test-gap entries — list:

  • PPCBUG-045 (shift), 047 (rld), 055 (branch), 067 (trap+sc), 070 (CR logical)
  • PPCBUG-081, 082, 083, 084, 085 (SPR/MSR/TB/FPSCR/VSCR moves), 089 (cache+sync)
  • PPCBUG-091 (lbz), 100 (lha), 109, 110, 111 (lwa/lwbrx/lwarx), 118 (ld), 127 (lmw/lswi/lswx), 129 (lfs/lfd)
  • PPCBUG-132 (stb/sth), 146, 147 (stw/stwcx), 153 (std/stdcx), 163 (stmw/stswi/stswx), 171 (stfs/stfd)
  • PPCBUG-187 (FPU single), 208 (FPU double), 228 (FPU misc convert)
  • PPCBUG-240 (VMX add/sub), 243 (VMX sat helpers)
  • PPCBUG-277, 278, 279 (VMX compare/min/max/avg)
  • PPCBUG-316, 317, 320, 321, 322, 323, 324, 325 (VMX shift/rotate/logical)
  • PPCBUG-370, 371, 372, 373, 374, 375, 376, 377, 378 (VMX permute/pack)
  • PPCBUG-438, 439, 440 (VMX float compare/round/convert)
  • PPCBUG-490, 491, 492, 493, 494 (VMX multiply-sum)
  • PPCBUG-517, 518, 519 (VMX load/store)
  • PPCBUG-567 (decoder accessors)
  • PPCBUG-604 (decoder dispatch tables)
  • PPCBUG-649, 650, 652 (golden fixtures for branches/VMX128)

Notes & administrative

Withdrawn / retracted

  • PPCBUG-220fctiwx strict-> threshold actually correct (i32::MAX exactly representable in f64). Retracted by group-31 subagent.
  • PPCBUG-222fctidx positive-overflow sentinel 0x7FFF_FFFF_FFFF_FFFF is the correct ISA value. Retracted.
  • PPCBUG-226 — FPRF 5-bit codes for fcmpu/fcmpo are correct per PowerISA. Retracted.
  • PPCBUG-482vmhaddshs shift >>15 is correct per spec snapshots. Retracted.
  • PPCBUG-483vmhraddshs shift >>15 is correct per spec snapshots. Retracted.

Wontfix / informational (not retracted but no fix needed)

  • PPCBUG-038 — extswx ISA-correct, intentional 64-bit sign-extension. Document the asymmetry with extsb/extsh after PPCBUG-034/035 land.
  • PPCBUG-090, 099, 152 — invalid-form (rD==rA) silently destroys load/store result. Per ISA: undefined behavior. No compiler emits these; matches Canary. Optional debug_assert!.
  • PPCBUG-106, 115, 131, 169, 170, 206, 207, 318, 319, 364, 365, 434, 651, 653, 645, 646, 648 — informational confirmations that the implementation is correct, no change needed.
  • PPCBUG-069 — test comment OX(so)=0 is wrong but the assert is correct.
  • PPCBUG-602, 603, 605 — undocumented decoder dispatch quirks; correct but should add comments.
  • PPCBUG-647, 654 — disassembler edge-case behavior on invalid encodings; not-a-bug for valid input.

Coupling matrix (must-land-together)

Group IDs Reason
divwx 010, 011 Quotient zero-extension changes the CR0 sign view
srawx/srawix 041, 042, 043 Writeback truncation invalidates the CR0 view
extsbx/extshx 034+036, 035+037 Same coupling shape as srawx
sh64 040, 560 Test helper is wrong in the inverse direction
mb_md sweep 046, 561 Promote disasm.rs accessor first
VC-form Rc 275, 276, 420, 421, 562 All consume the same new accessor
VMX128_R Rc 422, 562 Same accessor sweep
vrlimi128 315, 563 Field accessor + caller fix
vsldoi128 361, 565 Field accessor + caller fix
vpermwi128 362, 564 Field accessor + caller fix
vcmp*fp128. 423, 600 decode_op6 odd keys + opcode mapping
XER TBC 123, 124, 161, 566 Add field, wire xer()/set_xer(), enables lswx/stswx
round_to_i64 221, 227 round_to_i32 delegates
stfs FPSCR 165, 166, 168 Single fix shape covers all three

Dependency on the addis fix

The addis fix (project_xenia_rs_addis_signext_root_cause_2026_04_29.md) is already in place. Phase 4 generalizes that fix systematically; without it, the writeback-truncation invariant would still be incomplete.

Anticipated impact on the Sylpheed renderer plateau

Strong candidates for direct cause of the plateau:

  • PPCBUG-107 — broken atomics. Workers wait forever on never-signaled events; classical broken-spinlock symptom.
  • PPCBUG-053+054 — broken bdnz loops; could explain workers parked indefinitely.
  • PPCBUG-046 (clrldi r3, r4, 32) — pollution propagation in 32-bit ABI; could break any pointer-clean-up sequence.

After applying Phase 1 alone, run xenia-rs check sylpheed.iso -n 4B --parallel and check whether draws > 0. If yes, the plateau was atomics; if no, proceed to P2/P3.


Progress log

P1 — Cross-thread atomicity sweep (merged 2026-05-01, HEAD ca5b90b)

PPCBUGs fixed: 107, 130, 140, 141, 142, 143, 144, 150, 160, 167, 511, 512, 513, 514, 151, 108. Plus review-fix additions: dcbz, dcbz128, stswi two-line, stswx two-line (merged in review-fix commit c9f194d).

Gate results:

  • cargo test --workspace --release: 449 passed, 0 failed
  • -n 100M lockstep: swaps=2, clean
  • -n 100M --parallel --reservations-table: swaps=2, clean
  • Acid test -n 4B --parallel --reservations-table: swaps=2, draws=0, no RtlRaiseException, no panics

Conclusion: P1 did NOT unblock the Sylpheed renderer. draws remains 0. The renderer plateau is not caused by broken cross-thread atomics alone. Proceeding to P2 (decoder/field-extraction sweep). The strongest remaining candidate per the plan is PPCBUG-046 (clrldi r3, r4, 32 no-op).


P2 — Decoder/field-extraction structural sweep (merged 2026-05-01, HEAD see git log master --oneline -1)

PPCBUGs fixed: 040, 046, 275, 276, 315, 360, 361, 362, 363, 369, 420, 421, 422, 423, 560, 561, 562, 563, 564, 565, 600.

Batches:

  • Batch 1: PPCBUG-040+560 — sh64() bit-order fix (XS-form SH split) + rldicl test helper encoding
  • Batch 2: PPCBUG-046+561 — mb_md() accessor; all 6 rld* MB fields corrected (clrldi was a no-op)
  • Batch 3: PPCBUG-275+276+420+421+422+423+562+600 — vc_rc_bit()/vx128r_rc_bit() Rc accessors; 13 vcmp interpreter sites; 5 decode_op6 dot-form entries
  • Batch 4: PPCBUG-315+563 — vrlimi128 vx128_4_z/imm field extraction
  • Batch 5: PPCBUG-361+565 — vsldoi128 vx128_5_sh field extraction
  • Batch 6: PPCBUG-362+564 — vpermwi128 vx128_p_perm field extraction
  • Batch 7: PPCBUG-360 — vperm128 vc128_2() accessor (was erroneously vd128())
  • Batch 8: PPCBUG-363+369 — vpkd3d128 post-pack permutation (MakePermuteMask tables from canary)

Gate results:

  • cargo test --workspace --release: 201 (cpu) + 6 (disasm goldens) + 144 + 76 + 16 + 8 + … passed, 0 failed
  • Independent code reviewer: all 9 check items OK
  • -n 100M lockstep smoke: ISO not available in CI environment; last known good at P1 HEAD was swaps=2
  • Acid test -n 4B --parallel --reservations-table: pending (ISO not in CI environment)

Conclusion: All P2 fixes applied and reviewed. Decoder field extraction is now correct for all audited VMX128 and MD/XS-form instructions. Whether P2 unblocks the renderer (draws > 0) requires the sylpheed.iso acid test on the user's machine. PPCBUG-046 (clrldi no-op fix) was the highest-probability P2 renderer-unblock candidate. Next: P3 — isolated HIGH bugs (PPCBUG-510, 424/425, 053+054, 640, 641).


P3 — Isolated HIGH bugs (merged 2026-05-02, HEAD f3ebaba)

PPCBUGs fixed: 053+054 (coupled CTR 32-bit), 424+425 (vmaddfp128/vmaddcfp128 operand swap), 510 (stvewx128 corruption), 640+650 (bdnz/bdz suffix), 641+649 (sync/lwsync), 700 (NEW).

Batches:

  • Batch 1: PPCBUG-510 — stvewx128 16-byte corruption fixed (word-align EA, extract lane, write 4 bytes)
  • Batch 2: PPCBUG-424+425 + PPCBUG-700 partial (va128 PPC[11-15] partial fix) — vmaddfp128/vmaddcfp128 operand swap to VA*VD+VB
  • Batch 3: PPCBUG-053+054 — bcx/bclrx 32-bit CTR compare + mtspr CTR truncation
  • Batch 4: PPCBUG-640+650 — fmt_bc spurious bdnzge/bdzge suffix gated on !uncond
  • Batch 5: PPCBUG-641+649 — sync/lwsync L-field disambiguation
  • Phase review fix: PPCBUG-700 (NEW) — VMX128 register accessors (va128/vb128/vd128/vx128r_rc_bit) rewritten to canary's bitfield positions. Audit's "confirmed-clean" line-2958 assessment was based on miscounting LSB-first packed C++ bitfields. Per canary (xenia-canary/src/xenia/cpu/ppc/ppc_decode_data.h:484-663):
    • VA128 = PPC[11-15] | PPC[26]<<5 | PPC[21]<<6 (3 fields, 7 bits)
    • VB128 = PPC[16-20] | PPC[30-31]<<5
    • VD128 = PPC[6-10] | PPC[28-29]<<5
    • VX128_R Rc = PPC[25] (host bit 6) — NOT PPC[27] as PPCBUG-422 prescribed Affects 30+ VMX128 opcodes; production game code with VR>=32 was silently mis-decoded. Speculative key4_dt dot-form dispatch in decode_op6 removed (canary has no separate dot-form opcodes for VX128_R). New PPCBUG-700 entry added to audit-findings.md Phase C4 invalidating audit line 2958.

Gate results:

  • cargo test --workspace --release: 470 passed, 0 failed (up from 467 baseline at P3 start; 3 new CTR regression tests added)
  • Independent code reviewer: 1 BLOCKING issue (PPCBUG-700 above) — addressed before merge
  • -n 100M lockstep smoke: ISO not in CI; checked locally during development
  • Acid test -n 4B --parallel --reservations-table: deferred to end of all phases per user direction

Conclusion: All P3 fixes applied + reviewed + reviewer's blocking concern resolved. Phase 3 also produced one HIGH discovery (PPCBUG-700) that the audit had missed. Total fixes: 6 commits, 7 distinct PPCBUG groups. Next: P4 — 32-bit ABI writeback truncation sweep, ~30 IDs across 4a-4d sub-sections.


Index — every PPCBUG referenced (in numerical order)

This list intentionally includes every ID found in audit-findings.md so nothing is dropped. For each entry's full description / file:line / fix snippet / test recommendation, see the corresponding ### PPCBUG-NNN heading in audit-findings.md.

001-022 (batch 1: integer ALU): 001, 002, 003, 004, 005, 006, 007, 008, 009, 010, 011, 012, 013, 014, 015, 016, 017, 018, 019, 020, 021, 022.

023 (batch 2 group 6 logic immediate): 023.

024-027 (batch 2 group 9 word rotate): 024, 025, 026, 027.

028-033 (batch 2 group 7 logic register): 028, 029, 030, 031, 032, 033.

034-039 (batch 2 group 8 sign-extend / count-leading-zeros): 034, 035, 036, 037, 038, 039.

040-045 (batch 2 group 11 shift): 040, 041, 042, 043, 044, 045.

046-047 (batch 2 group 10 doubleword rotate): 046, 047.

048-052 reserved (group 12 compare): 048, 049, 050.

053-055 (batch 3 group 13 branch): 053, 054, 055.

063-067 (batch 3 group 14 trap+sc): 063, 064, 065, 066, 067.

068-070 (batch 3 group 15 CR logical): 068, 069, 070.

078-085 (batch 3 group 16 SPR/MSR/TB/FPSCR/VSCR): 078, 079, 080, 081, 082, 083, 084, 085.

088-089 (batch 3 group 17 cache+sync): 088, 089.

090-091 (batch 4 group 18 load byte): 090, 091.

095-100 (batch 4 group 19 load halfword): 095, 096, 097, 098, 099, 100.

105-111 (batch 4 group 20 load word + reservation): 105, 106, 107, 108, 109, 110, 111.

115-118 (batch 4 group 21 load doubleword): 115, 116, 117, 118.

123-127 (batch 4 group 22 load multiple/string): 123, 124, 125, 126, 127.

128-129 (batch 4 group 23 load float): 128, 129.

130-132 (batch 5 group 24 store byte/halfword): 130, 131, 132.

140-147 (batch 5 group 25 store word + stwcx): 140, 141, 142, 143, 144, 145, 146, 147.

150-153 (batch 5 group 26 store doubleword): 150, 151, 152, 153.

160-163 (batch 5 group 27 store multiple/string): 160, 161, 162, 163.

165-171 (batch 5 group 28 store float): 165, 166, 167, 168, 169, 170, 171.

180-187 (batch 6 group 29 FPU single arithmetic): 180, 181, 182, 183, 184, 185, 186, 187.

200-208 (batch 6 group 30 FPU double arithmetic): 200, 201, 202, 203, 204, 205, 206, 207, 208.

220-231 (batch 6 group 31 FPU sign/move/compare/convert): 220 [retracted], 221, 222 [retracted], 223, 224, 225, 226 [retracted], 227, 228, 229, 230, 231.

240-243 (batch 7 group 32 VMX integer add/sub): 240, 241, 242, 243.

275-279 (batch 7 group 33 VMX integer compare/min/max/avg): 275, 276, 277, 278, 279.

315-325 (batch 7 group 34 VMX integer logical/shift/rotate): 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325.

360-378 (batch 8 group 35 VMX permute/pack): 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378.

420-440 (batch 8 group 36 VMX float arith+compare): 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440.

482-495 (batch 8 group 37 VMX multiply-sum + special): 482 [retracted], 483 [retracted], 487, 490, 491, 492, 493, 494, 495.

510-519 (batch 8 group 38 VMX load/store): 510, 511, 512, 513, 514, 515, 516, 517, 518, 519.

560-567 (Phase C1 decoder field extractors): 560, 561, 562, 563, 564, 565, 566, 567.

600-605 (Phase C2 decoder opcode-lookup): 600, 601, 602, 603, 604, 605.

640-654 (Phase C3 disassembler formatter): 640, 641, 642, 643, 644, 645, 646, 647, 648, 649, 650, 651, 652, 653, 654.

Counted IDs: 253. Retracted: 220, 222, 226, 482, 483 (5). Net actionable: 248.

Counted by phase here: P1 (~17 IDs), P2 (~17 IDs), P3 (~7 IDs), P4 (~30 IDs), P5 (~30 IDs), P6 (~25 IDs), P7 (~5 IDs), P8 (~50 IDs), Notes (~30 wontfix/informational/retracted). Total accounts for all 253 IDs — every ID is either in a fix phase, the wontfix/informational list, or retracted. Nothing has been dropped.