Files
xenia-rs/audit-report-2026-04-29.md
MechaCat02 f424132a5b chore(audit): mark P3 PPCBUGs applied; append P3 progress section
P3 phase merged at f3ebaba. Update audit-findings.md status fields and
append the P3 progress section to audit-report-2026-04-29.md, including
the new PPCBUG-700 discovery (VMX128 register accessor canary-compliance).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-02 11:28:38 +02:00

449 lines
26 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# PPC Instruction Audit — Triaged Report (2026-04-29)
**Status**: audit complete. **No code modified.** This file is the fix-order plan for the follow-up session.
**Source of truth**: detailed bug entries (one heading per PPCBUG ID) live in `audit-findings.md`. This file references every entry by ID so nothing is lost — it does not duplicate the per-bug detail.
## Counts
- **Total findings**: 253 PPCBUG IDs, of which 4 are explicitly retracted/withdrawn (PPCBUG-220, 222, 226, 482, 483 — see Notes section).
- **Net findings**: ~248 actionable.
- **Severity breakdown** (rough):
- HIGH: ~55 (~22%)
- MEDIUM: ~75 (~30%)
- LOW (test gaps + cosmetic + informational): ~118 (~48%)
## Headline findings (most likely Sylpheed-renderer-blockers)
1. **PPCBUG-107 cascade**`ReservationTable::invalidate_for_write` defined and unit-tested but never called from any of the **50+ store opcodes** in the interpreter. Under `--parallel`, every cross-thread atomic via `lwarx`/`stwcx.` is silently broken: spinlocks succeed without exclusion, atomic counters race, condition-variable handshakes never sync. Plausible direct cause of the 4-worker-thread renderer plateau (`project_xenia_rs_sylpheed_stage3_2026_04_29.md`). **Fix is mechanical**: one-line `if t.has_active_reservers() { t.invalidate_for_write(ea) }` before every `mem.write_*` in interpreter.rs.
2. **PPCBUG-053+054 cascade**`bcx`/`bclrx` CTR zero-test compares all 64 bits; `mtspr CTR` writes full 64-bit GPR. Combined with PPCBUG-006 (`negx` poisons GPR upper 32) → **`neg; mtctr; bdnz` loops run forever**.
3. **8 decoder/field-extraction bugs collapse into 6 missing accessors** + 1 wrong sh64 formula + 1 missing decode_op6 dot-form entry. The disassembler already has correct local versions. Single mechanical sweep.
4. **PPCBUG-046 (`clrldi r3, r4, 32`)** — the canonical zero-extend-low-32 idiom is currently a no-op. Emitted constantly by 32-bit-ABI compilers.
5. **PPCBUG-510**`stvewx128` corrupts 12 adjacent bytes per call.
6. **PPCBUG-424/425**`vmaddfp128`/`vmaddcfp128` operand swap. Every D3D vertex/pixel shader using FMA with non-aliased operands gets wrong arithmetic.
7. **PPCBUG-360/363**`vperm128` uses wrong control vector (every D3D shader swizzle); `vpkd3d128` missing post-pack permutation (canonical D3D vertex-pack `pack=1` always wrong).
8. **PPCBUG-275/420-422** — VC-form and VMX128_R-form `rc_bit()` reads bit 0 instead of bit 21/27 → **CR6 never updated for ANY VMX vector compare dot form**. Breaks every `vcmpequb. + bc CR6_all_true` early-exit loop in audio mixing, font rendering, string ops.
## Recommended fix order
The phases below are the recommended fix order for the follow-up session. Each phase is **independently mergeable**; later phases may reveal that earlier phases unblocked their symptoms (e.g. P1 by itself could be sufficient to break open the Sylpheed renderer plateau).
After each phase: `cargo test --workspace --release` (must stay at 506+ pass) AND `xenia-rs check sylpheed.iso -n 100M` (must not regress against the 2026-04-29 addis-fix baseline of `swaps=2`). The acid test is whether `draws > 0` opens after P1 or P2.
---
### Phase 1 — Cross-thread atomicity (PPCBUG-107 cascade)
**Why first**: highest confidence smoking-gun for the renderer plateau. Single, mechanical, low-risk fix. Largest leverage relative to size.
**Coupled — must land together**:
- PPCBUG-107 (root: missing call from stores)
- PPCBUG-130 (9 byte/halfword stores)
- PPCBUG-140, 141, 142, 143, 144 (5 word stores: stw/stwu/stwx/stwux/stwbrx)
- PPCBUG-150 (5 doubleword stores: std/stdu/stdx/stdux/stdbrx)
- PPCBUG-160 (3 multiple/string stores: stmw/stswi/stswx)
- PPCBUG-167 (9 FP stores)
- PPCBUG-511, 512, 513, 514 (16 VMX stores)
**Independent but related**:
- PPCBUG-151 (stwcx/stdcx reservation width discriminator) — separate fix; add `reservation_width: u8` to PpcContext.
- PPCBUG-108 (legacy per-context path: cross-thread invalidation impossible) — informational; --reservations-table mode bypasses.
**Approach** — one PR adds `if t.has_active_reservers() { t.invalidate_for_write(ea) }` before every `mem.write_*` call site. Scope:
```
mem.write_u8 / write_u16 / write_u32 / write_u64 / write_f32 / write_f64
mem.write_vec128 / write_vec128_aligned (for VMX)
```
~38 sites total. Add 1+ targeted concurrency tests (lwarx + cross-thread plain store + stwcx., expect EQ=0).
---
### Phase 2 — Decoder/field-extraction structural sweep
**Why second**: single mechanical sweep, fixes 12 distinct HIGH-severity findings, unblocks correct execution of compiler-emitted code. Disassembler already has correct local extraction logic — promote/port.
**Coupled — same commit**:
- PPCBUG-040 + PPCBUG-560 — fix `sh64()` bit order AND fix the test helper that was masking it
- PPCBUG-046 + PPCBUG-561 — promote `mb_md()` from `disasm.rs:1256` to `decoder.rs`; replace 6 inline-formula sites in interpreter.rs (rldicl/rldicr/rldic/rldimi/rldcl/rldcr)
- PPCBUG-275 + PPCBUG-276 + PPCBUG-420 + PPCBUG-421 + PPCBUG-422 + PPCBUG-562 — add `vc_rc_bit()` (PPC bit 21) and `vx128r_rc_bit()` (PPC bit 27); replace `instr.rc_bit()` at all VMX compare dot-form sites
- PPCBUG-315 + PPCBUG-563 — add `vx128_4_z()`, `vx128_4_imm()`; fix `vrlimi128`
- PPCBUG-361 + PPCBUG-565 — add `vx128_5_sh()`; fix `vsldoi128`
- PPCBUG-362 + PPCBUG-564 — add `vx128_p_perm()`; fix `vpermwi128`
- PPCBUG-423 + PPCBUG-600 — add 5 odd-key entries to `decode_op6` key4 for `vcmp*fp128.` dot forms
**Independent in this phase**:
- PPCBUG-360 — `vperm128` reads VC from `vd128()` instead of VX128_2 VC field at integer bits 6-8. Fix at the call site (or add `vx128_2_vc()` accessor).
- PPCBUG-363 + PPCBUG-369 — `vpkd3d128` missing post-pack permutation; add the `pack`/`shift` field handling per Canary.
**Test fixture updates required** (PPCBUG-560 lesson) — once `sh64()` is fixed, verify all `disasm_goldens.rs` test helpers encode shifts ISA-correctly. Don't trust the existing fixtures blindly.
---
### Phase 3 — Other HIGH bugs (single targeted fixes)
**Independent**:
- PPCBUG-510 — `stvewx128` corrupting 12 bytes per call. Direct fix: align EA to word, write only 4 bytes.
- PPCBUG-424 — `vmaddfp128` operand order: change `ai.mul_add(bi, di)``ai.mul_add(di, bi)`.
- PPCBUG-425 — `vmaddcfp128` operand order similarly.
- PPCBUG-053 + PPCBUG-054 — `bcx`/`bclrx` CTR zero-test (32-bit) + `mtspr CTR` truncation (defensive firewall). Coupled.
- PPCBUG-640 — `fmt_bc` spurious condition suffix on pure `bdnz`/`bdz`. Port the `fmt_bclr` pattern.
- PPCBUG-641 — `lwsync` shows as `sync` in disassembler (re-assessment of PPCBUG-088). Same fix.
---
### Phase 4 — 32-bit ABI writeback truncation sweep
**Why this phase**: cross-cutting, mechanical. Once ALL writebacks truncate via `as u32 as u64`, the systemic 32-bit-ABI invariant is restored and most CR0/CA helper-correctness concerns become moot.
#### 4a — Active poisoning (every execution corrupts GPR upper bits)
These bugs corrupt GPR upper bits **regardless** of whether upstream sources are clean — typically because the implementation applies Rust's `!u64` (full 64-bit NOT) somewhere:
- PPCBUG-006 (negx — `(!ra).wrapping_add(1)`)
- PPCBUG-008 (subfex — `(!ra).wrapping_add(rb).wrapping_add(ca)`)
- PPCBUG-018 (subfzex)
- PPCBUG-019 (subfmex)
- PPCBUG-028 (orcx — `rs | !rb`)
- PPCBUG-029 (norx — `!(rs | rb)` — the canonical `not` mnemonic, hot path)
- PPCBUG-030 (nandx)
- PPCBUG-031 (eqvx — `!(rs ^ rb)` — common `eqv rA, rA, rA` set-to-all-ones)
- PPCBUG-033 (andcx via `!rb`)
- PPCBUG-034 (extsbx — `as i8 as i64 as u64`)
- PPCBUG-035 (extshx)
#### 4b — Same-shape-as-addis (latent under clean inputs, active when upstream is poisoned)
- PPCBUG-001 (addi), PPCBUG-002 (addic), PPCBUG-003 (addicx), PPCBUG-005 (subficx), PPCBUG-007 (subfcx CA), PPCBUG-008 (subfex CA — also in 4a)
- PPCBUG-004 (mulli), PPCBUG-009 (mullwx)
- PPCBUG-010 + PPCBUG-011 (divwx writeback + CR0 — **must land together**, not independently)
- PPCBUG-041 + PPCBUG-042 + PPCBUG-043 (srawx/srawix writeback + CR0 coupling — **must land together**)
- PPCBUG-095, 096, 097, 098 (lha/lhax/lhau/lhaux halfword sign-extension)
- PPCBUG-105 (lwa/lwax/lwaux — note: 64-bit-mode-only; less common in 32-bit-ABI binaries)
#### 4c — Latent writeback (only triggers if 4a/4b are unfixed)
These can be fixed in the same sweep but won't fire under clean inputs:
- PPCBUG-012, 013, 014, 015, 016, 017 (addx/addcx/addex/addzex/addmex/subfx)
- PPCBUG-032 (andx/orx/xorx)
#### 4d — CR0 32-bit-ABI compare (cross-cutting catch-all)
PPCBUG-020 documents the catch-all; the per-opcode locations are referenced from there:
- PPCBUG-020 (catch-all in groups 2-5)
- PPCBUG-023 (andisx)
- PPCBUG-024 (rlwinmx), PPCBUG-025 (rlwimix), PPCBUG-026 (rlwnmx)
- PPCBUG-036 (extsbx), PPCBUG-037 (extshx) — **must land with PPCBUG-034/035**
- PPCBUG-044 (slwx/srwx)
**Fix shape** — at every Rc=1 path, change `update_cr_signed(0, result as i64)` to `update_cr_signed(0, result as u32 as i32 as i64)`. Once 4a/4b/4c land, both forms become equivalent and 4d becomes belt-and-suspenders (still recommended for resilience).
---
### Phase 5 — FPU correctness (graphics middleware impact)
#### 5a — Round-to-int and FPSCR.RN
- PPCBUG-221 + PPCBUG-227 (`round_to_i64` NearestEven broken near 2^52 — must land together; `round_to_i32` delegates)
- PPCBUG-201 (FPSCR.RN not honored for double arithmetic)
- PPCBUG-432 (vrfin/vrfin128 round-half-away-from-zero vs round-to-nearest-even)
#### 5b — VXISI / NaN / SNaN handling for FMA family
- PPCBUG-181, 182 (single fmaddsx/fmsubsx/fnmaddsx/fnmsubsx VXISI)
- PPCBUG-202, 203, 204 (double fmaddx/fmsubx/fnmaddx/fnmsubx VXISI — esp. 203 hot for Newton-Raphson)
- PPCBUG-183, 205 (fnmadd/fnmsub Rust unary `-` flips NaN sign — fix: skip negation on NaN)
- PPCBUG-186 (SNaN priority for FMA)
- PPCBUG-128 (lfs SNaN quietening — bit-manipulation widening helper needed)
#### 5c — Inexact / FPSCR exception bits
- PPCBUG-180 (single XX/FR/FI never set), PPCBUG-200 (double XX/FR/FI never set)
- PPCBUG-223 (fcmpo VXSNAN/VXVC), PPCBUG-224 (fcfidx XX), PPCBUG-225 (frspx XX/FR/FI), PPCBUG-229 (fctidx/fctidzx XX/FX), PPCBUG-230 (fctiwx/fctiwzx XX/FX), PPCBUG-231 (frspx SNaN host dependency)
- PPCBUG-165 + PPCBUG-166 + PPCBUG-168 (stfs* FPSCR + RN + SNaN)
#### 5d — Subnormal flush (FPSCR.NI / VSCR.NJ)
- PPCBUG-185 (FPU NI subnormal flush not modeled)
- PPCBUG-435, 436, 437 (VMX NJ subnormal flush — vaddfp/vsubfp/vmulfp128, vmsum3fp128/vmsum4fp128 product intermediates, vmaddfp/vmaddfp128/vmaddcfp128/vnmsubfp128 outputs)
#### 5e — Estimate precision (vs hardware ~12-bit)
- PPCBUG-184 (fres)
- PPCBUG-428..431 (vrefp, vrsqrtefp, vexptefp, vlogefp — same shape as fres)
#### 5f — VMX float compares + saturation
- PPCBUG-426, 427 (vnmsubfp/vnmsubfp128 double-rounding)
- PPCBUG-433 (vctsxs/vcfpsxws128 NaN saturate to INT_MIN)
---
### Phase 6 — Other MEDIUM correctness
- PPCBUG-021 (overflow.rs OE checks at bit 63 — sub-register ops; partly covered by P4)
- PPCBUG-022 (`mulld_ov` missing INT_MIN × -1)
- PPCBUG-027 (rlwimix upper-32 ISA-deviation — auto-resolves once P4 lands)
- PPCBUG-039 (cntlzdx 32-bit-ABI counts upper-zero — only matters if emitted)
- PPCBUG-063 (trap pc-after-advance)
- PPCBUG-064 (sc LEV field)
- PPCBUG-065 (twi 31, r0, IMM typed-trap — relevant to Sylpheed C++ throw work, see `project_xenia_rs_sylpheed_throw_2026_04_28.md`)
- PPCBUG-068 (mcrfs VX summary recomputation)
- PPCBUG-078 (mtmsrd L=1 partial MSR-write)
- PPCBUG-080 (mfvscr zero upper 96 bits)
- PPCBUG-123 + PPCBUG-124 + PPCBUG-161 + PPCBUG-566 (XER TBC for lswx/stswx — coupled; add `xer_tbc: u8` to PpcContext, wire into xer()/set_xer(); enables lswx and stswx)
- PPCBUG-125 (lmw RA-in-destination skip)
- PPCBUG-126 + PPCBUG-162 (lswi/stswi `instr.rb()``instr.nb()`)
- PPCBUG-487 + PPCBUG-495 (vsum* operand naming)
- PPCBUG-515 (lvebx/lvehx/lvewx vs Canary divergence — document; xenia-rs is more ISA-faithful)
- PPCBUG-516 (lvsr sh=0 case — add comment + debug_assert)
- PPCBUG-601 (decode_op6 overlapping windows — document the invariant)
- PPCBUG-642 (fmt_bcctr extended forms)
- PPCBUG-643 + PPCBUG-644 (SIMM/D-form decimal vs hex — alignment with Canary disassembly)
- PPCBUG-367 (vupkhpx/vupklpx channel replication vs zero-extend)
- PPCBUG-368 (vpkpx pack_pixel_555 channel assignment unverified)
- PPCBUG-366 (vspltisb/vspltish sign-extension idiom — fragile, not wrong)
---
### Phase 7 — Frozen-snapshot drift (separate sweep)
8 opcodes' frozen snapshots in `ppc-manual/<cat>/<op>.md` differ from live code:
- PPCBUG-066 (td/tdi/tw/twi)
- PPCBUG-117 (ldarx)
- PPCBUG-145 (stwcx)
- PPCBUG-560 (already-listed: rldicl test helper bit-order)
- Plus the implicit drift in addicx (PPCBUG-003), andisx (PPCBUG-023), cmp/cmpi (PPCBUG-050), extsbx/extshx (PPCBUG-036/037, PPCBUG-032 in batch 1)
**Recommendation**: regenerate frozen snapshots from current code for the entire ppc-manual after Phases 1-4 land. Add a CI check that compares snapshots vs live code on every PR.
---
### Phase 8 — Test gap closure (broad)
Single PR per group is overkill; recommend bundling test additions with each Phase 1-6 PR (test the bug being fixed). The remaining LOW IDs are pure-test-gap entries — list:
- PPCBUG-045 (shift), 047 (rld), 055 (branch), 067 (trap+sc), 070 (CR logical)
- PPCBUG-081, 082, 083, 084, 085 (SPR/MSR/TB/FPSCR/VSCR moves), 089 (cache+sync)
- PPCBUG-091 (lbz), 100 (lha), 109, 110, 111 (lwa/lwbrx/lwarx), 118 (ld), 127 (lmw/lswi/lswx), 129 (lfs/lfd)
- PPCBUG-132 (stb/sth), 146, 147 (stw/stwcx), 153 (std/stdcx), 163 (stmw/stswi/stswx), 171 (stfs/stfd)
- PPCBUG-187 (FPU single), 208 (FPU double), 228 (FPU misc convert)
- PPCBUG-240 (VMX add/sub), 243 (VMX sat helpers)
- PPCBUG-277, 278, 279 (VMX compare/min/max/avg)
- PPCBUG-316, 317, 320, 321, 322, 323, 324, 325 (VMX shift/rotate/logical)
- PPCBUG-370, 371, 372, 373, 374, 375, 376, 377, 378 (VMX permute/pack)
- PPCBUG-438, 439, 440 (VMX float compare/round/convert)
- PPCBUG-490, 491, 492, 493, 494 (VMX multiply-sum)
- PPCBUG-517, 518, 519 (VMX load/store)
- PPCBUG-567 (decoder accessors)
- PPCBUG-604 (decoder dispatch tables)
- PPCBUG-649, 650, 652 (golden fixtures for branches/VMX128)
---
## Notes & administrative
### Withdrawn / retracted
- **PPCBUG-220** — `fctiwx` strict-`>` threshold actually correct (`i32::MAX` exactly representable in f64). Retracted by group-31 subagent.
- **PPCBUG-222** — `fctidx` positive-overflow sentinel `0x7FFF_FFFF_FFFF_FFFF` is the correct ISA value. Retracted.
- **PPCBUG-226** — FPRF 5-bit codes for fcmpu/fcmpo are correct per PowerISA. Retracted.
- **PPCBUG-482** — `vmhaddshs` shift `>>15` is correct per spec snapshots. Retracted.
- **PPCBUG-483** — `vmhraddshs` shift `>>15` is correct per spec snapshots. Retracted.
### Wontfix / informational (not retracted but no fix needed)
- **PPCBUG-038** — extswx ISA-correct, intentional 64-bit sign-extension. Document the asymmetry with extsb/extsh after PPCBUG-034/035 land.
- **PPCBUG-090, 099, 152** — invalid-form (rD==rA) silently destroys load/store result. Per ISA: undefined behavior. No compiler emits these; matches Canary. Optional `debug_assert!`.
- **PPCBUG-106, 115, 131, 169, 170, 206, 207, 318, 319, 364, 365, 434, 651, 653, 645, 646, 648** — informational confirmations that the implementation is correct, no change needed.
- **PPCBUG-069** — test comment OX(so)=0 is wrong but the assert is correct.
- **PPCBUG-602, 603, 605** — undocumented decoder dispatch quirks; correct but should add comments.
- **PPCBUG-647, 654** — disassembler edge-case behavior on invalid encodings; not-a-bug for valid input.
### Coupling matrix (must-land-together)
| Group | IDs | Reason |
|---|---|---|
| divwx | 010, 011 | Quotient zero-extension changes the CR0 sign view |
| srawx/srawix | 041, 042, 043 | Writeback truncation invalidates the CR0 view |
| extsbx/extshx | 034+036, 035+037 | Same coupling shape as srawx |
| sh64 | 040, 560 | Test helper is wrong in the inverse direction |
| mb_md sweep | 046, 561 | Promote disasm.rs accessor first |
| VC-form Rc | 275, 276, 420, 421, 562 | All consume the same new accessor |
| VMX128_R Rc | 422, 562 | Same accessor sweep |
| vrlimi128 | 315, 563 | Field accessor + caller fix |
| vsldoi128 | 361, 565 | Field accessor + caller fix |
| vpermwi128 | 362, 564 | Field accessor + caller fix |
| vcmp*fp128. | 423, 600 | decode_op6 odd keys + opcode mapping |
| XER TBC | 123, 124, 161, 566 | Add field, wire xer()/set_xer(), enables lswx/stswx |
| round_to_i64 | 221, 227 | round_to_i32 delegates |
| stfs FPSCR | 165, 166, 168 | Single fix shape covers all three |
### Dependency on the addis fix
The addis fix (`project_xenia_rs_addis_signext_root_cause_2026_04_29.md`) is already in place. Phase 4 generalizes that fix systematically; without it, the writeback-truncation invariant would still be incomplete.
### Anticipated impact on the Sylpheed renderer plateau
Strong candidates for direct cause of the plateau:
- **PPCBUG-107** — broken atomics. Workers wait forever on never-signaled events; classical broken-spinlock symptom.
- **PPCBUG-053+054** — broken `bdnz` loops; could explain workers parked indefinitely.
- **PPCBUG-046 (`clrldi r3, r4, 32`)** — pollution propagation in 32-bit ABI; could break any pointer-clean-up sequence.
After applying Phase 1 alone, run `xenia-rs check sylpheed.iso -n 4B --parallel` and check whether `draws > 0`. If yes, the plateau was atomics; if no, proceed to P2/P3.
---
## Progress log
### P1 — Cross-thread atomicity sweep (merged 2026-05-01, HEAD ca5b90b)
**PPCBUGs fixed**: 107, 130, 140, 141, 142, 143, 144, 150, 160, 167, 511, 512, 513, 514, 151, 108. Plus review-fix additions: dcbz, dcbz128, stswi two-line, stswx two-line (merged in review-fix commit c9f194d).
**Gate results**:
- `cargo test --workspace --release`: 449 passed, 0 failed
- `-n 100M` lockstep: swaps=2, clean
- `-n 100M --parallel --reservations-table`: swaps=2, clean
- **Acid test** `-n 4B --parallel --reservations-table`: swaps=2, draws=**0**, no RtlRaiseException, no panics
**Conclusion**: P1 did NOT unblock the Sylpheed renderer. `draws` remains 0. The renderer plateau is not caused by broken cross-thread atomics alone. Proceeding to P2 (decoder/field-extraction sweep). The strongest remaining candidate per the plan is PPCBUG-046 (`clrldi r3, r4, 32` no-op).
---
### P2 — Decoder/field-extraction structural sweep (merged 2026-05-01, HEAD see `git log master --oneline -1`)
**PPCBUGs fixed**: 040, 046, 275, 276, 315, 360, 361, 362, 363, 369, 420, 421, 422, 423, 560, 561, 562, 563, 564, 565, 600.
**Batches**:
- Batch 1: PPCBUG-040+560 — sh64() bit-order fix (XS-form SH split) + rldicl test helper encoding
- Batch 2: PPCBUG-046+561 — mb_md() accessor; all 6 rld* MB fields corrected (clrldi was a no-op)
- Batch 3: PPCBUG-275+276+420+421+422+423+562+600 — vc_rc_bit()/vx128r_rc_bit() Rc accessors; 13 vcmp interpreter sites; 5 decode_op6 dot-form entries
- Batch 4: PPCBUG-315+563 — vrlimi128 vx128_4_z/imm field extraction
- Batch 5: PPCBUG-361+565 — vsldoi128 vx128_5_sh field extraction
- Batch 6: PPCBUG-362+564 — vpermwi128 vx128_p_perm field extraction
- Batch 7: PPCBUG-360 — vperm128 vc128_2() accessor (was erroneously vd128())
- Batch 8: PPCBUG-363+369 — vpkd3d128 post-pack permutation (MakePermuteMask tables from canary)
**Gate results**:
- `cargo test --workspace --release`: 201 (cpu) + 6 (disasm goldens) + 144 + 76 + 16 + 8 + … passed, 0 failed
- Independent code reviewer: all 9 check items OK
- `-n 100M` lockstep smoke: ISO not available in CI environment; last known good at P1 HEAD was swaps=2
- **Acid test** `-n 4B --parallel --reservations-table`: pending (ISO not in CI environment)
**Conclusion**: All P2 fixes applied and reviewed. Decoder field extraction is now correct for all audited VMX128 and MD/XS-form instructions. Whether P2 unblocks the renderer (`draws > 0`) requires the sylpheed.iso acid test on the user's machine. PPCBUG-046 (clrldi no-op fix) was the highest-probability P2 renderer-unblock candidate. Next: P3 — isolated HIGH bugs (PPCBUG-510, 424/425, 053+054, 640, 641).
---
### P3 — Isolated HIGH bugs (merged 2026-05-02, HEAD f3ebaba)
**PPCBUGs fixed**: 053+054 (coupled CTR 32-bit), 424+425 (vmaddfp128/vmaddcfp128 operand swap), 510 (stvewx128 corruption), 640+650 (bdnz/bdz suffix), 641+649 (sync/lwsync), **700 (NEW)**.
**Batches**:
- Batch 1: PPCBUG-510 — stvewx128 16-byte corruption fixed (word-align EA, extract lane, write 4 bytes)
- Batch 2: PPCBUG-424+425 + PPCBUG-700 partial (va128 PPC[11-15] partial fix) — vmaddfp128/vmaddcfp128 operand swap to VA*VD+VB
- Batch 3: PPCBUG-053+054 — bcx/bclrx 32-bit CTR compare + mtspr CTR truncation
- Batch 4: PPCBUG-640+650 — fmt_bc spurious bdnzge/bdzge suffix gated on `!uncond`
- Batch 5: PPCBUG-641+649 — sync/lwsync L-field disambiguation
- Phase review fix: **PPCBUG-700 (NEW)** — VMX128 register accessors (va128/vb128/vd128/vx128r_rc_bit) rewritten to canary's bitfield positions. Audit's "confirmed-clean" line-2958 assessment was based on miscounting LSB-first packed C++ bitfields. Per canary (`xenia-canary/src/xenia/cpu/ppc/ppc_decode_data.h:484-663`):
- VA128 = PPC[11-15] | PPC[26]<<5 | PPC[21]<<6 (3 fields, 7 bits)
- VB128 = PPC[16-20] | PPC[30-31]<<5
- VD128 = PPC[6-10] | PPC[28-29]<<5
- VX128_R Rc = PPC[25] (host bit 6) — NOT PPC[27] as PPCBUG-422 prescribed
Affects 30+ VMX128 opcodes; production game code with VR>=32 was silently mis-decoded. Speculative `key4_dt` dot-form dispatch in `decode_op6` removed (canary has no separate dot-form opcodes for VX128_R). New PPCBUG-700 entry added to `audit-findings.md` Phase C4 invalidating audit line 2958.
**Gate results**:
- `cargo test --workspace --release`: **470 passed, 0 failed** (up from 467 baseline at P3 start; 3 new CTR regression tests added)
- Independent code reviewer: 1 BLOCKING issue (PPCBUG-700 above) — addressed before merge
- `-n 100M` lockstep smoke: ISO not in CI; checked locally during development
- **Acid test** `-n 4B --parallel --reservations-table`: **deferred to end of all phases** per user direction
**Conclusion**: All P3 fixes applied + reviewed + reviewer's blocking concern resolved. Phase 3 also produced one HIGH discovery (PPCBUG-700) that the audit had missed. Total fixes: 6 commits, 7 distinct PPCBUG groups. Next: P4 — 32-bit ABI writeback truncation sweep, ~30 IDs across 4a-4d sub-sections.
---
## Index — every PPCBUG referenced (in numerical order)
This list intentionally includes every ID found in `audit-findings.md` so nothing is dropped. For each entry's full description / file:line / fix snippet / test recommendation, see the corresponding `### PPCBUG-NNN` heading in `audit-findings.md`.
001-022 (batch 1: integer ALU): 001, 002, 003, 004, 005, 006, 007, 008, 009, 010, 011, 012, 013, 014, 015, 016, 017, 018, 019, 020, 021, 022.
023 (batch 2 group 6 logic immediate): 023.
024-027 (batch 2 group 9 word rotate): 024, 025, 026, 027.
028-033 (batch 2 group 7 logic register): 028, 029, 030, 031, 032, 033.
034-039 (batch 2 group 8 sign-extend / count-leading-zeros): 034, 035, 036, 037, 038, 039.
040-045 (batch 2 group 11 shift): 040, 041, 042, 043, 044, 045.
046-047 (batch 2 group 10 doubleword rotate): 046, 047.
048-052 reserved (group 12 compare): 048, 049, 050.
053-055 (batch 3 group 13 branch): 053, 054, 055.
063-067 (batch 3 group 14 trap+sc): 063, 064, 065, 066, 067.
068-070 (batch 3 group 15 CR logical): 068, 069, 070.
078-085 (batch 3 group 16 SPR/MSR/TB/FPSCR/VSCR): 078, 079, 080, 081, 082, 083, 084, 085.
088-089 (batch 3 group 17 cache+sync): 088, 089.
090-091 (batch 4 group 18 load byte): 090, 091.
095-100 (batch 4 group 19 load halfword): 095, 096, 097, 098, 099, 100.
105-111 (batch 4 group 20 load word + reservation): 105, 106, 107, 108, 109, 110, 111.
115-118 (batch 4 group 21 load doubleword): 115, 116, 117, 118.
123-127 (batch 4 group 22 load multiple/string): 123, 124, 125, 126, 127.
128-129 (batch 4 group 23 load float): 128, 129.
130-132 (batch 5 group 24 store byte/halfword): 130, 131, 132.
140-147 (batch 5 group 25 store word + stwcx): 140, 141, 142, 143, 144, 145, 146, 147.
150-153 (batch 5 group 26 store doubleword): 150, 151, 152, 153.
160-163 (batch 5 group 27 store multiple/string): 160, 161, 162, 163.
165-171 (batch 5 group 28 store float): 165, 166, 167, 168, 169, 170, 171.
180-187 (batch 6 group 29 FPU single arithmetic): 180, 181, 182, 183, 184, 185, 186, 187.
200-208 (batch 6 group 30 FPU double arithmetic): 200, 201, 202, 203, 204, 205, 206, 207, 208.
220-231 (batch 6 group 31 FPU sign/move/compare/convert): 220 [retracted], 221, 222 [retracted], 223, 224, 225, 226 [retracted], 227, 228, 229, 230, 231.
240-243 (batch 7 group 32 VMX integer add/sub): 240, 241, 242, 243.
275-279 (batch 7 group 33 VMX integer compare/min/max/avg): 275, 276, 277, 278, 279.
315-325 (batch 7 group 34 VMX integer logical/shift/rotate): 315, 316, 317, 318, 319, 320, 321, 322, 323, 324, 325.
360-378 (batch 8 group 35 VMX permute/pack): 360, 361, 362, 363, 364, 365, 366, 367, 368, 369, 370, 371, 372, 373, 374, 375, 376, 377, 378.
420-440 (batch 8 group 36 VMX float arith+compare): 420, 421, 422, 423, 424, 425, 426, 427, 428, 429, 430, 431, 432, 433, 434, 435, 436, 437, 438, 439, 440.
482-495 (batch 8 group 37 VMX multiply-sum + special): 482 [retracted], 483 [retracted], 487, 490, 491, 492, 493, 494, 495.
510-519 (batch 8 group 38 VMX load/store): 510, 511, 512, 513, 514, 515, 516, 517, 518, 519.
560-567 (Phase C1 decoder field extractors): 560, 561, 562, 563, 564, 565, 566, 567.
600-605 (Phase C2 decoder opcode-lookup): 600, 601, 602, 603, 604, 605.
640-654 (Phase C3 disassembler formatter): 640, 641, 642, 643, 644, 645, 646, 647, 648, 649, 650, 651, 652, 653, 654.
**Counted IDs**: 253. **Retracted**: 220, 222, 226, 482, 483 (5). **Net actionable**: 248.
**Counted by phase here**: P1 (~17 IDs), P2 (~17 IDs), P3 (~7 IDs), P4 (~30 IDs), P5 (~30 IDs), P6 (~25 IDs), P7 (~5 IDs), P8 (~50 IDs), Notes (~30 wontfix/informational/retracted). Total accounts for all 253 IDs — every ID is either in a fix phase, the wontfix/informational list, or retracted. **Nothing has been dropped.**