Files
xenia-rs/migration/project-root/ppc-manual/alu/mullwx.md
MechaCat02 e6d43a23ac chore: add migration/ bundle for cross-machine setup
Bundles state that lives OUTSIDE the xenia-rs repo so a fresh clone on
another machine can be brought up to identical configuration via
migration/setup.sh:

  - claude-memory/             ~/.claude/projects/-home-fabi-RE-Project-Sylpheed/memory/
                               (103 files, 1.1 MB - MEMORY.md + every
                                project_xenia_rs_*.md from audits
                                addis_signext through audit-058)
  - project-root/dot-claude/   <project-root>/.claude/settings.json
                               (Stop hook + permissions)
  - project-root/ppc-manual/   <project-root>/ppc-manual/
                               (PowerPC reference docs, 397 files, 3.7 MB)
  - project-root/run-canary.sh <project-root>/run-canary.sh
  - README.md                  Human-readable setup checklist
  - setup.sh                   Idempotent installer (also reclones
                               xenia-canary at pinned HEAD 6de80dffe)
  - MANIFEST.md                Per-file mapping + per-file-not-bundled
                               restoration recipe

Excluded from bundle (not shippable via git):
  - Sylpheed ISO (7.8 GB; copyright; manual copy required)
  - sylpheed.db (395 MB; regenerable from XEX via analysis tooling)
  - target/ build artifacts (rebuild on target)
  - audit-runs probe firehoses (.log/.stdout/.stderr ~11 GB; rerun if needed)
  - audit-runs memory dumps (.bin ~4.5 GB; rerun audit-026/027/029 if needed)
  - xenia-canary checkout (setup.sh reclones from
    git.mc02.dev/fabi/Xenia-Canary.git at HEAD 6de80dffe)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 21:38:38 +02:00

6.7 KiB
Raw Permalink Blame History

mullwx — Multiply Low Word

Category: Integer ALU · Form: XO · Opcode: 0x7c0001d6

Assembler Mnemonics

Mnemonic XML entry Flags Description
mullw mullwx Multiply Low Word
mullwo mullwx OE=1 Multiply Low Word
mullw. mullwx Rc=1 Multiply Low Word
mullwo. mullwx OE=1, Rc=1 Multiply Low Word

Syntax

mullw[OE][Rc] [RD], [RA], [RB]

Encoding

mullwx — form XO

  • Opcode word: 0x7c0001d6
  • Primary opcode (bits 05): 31
  • Extended opcode: 235
  • Synchronising: no
Bits Field Meaning
05 OPCD primary opcode (31)
610 RT destination GPR
1115 RA source A
1620 RB source B
21 OE overflow-enable flag
2230 XO extended opcode (9 bits)
31 Rc record-form flag

Operands

Field Role Description
RA mullwx: read Source GPR (r0r31).
RB mullwx: read Source GPR.
RD mullwx: write Destination GPR.
CR mullwx: write (conditional) Condition-register update. When Rc=1, CR field 0 (or CR6 for vector compares, CR1 for FPU) is updated from the result.
OE mullwx: write (conditional) Overflow-enable bit. When 1, the instruction updates XER[OV] and stickies XER[SO] on signed overflow.

Register Effects

mullwx

  • Reads (always): RA, RB
  • Reads (conditional): none
  • Writes (always): RD
  • Writes (conditional): CR, OE

Status-Register Effects

  • mullwx: CR0 ← signed-compare(result, 0) with SO ← XER[SO], when Rc=1.; XER[OV] ← signed-overflow(result); XER[SO] stickies, when OE=1.

Operation (pseudocode)

RT <- ((RA)[32:63]) * ((RB)[32:63])    ; signed 32×32 → 64

C Translation Example

/* C translation: the xenia-rs interpreter arm below in           */
/* Implementation References is the authoritative semantic        */
/* snapshot. Translate it line-by-line:                            */
/*   - ctx.gpr[N]  -> r[N]       (or f[]/v[] for FPRs/VRs)        */
/*   - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be   */
/*   - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v)   */
/*   - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO   */
/* The Register Effects and Status-Register Effects tables above  */
/* enumerate every side effect a faithful translation must emit.  */

Implementation References

mullwx

xenia-rs interpreter body (frozen snapshot)
        PpcOpcode::mullwx => {
            // PPCBUG-009: 32-bit ABI. Truncate product to u32 — overflow detection
            // (mullw_ov) still uses the full i64 product to catch the overflow.
            let ra = ctx.gpr[instr.ra()] as i32 as i64;
            let rb = ctx.gpr[instr.rb()] as i32 as i64;
            let product = ra.wrapping_mul(rb);
            ctx.gpr[instr.rd()] = product as u32 as u64;
            if instr.oe() {
                overflow::apply(ctx, overflow::mullw_ov(product));
            }
            if instr.rc_bit() {
                ctx.update_cr_signed(0, ctx.gpr[instr.rd()] as u32 as i32 as i64);
            }
            ctx.pc += 4;
        }

Extended Pseudocode

prod64 <- sign_extend_32_to_64((RA)[32:63]) *s sign_extend_32_to_64((RB)[32:63])
RT <- prod64                                         ; 64-bit result
if OE then
    XER[OV] <- (prod64 ≠ sign_extend_32_to_64(prod64[32:63]))  ; set when product doesn't fit in 32 bits
    XER[SO] <- XER[SO] | XER[OV]
if Rc then
    CR0 <- signed_compare(RT, 0) || XER[SO]

Special Cases & Edge Conditions

  • Inputs are the low 32 bits. mullw only looks at RA[32:63] and RB[32:63]; the high 32 bits of each source are ignored. This is a 32-bit × 32-bit → 64-bit signed multiply. For full 64-bit operands use mulldx.
  • Result is sign-extended to 64 bits. The 64-bit product fits into a 64-bit GPR without loss. Subsequent 32-bit consumers see RT[32:63] (the low 32 bits of the product); use mulhwx for the signed high 32 bits or mulhwux for the unsigned high 32 bits, computed in parallel without this instruction.
  • OE overflow test is 32-bit. XER[OV] is set iff the 64-bit signed product cannot be represented in 32 bits — equivalently, iff RT[32] ≠ RT[33] = … = RT[63] (sign bit disagrees with the next 32 bits). Xenia-rs does not implement this; OE on mullwo is a no-op in the interpreter.
  • Xenia-rs CR0 update bug footprint. The interpreter computes CR0 from result as i32 as i64 — the low 32 bits sign-extended. For a 32×32→64 multiply the high 32 bits may be non-zero even when the low 32 bits are zero, so xenia's CR0 can differ from the spec's (which compares the full 64-bit product to zero). In practice this matters only for code that relies on mullw. to detect overflow via CR0 — extremely rare.
  • Latency. On the Xenon, mullw has higher latency than add/sub; many hot inner loops avoid it by strength-reduction or shift-add chains. This is irrelevant for correctness but sometimes explains surprising instruction sequences in disassembly.
  • mulhwx — signed high 32 bits of the same 32×32 product.
  • mulhwux — unsigned high 32 bits of a 32×32 product.
  • mulli — D-form: RT ← (RA[32:63]) × SIMM (low 64 bits, signed).
  • mulldx, mulhdx, mulhdux — 64-bit multiplies (low/high, signed/unsigned).
  • divwx, divwux — 32-bit signed / unsigned division.

IBM Reference