Files

MechaCat02 e6d43a23ac chore: add migration/ bundle for cross-machine setup

Bundles state that lives OUTSIDE the xenia-rs repo so a fresh clone on
another machine can be brought up to identical configuration via
migration/setup.sh:

  - claude-memory/             ~/.claude/projects/-home-fabi-RE-Project-Sylpheed/memory/
                               (103 files, 1.1 MB - MEMORY.md + every
                                project_xenia_rs_*.md from audits
                                addis_signext through audit-058)
  - project-root/dot-claude/   <project-root>/.claude/settings.json
                               (Stop hook + permissions)
  - project-root/ppc-manual/   <project-root>/ppc-manual/
                               (PowerPC reference docs, 397 files, 3.7 MB)
  - project-root/run-canary.sh <project-root>/run-canary.sh
  - README.md                  Human-readable setup checklist
  - setup.sh                   Idempotent installer (also reclones
                               xenia-canary at pinned HEAD 6de80dffe)
  - MANIFEST.md                Per-file mapping + per-file-not-bundled
                               restoration recipe

Excluded from bundle (not shippable via git):
  - Sylpheed ISO (7.8 GB; copyright; manual copy required)
  - sylpheed.db (395 MB; regenerable from XEX via analysis tooling)
  - target/ build artifacts (rebuild on target)
  - audit-runs probe firehoses (.log/.stdout/.stderr ~11 GB; rerun if needed)
  - audit-runs memory dumps (.bin ~4.5 GB; rerun audit-026/027/029 if needed)
  - xenia-canary checkout (setup.sh reclones from
    git.mc02.dev/fabi/Xenia-Canary.git at HEAD 6de80dffe)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-10 21:38:38 +02:00

8.5 KiB

Raw Blame History

`vaddfp` — Vector Add Floating Point

Category: VMX (Altivec) · Form: VX · Opcode: 0x1000000a

Assembler Mnemonics

Mnemonic	XML entry	Flags	Description
`vaddfp`	`vaddfp`	—	Vector Add Floating Point
`vaddfp128`	`vaddfp128`	—	Vector128 Add Floating Point

Syntax

vaddfp [VD], [VA], [VB]
vaddfp128 [VD], [VA], [VB]

Encoding

`vaddfp` — form `VX`

Opcode word: 0x1000000a
Primary opcode (bits 0–5): 4
Extended opcode: 10
Synchronising: no

Bits	Field	Meaning
0–5	`OPCD`	primary opcode (4)
6–10	`VRT/VD`	destination vector register
11–15	`VRA/VA`	source A vector register
16–20	`VRB/VB`	source B vector register
21–31	`XO`	extended opcode (11 bits)

`vaddfp128` — form `VX128`

Opcode word: 0x14000010
Primary opcode (bits 0–5): 5
Extended opcode: 16
Synchronising: no

Bits	Field	Meaning
0–5	`OPCD`	primary opcode (4 or 5)
6–10	`VD128l`	destination low 5 bits
11–15	`VA128l`	source A low 5 bits
16–20	`VB128l`	source B low 5 bits
21	`VA128H`	source A high bit
22	`—`	reserved
23–25	`VC`	optional VC / XO sub-field
26	`VA128h`	source A middle bit
27	`—`	reserved
28–29	`VD128h`	destination high 2 bits
30–31	`VB128h`	source B high 2 bits

Operands

Field	Role	Description
`VA`	vaddfp: read; vaddfp128: read	Source A vector register.
`VB`	vaddfp: read; vaddfp128: read	Source B vector register.
`VD`	vaddfp: write; vaddfp128: write	Destination vector register.

Register Effects

`vaddfp`

Reads (always): VA, VB
Reads (conditional): none
Writes (always): VD
Writes (conditional): none

`vaddfp128`

Reads (always): VA, VB
Reads (conditional): none
Writes (always): VD
Writes (conditional): none

Status-Register Effects

No condition-register or status-register effects.

Operation (pseudocode)

for each 32-bit float lane i in 0..3:
    VD[i] <- VA[i] + VB[i]

C Translation Example

/* vaddfp VD, VA, VB — lane-wise float add                         */
for (int i = 0; i < 4; ++i) v[insn.VD].f[i] = v[insn.VA].f[i] + v[insn.VB].f[i];

Implementation References

vaddfp

xenia-canary XML: tools/ppc-instructions.xml — search for mnem="vaddfp"
xenia-canary emit: src/xenia/cpu/ppc/ppc_emit_altivec.cc:341
xenia-rs opcode: crates/xenia-cpu/src/opcode.rs:89
xenia-rs decoder: crates/xenia-cpu/src/decoder.rs:438
xenia-rs interpreter: crates/xenia-cpu/src/interpreter.rs:1984-1998

xenia-rs interpreter body (frozen snapshot)

        PpcOpcode::vaddfp => {
            // PPCBUG-435: VSCR.NJ=1 (Xbox 360 always boots with this set) requires
            // flush-to-zero on subnormal inputs and outputs. Canary VMX float
            // arithmetic flushes denormals unconditionally.
            let a = ctx.vr[instr.ra()].as_f32x4();
            let b = ctx.vr[instr.rb()].as_f32x4();
            let mut r = [0f32; 4];
            for i in 0..4 {
                let ai = vmx::flush_denorm(a[i]);
                let bi = vmx::flush_denorm(b[i]);
                r[i] = vmx::flush_denorm(ai + bi);
            }
            ctx.vr[instr.rd()] = xenia_types::Vec128::from_f32x4_array(r);
            ctx.pc += 4;
        }

vaddfp128

xenia-canary XML: tools/ppc-instructions.xml — search for mnem="vaddfp128"
xenia-canary emit: src/xenia/cpu/ppc/ppc_emit_altivec.cc:344
xenia-rs opcode: crates/xenia-cpu/src/opcode.rs:89
xenia-rs decoder: crates/xenia-cpu/src/decoder.rs:610
xenia-rs interpreter: crates/xenia-cpu/src/interpreter.rs:1999-2011

xenia-rs interpreter body (frozen snapshot)

        PpcOpcode::vaddfp128 => {
            // PPCBUG-435: same as vaddfp.
            let a = ctx.vr[instr.va128()].as_f32x4();
            let b = ctx.vr[instr.vb128()].as_f32x4();
            let mut r = [0f32; 4];
            for i in 0..4 {
                let ai = vmx::flush_denorm(a[i]);
                let bi = vmx::flush_denorm(b[i]);
                r[i] = vmx::flush_denorm(ai + bi);
            }
            ctx.vr[instr.vd128()] = xenia_types::Vec128::from_f32x4_array(r);
            ctx.pc += 4;
        }

Extended Pseudocode

; Four independent lane-wise IEEE-754 single-precision adds
for i in 0..3:
    VD[i] <- VA[i] + VB[i]                       ; binary32, rounded to nearest

; No FPSCR update (VMX uses VSCR, which only has NJ / SAT — and vaddfp doesn't saturate)

Special Cases & Edge Conditions

Lane indexing is big-endian. Lane 0 is the most significant 4 bytes of the 128-bit register (the one that appears at the lowest byte offset after a stvx). Xenia's Vec128::as_f32x4() already reads lanes in PPC order on x86-64. When writing C that manipulates individual lanes, index v.f[0] as "the byte 0..3" of the big-endian layout.
Flush-denormals ("NJ") mode. Altivec is independent of FPSCR — it has its own 2-bit VSCR (NJ for non-Java mode + SAT sticky-saturation). VMX float operations honour VSCR[NJ]: when set (the Xenon boot default), denormal inputs and outputs are flushed to zero. This is opposite to the scalar FPU, which has its own non-IEEE bit. Xenia sets NJ = 1 at context creation (context.rs).
No exception, no trap. Altivec floats never raise exceptions. NaN inputs produce NaN outputs; ±∞ − ±∞ yields a NaN; there is no VXISI-style status bit. VSCR[SAT] is not touched by vaddfp (it saturates integer ops, not floats).
Four independent lanes. Each lane's operation is unaffected by the others. Aliasing between VA, VB, and VD is legal and common (vaddfp v3, v3, v4).
VMX128 sibling (vaddfp128). Semantics identical; only the register encoding differs. VMX128 uses a 7-bit operand ID per source (and destination) built from two or three non-contiguous bit fields — see categories/vmx128.md. Any bit pattern encodable as a 32-register VX-form is also encodable as a VMX128 form, so compilers picked the more compact form that reached the needed register range.
On x86-64 hosts. A natural compilation uses _mm_add_ps or AVX vaddps. These preserve lane indexing because PPC lane 0 maps to x86 lane 3 only if you treat the 128-bit value as "big-endian in memory" — i.e. byte-swap on load/store. With xenia's _be memory helpers, _mm_add_ps gives the right per-lane result.

vsubfp — lane-wise float subtract.
vmaddfp — lane-wise (VA × VC) + VB (fused multiply-add with single rounding).
vnmsubfp — −((VA × VC) − VB).
vmaxfp, vminfp — IEEE-754-aware max/min (NaN propagation).
vcmpeqfp, vcmpgtfp, vcmpgefp, vcmpbfp — compares producing per-lane all-ones / all-zero masks.
vrfin, vrfim, vrfip, vrfiz — round to integer (to-nearest / down / up / toward-zero).
vmulfp — xenia's helper; not a native Altivec op, included for convenience. Hardware games use vmaddfp v, va, vc, v0_zero instead.

8.5 KiB Raw Blame History Unescape Escape

vaddfp — Vector Add Floating Point

Assembler Mnemonics

Syntax

Encoding

vaddfp — form VX

vaddfp128 — form VX128

Operands

Register Effects

vaddfp

vaddfp128

Status-Register Effects

Operation (pseudocode)

C Translation Example

Implementation References

Extended Pseudocode

Special Cases & Edge Conditions

Related Instructions

IBM Reference

8.5 KiB

Raw Blame History

`vaddfp` — Vector Add Floating Point

`vaddfp` — form `VX`

`vaddfp128` — form `VX128`

`vaddfp`

`vaddfp128`