Bundles state that lives OUTSIDE the xenia-rs repo so a fresh clone on
another machine can be brought up to identical configuration via
migration/setup.sh:
- claude-memory/ ~/.claude/projects/-home-fabi-RE-Project-Sylpheed/memory/
(103 files, 1.1 MB - MEMORY.md + every
project_xenia_rs_*.md from audits
addis_signext through audit-058)
- project-root/dot-claude/ <project-root>/.claude/settings.json
(Stop hook + permissions)
- project-root/ppc-manual/ <project-root>/ppc-manual/
(PowerPC reference docs, 397 files, 3.7 MB)
- project-root/run-canary.sh <project-root>/run-canary.sh
- README.md Human-readable setup checklist
- setup.sh Idempotent installer (also reclones
xenia-canary at pinned HEAD 6de80dffe)
- MANIFEST.md Per-file mapping + per-file-not-bundled
restoration recipe
Excluded from bundle (not shippable via git):
- Sylpheed ISO (7.8 GB; copyright; manual copy required)
- sylpheed.db (395 MB; regenerable from XEX via analysis tooling)
- target/ build artifacts (rebuild on target)
- audit-runs probe firehoses (.log/.stdout/.stderr ~11 GB; rerun if needed)
- audit-runs memory dumps (.bin ~4.5 GB; rerun audit-026/027/029 if needed)
- xenia-canary checkout (setup.sh reclones from
git.mc02.dev/fabi/Xenia-Canary.git at HEAD 6de80dffe)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
8.5 KiB
8.5 KiB
vaddfp — Vector Add Floating Point
Category: VMX (Altivec) · Form: VX · Opcode:
0x1000000a
Assembler Mnemonics
| Mnemonic | XML entry | Flags | Description |
|---|---|---|---|
vaddfp |
vaddfp |
— | Vector Add Floating Point |
vaddfp128 |
vaddfp128 |
— | Vector128 Add Floating Point |
Syntax
vaddfp [VD], [VA], [VB]
vaddfp128 [VD], [VA], [VB]
Encoding
vaddfp — form VX
- Opcode word:
0x1000000a - Primary opcode (bits 0–5):
4 - Extended opcode:
10 - Synchronising: no
| Bits | Field | Meaning |
|---|---|---|
| 0–5 | OPCD |
primary opcode (4) |
| 6–10 | VRT/VD |
destination vector register |
| 11–15 | VRA/VA |
source A vector register |
| 16–20 | VRB/VB |
source B vector register |
| 21–31 | XO |
extended opcode (11 bits) |
vaddfp128 — form VX128
- Opcode word:
0x14000010 - Primary opcode (bits 0–5):
5 - Extended opcode:
16 - Synchronising: no
| Bits | Field | Meaning |
|---|---|---|
| 0–5 | OPCD |
primary opcode (4 or 5) |
| 6–10 | VD128l |
destination low 5 bits |
| 11–15 | VA128l |
source A low 5 bits |
| 16–20 | VB128l |
source B low 5 bits |
| 21 | VA128H |
source A high bit |
| 22 | — |
reserved |
| 23–25 | VC |
optional VC / XO sub-field |
| 26 | VA128h |
source A middle bit |
| 27 | — |
reserved |
| 28–29 | VD128h |
destination high 2 bits |
| 30–31 | VB128h |
source B high 2 bits |
Operands
| Field | Role | Description |
|---|---|---|
VA |
vaddfp: read; vaddfp128: read | Source A vector register. |
VB |
vaddfp: read; vaddfp128: read | Source B vector register. |
VD |
vaddfp: write; vaddfp128: write | Destination vector register. |
Register Effects
vaddfp
- Reads (always):
VA,VB - Reads (conditional): none
- Writes (always):
VD - Writes (conditional): none
vaddfp128
- Reads (always):
VA,VB - Reads (conditional): none
- Writes (always):
VD - Writes (conditional): none
Status-Register Effects
No condition-register or status-register effects.
Operation (pseudocode)
for each 32-bit float lane i in 0..3:
VD[i] <- VA[i] + VB[i]
C Translation Example
/* vaddfp VD, VA, VB — lane-wise float add */
for (int i = 0; i < 4; ++i) v[insn.VD].f[i] = v[insn.VA].f[i] + v[insn.VB].f[i];
Implementation References
vaddfp
- xenia-canary XML:
tools/ppc-instructions.xml— search formnem="vaddfp" - xenia-canary emit:
src/xenia/cpu/ppc/ppc_emit_altivec.cc:341 - xenia-rs opcode:
crates/xenia-cpu/src/opcode.rs:89 - xenia-rs decoder:
crates/xenia-cpu/src/decoder.rs:438 - xenia-rs interpreter:
crates/xenia-cpu/src/interpreter.rs:1984-1998
xenia-rs interpreter body (frozen snapshot)
PpcOpcode::vaddfp => {
// PPCBUG-435: VSCR.NJ=1 (Xbox 360 always boots with this set) requires
// flush-to-zero on subnormal inputs and outputs. Canary VMX float
// arithmetic flushes denormals unconditionally.
let a = ctx.vr[instr.ra()].as_f32x4();
let b = ctx.vr[instr.rb()].as_f32x4();
let mut r = [0f32; 4];
for i in 0..4 {
let ai = vmx::flush_denorm(a[i]);
let bi = vmx::flush_denorm(b[i]);
r[i] = vmx::flush_denorm(ai + bi);
}
ctx.vr[instr.rd()] = xenia_types::Vec128::from_f32x4_array(r);
ctx.pc += 4;
}
vaddfp128
- xenia-canary XML:
tools/ppc-instructions.xml— search formnem="vaddfp128" - xenia-canary emit:
src/xenia/cpu/ppc/ppc_emit_altivec.cc:344 - xenia-rs opcode:
crates/xenia-cpu/src/opcode.rs:89 - xenia-rs decoder:
crates/xenia-cpu/src/decoder.rs:610 - xenia-rs interpreter:
crates/xenia-cpu/src/interpreter.rs:1999-2011
xenia-rs interpreter body (frozen snapshot)
PpcOpcode::vaddfp128 => {
// PPCBUG-435: same as vaddfp.
let a = ctx.vr[instr.va128()].as_f32x4();
let b = ctx.vr[instr.vb128()].as_f32x4();
let mut r = [0f32; 4];
for i in 0..4 {
let ai = vmx::flush_denorm(a[i]);
let bi = vmx::flush_denorm(b[i]);
r[i] = vmx::flush_denorm(ai + bi);
}
ctx.vr[instr.vd128()] = xenia_types::Vec128::from_f32x4_array(r);
ctx.pc += 4;
}
Extended Pseudocode
; Four independent lane-wise IEEE-754 single-precision adds
for i in 0..3:
VD[i] <- VA[i] + VB[i] ; binary32, rounded to nearest
; No FPSCR update (VMX uses VSCR, which only has NJ / SAT — and vaddfp doesn't saturate)
Special Cases & Edge Conditions
- Lane indexing is big-endian. Lane 0 is the most significant 4 bytes of the 128-bit register (the one that appears at the lowest byte offset after a
stvx). Xenia'sVec128::as_f32x4()already reads lanes in PPC order on x86-64. When writing C that manipulates individual lanes, indexv.f[0]as "the byte 0..3" of the big-endian layout. - Flush-denormals ("NJ") mode. Altivec is independent of FPSCR — it has its own 2-bit VSCR (
NJfor non-Java mode +SATsticky-saturation). VMX float operations honourVSCR[NJ]: when set (the Xenon boot default), denormal inputs and outputs are flushed to zero. This is opposite to the scalar FPU, which has its own non-IEEE bit. Xenia setsNJ = 1at context creation (context.rs). - No exception, no trap. Altivec floats never raise exceptions. NaN inputs produce NaN outputs;
±∞ − ±∞yields a NaN; there is no VXISI-style status bit.VSCR[SAT]is not touched byvaddfp(it saturates integer ops, not floats). - Four independent lanes. Each lane's operation is unaffected by the others. Aliasing between
VA,VB, andVDis legal and common (vaddfp v3, v3, v4). - VMX128 sibling (
vaddfp128). Semantics identical; only the register encoding differs. VMX128 uses a 7-bit operand ID per source (and destination) built from two or three non-contiguous bit fields — seecategories/vmx128.md. Any bit pattern encodable as a 32-register VX-form is also encodable as a VMX128 form, so compilers picked the more compact form that reached the needed register range. - On x86-64 hosts. A natural compilation uses
_mm_add_psor AVXvaddps. These preserve lane indexing because PPC lane 0 maps to x86 lane 3 only if you treat the 128-bit value as "big-endian in memory" — i.e. byte-swap on load/store. With xenia's_bememory helpers,_mm_add_psgives the right per-lane result.
Related Instructions
vsubfp— lane-wise float subtract.vmaddfp— lane-wise(VA × VC) + VB(fused multiply-add with single rounding).vnmsubfp—−((VA × VC) − VB).vmaxfp,vminfp— IEEE-754-aware max/min (NaN propagation).vcmpeqfp,vcmpgtfp,vcmpgefp,vcmpbfp— compares producing per-lane all-ones / all-zero masks.vrfin,vrfim,vrfip,vrfiz— round to integer (to-nearest / down / up / toward-zero).vmulfp— xenia's helper; not a native Altivec op, included for convenience. Hardware games usevmaddfp v, va, vc, v0_zeroinstead.