chore: add migration/ bundle for cross-machine setup

Bundles state that lives OUTSIDE the xenia-rs repo so a fresh clone on
another machine can be brought up to identical configuration via
migration/setup.sh:

  - claude-memory/             ~/.claude/projects/-home-fabi-RE-Project-Sylpheed/memory/
                               (103 files, 1.1 MB - MEMORY.md + every
                                project_xenia_rs_*.md from audits
                                addis_signext through audit-058)
  - project-root/dot-claude/   <project-root>/.claude/settings.json
                               (Stop hook + permissions)
  - project-root/ppc-manual/   <project-root>/ppc-manual/
                               (PowerPC reference docs, 397 files, 3.7 MB)
  - project-root/run-canary.sh <project-root>/run-canary.sh
  - README.md                  Human-readable setup checklist
  - setup.sh                   Idempotent installer (also reclones
                               xenia-canary at pinned HEAD 6de80dffe)
  - MANIFEST.md                Per-file mapping + per-file-not-bundled
                               restoration recipe

Excluded from bundle (not shippable via git):
  - Sylpheed ISO (7.8 GB; copyright; manual copy required)
  - sylpheed.db (395 MB; regenerable from XEX via analysis tooling)
  - target/ build artifacts (rebuild on target)
  - audit-runs probe firehoses (.log/.stdout/.stderr ~11 GB; rerun if needed)
  - audit-runs memory dumps (.bin ~4.5 GB; rerun audit-026/027/029 if needed)
  - xenia-canary checkout (setup.sh reclones from
    git.mc02.dev/fabi/Xenia-Canary.git at HEAD 6de80dffe)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-05-10 21:38:38 +02:00
parent 8e709b0a24
commit e6d43a23ac
505 changed files with 86028 additions and 0 deletions

View File

@@ -0,0 +1,137 @@
# `vcfpsxws128` — Vector128 Convert From Floating-Point to Signed Fixed-Point Word Saturate
> **Category:** [VMX128](../categories/vmx128.md) · **Form:** [VX128_3](../forms/VX128_3.md) · **Opcode:** `0x18000230`
<!-- GENERATED: BEGIN -->
## Assembler Mnemonics
| Mnemonic | XML entry | Flags | Description |
| --- | --- | --- | --- |
| `vcfpsxws128` | `vcfpsxws128` | — | Vector128 Convert From Floating-Point to Signed Fixed-Point Word Saturate |
## Syntax
```asm
vcfpsxws128 [VD], [VB], [UIMM]
```
## Encoding
### `vcfpsxws128` — form `VX128_3`
- **Opcode word:** `0x18000230`
- **Primary opcode (bits 05):** `6`
- **Extended opcode:** `560`
- **Synchronising:** no
| Bits | Field | Meaning |
| --- | --- | --- |
| 05 | `OPCD` | primary opcode (6) |
| 610 | `VD128l` | destination low 5 bits |
| 1115 | `IMM` | 5-bit immediate |
| 1620 | `VB128l` | source B low 5 bits |
| 2127 | `XO` | extended opcode |
| 2829 | `VD128h` | destination high 2 bits |
| 3031 | `VB128h` | source B high 2 bits |
## Operands
| Field | Role | Description |
| --- | --- | --- |
| `VB` | vcfpsxws128: read | Source B vector register. |
| `UIMM` | vcfpsxws128: read | 16-bit unsigned immediate. Zero-extended. |
| `VD` | vcfpsxws128: write | Destination vector register. |
| `VSCR` | vcfpsxws128: write | Vector Status and Control Register (NJ/SAT bits). |
## Register Effects
### `vcfpsxws128`
- **Reads (always):** `VB`, `UIMM`
- **Reads (conditional):** _none_
- **Writes (always):** `VD`, `VSCR`
- **Writes (conditional):** _none_
## Status-Register Effects
- `vcfpsxws128`: **VSCR[SAT]** may be stickied on saturating vector operations.
## Operation (pseudocode)
```
; Pseudocode derives directly from the xenia-rs interpreter
; arm (see Implementation References). Operation semantics:
; - Read source operands from the fields listed under Operands.
; - Apply the arithmetic / logical / memory action described
; in the Description field above.
; - Write results to the destination register(s); update any
; status bits enumerated under Status-Register Effects.
; Consult the IBM AIX reference link under IBM Reference for
; canonical PPC-style pseudocode where xenia's expression is
; terse.
```
## C Translation Example
```c
/* C translation: the xenia-rs interpreter arm below in */
/* Implementation References is the authoritative semantic */
/* snapshot. Translate it line-by-line: */
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
/* The Register Effects and Status-Register Effects tables above */
/* enumerate every side effect a faithful translation must emit. */
```
## Implementation References
**`vcfpsxws128`**
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vcfpsxws128"`](../../xenia-canary/tools/ppc-instructions.xml)
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:539`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L539)
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:93`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L93)
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:656`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L656)
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:4323-4334`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L4323-L4334)
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
```rust
PpcOpcode::vcfpsxws128 => {
let uimm = (instr.raw >> 16) & 0x1F;
let b = ctx.vr[instr.vb128()].as_f32x4();
let mut r = [0i32; 4]; let mut sat = false;
for i in 0..4 {
let (v, s) = crate::vmx::cvt_f32_to_i32_sat(b[i], uimm);
r[i] = v; sat |= s;
}
if sat { ctx.set_vscr_sat(true); }
ctx.vr[instr.vd128()] = crate::vmx::from_i32x4(r);
ctx.pc += 4;
}
```
</details>
<!-- GENERATED: END -->
## Special Cases & Edge Conditions
- **Float → signed fixed-point (int32) with explicit scale.** Each lane computes `VD.w[i] = sat_int32(VB[i] * 2^UIMM)`, truncating toward zero and clamping to `[2^31, 2^311]`. `UIMM` is a 5-bit unsigned bias (range 0..31) that specifies a power-of-two pre-scale on the float value.
- **Use case: fixed-point pipelines.** The `UIMM` pre-scale lets game code convert a `[0.0, 1.0]` float channel into a `uint16`-range fixed-point value in one instruction (e.g. `UIMM = 15` → scale by 32768).
- **Sticky VSCR[SAT]** set whenever a lane clamps (including NaN inputs, which xenia's `cvt_f32_to_i32_sat` treats as 0 and flags saturation).
- **`VSCR[NJ]` honoured** on the float input side.
- **VMX128 register-fusion** applies to `VD` and `VB`: 7-bit register IDs via `VD128l ‖ VD128h` and `VB128l ‖ VB128h`.
- **No IBM AIX entry** — this is Xenon-only. The closest standard Altivec op is [`vctsxs`](../vmx/vctsxs.md).
- **No `Rc`, no XER / FPSCR.**
## Related Instructions
- [`vctsxs`](../vmx/vctsxs.md) — the standard Altivec equivalent (same semantics, 32-register file).
- [`vcfpuxws128`](vcfpuxws128.md) — unsigned variant (clamps to `uint32`).
- [`vcsxwfp128`](vcsxwfp128.md), [`vcuxwfp128`](vcuxwfp128.md) — the inverse (int → float with scale).
- [`vrfiz`](../vmx/vrfiz.md) — plain truncate-to-float-integer without scale.
## IBM Reference
- No IBM AIX entry — this instruction is exclusive to the Xbox 360's VMX128 extension.
- Xbox 360 XDK, Altivec-128 (VMX128) extensions (Microsoft internal documentation); semantics cross-referenced with [IBM AltiVec Technology Programmer's Interface Manual §`vctsxs`](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf).

View File

@@ -0,0 +1,137 @@
# `vcfpuxws128` — Vector128 Convert From Floating-Point to Unsigned Fixed-Point Word Saturate
> **Category:** [VMX128](../categories/vmx128.md) · **Form:** [VX128_3](../forms/VX128_3.md) · **Opcode:** `0x18000270`
<!-- GENERATED: BEGIN -->
## Assembler Mnemonics
| Mnemonic | XML entry | Flags | Description |
| --- | --- | --- | --- |
| `vcfpuxws128` | `vcfpuxws128` | — | Vector128 Convert From Floating-Point to Unsigned Fixed-Point Word Saturate |
## Syntax
```asm
vcfpuxws128 [VD], [VB], [UIMM]
```
## Encoding
### `vcfpuxws128` — form `VX128_3`
- **Opcode word:** `0x18000270`
- **Primary opcode (bits 05):** `6`
- **Extended opcode:** `624`
- **Synchronising:** no
| Bits | Field | Meaning |
| --- | --- | --- |
| 05 | `OPCD` | primary opcode (6) |
| 610 | `VD128l` | destination low 5 bits |
| 1115 | `IMM` | 5-bit immediate |
| 1620 | `VB128l` | source B low 5 bits |
| 2127 | `XO` | extended opcode |
| 2829 | `VD128h` | destination high 2 bits |
| 3031 | `VB128h` | source B high 2 bits |
## Operands
| Field | Role | Description |
| --- | --- | --- |
| `VB` | vcfpuxws128: read | Source B vector register. |
| `UIMM` | vcfpuxws128: read | 16-bit unsigned immediate. Zero-extended. |
| `VD` | vcfpuxws128: write | Destination vector register. |
| `VSCR` | vcfpuxws128: write | Vector Status and Control Register (NJ/SAT bits). |
## Register Effects
### `vcfpuxws128`
- **Reads (always):** `VB`, `UIMM`
- **Reads (conditional):** _none_
- **Writes (always):** `VD`, `VSCR`
- **Writes (conditional):** _none_
## Status-Register Effects
- `vcfpuxws128`: **VSCR[SAT]** may be stickied on saturating vector operations.
## Operation (pseudocode)
```
; Pseudocode derives directly from the xenia-rs interpreter
; arm (see Implementation References). Operation semantics:
; - Read source operands from the fields listed under Operands.
; - Apply the arithmetic / logical / memory action described
; in the Description field above.
; - Write results to the destination register(s); update any
; status bits enumerated under Status-Register Effects.
; Consult the IBM AIX reference link under IBM Reference for
; canonical PPC-style pseudocode where xenia's expression is
; terse.
```
## C Translation Example
```c
/* C translation: the xenia-rs interpreter arm below in */
/* Implementation References is the authoritative semantic */
/* snapshot. Translate it line-by-line: */
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
/* The Register Effects and Status-Register Effects tables above */
/* enumerate every side effect a faithful translation must emit. */
```
## Implementation References
**`vcfpuxws128`**
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vcfpuxws128"`](../../xenia-canary/tools/ppc-instructions.xml)
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:557`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L557)
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:93`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L93)
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:657`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L657)
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:4335-4346`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L4335-L4346)
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
```rust
PpcOpcode::vcfpuxws128 => {
let uimm = (instr.raw >> 16) & 0x1F;
let b = ctx.vr[instr.vb128()].as_f32x4();
let mut r = [0u32; 4]; let mut sat = false;
for i in 0..4 {
let (v, s) = crate::vmx::cvt_f32_to_u32_sat(b[i], uimm);
r[i] = v; sat |= s;
}
if sat { ctx.set_vscr_sat(true); }
ctx.vr[instr.vd128()] = xenia_types::Vec128::from_u32x4_array(r);
ctx.pc += 4;
}
```
</details>
<!-- GENERATED: END -->
## Special Cases & Edge Conditions
- **Float → unsigned fixed-point (uint32) with explicit scale.** Each lane computes `VD.w[i] = sat_uint32(VB[i] * 2^UIMM)`, truncating toward zero and clamping to `[0, 2^321]`. `UIMM` is a 5-bit unsigned bias (range 0..31).
- **Negative floats clamp to 0** and sticky-set `VSCR[SAT]`.
- **NaN inputs** → 0 with `VSCR[SAT]` set (xenia's `cvt_f32_to_u32_sat`).
- **`VSCR[NJ]` honoured** for denormal inputs.
- **VMX128 register-fusion** applies to `VD` and `VB` (7-bit IDs).
- **No IBM AIX entry** — Xenon-only.
- **No `Rc`, no XER / FPSCR.**
## Related Instructions
- [`vctuxs`](../vmx/vctuxs.md) — the standard Altivec equivalent (uint32 clamp with scale).
- [`vcfpsxws128`](vcfpsxws128.md) — signed variant.
- [`vcuxwfp128`](vcuxwfp128.md) — the inverse (uint → float with scale).
- [`vrfiz`](../vmx/vrfiz.md) — plain truncate-to-integer-float without scale.
## IBM Reference
- No IBM AIX entry — this instruction is exclusive to the Xbox 360's VMX128 extension.
- Xbox 360 XDK, Altivec-128 (VMX128) extensions (Microsoft internal documentation); cross-referenced with [IBM AltiVec Technology Programmer's Interface Manual §`vctuxs`](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf).

View File

@@ -0,0 +1,131 @@
# `vcsxwfp128` — Vector128 Convert From Signed Fixed-Point Word to Floating-Point
> **Category:** [VMX128](../categories/vmx128.md) · **Form:** [VX128_3](../forms/VX128_3.md) · **Opcode:** `0x180002b0`
<!-- GENERATED: BEGIN -->
## Assembler Mnemonics
| Mnemonic | XML entry | Flags | Description |
| --- | --- | --- | --- |
| `vcsxwfp128` | `vcsxwfp128` | — | Vector128 Convert From Signed Fixed-Point Word to Floating-Point |
## Syntax
```asm
vcsxwfp128 [VD], [VB], [UIMM]
```
## Encoding
### `vcsxwfp128` — form `VX128_3`
- **Opcode word:** `0x180002b0`
- **Primary opcode (bits 05):** `6`
- **Extended opcode:** `688`
- **Synchronising:** no
| Bits | Field | Meaning |
| --- | --- | --- |
| 05 | `OPCD` | primary opcode (6) |
| 610 | `VD128l` | destination low 5 bits |
| 1115 | `IMM` | 5-bit immediate |
| 1620 | `VB128l` | source B low 5 bits |
| 2127 | `XO` | extended opcode |
| 2829 | `VD128h` | destination high 2 bits |
| 3031 | `VB128h` | source B high 2 bits |
## Operands
| Field | Role | Description |
| --- | --- | --- |
| `VB` | vcsxwfp128: read | Source B vector register. |
| `UIMM` | vcsxwfp128: read | 16-bit unsigned immediate. Zero-extended. |
| `VD` | vcsxwfp128: write | Destination vector register. |
## Register Effects
### `vcsxwfp128`
- **Reads (always):** `VB`, `UIMM`
- **Reads (conditional):** _none_
- **Writes (always):** `VD`
- **Writes (conditional):** _none_
## Status-Register Effects
_No condition-register or status-register effects._
## Operation (pseudocode)
```
; Pseudocode derives directly from the xenia-rs interpreter
; arm (see Implementation References). Operation semantics:
; - Read source operands from the fields listed under Operands.
; - Apply the arithmetic / logical / memory action described
; in the Description field above.
; - Write results to the destination register(s); update any
; status bits enumerated under Status-Register Effects.
; Consult the IBM AIX reference link under IBM Reference for
; canonical PPC-style pseudocode where xenia's expression is
; terse.
```
## C Translation Example
```c
/* C translation: the xenia-rs interpreter arm below in */
/* Implementation References is the authoritative semantic */
/* snapshot. Translate it line-by-line: */
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
/* The Register Effects and Status-Register Effects tables above */
/* enumerate every side effect a faithful translation must emit. */
```
## Implementation References
**`vcsxwfp128`**
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vcsxwfp128"`](../../xenia-canary/tools/ppc-instructions.xml)
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:503`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L503)
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:98`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L98)
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:658`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L658)
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:4347-4354`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L4347-L4354)
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
```rust
PpcOpcode::vcsxwfp128 => {
let uimm = (instr.raw >> 16) & 0x1F;
let b = crate::vmx::as_i32x4(ctx.vr[instr.vb128()]);
let mut r = [0f32; 4];
for i in 0..4 { r[i] = crate::vmx::cvt_i32_to_f32(b[i], uimm); }
ctx.vr[instr.vd128()] = xenia_types::Vec128::from_f32x4_array(r);
ctx.pc += 4;
}
```
</details>
<!-- GENERATED: END -->
## Special Cases & Edge Conditions
- **Signed fixed-point (int32) → float with explicit scale.** Each lane computes `VD[i] = (float)VB.w[i] * 2^-UIMM` (equivalently `(int32)VB.w[i] / 2^UIMM`). `UIMM` is a 5-bit unsigned bias that specifies a post-scale — the inverse direction of [`vcfpsxws128`](vcfpsxws128.md), so the `UIMM`s should match for a round-trip.
- **IEEE-754 binary32 output, round-to-nearest.** Values outside the exactly-representable range (`|x| > 2^24`) lose low-order bits; no saturation on the float side.
- **No `VSCR[SAT]` effect** — conversion in this direction never saturates.
- **`VSCR[NJ]` does not affect the int → float path.**
- **VMX128 register-fusion** applies (7-bit register IDs).
- **No IBM AIX entry** — Xenon-only. Closest standard Altivec op is [`vcfsx`](../vmx/vcfsx.md).
- **No `Rc`, no XER / FPSCR.**
## Related Instructions
- [`vcfsx`](../vmx/vcfsx.md) — the standard Altivec `int32 → float` with scale.
- [`vcuxwfp128`](vcuxwfp128.md) — unsigned-int variant.
- [`vcfpsxws128`](vcfpsxws128.md), [`vcfpuxws128`](vcfpuxws128.md) — the inverse (float → int with scale).
## IBM Reference
- No IBM AIX entry — Xbox 360 VMX128 extension only.
- Xbox 360 XDK, Altivec-128 (VMX128) extensions; cross-referenced with [IBM AltiVec Technology Programmer's Interface Manual §`vcfsx`](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf).

View File

@@ -0,0 +1,131 @@
# `vcuxwfp128` — Vector128 Convert From Unsigned Fixed-Point Word to Floating-Point
> **Category:** [VMX128](../categories/vmx128.md) · **Form:** [VX128_3](../forms/VX128_3.md) · **Opcode:** `0x180002f0`
<!-- GENERATED: BEGIN -->
## Assembler Mnemonics
| Mnemonic | XML entry | Flags | Description |
| --- | --- | --- | --- |
| `vcuxwfp128` | `vcuxwfp128` | — | Vector128 Convert From Unsigned Fixed-Point Word to Floating-Point |
## Syntax
```asm
vcuxwfp128 [VD], [VB], [UIMM]
```
## Encoding
### `vcuxwfp128` — form `VX128_3`
- **Opcode word:** `0x180002f0`
- **Primary opcode (bits 05):** `6`
- **Extended opcode:** `752`
- **Synchronising:** no
| Bits | Field | Meaning |
| --- | --- | --- |
| 05 | `OPCD` | primary opcode (6) |
| 610 | `VD128l` | destination low 5 bits |
| 1115 | `IMM` | 5-bit immediate |
| 1620 | `VB128l` | source B low 5 bits |
| 2127 | `XO` | extended opcode |
| 2829 | `VD128h` | destination high 2 bits |
| 3031 | `VB128h` | source B high 2 bits |
## Operands
| Field | Role | Description |
| --- | --- | --- |
| `VB` | vcuxwfp128: read | Source B vector register. |
| `UIMM` | vcuxwfp128: read | 16-bit unsigned immediate. Zero-extended. |
| `VD` | vcuxwfp128: write | Destination vector register. |
## Register Effects
### `vcuxwfp128`
- **Reads (always):** `VB`, `UIMM`
- **Reads (conditional):** _none_
- **Writes (always):** `VD`
- **Writes (conditional):** _none_
## Status-Register Effects
_No condition-register or status-register effects._
## Operation (pseudocode)
```
; Pseudocode derives directly from the xenia-rs interpreter
; arm (see Implementation References). Operation semantics:
; - Read source operands from the fields listed under Operands.
; - Apply the arithmetic / logical / memory action described
; in the Description field above.
; - Write results to the destination register(s); update any
; status bits enumerated under Status-Register Effects.
; Consult the IBM AIX reference link under IBM Reference for
; canonical PPC-style pseudocode where xenia's expression is
; terse.
```
## C Translation Example
```c
/* C translation: the xenia-rs interpreter arm below in */
/* Implementation References is the authoritative semantic */
/* snapshot. Translate it line-by-line: */
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
/* The Register Effects and Status-Register Effects tables above */
/* enumerate every side effect a faithful translation must emit. */
```
## Implementation References
**`vcuxwfp128`**
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vcuxwfp128"`](../../xenia-canary/tools/ppc-instructions.xml)
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:521`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L521)
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:98`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L98)
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:659`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L659)
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:4355-4362`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L4355-L4362)
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
```rust
PpcOpcode::vcuxwfp128 => {
let uimm = (instr.raw >> 16) & 0x1F;
let b = ctx.vr[instr.vb128()].as_u32x4();
let mut r = [0f32; 4];
for i in 0..4 { r[i] = crate::vmx::cvt_u32_to_f32(b[i], uimm); }
ctx.vr[instr.vd128()] = xenia_types::Vec128::from_f32x4_array(r);
ctx.pc += 4;
}
```
</details>
<!-- GENERATED: END -->
## Special Cases & Edge Conditions
- **Unsigned fixed-point (uint32) → float with explicit scale.** Each lane computes `VD[i] = (float)VB.w[i] * 2^-UIMM`. Treats the 32-bit input as unsigned, so values ≥ `0x80000000` produce positive floats (unlike `vcsxwfp128` which would produce negatives).
- **IEEE-754 binary32 output, round-to-nearest.** Precision loss above `2^24`.
- **No `VSCR[SAT]` effect.**
- **`VSCR[NJ]` does not affect the uint → float path.**
- **VMX128 register-fusion** applies.
- **No IBM AIX entry** — Xenon-only. Closest standard is [`vcfux`](../vmx/vcfux.md).
- **No `Rc`, no XER / FPSCR.**
## Related Instructions
- [`vcfux`](../vmx/vcfux.md) — the standard Altivec `uint32 → float` with scale.
- [`vcsxwfp128`](vcsxwfp128.md) — signed-int variant.
- [`vcfpuxws128`](vcfpuxws128.md), [`vcfpsxws128`](vcfpsxws128.md) — the inverse (float → int with scale).
## IBM Reference
- No IBM AIX entry — Xbox 360 VMX128 extension only.
- Xbox 360 XDK, Altivec-128 (VMX128) extensions; cross-referenced with [IBM AltiVec Technology Programmer's Interface Manual §`vcfux`](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf).

View File

@@ -0,0 +1,148 @@
# `vmaddcfp128` — Vector128 Multiply Add Floating Point
> **Category:** [VMX128](../categories/vmx128.md) · **Form:** [VX128](../forms/VX128.md) · **Opcode:** `0x14000110`
<!-- GENERATED: BEGIN -->
## Assembler Mnemonics
| Mnemonic | XML entry | Flags | Description |
| --- | --- | --- | --- |
| `vmaddcfp128` | `vmaddcfp128` | — | Vector128 Multiply Add Floating Point |
## Syntax
```asm
vmaddcfp128 [VD], [VA], [VD], [VB]
```
## Encoding
### `vmaddcfp128` — form `VX128`
- **Opcode word:** `0x14000110`
- **Primary opcode (bits 05):** `5`
- **Extended opcode:** `272`
- **Synchronising:** no
| Bits | Field | Meaning |
| --- | --- | --- |
| 05 | `OPCD` | primary opcode (4 or 5) |
| 610 | `VD128l` | destination low 5 bits |
| 1115 | `VA128l` | source A low 5 bits |
| 1620 | `VB128l` | source B low 5 bits |
| 21 | `VA128H` | source A high bit |
| 22 | `—` | reserved |
| 2325 | `VC` | optional VC / XO sub-field |
| 26 | `VA128h` | source A middle bit |
| 27 | `—` | reserved |
| 2829 | `VD128h` | destination high 2 bits |
| 3031 | `VB128h` | source B high 2 bits |
## Operands
| Field | Role | Description |
| --- | --- | --- |
| `VA` | vmaddcfp128: read | Source A vector register. |
| `VD` | vmaddcfp128: read; vmaddcfp128: write | Destination vector register. |
| `VB` | vmaddcfp128: read | Source B vector register. |
## Register Effects
### `vmaddcfp128`
- **Reads (always):** `VA`, `VD`, `VB`
- **Reads (conditional):** _none_
- **Writes (always):** `VD`
- **Writes (conditional):** _none_
## Status-Register Effects
_No condition-register or status-register effects._
## Operation (pseudocode)
```
; Pseudocode derives directly from the xenia-rs interpreter
; arm (see Implementation References). Operation semantics:
; - Read source operands from the fields listed under Operands.
; - Apply the arithmetic / logical / memory action described
; in the Description field above.
; - Write results to the destination register(s); update any
; status bits enumerated under Status-Register Effects.
; Consult the IBM AIX reference link under IBM Reference for
; canonical PPC-style pseudocode where xenia's expression is
; terse.
```
## C Translation Example
```c
/* C translation: the xenia-rs interpreter arm below in */
/* Implementation References is the authoritative semantic */
/* snapshot. Translate it line-by-line: */
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
/* The Register Effects and Status-Register Effects tables above */
/* enumerate every side effect a faithful translation must emit. */
```
## Implementation References
**`vmaddcfp128`**
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vmaddcfp128"`](../../xenia-canary/tools/ppc-instructions.xml)
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:812`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L812)
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:100`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L100)
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:614`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L614)
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:4492-4509`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L4492-L4509)
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
```rust
PpcOpcode::vmaddcfp128 => {
// ISA: (VD) <- (VA × VD) + VB. Canary InstrEmit_vmaddcfp128 (cc:819): MulAdd(VA, VD, VB).
// Previous code computed di.mul_add(bi, ai) = VD×VB+VA — both operands wrong
// (PPCBUG-425). Fix: ai.mul_add(di, bi) = VA×VD+VB.
let a = ctx.vr[instr.va128()].as_f32x4();
let b = ctx.vr[instr.vb128()].as_f32x4();
let d = ctx.vr[instr.vd128()].as_f32x4();
let mut r = [0f32; 4];
for i in 0..4 {
let ai = vmx::flush_denorm(a[i]);
let bi = vmx::flush_denorm(b[i]);
let di = vmx::flush_denorm(d[i]);
// PPCBUG-437: flush subnormal output too.
r[i] = vmx::flush_denorm(ai.mul_add(di, bi));
}
ctx.vr[instr.vd128()] = xenia_types::Vec128::from_f32x4_array(r);
ctx.pc += 4;
}
```
</details>
<!-- GENERATED: END -->
## Special Cases & Edge Conditions
- **Xbox-specific fused multiply-add variant.** Each lane computes `VD[i] = VD[i] * VB[i] + VA[i]` — note that `VD` is both source and destination (xenia reads `VD` first, then writes). This is *not* the standard [`vmaddfp`](../vmx/vmaddfp.md) operand order: the "addend" position is `VA`, the other factor is `VB`, and `VD` carries the on-going accumulator. The mnemonic's trailing `c` denotes "accumulator-in-VD" rather than a separate `VC` operand.
- **Fused, single-rounding.** Xenia uses `f32::mul_add`, which maps to a host FMA instruction when available. Bit-for-bit result depends on host support; xenia-canary's LLVM path emits the equivalent IR node.
- **IEEE-754 binary32 lanes; `VSCR[NJ]` honoured.**
- **No VSCR[SAT], no FPSCR update.**
- **NaN propagation** per IEEE-754.
- **VMX128 register-fusion** (7-bit IDs on `VA`, `VB`, `VD`).
- **No IBM AIX entry** — Xenon-only.
- **No `Rc`, no XER.**
## Related Instructions
- [`vmaddfp`](../vmx/vmaddfp.md), [`vmaddfp128`](../vmx/vmaddfp.md) — standard fused `(VA × VC) + VB`.
- [`vmulfp128`](vmulfp128.md) — plain lane-wise float multiply.
- [`vnmsubfp`](../vmx/vnmsubfp.md) — negative-multiply-subtract.
- [`vmsum3fp128`](vmsum3fp128.md), [`vmsum4fp128`](vmsum4fp128.md) — dot-product reductions.
## IBM Reference
- No IBM AIX entry — this is an Xbox 360 VMX128 extension. Its semantics differ from the base Altivec [`vmaddfp`](../vmx/vmaddfp.md) in the operand order (accumulator in `VD`, not `VC`).
- Xbox 360 XDK, Altivec-128 (VMX128) extensions.
- [IBM AltiVec Technology Programmer's Interface Manual, Chapter 5 — Floating-Point Arithmetic](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf) for the base FMA semantics.

View File

@@ -0,0 +1,141 @@
# `vmsum3fp128` — Vector128 Multiply Sum 3-way Floating Point
> **Category:** [VMX128](../categories/vmx128.md) · **Form:** [VX128](../forms/VX128.md) · **Opcode:** `0x14000190`
<!-- GENERATED: BEGIN -->
## Assembler Mnemonics
| Mnemonic | XML entry | Flags | Description |
| --- | --- | --- | --- |
| `vmsum3fp128` | `vmsum3fp128` | — | Vector128 Multiply Sum 3-way Floating Point |
## Syntax
```asm
vmsum3fp128 [VD], [VA], [VB]
```
## Encoding
### `vmsum3fp128` — form `VX128`
- **Opcode word:** `0x14000190`
- **Primary opcode (bits 05):** `5`
- **Extended opcode:** `400`
- **Synchronising:** no
| Bits | Field | Meaning |
| --- | --- | --- |
| 05 | `OPCD` | primary opcode (4 or 5) |
| 610 | `VD128l` | destination low 5 bits |
| 1115 | `VA128l` | source A low 5 bits |
| 1620 | `VB128l` | source B low 5 bits |
| 21 | `VA128H` | source A high bit |
| 22 | `—` | reserved |
| 2325 | `VC` | optional VC / XO sub-field |
| 26 | `VA128h` | source A middle bit |
| 27 | `—` | reserved |
| 2829 | `VD128h` | destination high 2 bits |
| 3031 | `VB128h` | source B high 2 bits |
## Operands
| Field | Role | Description |
| --- | --- | --- |
| `VA` | vmsum3fp128: read | Source A vector register. |
| `VB` | vmsum3fp128: read | Source B vector register. |
| `VD` | vmsum3fp128: write | Destination vector register. |
## Register Effects
### `vmsum3fp128`
- **Reads (always):** `VA`, `VB`
- **Reads (conditional):** _none_
- **Writes (always):** `VD`
- **Writes (conditional):** _none_
## Status-Register Effects
_No condition-register or status-register effects._
## Operation (pseudocode)
```
; Pseudocode derives directly from the xenia-rs interpreter
; arm (see Implementation References). Operation semantics:
; - Read source operands from the fields listed under Operands.
; - Apply the arithmetic / logical / memory action described
; in the Description field above.
; - Write results to the destination register(s); update any
; status bits enumerated under Status-Register Effects.
; Consult the IBM AIX reference link under IBM Reference for
; canonical PPC-style pseudocode where xenia's expression is
; terse.
```
## C Translation Example
```c
/* C translation: the xenia-rs interpreter arm below in */
/* Implementation References is the authoritative semantic */
/* snapshot. Translate it line-by-line: */
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
/* The Register Effects and Status-Register Effects tables above */
/* enumerate every side effect a faithful translation must emit. */
```
## Implementation References
**`vmsum3fp128`**
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vmsum3fp128"`](../../xenia-canary/tools/ppc-instructions.xml)
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:1067`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L1067)
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:106`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L106)
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:616`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L616)
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:4513-4523`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L4513-L4523)
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
```rust
PpcOpcode::vmsum3fp128 => {
// PPCBUG-436: flush per-product intermediates (not just the final sum).
let a = ctx.vr[instr.va128()].as_f32x4();
let b = ctx.vr[instr.vb128()].as_f32x4();
let p0 = vmx::flush_denorm(a[0] * b[0]);
let p1 = vmx::flush_denorm(a[1] * b[1]);
let p2 = vmx::flush_denorm(a[2] * b[2]);
let s = vmx::flush_denorm(p0 + p1 + p2);
ctx.vr[instr.vd128()] = xenia_types::Vec128::from_f32x4(s, s, s, s);
ctx.pc += 4;
}
```
</details>
<!-- GENERATED: END -->
## Special Cases & Edge Conditions
- **3-way float dot product.** Computes `s = VA[0]*VB[0] + VA[1]*VB[1] + VA[2]*VB[2]` (ignoring lane 3 — the "w" component of a homogeneous vector) and **broadcasts `s` to every lane of `VD`**. Typical call site: 3D vector dot products where the w-component is padding.
- **Scalar-result-splatted-across-lanes.** Consuming code can then use any lane of `VD` as the dot-product result.
- **Rounding.** Xenia performs two adds in sequence (no fused triple-add in Rust). The order matches the spec but the summation order affects round-off by ~1 ulp. Games that need deterministic cross-host behaviour typically pre-scale their inputs.
- **IEEE-754 binary32; `VSCR[NJ]` honoured.**
- **No VSCR[SAT], no FPSCR update.**
- **VMX128 register-fusion** (7-bit IDs on `VA`, `VB`, `VD`).
- **No IBM AIX entry** — Xenon-only.
- **No `Rc`, no XER.**
## Related Instructions
- [`vmsum4fp128`](vmsum4fp128.md) — 4-way dot-product (includes the w-lane).
- [`vmulfp128`](vmulfp128.md), [`vaddfp`](../vmx/vaddfp.md) — the building blocks.
- [`vmaddcfp128`](vmaddcfp128.md), [`vmaddfp`](../vmx/vmaddfp.md) — fused MAC variants.
- [`vsumsws`](../vmx/vsumsws.md) — integer sum-reduction analogue.
## IBM Reference
- No IBM AIX entry — Xbox 360 VMX128 extension only.
- Xbox 360 XDK, Altivec-128 (VMX128) extensions. A 3-way dot product is a direct mirror of D3D9's `float3 dot`.
- [IBM AltiVec Technology Programmer's Interface Manual, Chapter 5 — Floating-Point Arithmetic](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf) for the base float arithmetic semantics.

View File

@@ -0,0 +1,142 @@
# `vmsum4fp128` — Vector128 Multiply Sum 4-way Floating-Point
> **Category:** [VMX128](../categories/vmx128.md) · **Form:** [VX128](../forms/VX128.md) · **Opcode:** `0x140001d0`
<!-- GENERATED: BEGIN -->
## Assembler Mnemonics
| Mnemonic | XML entry | Flags | Description |
| --- | --- | --- | --- |
| `vmsum4fp128` | `vmsum4fp128` | — | Vector128 Multiply Sum 4-way Floating-Point |
## Syntax
```asm
vmsum4fp128 [VD], [VA], [VB]
```
## Encoding
### `vmsum4fp128` — form `VX128`
- **Opcode word:** `0x140001d0`
- **Primary opcode (bits 05):** `5`
- **Extended opcode:** `464`
- **Synchronising:** no
| Bits | Field | Meaning |
| --- | --- | --- |
| 05 | `OPCD` | primary opcode (4 or 5) |
| 610 | `VD128l` | destination low 5 bits |
| 1115 | `VA128l` | source A low 5 bits |
| 1620 | `VB128l` | source B low 5 bits |
| 21 | `VA128H` | source A high bit |
| 22 | `—` | reserved |
| 2325 | `VC` | optional VC / XO sub-field |
| 26 | `VA128h` | source A middle bit |
| 27 | `—` | reserved |
| 2829 | `VD128h` | destination high 2 bits |
| 3031 | `VB128h` | source B high 2 bits |
## Operands
| Field | Role | Description |
| --- | --- | --- |
| `VA` | vmsum4fp128: read | Source A vector register. |
| `VB` | vmsum4fp128: read | Source B vector register. |
| `VD` | vmsum4fp128: write | Destination vector register. |
## Register Effects
### `vmsum4fp128`
- **Reads (always):** `VA`, `VB`
- **Reads (conditional):** _none_
- **Writes (always):** `VD`
- **Writes (conditional):** _none_
## Status-Register Effects
_No condition-register or status-register effects._
## Operation (pseudocode)
```
; Pseudocode derives directly from the xenia-rs interpreter
; arm (see Implementation References). Operation semantics:
; - Read source operands from the fields listed under Operands.
; - Apply the arithmetic / logical / memory action described
; in the Description field above.
; - Write results to the destination register(s); update any
; status bits enumerated under Status-Register Effects.
; Consult the IBM AIX reference link under IBM Reference for
; canonical PPC-style pseudocode where xenia's expression is
; terse.
```
## C Translation Example
```c
/* C translation: the xenia-rs interpreter arm below in */
/* Implementation References is the authoritative semantic */
/* snapshot. Translate it line-by-line: */
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
/* The Register Effects and Status-Register Effects tables above */
/* enumerate every side effect a faithful translation must emit. */
```
## Implementation References
**`vmsum4fp128`**
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vmsum4fp128"`](../../xenia-canary/tools/ppc-instructions.xml)
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:1077`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L1077)
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:106`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L106)
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:617`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L617)
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:4524-4535`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L4524-L4535)
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
```rust
PpcOpcode::vmsum4fp128 => {
// PPCBUG-436.
let a = ctx.vr[instr.va128()].as_f32x4();
let b = ctx.vr[instr.vb128()].as_f32x4();
let p0 = vmx::flush_denorm(a[0] * b[0]);
let p1 = vmx::flush_denorm(a[1] * b[1]);
let p2 = vmx::flush_denorm(a[2] * b[2]);
let p3 = vmx::flush_denorm(a[3] * b[3]);
let s = vmx::flush_denorm(p0 + p1 + p2 + p3);
ctx.vr[instr.vd128()] = xenia_types::Vec128::from_f32x4(s, s, s, s);
ctx.pc += 4;
}
```
</details>
<!-- GENERATED: END -->
## Special Cases & Edge Conditions
- **4-way float dot product.** Computes `s = VA[0]*VB[0] + VA[1]*VB[1] + VA[2]*VB[2] + VA[3]*VB[3]` (the full xyzw dot) and **broadcasts `s` to every lane of `VD`**.
- **Scalar-result-splatted-across-lanes.** Direct mirror of HLSL/GLSL's `float4 dot`.
- **Rounding.** Three sequential adds; round-off order affects result by ~1 ulp. Not an FMA in xenia.
- **IEEE-754 binary32; `VSCR[NJ]` honoured.**
- **No VSCR[SAT], no FPSCR update.**
- **VMX128 register-fusion** (7-bit IDs on `VA`, `VB`, `VD`).
- **No IBM AIX entry** — Xenon-only.
- **No `Rc`, no XER.**
## Related Instructions
- [`vmsum3fp128`](vmsum3fp128.md) — 3-way dot-product (ignores the w-lane).
- [`vmulfp128`](vmulfp128.md), [`vaddfp`](../vmx/vaddfp.md) — the building blocks.
- [`vmaddcfp128`](vmaddcfp128.md), [`vmaddfp`](../vmx/vmaddfp.md) — fused MAC variants.
- [`vsumsws`](../vmx/vsumsws.md) — integer sum-reduction analogue.
## IBM Reference
- No IBM AIX entry — Xbox 360 VMX128 extension only.
- Xbox 360 XDK, Altivec-128 (VMX128) extensions. Directly mirrors D3D9's `float4 dot`.
- [IBM AltiVec Technology Programmer's Interface Manual, Chapter 5 — Floating-Point Arithmetic](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf) for base float semantics.

View File

@@ -0,0 +1,142 @@
# `vmulfp128` — Vector128 Multiply Floating-Point
> **Category:** [VMX128](../categories/vmx128.md) · **Form:** [VX128](../forms/VX128.md) · **Opcode:** `0x14000090`
<!-- GENERATED: BEGIN -->
## Assembler Mnemonics
| Mnemonic | XML entry | Flags | Description |
| --- | --- | --- | --- |
| `vmulfp128` | `vmulfp128` | — | Vector128 Multiply Floating-Point |
## Syntax
```asm
vmulfp128 [VD], [VA], [VB]
```
## Encoding
### `vmulfp128` — form `VX128`
- **Opcode word:** `0x14000090`
- **Primary opcode (bits 05):** `5`
- **Extended opcode:** `144`
- **Synchronising:** no
| Bits | Field | Meaning |
| --- | --- | --- |
| 05 | `OPCD` | primary opcode (4 or 5) |
| 610 | `VD128l` | destination low 5 bits |
| 1115 | `VA128l` | source A low 5 bits |
| 1620 | `VB128l` | source B low 5 bits |
| 21 | `VA128H` | source A high bit |
| 22 | `—` | reserved |
| 2325 | `VC` | optional VC / XO sub-field |
| 26 | `VA128h` | source A middle bit |
| 27 | `—` | reserved |
| 2829 | `VD128h` | destination high 2 bits |
| 3031 | `VB128h` | source B high 2 bits |
## Operands
| Field | Role | Description |
| --- | --- | --- |
| `VA` | vmulfp128: read | Source A vector register. |
| `VB` | vmulfp128: read | Source B vector register. |
| `VD` | vmulfp128: write | Destination vector register. |
## Register Effects
### `vmulfp128`
- **Reads (always):** `VA`, `VB`
- **Reads (conditional):** _none_
- **Writes (always):** `VD`
- **Writes (conditional):** _none_
## Status-Register Effects
_No condition-register or status-register effects._
## Operation (pseudocode)
```
; Pseudocode derives directly from the xenia-rs interpreter
; arm (see Implementation References). Operation semantics:
; - Read source operands from the fields listed under Operands.
; - Apply the arithmetic / logical / memory action described
; in the Description field above.
; - Write results to the destination register(s); update any
; status bits enumerated under Status-Register Effects.
; Consult the IBM AIX reference link under IBM Reference for
; canonical PPC-style pseudocode where xenia's expression is
; terse.
```
## C Translation Example
```c
/* C translation: the xenia-rs interpreter arm below in */
/* Implementation References is the authoritative semantic */
/* snapshot. Translate it line-by-line: */
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
/* The Register Effects and Status-Register Effects tables above */
/* enumerate every side effect a faithful translation must emit. */
```
## Implementation References
**`vmulfp128`**
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vmulfp128"`](../../xenia-canary/tools/ppc-instructions.xml)
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:1126`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L1126)
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:108`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L108)
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:612`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L612)
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2108-2120`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2108-L2120)
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
```rust
PpcOpcode::vmulfp128 => {
// PPCBUG-435 + PPCBUG-437.
let a = ctx.vr[instr.va128()].as_f32x4();
let b = ctx.vr[instr.vb128()].as_f32x4();
let mut r = [0f32; 4];
for i in 0..4 {
let ai = vmx::flush_denorm(a[i]);
let bi = vmx::flush_denorm(b[i]);
r[i] = vmx::flush_denorm(ai * bi);
}
ctx.vr[instr.vd128()] = xenia_types::Vec128::from_f32x4_array(r);
ctx.pc += 4;
}
```
</details>
<!-- GENERATED: END -->
## Special Cases & Edge Conditions
- **Lane-wise float multiply — Xenon-only.** Base Altivec has no dedicated `vmulfp`; the pattern on traditional PowerPC is `vmaddfp vD, vA, vC, v_zero`. Xenon adds this direct instruction, saving the zero-register setup.
- **IEEE-754 binary32, round-to-nearest.** Each of the four lanes computes `VD[i] = VA[i] * VB[i]`.
- **`VSCR[NJ]` honoured** (denormals flush-to-zero).
- **NaN propagation** per IEEE-754.
- **No VSCR[SAT], no FPSCR update, no exceptions.**
- **VMX128 register-fusion** (7-bit IDs).
- **No IBM AIX entry** — Xbox-specific; contrast with the `vmaddfp`-with-zero workaround used on non-Xenon Altivec.
- **No `Rc`, no XER.**
## Related Instructions
- [`vmaddfp`](../vmx/vmaddfp.md), [`vmaddcfp128`](vmaddcfp128.md) — fused MAC forms.
- [`vaddfp`](../vmx/vaddfp.md), [`vsubfp`](../vmx/vsubfp.md) — lane-wise float add/sub.
- [`vmsum3fp128`](vmsum3fp128.md), [`vmsum4fp128`](vmsum4fp128.md) — dot-product reductions.
## IBM Reference
- No IBM AIX entry — this instruction is exclusive to the Xbox 360's VMX128 extension.
- Xbox 360 XDK, Altivec-128 (VMX128) extensions. Non-Xenon Altivec code emits `vmaddfp vD, vA, vC, v_zero` to achieve the same effect.
- [IBM AltiVec Technology Programmer's Interface Manual §`vmaddfp`](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf) for the underlying float semantics.

View File

@@ -0,0 +1,139 @@
# `vpermwi128` — Vector128 Permutate Word Immediate
> **Category:** [VMX128](../categories/vmx128.md) · **Form:** [VX128_P](../forms/VX128_P.md) · **Opcode:** `0x18000210`
<!-- GENERATED: BEGIN -->
## Assembler Mnemonics
| Mnemonic | XML entry | Flags | Description |
| --- | --- | --- | --- |
| `vpermwi128` | `vpermwi128` | — | Vector128 Permutate Word Immediate |
## Syntax
```asm
vpermwi128 [VD], [VB], [UIMM]
```
## Encoding
### `vpermwi128` — form `VX128_P`
- **Opcode word:** `0x18000210`
- **Primary opcode (bits 05):** `6`
- **Extended opcode:** `528`
- **Synchronising:** no
| Bits | Field | Meaning |
| --- | --- | --- |
| 05 | `OPCD` | primary opcode (6) |
| 610 | `VD128l` | destination low 5 bits |
| 1115 | `PERMl` | permute selector low 5 bits |
| 1620 | `VB128l` | source B low 5 bits |
| 2122 | `—` | reserved |
| 2325 | `PERMh` | permute selector high 3 bits |
| 2829 | `VD128h` | destination high 2 bits |
| 3031 | `VB128h` | source B high 2 bits |
## Operands
| Field | Role | Description |
| --- | --- | --- |
| `VB` | vpermwi128: read | Source B vector register. |
| `UIMM` | vpermwi128: read | 16-bit unsigned immediate. Zero-extended. |
| `VD` | vpermwi128: write | Destination vector register. |
## Register Effects
### `vpermwi128`
- **Reads (always):** `VB`, `UIMM`
- **Reads (conditional):** _none_
- **Writes (always):** `VD`
- **Writes (conditional):** _none_
## Status-Register Effects
_No condition-register or status-register effects._
## Operation (pseudocode)
```
; Pseudocode derives directly from the xenia-rs interpreter
; arm (see Implementation References). Operation semantics:
; - Read source operands from the fields listed under Operands.
; - Apply the arithmetic / logical / memory action described
; in the Description field above.
; - Write results to the destination register(s); update any
; status bits enumerated under Status-Register Effects.
; Consult the IBM AIX reference link under IBM Reference for
; canonical PPC-style pseudocode where xenia's expression is
; terse.
```
## C Translation Example
```c
/* C translation: the xenia-rs interpreter arm below in */
/* Implementation References is the authoritative semantic */
/* snapshot. Translate it line-by-line: */
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
/* The Register Effects and Status-Register Effects tables above */
/* enumerate every side effect a faithful translation must emit. */
```
## Implementation References
**`vpermwi128`**
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vpermwi128"`](../../xenia-canary/tools/ppc-instructions.xml)
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:1207`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L1207)
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:112`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L112)
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:642`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L642)
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:4537-4548`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L4537-L4548)
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
```rust
PpcOpcode::vpermwi128 => {
let imm = instr.vx128_p_perm();
let b = ctx.vr[instr.vb128()].as_u32x4();
let mut r = [0u32; 4];
// Output lane i ← b[(imm >> (2 * (3-i))) & 3]
for i in 0..4 {
let sel = ((imm >> (2 * (3 - i))) & 3) as usize;
r[i] = b[sel];
}
ctx.vr[instr.vd128()] = xenia_types::Vec128::from_u32x4_array(r);
ctx.pc += 4;
}
```
</details>
<!-- GENERATED: END -->
## Special Cases & Edge Conditions
- **Word-level 4-way permute via an 8-bit immediate.** The 8-bit `PERM` immediate (carried in fields `PERMh ‖ PERMl` of the encoding) is treated as **four 2-bit selectors**, one per output word lane. Each 2-bit field selects which of `VB`'s 4 word lanes is copied to the corresponding output lane.
- **Bit layout of the immediate.** Output lane 0 (big-endian MSB word) is selected by bits 67 of `PERM`; lane 1 by bits 45; lane 2 by bits 23; lane 3 by bits 01. (In xenia: `sel = (imm >> (2 * (3-i))) & 3`.)
- **Super-set of [`vspltw`](../vmx/vspltw.md).** A splat is `vpermwi128 vD, vB, 0x00` (all lanes = word 0), `0x55` (all = word 1), `0xAA` (all = word 2), `0xFF` (all = word 3). Arbitrary shuffles like "xyzw → wzyx" are a single-instruction operation.
- **Immediate-only.** No dynamic selector vector; contrast with [`vperm`](../vmx/vperm.md).
- **Single-source.** Unlike `vperm`/`vperm128`, `vpermwi128` only reshuffles one register (`VB`); it cannot interleave two operands.
- **VMX128 register-fusion** on `VD` and `VB` (7-bit IDs).
- **No IBM AIX entry** — Xenon-only.
- **No `Rc`, no XER, no VSCR.**
## Related Instructions
- [`vperm`](../vmx/vperm.md), [`vperm128`](../vmx/vperm.md) — general byte-granularity permute (two-source).
- [`vspltw`](../vmx/vspltw.md), [`vspltw128`](../vmx/vspltw.md) — single-word splat (special case of `vpermwi128`).
- [`vsldoi`](../vmx/vsldoi.md) — static-immediate byte rotate of two registers.
- [`vrlimi128`](vrlimi128.md) — rotate + mask-insert (per-word rotate with an insert mask).
## IBM Reference
- No IBM AIX entry — this instruction is exclusive to the Xbox 360's VMX128 extension.
- Xbox 360 XDK, Altivec-128 (VMX128) extensions. Functionally equivalent to HLSL's `.xyzw`-suffix swizzle on `float4`.
- [IBM AltiVec Technology Programmer's Interface Manual, Chapter 6 — Permute and Formatting](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf) for the base permute semantics.

View File

@@ -0,0 +1,185 @@
# `vpkd3d128` — Vector128 Pack D3Dtype, Rotate Left Immediate and Mask Insert
> **Category:** [VMX128](../categories/vmx128.md) · **Form:** [VX128_4](../forms/VX128_4.md) · **Opcode:** `0x18000610`
<!-- GENERATED: BEGIN -->
## Assembler Mnemonics
| Mnemonic | XML entry | Flags | Description |
| --- | --- | --- | --- |
| `vpkd3d128` | `vpkd3d128` | — | Vector128 Pack D3Dtype, Rotate Left Immediate and Mask Insert |
## Syntax
```asm
(no disassembly template)
```
## Encoding
### `vpkd3d128` — form `VX128_4`
- **Opcode word:** `0x18000610`
- **Primary opcode (bits 05):** `6`
- **Extended opcode:** `1552`
- **Synchronising:** no
| Bits | Field | Meaning |
| --- | --- | --- |
| 05 | `OPCD` | primary opcode (6) |
| 610 | `VD128l` | destination low 5 bits |
| 1115 | `IMM` | 5-bit immediate |
| 1620 | `VB128l` | source B low 5 bits |
| 2123 | `XO` | extended opcode |
| 2425 | `z` | sub-operation selector |
| 2829 | `VD128h` | destination high 2 bits |
| 3031 | `VB128h` | source B high 2 bits |
## Operands
| Field | Role | Description |
| --- | --- | --- |
| `VB` | vpkd3d128: read | Source B vector register. |
| `VD` | vpkd3d128: write | Destination vector register. |
## Register Effects
### `vpkd3d128`
- **Reads (always):** `VB`
- **Reads (conditional):** _none_
- **Writes (always):** `VD`
- **Writes (conditional):** _none_
## Status-Register Effects
_No condition-register or status-register effects._
## Operation (pseudocode)
```
; Pseudocode derives directly from the xenia-rs interpreter
; arm (see Implementation References). Operation semantics:
; - Read source operands from the fields listed under Operands.
; - Apply the arithmetic / logical / memory action described
; in the Description field above.
; - Write results to the destination register(s); update any
; status bits enumerated under Status-Register Effects.
; Consult the IBM AIX reference link under IBM Reference for
; canonical PPC-style pseudocode where xenia's expression is
; terse.
```
## C Translation Example
```c
/* C translation: the xenia-rs interpreter arm below in */
/* Implementation References is the authoritative semantic */
/* snapshot. Translate it line-by-line: */
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
/* The Register Effects and Status-Register Effects tables above */
/* enumerate every side effect a faithful translation must emit. */
```
## Implementation References
**`vpkd3d128`**
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vpkd3d128"`](../../xenia-canary/tools/ppc-instructions.xml)
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:2088`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L2088)
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:112`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L112)
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:648`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L648)
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:4191-4248`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L4191-L4248)
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
```rust
PpcOpcode::vpkd3d128 => {
use crate::vmx::D3dPackType;
let uimm = crate::decoder::extract_vx128_uimm5(instr.raw);
let pack = (uimm & 3) as usize;
let shift = instr.vx128_4_z() as usize;
let ty = D3dPackType::from_immediate(uimm >> 2);
let src = ctx.vr[instr.vb128()];
let out = match ty {
D3dPackType::D3dColor => crate::vmx::pack_d3dcolor(src),
D3dPackType::NormShort2 => crate::vmx::pack_normshort2(src),
D3dPackType::NormPacked32 => crate::vmx::pack_normpacked32(src),
D3dPackType::Float16_2 => crate::vmx::pack_float16_2(src),
D3dPackType::NormShort4 => crate::vmx::pack_normshort4(src),
D3dPackType::Float16_4 => crate::vmx::pack_float16_4(src),
D3dPackType::NormPacked64 => crate::vmx::pack_normpacked64(src),
D3dPackType::Other(t) => {
tracing::warn!(
raw = format_args!("{:#010x}", instr.raw),
uimm,
ty = t,
"vpkd3d128: unhandled pack type at {:#010x}",
ctx.pc,
);
src
}
};
// Post-pack permutation: merge packed `out` into previous `vd`
// per canary ppc_emit_altivec.cc:2126-2188 MakePermuteMask tables.
// MakePermuteMask(r0,l0, r1,l1, r2,l2, r3,l3): result[i] = if ri==0 { prev[li] } else { out[li] }
let result = if pack == 0 {
out
} else {
// (source_reg, lane): 0=prev vd, 1=packed out
const PERM: [[[(u8, u8); 4]; 4]; 3] = [
// pack=1 (VPACK_32): places out[3] at lane (3-shift)
[[(0,0),(0,1),(0,2),(1,3)], [(0,0),(0,1),(1,3),(0,3)],
[(0,0),(1,3),(0,2),(0,3)], [(1,3),(0,1),(0,2),(0,3)]],
// pack=2 (64-bit): places out[2..3] at lanes (2-shift)..(3-shift)
[[(0,0),(0,1),(1,2),(1,3)], [(0,0),(1,2),(1,3),(0,3)],
[(1,2),(1,3),(0,2),(0,3)], [(1,3),(0,1),(0,2),(0,3)]],
// pack=3 (64-bit): same as pack=2 except shift=3 selects out[2] at lane 3
[[(0,0),(0,1),(1,2),(1,3)], [(0,0),(1,2),(1,3),(0,3)],
[(1,2),(1,3),(0,2),(0,3)], [(0,0),(0,1),(0,2),(1,2)]],
];
let prev = ctx.vr[instr.vd128()];
let pw = prev.as_u32x4();
let ow = out.as_u32x4();
let sel = PERM[pack - 1][shift];
xenia_types::Vec128::from_u32x4_array([
if sel[0].0 == 0 { pw[sel[0].1 as usize] } else { ow[sel[0].1 as usize] },
if sel[1].0 == 0 { pw[sel[1].1 as usize] } else { ow[sel[1].1 as usize] },
if sel[2].0 == 0 { pw[sel[2].1 as usize] } else { ow[sel[2].1 as usize] },
if sel[3].0 == 0 { pw[sel[3].1 as usize] } else { ow[sel[3].1 as usize] },
])
};
ctx.vr[instr.vd128()] = result;
ctx.pc += 4;
}
```
</details>
<!-- GENERATED: END -->
## Special Cases & Edge Conditions
- **Pack four float lanes into a single D3D-format 32-bit word.** The `IMM` field and the `z` sub-operation selector (together carried in bits 610 of the encoding in xenia's layout) choose *which* D3D format to emit:
- `D3dColor` — pack 4×float `[0.0, 1.0]` lanes into a 32-bit RGBA8 (A in high byte, B in low byte) — the canonical Direct3D 9 `D3DCOLOR` format. Xenia's helper is `vmx::pack_d3dcolor`.
- Other formats (RGBA16, compressed colour, etc.) are not yet implemented in xenia-rs; the interpreter logs a warning and passes through unchanged.
- **Also performs rotate-left-immediate and mask-insert.** The mnemonic is "Pack D3Dtype, Rotate Left Immediate and Mask Insert": the result of the pack step is rotated and merged into an existing `VD` under an immediate mask. Xenia currently emits only the pack step and overwrites `VD` wholesale; games rarely rely on the rotate-and-insert aspect.
- **Sub-operation via the `z` field** (2 bits) + `IMM` (5 bits) gives 7 bits of format selection; the practical set used by Xenon games is small (D3DCOLOR is the dominant one).
- **No saturation signal.** The packer saturates floats beyond `[0.0, 1.0]` silently; `VSCR[SAT]` is not touched.
- **VMX128 register-fusion** on `VD` and `VB`.
- **No IBM AIX entry** — Xenon-only.
- **No `Rc`, no XER.**
## Related Instructions
- [`vupkd3d128`](vupkd3d128.md) — the inverse (unpack a D3D-format word back into 4 floats).
- [`vpkpx`](../vmx/vpkpx.md) — the standard Altivec 1-5-5-5 pixel pack.
- [`vpkshus`](../vmx/vpkshus.md), [`vpkuhus`](../vmx/vpkuhus.md) — byte-range saturating packs (an alternative colour-packing path).
- [`vcfpsxws128`](vcfpsxws128.md), [`vcfpuxws128`](vcfpuxws128.md) — conversion with explicit scale; software sometimes pre-scales floats to `[0, 255]` before using these in place of `vpkd3d128`.
## IBM Reference
- No IBM AIX entry — Xbox 360 VMX128 extension only. The "D3D" in the mnemonic refers directly to Direct3D 9 vertex/pixel formats (the `D3DDECLTYPE_*` enumeration).
- Xbox 360 XDK, Altivec-128 (VMX128) extensions.
- Microsoft D3D9 documentation: `D3DDECLTYPE_D3DCOLOR`, `D3DDECLTYPE_UBYTE4N`, etc.

View File

@@ -0,0 +1,141 @@
# `vrlimi128` — Vector128 Rotate Left Immediate and Mask Insert
> **Category:** [VMX128](../categories/vmx128.md) · **Form:** [VX128_4](../forms/VX128_4.md) · **Opcode:** `0x18000710`
<!-- GENERATED: BEGIN -->
## Assembler Mnemonics
| Mnemonic | XML entry | Flags | Description |
| --- | --- | --- | --- |
| `vrlimi128` | `vrlimi128` | — | Vector128 Rotate Left Immediate and Mask Insert |
## Syntax
```asm
vrlimi128 [VD], [VB], [IMM], [z]
```
## Encoding
### `vrlimi128` — form `VX128_4`
- **Opcode word:** `0x18000710`
- **Primary opcode (bits 05):** `6`
- **Extended opcode:** `1808`
- **Synchronising:** no
| Bits | Field | Meaning |
| --- | --- | --- |
| 05 | `OPCD` | primary opcode (6) |
| 610 | `VD128l` | destination low 5 bits |
| 1115 | `IMM` | 5-bit immediate |
| 1620 | `VB128l` | source B low 5 bits |
| 2123 | `XO` | extended opcode |
| 2425 | `z` | sub-operation selector |
| 2829 | `VD128h` | destination high 2 bits |
| 3031 | `VB128h` | source B high 2 bits |
## Operands
| Field | Role | Description |
| --- | --- | --- |
| `VB` | vrlimi128: read | Source B vector register. |
| `VD` | vrlimi128: write | Destination vector register. |
## Register Effects
### `vrlimi128`
- **Reads (always):** `VB`
- **Reads (conditional):** _none_
- **Writes (always):** `VD`
- **Writes (conditional):** _none_
## Status-Register Effects
_No condition-register or status-register effects._
## Operation (pseudocode)
```
; Pseudocode derives directly from the xenia-rs interpreter
; arm (see Implementation References). Operation semantics:
; - Read source operands from the fields listed under Operands.
; - Apply the arithmetic / logical / memory action described
; in the Description field above.
; - Write results to the destination register(s); update any
; status bits enumerated under Status-Register Effects.
; Consult the IBM AIX reference link under IBM Reference for
; canonical PPC-style pseudocode where xenia's expression is
; terse.
```
## C Translation Example
```c
/* C translation: the xenia-rs interpreter arm below in */
/* Implementation References is the authoritative semantic */
/* snapshot. Translate it line-by-line: */
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
/* The Register Effects and Status-Register Effects tables above */
/* enumerate every side effect a faithful translation must emit. */
```
## Implementation References
**`vrlimi128`**
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vrlimi128"`](../../xenia-canary/tools/ppc-instructions.xml)
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:1315`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L1315)
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:119`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L119)
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:649`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L649)
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:3962-3977`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L3962-L3977)
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
```rust
PpcOpcode::vrlimi128 => {
let shift = instr.vx128_4_z() as usize;
let mask = instr.vx128_4_imm();
let b = ctx.vr[instr.vb128()].as_u32x4();
let d = ctx.vr[instr.vd128()].as_u32x4();
let rot = [b[shift % 4], b[(shift + 1) % 4], b[(shift + 2) % 4], b[(shift + 3) % 4]];
let mut r = [0u32; 4];
for i in 0..4 {
// mask bit 3 corresponds to word 0 (BE-first). Use rot when
// the corresponding mask bit is set.
let use_rot = (mask >> (3 - i)) & 1 == 1;
r[i] = if use_rot { rot[i] } else { d[i] };
}
ctx.vr[instr.vd128()] = xenia_types::Vec128::from_u32x4_array(r);
ctx.pc += 4;
}
```
</details>
<!-- GENERATED: END -->
## Special Cases & Edge Conditions
- **Rotate-left-word + mask-insert in one step.** `VB` is rotated left by `IMM & 3` word positions (word-granular, 0..3 — not bits). The resulting rotated vector is merged into the pre-existing `VD` under control of a 4-bit "insert mask" (`fmask`, from bits 2629 of the encoding in xenia's layout): mask bit `i` = 1 keeps lane `i` from the rotated `VB`; mask bit = 0 keeps lane `i` from the old `VD`.
- **Destructive destination.** `VD` is both source and destination — software must preserve its value or pre-initialise it.
- **Typical use: selective-lane overwrite.** Games use this to "rewrite lane `n` of a vector with a shuffled component" without a full permute. A common pattern is "insert a scalar into lane `i` of a vector" where the scalar has been pre-loaded to a known word of `VB`.
- **Mask bit ↔ lane mapping.** Big-endian: mask bit 3 (MSB of the 4-bit mask) controls lane 0; bit 0 controls lane 3. (In xenia: `use_rot = (mask >> (3 i)) & 1`.)
- **VMX128 register-fusion** on `VD` and `VB`.
- **No IBM AIX entry** — Xenon-only.
- **No `Rc`, no XER, no VSCR.**
## Related Instructions
- [`vrlw`](../vmx/vrlw.md), [`vrlw128`](../vmx/vrlw.md) — per-lane bit-level rotate (word-granular shift, not lane-granular).
- [`vpermwi128`](vpermwi128.md) — immediate 4-way word permute (no merge).
- [`vsel`](../vmx/vsel.md), [`vsel128`](../vmx/vsel.md) — general bit-select; `vrlimi128` is the specialised "rotate + insert" equivalent.
- [`vsldoi`](../vmx/vsldoi.md) — byte-level immediate shift.
## IBM Reference
- No IBM AIX entry — this instruction is exclusive to the Xbox 360's VMX128 extension. The mnemonic is an adaptation of the scalar `rlwimi` (rotate-left-word-immediate-mask-insert) pattern for vectors.
- Xbox 360 XDK, Altivec-128 (VMX128) extensions.
- [IBM AltiVec Technology Programmer's Interface Manual, Chapter 4 — Integer Shift / Rotate](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf) for the base rotate semantics.

View File

@@ -0,0 +1,154 @@
# `vupkd3d128` — Vector128 Unpack D3Dtype
> **Category:** [VMX128](../categories/vmx128.md) · **Form:** [VX128_3](../forms/VX128_3.md) · **Opcode:** `0x180007f0`
<!-- GENERATED: BEGIN -->
## Assembler Mnemonics
| Mnemonic | XML entry | Flags | Description |
| --- | --- | --- | --- |
| `vupkd3d128` | `vupkd3d128` | — | Vector128 Unpack D3Dtype |
## Syntax
```asm
(no disassembly template)
```
## Encoding
### `vupkd3d128` — form `VX128_3`
- **Opcode word:** `0x180007f0`
- **Primary opcode (bits 05):** `6`
- **Extended opcode:** `2032`
- **Synchronising:** no
| Bits | Field | Meaning |
| --- | --- | --- |
| 05 | `OPCD` | primary opcode (6) |
| 610 | `VD128l` | destination low 5 bits |
| 1115 | `IMM` | 5-bit immediate |
| 1620 | `VB128l` | source B low 5 bits |
| 2127 | `XO` | extended opcode |
| 2829 | `VD128h` | destination high 2 bits |
| 3031 | `VB128h` | source B high 2 bits |
## Operands
| Field | Role | Description |
| --- | --- | --- |
| `VB` | vupkd3d128: read | Source B vector register. |
| `VD` | vupkd3d128: write | Destination vector register. |
## Register Effects
### `vupkd3d128`
- **Reads (always):** `VB`
- **Reads (conditional):** _none_
- **Writes (always):** `VD`
- **Writes (conditional):** _none_
## Status-Register Effects
_No condition-register or status-register effects._
## Operation (pseudocode)
```
; Pseudocode derives directly from the xenia-rs interpreter
; arm (see Implementation References). Operation semantics:
; - Read source operands from the fields listed under Operands.
; - Apply the arithmetic / logical / memory action described
; in the Description field above.
; - Write results to the destination register(s); update any
; status bits enumerated under Status-Register Effects.
; Consult the IBM AIX reference link under IBM Reference for
; canonical PPC-style pseudocode where xenia's expression is
; terse.
```
## C Translation Example
```c
/* C translation: the xenia-rs interpreter arm below in */
/* Implementation References is the authoritative semantic */
/* snapshot. Translate it line-by-line: */
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
/* The Register Effects and Status-Register Effects tables above */
/* enumerate every side effect a faithful translation must emit. */
```
## Implementation References
**`vupkd3d128`**
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vupkd3d128"`](../../xenia-canary/tools/ppc-instructions.xml)
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:2194`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L2194)
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:128`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L128)
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:670`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L670)
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:4249-4275`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L4249-L4275)
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
```rust
PpcOpcode::vupkd3d128 => {
use crate::vmx::D3dPackType;
let uimm = crate::decoder::extract_vx128_uimm5(instr.raw);
let ty = D3dPackType::from_immediate(uimm >> 2);
let src = ctx.vr[instr.vb128()];
let out = match ty {
D3dPackType::D3dColor => crate::vmx::unpack_d3dcolor(src),
D3dPackType::NormShort2 => crate::vmx::unpack_normshort2(src),
D3dPackType::NormPacked32 => crate::vmx::unpack_normpacked32(src),
D3dPackType::Float16_2 => crate::vmx::unpack_float16_2(src),
D3dPackType::NormShort4 => crate::vmx::unpack_normshort4(src),
D3dPackType::Float16_4 => crate::vmx::unpack_float16_4(src),
D3dPackType::NormPacked64 => crate::vmx::unpack_normpacked64(src),
D3dPackType::Other(t) => {
tracing::warn!(
raw = format_args!("{:#010x}", instr.raw),
uimm,
ty = t,
"vupkd3d128: unhandled pack type at {:#010x}",
ctx.pc,
);
src
}
};
ctx.vr[instr.vd128()] = out;
ctx.pc += 4;
}
```
</details>
<!-- GENERATED: END -->
## Special Cases & Edge Conditions
- **Unpack a D3D-format word into 4 float lanes.** The `IMM` field in the encoding selects the target format:
- `D3dColor` — decode a 32-bit RGBA8 (`D3DCOLOR`) into 4 float lanes in `[0.0, 1.0]`. Xenia's helper is `vmx::unpack_d3dcolor`.
- Other formats (UBYTE4N, SHORT2N, etc.) are not yet implemented in xenia-rs; the interpreter logs a warning and passes `VB` through unchanged.
- **Inverse of [`vpkd3d128`](vpkd3d128.md).** The same format code used to pack must be used to unpack.
- **Source-width is a single 32-bit word** of `VB` (typically lane 0; the helpers read the appropriate component). The other three input word lanes are ignored for `D3DCOLOR`.
- **IEEE-754 binary32 outputs,** already normalised to `[0.0, 1.0]` (integer value divided by 255, then cast to float).
- **No `VSCR[SAT]` effect**, no FPSCR, no exceptions.
- **VMX128 register-fusion** on `VD` and `VB`.
- **No IBM AIX entry** — Xenon-only.
- **No `Rc`, no XER.**
## Related Instructions
- [`vpkd3d128`](vpkd3d128.md) — the inverse pack.
- [`vupkhpx`](../vmx/vupkhpx.md), [`vupklpx`](../vmx/vupklpx.md) — standard Altivec 1-5-5-5 pixel unpacks.
- [`vupkhsb`](../vmx/vupkhsb.md), [`vupklsb`](../vmx/vupklsb.md) — sign-extending byte→half-word unpacks (the integer analogue).
- [`vcsxwfp128`](vcsxwfp128.md), [`vcuxwfp128`](vcuxwfp128.md) — int → float with scale; sometimes used as an alternate decode path.
## IBM Reference
- No IBM AIX entry — Xbox 360 VMX128 extension only. "D3D" denotes the Direct3D 9 vertex/pixel format catalogue (`D3DDECLTYPE_*`).
- Xbox 360 XDK, Altivec-128 (VMX128) extensions.
- Microsoft D3D9 documentation: `D3DDECLTYPE_D3DCOLOR`, `D3DDECLTYPE_UBYTE4N`, etc.