chore: add migration/ bundle for cross-machine setup
Bundles state that lives OUTSIDE the xenia-rs repo so a fresh clone on
another machine can be brought up to identical configuration via
migration/setup.sh:
- claude-memory/ ~/.claude/projects/-home-fabi-RE-Project-Sylpheed/memory/
(103 files, 1.1 MB - MEMORY.md + every
project_xenia_rs_*.md from audits
addis_signext through audit-058)
- project-root/dot-claude/ <project-root>/.claude/settings.json
(Stop hook + permissions)
- project-root/ppc-manual/ <project-root>/ppc-manual/
(PowerPC reference docs, 397 files, 3.7 MB)
- project-root/run-canary.sh <project-root>/run-canary.sh
- README.md Human-readable setup checklist
- setup.sh Idempotent installer (also reclones
xenia-canary at pinned HEAD 6de80dffe)
- MANIFEST.md Per-file mapping + per-file-not-bundled
restoration recipe
Excluded from bundle (not shippable via git):
- Sylpheed ISO (7.8 GB; copyright; manual copy required)
- sylpheed.db (395 MB; regenerable from XEX via analysis tooling)
- target/ build artifacts (rebuild on target)
- audit-runs probe firehoses (.log/.stdout/.stderr ~11 GB; rerun if needed)
- audit-runs memory dumps (.bin ~4.5 GB; rerun audit-026/027/029 if needed)
- xenia-canary checkout (setup.sh reclones from
git.mc02.dev/fabi/Xenia-Canary.git at HEAD 6de80dffe)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
137
migration/project-root/ppc-manual/vmx128/vcfpsxws128.md
Normal file
137
migration/project-root/ppc-manual/vmx128/vcfpsxws128.md
Normal file
@@ -0,0 +1,137 @@
|
||||
# `vcfpsxws128` — Vector128 Convert From Floating-Point to Signed Fixed-Point Word Saturate
|
||||
|
||||
> **Category:** [VMX128](../categories/vmx128.md) · **Form:** [VX128_3](../forms/VX128_3.md) · **Opcode:** `0x18000230`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `vcfpsxws128` | `vcfpsxws128` | — | Vector128 Convert From Floating-Point to Signed Fixed-Point Word Saturate |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
vcfpsxws128 [VD], [VB], [UIMM]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `vcfpsxws128` — form `VX128_3`
|
||||
|
||||
- **Opcode word:** `0x18000230`
|
||||
- **Primary opcode (bits 0–5):** `6`
|
||||
- **Extended opcode:** `560`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (6) |
|
||||
| 6–10 | `VD128l` | destination low 5 bits |
|
||||
| 11–15 | `IMM` | 5-bit immediate |
|
||||
| 16–20 | `VB128l` | source B low 5 bits |
|
||||
| 21–27 | `XO` | extended opcode |
|
||||
| 28–29 | `VD128h` | destination high 2 bits |
|
||||
| 30–31 | `VB128h` | source B high 2 bits |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `VB` | vcfpsxws128: read | Source B vector register. |
|
||||
| `UIMM` | vcfpsxws128: read | 16-bit unsigned immediate. Zero-extended. |
|
||||
| `VD` | vcfpsxws128: write | Destination vector register. |
|
||||
| `VSCR` | vcfpsxws128: write | Vector Status and Control Register (NJ/SAT bits). |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `vcfpsxws128`
|
||||
|
||||
- **Reads (always):** `VB`, `UIMM`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `VD`, `VSCR`
|
||||
- **Writes (conditional):** _none_
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `vcfpsxws128`: **VSCR[SAT]** may be stickied on saturating vector operations.
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`vcfpsxws128`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vcfpsxws128"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:539`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L539)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:93`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L93)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:656`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L656)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:4323-4334`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L4323-L4334)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::vcfpsxws128 => {
|
||||
let uimm = (instr.raw >> 16) & 0x1F;
|
||||
let b = ctx.vr[instr.vb128()].as_f32x4();
|
||||
let mut r = [0i32; 4]; let mut sat = false;
|
||||
for i in 0..4 {
|
||||
let (v, s) = crate::vmx::cvt_f32_to_i32_sat(b[i], uimm);
|
||||
r[i] = v; sat |= s;
|
||||
}
|
||||
if sat { ctx.set_vscr_sat(true); }
|
||||
ctx.vr[instr.vd128()] = crate::vmx::from_i32x4(r);
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Float → signed fixed-point (int32) with explicit scale.** Each lane computes `VD.w[i] = sat_int32(VB[i] * 2^UIMM)`, truncating toward zero and clamping to `[−2^31, 2^31−1]`. `UIMM` is a 5-bit unsigned bias (range 0..31) that specifies a power-of-two pre-scale on the float value.
|
||||
- **Use case: fixed-point pipelines.** The `UIMM` pre-scale lets game code convert a `[0.0, 1.0]` float channel into a `uint16`-range fixed-point value in one instruction (e.g. `UIMM = 15` → scale by 32768).
|
||||
- **Sticky VSCR[SAT]** set whenever a lane clamps (including NaN inputs, which xenia's `cvt_f32_to_i32_sat` treats as 0 and flags saturation).
|
||||
- **`VSCR[NJ]` honoured** on the float input side.
|
||||
- **VMX128 register-fusion** applies to `VD` and `VB`: 7-bit register IDs via `VD128l ‖ VD128h` and `VB128l ‖ VB128h`.
|
||||
- **No IBM AIX entry** — this is Xenon-only. The closest standard Altivec op is [`vctsxs`](../vmx/vctsxs.md).
|
||||
- **No `Rc`, no XER / FPSCR.**
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`vctsxs`](../vmx/vctsxs.md) — the standard Altivec equivalent (same semantics, 32-register file).
|
||||
- [`vcfpuxws128`](vcfpuxws128.md) — unsigned variant (clamps to `uint32`).
|
||||
- [`vcsxwfp128`](vcsxwfp128.md), [`vcuxwfp128`](vcuxwfp128.md) — the inverse (int → float with scale).
|
||||
- [`vrfiz`](../vmx/vrfiz.md) — plain truncate-to-float-integer without scale.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- No IBM AIX entry — this instruction is exclusive to the Xbox 360's VMX128 extension.
|
||||
- Xbox 360 XDK, Altivec-128 (VMX128) extensions (Microsoft internal documentation); semantics cross-referenced with [IBM AltiVec Technology Programmer's Interface Manual §`vctsxs`](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf).
|
||||
137
migration/project-root/ppc-manual/vmx128/vcfpuxws128.md
Normal file
137
migration/project-root/ppc-manual/vmx128/vcfpuxws128.md
Normal file
@@ -0,0 +1,137 @@
|
||||
# `vcfpuxws128` — Vector128 Convert From Floating-Point to Unsigned Fixed-Point Word Saturate
|
||||
|
||||
> **Category:** [VMX128](../categories/vmx128.md) · **Form:** [VX128_3](../forms/VX128_3.md) · **Opcode:** `0x18000270`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `vcfpuxws128` | `vcfpuxws128` | — | Vector128 Convert From Floating-Point to Unsigned Fixed-Point Word Saturate |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
vcfpuxws128 [VD], [VB], [UIMM]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `vcfpuxws128` — form `VX128_3`
|
||||
|
||||
- **Opcode word:** `0x18000270`
|
||||
- **Primary opcode (bits 0–5):** `6`
|
||||
- **Extended opcode:** `624`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (6) |
|
||||
| 6–10 | `VD128l` | destination low 5 bits |
|
||||
| 11–15 | `IMM` | 5-bit immediate |
|
||||
| 16–20 | `VB128l` | source B low 5 bits |
|
||||
| 21–27 | `XO` | extended opcode |
|
||||
| 28–29 | `VD128h` | destination high 2 bits |
|
||||
| 30–31 | `VB128h` | source B high 2 bits |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `VB` | vcfpuxws128: read | Source B vector register. |
|
||||
| `UIMM` | vcfpuxws128: read | 16-bit unsigned immediate. Zero-extended. |
|
||||
| `VD` | vcfpuxws128: write | Destination vector register. |
|
||||
| `VSCR` | vcfpuxws128: write | Vector Status and Control Register (NJ/SAT bits). |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `vcfpuxws128`
|
||||
|
||||
- **Reads (always):** `VB`, `UIMM`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `VD`, `VSCR`
|
||||
- **Writes (conditional):** _none_
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
- `vcfpuxws128`: **VSCR[SAT]** may be stickied on saturating vector operations.
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`vcfpuxws128`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vcfpuxws128"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:557`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L557)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:93`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L93)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:657`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L657)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:4335-4346`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L4335-L4346)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::vcfpuxws128 => {
|
||||
let uimm = (instr.raw >> 16) & 0x1F;
|
||||
let b = ctx.vr[instr.vb128()].as_f32x4();
|
||||
let mut r = [0u32; 4]; let mut sat = false;
|
||||
for i in 0..4 {
|
||||
let (v, s) = crate::vmx::cvt_f32_to_u32_sat(b[i], uimm);
|
||||
r[i] = v; sat |= s;
|
||||
}
|
||||
if sat { ctx.set_vscr_sat(true); }
|
||||
ctx.vr[instr.vd128()] = xenia_types::Vec128::from_u32x4_array(r);
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Float → unsigned fixed-point (uint32) with explicit scale.** Each lane computes `VD.w[i] = sat_uint32(VB[i] * 2^UIMM)`, truncating toward zero and clamping to `[0, 2^32−1]`. `UIMM` is a 5-bit unsigned bias (range 0..31).
|
||||
- **Negative floats clamp to 0** and sticky-set `VSCR[SAT]`.
|
||||
- **NaN inputs** → 0 with `VSCR[SAT]` set (xenia's `cvt_f32_to_u32_sat`).
|
||||
- **`VSCR[NJ]` honoured** for denormal inputs.
|
||||
- **VMX128 register-fusion** applies to `VD` and `VB` (7-bit IDs).
|
||||
- **No IBM AIX entry** — Xenon-only.
|
||||
- **No `Rc`, no XER / FPSCR.**
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`vctuxs`](../vmx/vctuxs.md) — the standard Altivec equivalent (uint32 clamp with scale).
|
||||
- [`vcfpsxws128`](vcfpsxws128.md) — signed variant.
|
||||
- [`vcuxwfp128`](vcuxwfp128.md) — the inverse (uint → float with scale).
|
||||
- [`vrfiz`](../vmx/vrfiz.md) — plain truncate-to-integer-float without scale.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- No IBM AIX entry — this instruction is exclusive to the Xbox 360's VMX128 extension.
|
||||
- Xbox 360 XDK, Altivec-128 (VMX128) extensions (Microsoft internal documentation); cross-referenced with [IBM AltiVec Technology Programmer's Interface Manual §`vctuxs`](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf).
|
||||
131
migration/project-root/ppc-manual/vmx128/vcsxwfp128.md
Normal file
131
migration/project-root/ppc-manual/vmx128/vcsxwfp128.md
Normal file
@@ -0,0 +1,131 @@
|
||||
# `vcsxwfp128` — Vector128 Convert From Signed Fixed-Point Word to Floating-Point
|
||||
|
||||
> **Category:** [VMX128](../categories/vmx128.md) · **Form:** [VX128_3](../forms/VX128_3.md) · **Opcode:** `0x180002b0`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `vcsxwfp128` | `vcsxwfp128` | — | Vector128 Convert From Signed Fixed-Point Word to Floating-Point |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
vcsxwfp128 [VD], [VB], [UIMM]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `vcsxwfp128` — form `VX128_3`
|
||||
|
||||
- **Opcode word:** `0x180002b0`
|
||||
- **Primary opcode (bits 0–5):** `6`
|
||||
- **Extended opcode:** `688`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (6) |
|
||||
| 6–10 | `VD128l` | destination low 5 bits |
|
||||
| 11–15 | `IMM` | 5-bit immediate |
|
||||
| 16–20 | `VB128l` | source B low 5 bits |
|
||||
| 21–27 | `XO` | extended opcode |
|
||||
| 28–29 | `VD128h` | destination high 2 bits |
|
||||
| 30–31 | `VB128h` | source B high 2 bits |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `VB` | vcsxwfp128: read | Source B vector register. |
|
||||
| `UIMM` | vcsxwfp128: read | 16-bit unsigned immediate. Zero-extended. |
|
||||
| `VD` | vcsxwfp128: write | Destination vector register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `vcsxwfp128`
|
||||
|
||||
- **Reads (always):** `VB`, `UIMM`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `VD`
|
||||
- **Writes (conditional):** _none_
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
_No condition-register or status-register effects._
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`vcsxwfp128`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vcsxwfp128"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:503`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L503)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:98`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L98)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:658`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L658)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:4347-4354`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L4347-L4354)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::vcsxwfp128 => {
|
||||
let uimm = (instr.raw >> 16) & 0x1F;
|
||||
let b = crate::vmx::as_i32x4(ctx.vr[instr.vb128()]);
|
||||
let mut r = [0f32; 4];
|
||||
for i in 0..4 { r[i] = crate::vmx::cvt_i32_to_f32(b[i], uimm); }
|
||||
ctx.vr[instr.vd128()] = xenia_types::Vec128::from_f32x4_array(r);
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Signed fixed-point (int32) → float with explicit scale.** Each lane computes `VD[i] = (float)VB.w[i] * 2^-UIMM` (equivalently `(int32)VB.w[i] / 2^UIMM`). `UIMM` is a 5-bit unsigned bias that specifies a post-scale — the inverse direction of [`vcfpsxws128`](vcfpsxws128.md), so the `UIMM`s should match for a round-trip.
|
||||
- **IEEE-754 binary32 output, round-to-nearest.** Values outside the exactly-representable range (`|x| > 2^24`) lose low-order bits; no saturation on the float side.
|
||||
- **No `VSCR[SAT]` effect** — conversion in this direction never saturates.
|
||||
- **`VSCR[NJ]` does not affect the int → float path.**
|
||||
- **VMX128 register-fusion** applies (7-bit register IDs).
|
||||
- **No IBM AIX entry** — Xenon-only. Closest standard Altivec op is [`vcfsx`](../vmx/vcfsx.md).
|
||||
- **No `Rc`, no XER / FPSCR.**
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`vcfsx`](../vmx/vcfsx.md) — the standard Altivec `int32 → float` with scale.
|
||||
- [`vcuxwfp128`](vcuxwfp128.md) — unsigned-int variant.
|
||||
- [`vcfpsxws128`](vcfpsxws128.md), [`vcfpuxws128`](vcfpuxws128.md) — the inverse (float → int with scale).
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- No IBM AIX entry — Xbox 360 VMX128 extension only.
|
||||
- Xbox 360 XDK, Altivec-128 (VMX128) extensions; cross-referenced with [IBM AltiVec Technology Programmer's Interface Manual §`vcfsx`](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf).
|
||||
131
migration/project-root/ppc-manual/vmx128/vcuxwfp128.md
Normal file
131
migration/project-root/ppc-manual/vmx128/vcuxwfp128.md
Normal file
@@ -0,0 +1,131 @@
|
||||
# `vcuxwfp128` — Vector128 Convert From Unsigned Fixed-Point Word to Floating-Point
|
||||
|
||||
> **Category:** [VMX128](../categories/vmx128.md) · **Form:** [VX128_3](../forms/VX128_3.md) · **Opcode:** `0x180002f0`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `vcuxwfp128` | `vcuxwfp128` | — | Vector128 Convert From Unsigned Fixed-Point Word to Floating-Point |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
vcuxwfp128 [VD], [VB], [UIMM]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `vcuxwfp128` — form `VX128_3`
|
||||
|
||||
- **Opcode word:** `0x180002f0`
|
||||
- **Primary opcode (bits 0–5):** `6`
|
||||
- **Extended opcode:** `752`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (6) |
|
||||
| 6–10 | `VD128l` | destination low 5 bits |
|
||||
| 11–15 | `IMM` | 5-bit immediate |
|
||||
| 16–20 | `VB128l` | source B low 5 bits |
|
||||
| 21–27 | `XO` | extended opcode |
|
||||
| 28–29 | `VD128h` | destination high 2 bits |
|
||||
| 30–31 | `VB128h` | source B high 2 bits |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `VB` | vcuxwfp128: read | Source B vector register. |
|
||||
| `UIMM` | vcuxwfp128: read | 16-bit unsigned immediate. Zero-extended. |
|
||||
| `VD` | vcuxwfp128: write | Destination vector register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `vcuxwfp128`
|
||||
|
||||
- **Reads (always):** `VB`, `UIMM`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `VD`
|
||||
- **Writes (conditional):** _none_
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
_No condition-register or status-register effects._
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`vcuxwfp128`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vcuxwfp128"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:521`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L521)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:98`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L98)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:659`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L659)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:4355-4362`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L4355-L4362)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::vcuxwfp128 => {
|
||||
let uimm = (instr.raw >> 16) & 0x1F;
|
||||
let b = ctx.vr[instr.vb128()].as_u32x4();
|
||||
let mut r = [0f32; 4];
|
||||
for i in 0..4 { r[i] = crate::vmx::cvt_u32_to_f32(b[i], uimm); }
|
||||
ctx.vr[instr.vd128()] = xenia_types::Vec128::from_f32x4_array(r);
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Unsigned fixed-point (uint32) → float with explicit scale.** Each lane computes `VD[i] = (float)VB.w[i] * 2^-UIMM`. Treats the 32-bit input as unsigned, so values ≥ `0x80000000` produce positive floats (unlike `vcsxwfp128` which would produce negatives).
|
||||
- **IEEE-754 binary32 output, round-to-nearest.** Precision loss above `2^24`.
|
||||
- **No `VSCR[SAT]` effect.**
|
||||
- **`VSCR[NJ]` does not affect the uint → float path.**
|
||||
- **VMX128 register-fusion** applies.
|
||||
- **No IBM AIX entry** — Xenon-only. Closest standard is [`vcfux`](../vmx/vcfux.md).
|
||||
- **No `Rc`, no XER / FPSCR.**
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`vcfux`](../vmx/vcfux.md) — the standard Altivec `uint32 → float` with scale.
|
||||
- [`vcsxwfp128`](vcsxwfp128.md) — signed-int variant.
|
||||
- [`vcfpuxws128`](vcfpuxws128.md), [`vcfpsxws128`](vcfpsxws128.md) — the inverse (float → int with scale).
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- No IBM AIX entry — Xbox 360 VMX128 extension only.
|
||||
- Xbox 360 XDK, Altivec-128 (VMX128) extensions; cross-referenced with [IBM AltiVec Technology Programmer's Interface Manual §`vcfux`](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf).
|
||||
148
migration/project-root/ppc-manual/vmx128/vmaddcfp128.md
Normal file
148
migration/project-root/ppc-manual/vmx128/vmaddcfp128.md
Normal file
@@ -0,0 +1,148 @@
|
||||
# `vmaddcfp128` — Vector128 Multiply Add Floating Point
|
||||
|
||||
> **Category:** [VMX128](../categories/vmx128.md) · **Form:** [VX128](../forms/VX128.md) · **Opcode:** `0x14000110`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `vmaddcfp128` | `vmaddcfp128` | — | Vector128 Multiply Add Floating Point |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
vmaddcfp128 [VD], [VA], [VD], [VB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `vmaddcfp128` — form `VX128`
|
||||
|
||||
- **Opcode word:** `0x14000110`
|
||||
- **Primary opcode (bits 0–5):** `5`
|
||||
- **Extended opcode:** `272`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (4 or 5) |
|
||||
| 6–10 | `VD128l` | destination low 5 bits |
|
||||
| 11–15 | `VA128l` | source A low 5 bits |
|
||||
| 16–20 | `VB128l` | source B low 5 bits |
|
||||
| 21 | `VA128H` | source A high bit |
|
||||
| 22 | `—` | reserved |
|
||||
| 23–25 | `VC` | optional VC / XO sub-field |
|
||||
| 26 | `VA128h` | source A middle bit |
|
||||
| 27 | `—` | reserved |
|
||||
| 28–29 | `VD128h` | destination high 2 bits |
|
||||
| 30–31 | `VB128h` | source B high 2 bits |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `VA` | vmaddcfp128: read | Source A vector register. |
|
||||
| `VD` | vmaddcfp128: read; vmaddcfp128: write | Destination vector register. |
|
||||
| `VB` | vmaddcfp128: read | Source B vector register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `vmaddcfp128`
|
||||
|
||||
- **Reads (always):** `VA`, `VD`, `VB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `VD`
|
||||
- **Writes (conditional):** _none_
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
_No condition-register or status-register effects._
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`vmaddcfp128`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vmaddcfp128"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:812`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L812)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:100`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L100)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:614`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L614)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:4492-4509`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L4492-L4509)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::vmaddcfp128 => {
|
||||
// ISA: (VD) <- (VA × VD) + VB. Canary InstrEmit_vmaddcfp128 (cc:819): MulAdd(VA, VD, VB).
|
||||
// Previous code computed di.mul_add(bi, ai) = VD×VB+VA — both operands wrong
|
||||
// (PPCBUG-425). Fix: ai.mul_add(di, bi) = VA×VD+VB.
|
||||
let a = ctx.vr[instr.va128()].as_f32x4();
|
||||
let b = ctx.vr[instr.vb128()].as_f32x4();
|
||||
let d = ctx.vr[instr.vd128()].as_f32x4();
|
||||
let mut r = [0f32; 4];
|
||||
for i in 0..4 {
|
||||
let ai = vmx::flush_denorm(a[i]);
|
||||
let bi = vmx::flush_denorm(b[i]);
|
||||
let di = vmx::flush_denorm(d[i]);
|
||||
// PPCBUG-437: flush subnormal output too.
|
||||
r[i] = vmx::flush_denorm(ai.mul_add(di, bi));
|
||||
}
|
||||
ctx.vr[instr.vd128()] = xenia_types::Vec128::from_f32x4_array(r);
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Xbox-specific fused multiply-add variant.** Each lane computes `VD[i] = VD[i] * VB[i] + VA[i]` — note that `VD` is both source and destination (xenia reads `VD` first, then writes). This is *not* the standard [`vmaddfp`](../vmx/vmaddfp.md) operand order: the "addend" position is `VA`, the other factor is `VB`, and `VD` carries the on-going accumulator. The mnemonic's trailing `c` denotes "accumulator-in-VD" rather than a separate `VC` operand.
|
||||
- **Fused, single-rounding.** Xenia uses `f32::mul_add`, which maps to a host FMA instruction when available. Bit-for-bit result depends on host support; xenia-canary's LLVM path emits the equivalent IR node.
|
||||
- **IEEE-754 binary32 lanes; `VSCR[NJ]` honoured.**
|
||||
- **No VSCR[SAT], no FPSCR update.**
|
||||
- **NaN propagation** per IEEE-754.
|
||||
- **VMX128 register-fusion** (7-bit IDs on `VA`, `VB`, `VD`).
|
||||
- **No IBM AIX entry** — Xenon-only.
|
||||
- **No `Rc`, no XER.**
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`vmaddfp`](../vmx/vmaddfp.md), [`vmaddfp128`](../vmx/vmaddfp.md) — standard fused `(VA × VC) + VB`.
|
||||
- [`vmulfp128`](vmulfp128.md) — plain lane-wise float multiply.
|
||||
- [`vnmsubfp`](../vmx/vnmsubfp.md) — negative-multiply-subtract.
|
||||
- [`vmsum3fp128`](vmsum3fp128.md), [`vmsum4fp128`](vmsum4fp128.md) — dot-product reductions.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- No IBM AIX entry — this is an Xbox 360 VMX128 extension. Its semantics differ from the base Altivec [`vmaddfp`](../vmx/vmaddfp.md) in the operand order (accumulator in `VD`, not `VC`).
|
||||
- Xbox 360 XDK, Altivec-128 (VMX128) extensions.
|
||||
- [IBM AltiVec Technology Programmer's Interface Manual, Chapter 5 — Floating-Point Arithmetic](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf) for the base FMA semantics.
|
||||
141
migration/project-root/ppc-manual/vmx128/vmsum3fp128.md
Normal file
141
migration/project-root/ppc-manual/vmx128/vmsum3fp128.md
Normal file
@@ -0,0 +1,141 @@
|
||||
# `vmsum3fp128` — Vector128 Multiply Sum 3-way Floating Point
|
||||
|
||||
> **Category:** [VMX128](../categories/vmx128.md) · **Form:** [VX128](../forms/VX128.md) · **Opcode:** `0x14000190`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `vmsum3fp128` | `vmsum3fp128` | — | Vector128 Multiply Sum 3-way Floating Point |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
vmsum3fp128 [VD], [VA], [VB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `vmsum3fp128` — form `VX128`
|
||||
|
||||
- **Opcode word:** `0x14000190`
|
||||
- **Primary opcode (bits 0–5):** `5`
|
||||
- **Extended opcode:** `400`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (4 or 5) |
|
||||
| 6–10 | `VD128l` | destination low 5 bits |
|
||||
| 11–15 | `VA128l` | source A low 5 bits |
|
||||
| 16–20 | `VB128l` | source B low 5 bits |
|
||||
| 21 | `VA128H` | source A high bit |
|
||||
| 22 | `—` | reserved |
|
||||
| 23–25 | `VC` | optional VC / XO sub-field |
|
||||
| 26 | `VA128h` | source A middle bit |
|
||||
| 27 | `—` | reserved |
|
||||
| 28–29 | `VD128h` | destination high 2 bits |
|
||||
| 30–31 | `VB128h` | source B high 2 bits |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `VA` | vmsum3fp128: read | Source A vector register. |
|
||||
| `VB` | vmsum3fp128: read | Source B vector register. |
|
||||
| `VD` | vmsum3fp128: write | Destination vector register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `vmsum3fp128`
|
||||
|
||||
- **Reads (always):** `VA`, `VB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `VD`
|
||||
- **Writes (conditional):** _none_
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
_No condition-register or status-register effects._
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`vmsum3fp128`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vmsum3fp128"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:1067`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L1067)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:106`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L106)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:616`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L616)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:4513-4523`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L4513-L4523)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::vmsum3fp128 => {
|
||||
// PPCBUG-436: flush per-product intermediates (not just the final sum).
|
||||
let a = ctx.vr[instr.va128()].as_f32x4();
|
||||
let b = ctx.vr[instr.vb128()].as_f32x4();
|
||||
let p0 = vmx::flush_denorm(a[0] * b[0]);
|
||||
let p1 = vmx::flush_denorm(a[1] * b[1]);
|
||||
let p2 = vmx::flush_denorm(a[2] * b[2]);
|
||||
let s = vmx::flush_denorm(p0 + p1 + p2);
|
||||
ctx.vr[instr.vd128()] = xenia_types::Vec128::from_f32x4(s, s, s, s);
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **3-way float dot product.** Computes `s = VA[0]*VB[0] + VA[1]*VB[1] + VA[2]*VB[2]` (ignoring lane 3 — the "w" component of a homogeneous vector) and **broadcasts `s` to every lane of `VD`**. Typical call site: 3D vector dot products where the w-component is padding.
|
||||
- **Scalar-result-splatted-across-lanes.** Consuming code can then use any lane of `VD` as the dot-product result.
|
||||
- **Rounding.** Xenia performs two adds in sequence (no fused triple-add in Rust). The order matches the spec but the summation order affects round-off by ~1 ulp. Games that need deterministic cross-host behaviour typically pre-scale their inputs.
|
||||
- **IEEE-754 binary32; `VSCR[NJ]` honoured.**
|
||||
- **No VSCR[SAT], no FPSCR update.**
|
||||
- **VMX128 register-fusion** (7-bit IDs on `VA`, `VB`, `VD`).
|
||||
- **No IBM AIX entry** — Xenon-only.
|
||||
- **No `Rc`, no XER.**
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`vmsum4fp128`](vmsum4fp128.md) — 4-way dot-product (includes the w-lane).
|
||||
- [`vmulfp128`](vmulfp128.md), [`vaddfp`](../vmx/vaddfp.md) — the building blocks.
|
||||
- [`vmaddcfp128`](vmaddcfp128.md), [`vmaddfp`](../vmx/vmaddfp.md) — fused MAC variants.
|
||||
- [`vsumsws`](../vmx/vsumsws.md) — integer sum-reduction analogue.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- No IBM AIX entry — Xbox 360 VMX128 extension only.
|
||||
- Xbox 360 XDK, Altivec-128 (VMX128) extensions. A 3-way dot product is a direct mirror of D3D9's `float3 dot`.
|
||||
- [IBM AltiVec Technology Programmer's Interface Manual, Chapter 5 — Floating-Point Arithmetic](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf) for the base float arithmetic semantics.
|
||||
142
migration/project-root/ppc-manual/vmx128/vmsum4fp128.md
Normal file
142
migration/project-root/ppc-manual/vmx128/vmsum4fp128.md
Normal file
@@ -0,0 +1,142 @@
|
||||
# `vmsum4fp128` — Vector128 Multiply Sum 4-way Floating-Point
|
||||
|
||||
> **Category:** [VMX128](../categories/vmx128.md) · **Form:** [VX128](../forms/VX128.md) · **Opcode:** `0x140001d0`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `vmsum4fp128` | `vmsum4fp128` | — | Vector128 Multiply Sum 4-way Floating-Point |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
vmsum4fp128 [VD], [VA], [VB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `vmsum4fp128` — form `VX128`
|
||||
|
||||
- **Opcode word:** `0x140001d0`
|
||||
- **Primary opcode (bits 0–5):** `5`
|
||||
- **Extended opcode:** `464`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (4 or 5) |
|
||||
| 6–10 | `VD128l` | destination low 5 bits |
|
||||
| 11–15 | `VA128l` | source A low 5 bits |
|
||||
| 16–20 | `VB128l` | source B low 5 bits |
|
||||
| 21 | `VA128H` | source A high bit |
|
||||
| 22 | `—` | reserved |
|
||||
| 23–25 | `VC` | optional VC / XO sub-field |
|
||||
| 26 | `VA128h` | source A middle bit |
|
||||
| 27 | `—` | reserved |
|
||||
| 28–29 | `VD128h` | destination high 2 bits |
|
||||
| 30–31 | `VB128h` | source B high 2 bits |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `VA` | vmsum4fp128: read | Source A vector register. |
|
||||
| `VB` | vmsum4fp128: read | Source B vector register. |
|
||||
| `VD` | vmsum4fp128: write | Destination vector register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `vmsum4fp128`
|
||||
|
||||
- **Reads (always):** `VA`, `VB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `VD`
|
||||
- **Writes (conditional):** _none_
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
_No condition-register or status-register effects._
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`vmsum4fp128`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vmsum4fp128"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:1077`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L1077)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:106`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L106)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:617`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L617)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:4524-4535`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L4524-L4535)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::vmsum4fp128 => {
|
||||
// PPCBUG-436.
|
||||
let a = ctx.vr[instr.va128()].as_f32x4();
|
||||
let b = ctx.vr[instr.vb128()].as_f32x4();
|
||||
let p0 = vmx::flush_denorm(a[0] * b[0]);
|
||||
let p1 = vmx::flush_denorm(a[1] * b[1]);
|
||||
let p2 = vmx::flush_denorm(a[2] * b[2]);
|
||||
let p3 = vmx::flush_denorm(a[3] * b[3]);
|
||||
let s = vmx::flush_denorm(p0 + p1 + p2 + p3);
|
||||
ctx.vr[instr.vd128()] = xenia_types::Vec128::from_f32x4(s, s, s, s);
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **4-way float dot product.** Computes `s = VA[0]*VB[0] + VA[1]*VB[1] + VA[2]*VB[2] + VA[3]*VB[3]` (the full xyzw dot) and **broadcasts `s` to every lane of `VD`**.
|
||||
- **Scalar-result-splatted-across-lanes.** Direct mirror of HLSL/GLSL's `float4 dot`.
|
||||
- **Rounding.** Three sequential adds; round-off order affects result by ~1 ulp. Not an FMA in xenia.
|
||||
- **IEEE-754 binary32; `VSCR[NJ]` honoured.**
|
||||
- **No VSCR[SAT], no FPSCR update.**
|
||||
- **VMX128 register-fusion** (7-bit IDs on `VA`, `VB`, `VD`).
|
||||
- **No IBM AIX entry** — Xenon-only.
|
||||
- **No `Rc`, no XER.**
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`vmsum3fp128`](vmsum3fp128.md) — 3-way dot-product (ignores the w-lane).
|
||||
- [`vmulfp128`](vmulfp128.md), [`vaddfp`](../vmx/vaddfp.md) — the building blocks.
|
||||
- [`vmaddcfp128`](vmaddcfp128.md), [`vmaddfp`](../vmx/vmaddfp.md) — fused MAC variants.
|
||||
- [`vsumsws`](../vmx/vsumsws.md) — integer sum-reduction analogue.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- No IBM AIX entry — Xbox 360 VMX128 extension only.
|
||||
- Xbox 360 XDK, Altivec-128 (VMX128) extensions. Directly mirrors D3D9's `float4 dot`.
|
||||
- [IBM AltiVec Technology Programmer's Interface Manual, Chapter 5 — Floating-Point Arithmetic](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf) for base float semantics.
|
||||
142
migration/project-root/ppc-manual/vmx128/vmulfp128.md
Normal file
142
migration/project-root/ppc-manual/vmx128/vmulfp128.md
Normal file
@@ -0,0 +1,142 @@
|
||||
# `vmulfp128` — Vector128 Multiply Floating-Point
|
||||
|
||||
> **Category:** [VMX128](../categories/vmx128.md) · **Form:** [VX128](../forms/VX128.md) · **Opcode:** `0x14000090`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `vmulfp128` | `vmulfp128` | — | Vector128 Multiply Floating-Point |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
vmulfp128 [VD], [VA], [VB]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `vmulfp128` — form `VX128`
|
||||
|
||||
- **Opcode word:** `0x14000090`
|
||||
- **Primary opcode (bits 0–5):** `5`
|
||||
- **Extended opcode:** `144`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (4 or 5) |
|
||||
| 6–10 | `VD128l` | destination low 5 bits |
|
||||
| 11–15 | `VA128l` | source A low 5 bits |
|
||||
| 16–20 | `VB128l` | source B low 5 bits |
|
||||
| 21 | `VA128H` | source A high bit |
|
||||
| 22 | `—` | reserved |
|
||||
| 23–25 | `VC` | optional VC / XO sub-field |
|
||||
| 26 | `VA128h` | source A middle bit |
|
||||
| 27 | `—` | reserved |
|
||||
| 28–29 | `VD128h` | destination high 2 bits |
|
||||
| 30–31 | `VB128h` | source B high 2 bits |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `VA` | vmulfp128: read | Source A vector register. |
|
||||
| `VB` | vmulfp128: read | Source B vector register. |
|
||||
| `VD` | vmulfp128: write | Destination vector register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `vmulfp128`
|
||||
|
||||
- **Reads (always):** `VA`, `VB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `VD`
|
||||
- **Writes (conditional):** _none_
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
_No condition-register or status-register effects._
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`vmulfp128`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vmulfp128"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:1126`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L1126)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:108`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L108)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:612`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L612)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2108-2120`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2108-L2120)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::vmulfp128 => {
|
||||
// PPCBUG-435 + PPCBUG-437.
|
||||
let a = ctx.vr[instr.va128()].as_f32x4();
|
||||
let b = ctx.vr[instr.vb128()].as_f32x4();
|
||||
let mut r = [0f32; 4];
|
||||
for i in 0..4 {
|
||||
let ai = vmx::flush_denorm(a[i]);
|
||||
let bi = vmx::flush_denorm(b[i]);
|
||||
r[i] = vmx::flush_denorm(ai * bi);
|
||||
}
|
||||
ctx.vr[instr.vd128()] = xenia_types::Vec128::from_f32x4_array(r);
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Lane-wise float multiply — Xenon-only.** Base Altivec has no dedicated `vmulfp`; the pattern on traditional PowerPC is `vmaddfp vD, vA, vC, v_zero`. Xenon adds this direct instruction, saving the zero-register setup.
|
||||
- **IEEE-754 binary32, round-to-nearest.** Each of the four lanes computes `VD[i] = VA[i] * VB[i]`.
|
||||
- **`VSCR[NJ]` honoured** (denormals flush-to-zero).
|
||||
- **NaN propagation** per IEEE-754.
|
||||
- **No VSCR[SAT], no FPSCR update, no exceptions.**
|
||||
- **VMX128 register-fusion** (7-bit IDs).
|
||||
- **No IBM AIX entry** — Xbox-specific; contrast with the `vmaddfp`-with-zero workaround used on non-Xenon Altivec.
|
||||
- **No `Rc`, no XER.**
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`vmaddfp`](../vmx/vmaddfp.md), [`vmaddcfp128`](vmaddcfp128.md) — fused MAC forms.
|
||||
- [`vaddfp`](../vmx/vaddfp.md), [`vsubfp`](../vmx/vsubfp.md) — lane-wise float add/sub.
|
||||
- [`vmsum3fp128`](vmsum3fp128.md), [`vmsum4fp128`](vmsum4fp128.md) — dot-product reductions.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- No IBM AIX entry — this instruction is exclusive to the Xbox 360's VMX128 extension.
|
||||
- Xbox 360 XDK, Altivec-128 (VMX128) extensions. Non-Xenon Altivec code emits `vmaddfp vD, vA, vC, v_zero` to achieve the same effect.
|
||||
- [IBM AltiVec Technology Programmer's Interface Manual §`vmaddfp`](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf) for the underlying float semantics.
|
||||
139
migration/project-root/ppc-manual/vmx128/vpermwi128.md
Normal file
139
migration/project-root/ppc-manual/vmx128/vpermwi128.md
Normal file
@@ -0,0 +1,139 @@
|
||||
# `vpermwi128` — Vector128 Permutate Word Immediate
|
||||
|
||||
> **Category:** [VMX128](../categories/vmx128.md) · **Form:** [VX128_P](../forms/VX128_P.md) · **Opcode:** `0x18000210`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `vpermwi128` | `vpermwi128` | — | Vector128 Permutate Word Immediate |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
vpermwi128 [VD], [VB], [UIMM]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `vpermwi128` — form `VX128_P`
|
||||
|
||||
- **Opcode word:** `0x18000210`
|
||||
- **Primary opcode (bits 0–5):** `6`
|
||||
- **Extended opcode:** `528`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (6) |
|
||||
| 6–10 | `VD128l` | destination low 5 bits |
|
||||
| 11–15 | `PERMl` | permute selector low 5 bits |
|
||||
| 16–20 | `VB128l` | source B low 5 bits |
|
||||
| 21–22 | `—` | reserved |
|
||||
| 23–25 | `PERMh` | permute selector high 3 bits |
|
||||
| 28–29 | `VD128h` | destination high 2 bits |
|
||||
| 30–31 | `VB128h` | source B high 2 bits |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `VB` | vpermwi128: read | Source B vector register. |
|
||||
| `UIMM` | vpermwi128: read | 16-bit unsigned immediate. Zero-extended. |
|
||||
| `VD` | vpermwi128: write | Destination vector register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `vpermwi128`
|
||||
|
||||
- **Reads (always):** `VB`, `UIMM`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `VD`
|
||||
- **Writes (conditional):** _none_
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
_No condition-register or status-register effects._
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`vpermwi128`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vpermwi128"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:1207`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L1207)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:112`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L112)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:642`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L642)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:4537-4548`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L4537-L4548)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::vpermwi128 => {
|
||||
let imm = instr.vx128_p_perm();
|
||||
let b = ctx.vr[instr.vb128()].as_u32x4();
|
||||
let mut r = [0u32; 4];
|
||||
// Output lane i ← b[(imm >> (2 * (3-i))) & 3]
|
||||
for i in 0..4 {
|
||||
let sel = ((imm >> (2 * (3 - i))) & 3) as usize;
|
||||
r[i] = b[sel];
|
||||
}
|
||||
ctx.vr[instr.vd128()] = xenia_types::Vec128::from_u32x4_array(r);
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Word-level 4-way permute via an 8-bit immediate.** The 8-bit `PERM` immediate (carried in fields `PERMh ‖ PERMl` of the encoding) is treated as **four 2-bit selectors**, one per output word lane. Each 2-bit field selects which of `VB`'s 4 word lanes is copied to the corresponding output lane.
|
||||
- **Bit layout of the immediate.** Output lane 0 (big-endian MSB word) is selected by bits 6–7 of `PERM`; lane 1 by bits 4–5; lane 2 by bits 2–3; lane 3 by bits 0–1. (In xenia: `sel = (imm >> (2 * (3-i))) & 3`.)
|
||||
- **Super-set of [`vspltw`](../vmx/vspltw.md).** A splat is `vpermwi128 vD, vB, 0x00` (all lanes = word 0), `0x55` (all = word 1), `0xAA` (all = word 2), `0xFF` (all = word 3). Arbitrary shuffles like "xyzw → wzyx" are a single-instruction operation.
|
||||
- **Immediate-only.** No dynamic selector vector; contrast with [`vperm`](../vmx/vperm.md).
|
||||
- **Single-source.** Unlike `vperm`/`vperm128`, `vpermwi128` only reshuffles one register (`VB`); it cannot interleave two operands.
|
||||
- **VMX128 register-fusion** on `VD` and `VB` (7-bit IDs).
|
||||
- **No IBM AIX entry** — Xenon-only.
|
||||
- **No `Rc`, no XER, no VSCR.**
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`vperm`](../vmx/vperm.md), [`vperm128`](../vmx/vperm.md) — general byte-granularity permute (two-source).
|
||||
- [`vspltw`](../vmx/vspltw.md), [`vspltw128`](../vmx/vspltw.md) — single-word splat (special case of `vpermwi128`).
|
||||
- [`vsldoi`](../vmx/vsldoi.md) — static-immediate byte rotate of two registers.
|
||||
- [`vrlimi128`](vrlimi128.md) — rotate + mask-insert (per-word rotate with an insert mask).
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- No IBM AIX entry — this instruction is exclusive to the Xbox 360's VMX128 extension.
|
||||
- Xbox 360 XDK, Altivec-128 (VMX128) extensions. Functionally equivalent to HLSL's `.xyzw`-suffix swizzle on `float4`.
|
||||
- [IBM AltiVec Technology Programmer's Interface Manual, Chapter 6 — Permute and Formatting](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf) for the base permute semantics.
|
||||
185
migration/project-root/ppc-manual/vmx128/vpkd3d128.md
Normal file
185
migration/project-root/ppc-manual/vmx128/vpkd3d128.md
Normal file
@@ -0,0 +1,185 @@
|
||||
# `vpkd3d128` — Vector128 Pack D3Dtype, Rotate Left Immediate and Mask Insert
|
||||
|
||||
> **Category:** [VMX128](../categories/vmx128.md) · **Form:** [VX128_4](../forms/VX128_4.md) · **Opcode:** `0x18000610`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `vpkd3d128` | `vpkd3d128` | — | Vector128 Pack D3Dtype, Rotate Left Immediate and Mask Insert |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
(no disassembly template)
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `vpkd3d128` — form `VX128_4`
|
||||
|
||||
- **Opcode word:** `0x18000610`
|
||||
- **Primary opcode (bits 0–5):** `6`
|
||||
- **Extended opcode:** `1552`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (6) |
|
||||
| 6–10 | `VD128l` | destination low 5 bits |
|
||||
| 11–15 | `IMM` | 5-bit immediate |
|
||||
| 16–20 | `VB128l` | source B low 5 bits |
|
||||
| 21–23 | `XO` | extended opcode |
|
||||
| 24–25 | `z` | sub-operation selector |
|
||||
| 28–29 | `VD128h` | destination high 2 bits |
|
||||
| 30–31 | `VB128h` | source B high 2 bits |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `VB` | vpkd3d128: read | Source B vector register. |
|
||||
| `VD` | vpkd3d128: write | Destination vector register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `vpkd3d128`
|
||||
|
||||
- **Reads (always):** `VB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `VD`
|
||||
- **Writes (conditional):** _none_
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
_No condition-register or status-register effects._
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`vpkd3d128`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vpkd3d128"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:2088`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L2088)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:112`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L112)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:648`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L648)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:4191-4248`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L4191-L4248)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::vpkd3d128 => {
|
||||
use crate::vmx::D3dPackType;
|
||||
let uimm = crate::decoder::extract_vx128_uimm5(instr.raw);
|
||||
let pack = (uimm & 3) as usize;
|
||||
let shift = instr.vx128_4_z() as usize;
|
||||
let ty = D3dPackType::from_immediate(uimm >> 2);
|
||||
let src = ctx.vr[instr.vb128()];
|
||||
let out = match ty {
|
||||
D3dPackType::D3dColor => crate::vmx::pack_d3dcolor(src),
|
||||
D3dPackType::NormShort2 => crate::vmx::pack_normshort2(src),
|
||||
D3dPackType::NormPacked32 => crate::vmx::pack_normpacked32(src),
|
||||
D3dPackType::Float16_2 => crate::vmx::pack_float16_2(src),
|
||||
D3dPackType::NormShort4 => crate::vmx::pack_normshort4(src),
|
||||
D3dPackType::Float16_4 => crate::vmx::pack_float16_4(src),
|
||||
D3dPackType::NormPacked64 => crate::vmx::pack_normpacked64(src),
|
||||
D3dPackType::Other(t) => {
|
||||
tracing::warn!(
|
||||
raw = format_args!("{:#010x}", instr.raw),
|
||||
uimm,
|
||||
ty = t,
|
||||
"vpkd3d128: unhandled pack type at {:#010x}",
|
||||
ctx.pc,
|
||||
);
|
||||
src
|
||||
}
|
||||
};
|
||||
// Post-pack permutation: merge packed `out` into previous `vd`
|
||||
// per canary ppc_emit_altivec.cc:2126-2188 MakePermuteMask tables.
|
||||
// MakePermuteMask(r0,l0, r1,l1, r2,l2, r3,l3): result[i] = if ri==0 { prev[li] } else { out[li] }
|
||||
let result = if pack == 0 {
|
||||
out
|
||||
} else {
|
||||
// (source_reg, lane): 0=prev vd, 1=packed out
|
||||
const PERM: [[[(u8, u8); 4]; 4]; 3] = [
|
||||
// pack=1 (VPACK_32): places out[3] at lane (3-shift)
|
||||
[[(0,0),(0,1),(0,2),(1,3)], [(0,0),(0,1),(1,3),(0,3)],
|
||||
[(0,0),(1,3),(0,2),(0,3)], [(1,3),(0,1),(0,2),(0,3)]],
|
||||
// pack=2 (64-bit): places out[2..3] at lanes (2-shift)..(3-shift)
|
||||
[[(0,0),(0,1),(1,2),(1,3)], [(0,0),(1,2),(1,3),(0,3)],
|
||||
[(1,2),(1,3),(0,2),(0,3)], [(1,3),(0,1),(0,2),(0,3)]],
|
||||
// pack=3 (64-bit): same as pack=2 except shift=3 selects out[2] at lane 3
|
||||
[[(0,0),(0,1),(1,2),(1,3)], [(0,0),(1,2),(1,3),(0,3)],
|
||||
[(1,2),(1,3),(0,2),(0,3)], [(0,0),(0,1),(0,2),(1,2)]],
|
||||
];
|
||||
let prev = ctx.vr[instr.vd128()];
|
||||
let pw = prev.as_u32x4();
|
||||
let ow = out.as_u32x4();
|
||||
let sel = PERM[pack - 1][shift];
|
||||
xenia_types::Vec128::from_u32x4_array([
|
||||
if sel[0].0 == 0 { pw[sel[0].1 as usize] } else { ow[sel[0].1 as usize] },
|
||||
if sel[1].0 == 0 { pw[sel[1].1 as usize] } else { ow[sel[1].1 as usize] },
|
||||
if sel[2].0 == 0 { pw[sel[2].1 as usize] } else { ow[sel[2].1 as usize] },
|
||||
if sel[3].0 == 0 { pw[sel[3].1 as usize] } else { ow[sel[3].1 as usize] },
|
||||
])
|
||||
};
|
||||
ctx.vr[instr.vd128()] = result;
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Pack four float lanes into a single D3D-format 32-bit word.** The `IMM` field and the `z` sub-operation selector (together carried in bits 6–10 of the encoding in xenia's layout) choose *which* D3D format to emit:
|
||||
- `D3dColor` — pack 4×float `[0.0, 1.0]` lanes into a 32-bit RGBA8 (A in high byte, B in low byte) — the canonical Direct3D 9 `D3DCOLOR` format. Xenia's helper is `vmx::pack_d3dcolor`.
|
||||
- Other formats (RGBA16, compressed colour, etc.) are not yet implemented in xenia-rs; the interpreter logs a warning and passes through unchanged.
|
||||
- **Also performs rotate-left-immediate and mask-insert.** The mnemonic is "Pack D3Dtype, Rotate Left Immediate and Mask Insert": the result of the pack step is rotated and merged into an existing `VD` under an immediate mask. Xenia currently emits only the pack step and overwrites `VD` wholesale; games rarely rely on the rotate-and-insert aspect.
|
||||
- **Sub-operation via the `z` field** (2 bits) + `IMM` (5 bits) gives 7 bits of format selection; the practical set used by Xenon games is small (D3DCOLOR is the dominant one).
|
||||
- **No saturation signal.** The packer saturates floats beyond `[0.0, 1.0]` silently; `VSCR[SAT]` is not touched.
|
||||
- **VMX128 register-fusion** on `VD` and `VB`.
|
||||
- **No IBM AIX entry** — Xenon-only.
|
||||
- **No `Rc`, no XER.**
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`vupkd3d128`](vupkd3d128.md) — the inverse (unpack a D3D-format word back into 4 floats).
|
||||
- [`vpkpx`](../vmx/vpkpx.md) — the standard Altivec 1-5-5-5 pixel pack.
|
||||
- [`vpkshus`](../vmx/vpkshus.md), [`vpkuhus`](../vmx/vpkuhus.md) — byte-range saturating packs (an alternative colour-packing path).
|
||||
- [`vcfpsxws128`](vcfpsxws128.md), [`vcfpuxws128`](vcfpuxws128.md) — conversion with explicit scale; software sometimes pre-scales floats to `[0, 255]` before using these in place of `vpkd3d128`.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- No IBM AIX entry — Xbox 360 VMX128 extension only. The "D3D" in the mnemonic refers directly to Direct3D 9 vertex/pixel formats (the `D3DDECLTYPE_*` enumeration).
|
||||
- Xbox 360 XDK, Altivec-128 (VMX128) extensions.
|
||||
- Microsoft D3D9 documentation: `D3DDECLTYPE_D3DCOLOR`, `D3DDECLTYPE_UBYTE4N`, etc.
|
||||
141
migration/project-root/ppc-manual/vmx128/vrlimi128.md
Normal file
141
migration/project-root/ppc-manual/vmx128/vrlimi128.md
Normal file
@@ -0,0 +1,141 @@
|
||||
# `vrlimi128` — Vector128 Rotate Left Immediate and Mask Insert
|
||||
|
||||
> **Category:** [VMX128](../categories/vmx128.md) · **Form:** [VX128_4](../forms/VX128_4.md) · **Opcode:** `0x18000710`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `vrlimi128` | `vrlimi128` | — | Vector128 Rotate Left Immediate and Mask Insert |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
vrlimi128 [VD], [VB], [IMM], [z]
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `vrlimi128` — form `VX128_4`
|
||||
|
||||
- **Opcode word:** `0x18000710`
|
||||
- **Primary opcode (bits 0–5):** `6`
|
||||
- **Extended opcode:** `1808`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (6) |
|
||||
| 6–10 | `VD128l` | destination low 5 bits |
|
||||
| 11–15 | `IMM` | 5-bit immediate |
|
||||
| 16–20 | `VB128l` | source B low 5 bits |
|
||||
| 21–23 | `XO` | extended opcode |
|
||||
| 24–25 | `z` | sub-operation selector |
|
||||
| 28–29 | `VD128h` | destination high 2 bits |
|
||||
| 30–31 | `VB128h` | source B high 2 bits |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `VB` | vrlimi128: read | Source B vector register. |
|
||||
| `VD` | vrlimi128: write | Destination vector register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `vrlimi128`
|
||||
|
||||
- **Reads (always):** `VB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `VD`
|
||||
- **Writes (conditional):** _none_
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
_No condition-register or status-register effects._
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`vrlimi128`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vrlimi128"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:1315`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L1315)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:119`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L119)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:649`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L649)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:3962-3977`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L3962-L3977)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::vrlimi128 => {
|
||||
let shift = instr.vx128_4_z() as usize;
|
||||
let mask = instr.vx128_4_imm();
|
||||
let b = ctx.vr[instr.vb128()].as_u32x4();
|
||||
let d = ctx.vr[instr.vd128()].as_u32x4();
|
||||
let rot = [b[shift % 4], b[(shift + 1) % 4], b[(shift + 2) % 4], b[(shift + 3) % 4]];
|
||||
let mut r = [0u32; 4];
|
||||
for i in 0..4 {
|
||||
// mask bit 3 corresponds to word 0 (BE-first). Use rot when
|
||||
// the corresponding mask bit is set.
|
||||
let use_rot = (mask >> (3 - i)) & 1 == 1;
|
||||
r[i] = if use_rot { rot[i] } else { d[i] };
|
||||
}
|
||||
ctx.vr[instr.vd128()] = xenia_types::Vec128::from_u32x4_array(r);
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Rotate-left-word + mask-insert in one step.** `VB` is rotated left by `IMM & 3` word positions (word-granular, 0..3 — not bits). The resulting rotated vector is merged into the pre-existing `VD` under control of a 4-bit "insert mask" (`fmask`, from bits 26–29 of the encoding in xenia's layout): mask bit `i` = 1 keeps lane `i` from the rotated `VB`; mask bit = 0 keeps lane `i` from the old `VD`.
|
||||
- **Destructive destination.** `VD` is both source and destination — software must preserve its value or pre-initialise it.
|
||||
- **Typical use: selective-lane overwrite.** Games use this to "rewrite lane `n` of a vector with a shuffled component" without a full permute. A common pattern is "insert a scalar into lane `i` of a vector" where the scalar has been pre-loaded to a known word of `VB`.
|
||||
- **Mask bit ↔ lane mapping.** Big-endian: mask bit 3 (MSB of the 4-bit mask) controls lane 0; bit 0 controls lane 3. (In xenia: `use_rot = (mask >> (3 − i)) & 1`.)
|
||||
- **VMX128 register-fusion** on `VD` and `VB`.
|
||||
- **No IBM AIX entry** — Xenon-only.
|
||||
- **No `Rc`, no XER, no VSCR.**
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`vrlw`](../vmx/vrlw.md), [`vrlw128`](../vmx/vrlw.md) — per-lane bit-level rotate (word-granular shift, not lane-granular).
|
||||
- [`vpermwi128`](vpermwi128.md) — immediate 4-way word permute (no merge).
|
||||
- [`vsel`](../vmx/vsel.md), [`vsel128`](../vmx/vsel.md) — general bit-select; `vrlimi128` is the specialised "rotate + insert" equivalent.
|
||||
- [`vsldoi`](../vmx/vsldoi.md) — byte-level immediate shift.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- No IBM AIX entry — this instruction is exclusive to the Xbox 360's VMX128 extension. The mnemonic is an adaptation of the scalar `rlwimi` (rotate-left-word-immediate-mask-insert) pattern for vectors.
|
||||
- Xbox 360 XDK, Altivec-128 (VMX128) extensions.
|
||||
- [IBM AltiVec Technology Programmer's Interface Manual, Chapter 4 — Integer Shift / Rotate](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf) for the base rotate semantics.
|
||||
154
migration/project-root/ppc-manual/vmx128/vupkd3d128.md
Normal file
154
migration/project-root/ppc-manual/vmx128/vupkd3d128.md
Normal file
@@ -0,0 +1,154 @@
|
||||
# `vupkd3d128` — Vector128 Unpack D3Dtype
|
||||
|
||||
> **Category:** [VMX128](../categories/vmx128.md) · **Form:** [VX128_3](../forms/VX128_3.md) · **Opcode:** `0x180007f0`
|
||||
|
||||
<!-- GENERATED: BEGIN -->
|
||||
|
||||
## Assembler Mnemonics
|
||||
|
||||
| Mnemonic | XML entry | Flags | Description |
|
||||
| --- | --- | --- | --- |
|
||||
| `vupkd3d128` | `vupkd3d128` | — | Vector128 Unpack D3Dtype |
|
||||
|
||||
## Syntax
|
||||
|
||||
```asm
|
||||
(no disassembly template)
|
||||
```
|
||||
|
||||
## Encoding
|
||||
|
||||
### `vupkd3d128` — form `VX128_3`
|
||||
|
||||
- **Opcode word:** `0x180007f0`
|
||||
- **Primary opcode (bits 0–5):** `6`
|
||||
- **Extended opcode:** `2032`
|
||||
- **Synchronising:** no
|
||||
|
||||
| Bits | Field | Meaning |
|
||||
| --- | --- | --- |
|
||||
| 0–5 | `OPCD` | primary opcode (6) |
|
||||
| 6–10 | `VD128l` | destination low 5 bits |
|
||||
| 11–15 | `IMM` | 5-bit immediate |
|
||||
| 16–20 | `VB128l` | source B low 5 bits |
|
||||
| 21–27 | `XO` | extended opcode |
|
||||
| 28–29 | `VD128h` | destination high 2 bits |
|
||||
| 30–31 | `VB128h` | source B high 2 bits |
|
||||
|
||||
## Operands
|
||||
|
||||
| Field | Role | Description |
|
||||
| --- | --- | --- |
|
||||
| `VB` | vupkd3d128: read | Source B vector register. |
|
||||
| `VD` | vupkd3d128: write | Destination vector register. |
|
||||
|
||||
## Register Effects
|
||||
|
||||
### `vupkd3d128`
|
||||
|
||||
- **Reads (always):** `VB`
|
||||
- **Reads (conditional):** _none_
|
||||
- **Writes (always):** `VD`
|
||||
- **Writes (conditional):** _none_
|
||||
|
||||
## Status-Register Effects
|
||||
|
||||
_No condition-register or status-register effects._
|
||||
|
||||
## Operation (pseudocode)
|
||||
|
||||
```
|
||||
; Pseudocode derives directly from the xenia-rs interpreter
|
||||
; arm (see Implementation References). Operation semantics:
|
||||
; - Read source operands from the fields listed under Operands.
|
||||
; - Apply the arithmetic / logical / memory action described
|
||||
; in the Description field above.
|
||||
; - Write results to the destination register(s); update any
|
||||
; status bits enumerated under Status-Register Effects.
|
||||
; Consult the IBM AIX reference link under IBM Reference for
|
||||
; canonical PPC-style pseudocode where xenia's expression is
|
||||
; terse.
|
||||
```
|
||||
|
||||
## C Translation Example
|
||||
|
||||
```c
|
||||
/* C translation: the xenia-rs interpreter arm below in */
|
||||
/* Implementation References is the authoritative semantic */
|
||||
/* snapshot. Translate it line-by-line: */
|
||||
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
|
||||
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
|
||||
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
|
||||
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
|
||||
/* The Register Effects and Status-Register Effects tables above */
|
||||
/* enumerate every side effect a faithful translation must emit. */
|
||||
```
|
||||
|
||||
## Implementation References
|
||||
|
||||
**`vupkd3d128`**
|
||||
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vupkd3d128"`](../../xenia-canary/tools/ppc-instructions.xml)
|
||||
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:2194`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L2194)
|
||||
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:128`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L128)
|
||||
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:670`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L670)
|
||||
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:4249-4275`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L4249-L4275)
|
||||
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
|
||||
|
||||
```rust
|
||||
PpcOpcode::vupkd3d128 => {
|
||||
use crate::vmx::D3dPackType;
|
||||
let uimm = crate::decoder::extract_vx128_uimm5(instr.raw);
|
||||
let ty = D3dPackType::from_immediate(uimm >> 2);
|
||||
let src = ctx.vr[instr.vb128()];
|
||||
let out = match ty {
|
||||
D3dPackType::D3dColor => crate::vmx::unpack_d3dcolor(src),
|
||||
D3dPackType::NormShort2 => crate::vmx::unpack_normshort2(src),
|
||||
D3dPackType::NormPacked32 => crate::vmx::unpack_normpacked32(src),
|
||||
D3dPackType::Float16_2 => crate::vmx::unpack_float16_2(src),
|
||||
D3dPackType::NormShort4 => crate::vmx::unpack_normshort4(src),
|
||||
D3dPackType::Float16_4 => crate::vmx::unpack_float16_4(src),
|
||||
D3dPackType::NormPacked64 => crate::vmx::unpack_normpacked64(src),
|
||||
D3dPackType::Other(t) => {
|
||||
tracing::warn!(
|
||||
raw = format_args!("{:#010x}", instr.raw),
|
||||
uimm,
|
||||
ty = t,
|
||||
"vupkd3d128: unhandled pack type at {:#010x}",
|
||||
ctx.pc,
|
||||
);
|
||||
src
|
||||
}
|
||||
};
|
||||
ctx.vr[instr.vd128()] = out;
|
||||
ctx.pc += 4;
|
||||
}
|
||||
```
|
||||
</details>
|
||||
|
||||
<!-- GENERATED: END -->
|
||||
|
||||
## Special Cases & Edge Conditions
|
||||
|
||||
- **Unpack a D3D-format word into 4 float lanes.** The `IMM` field in the encoding selects the target format:
|
||||
- `D3dColor` — decode a 32-bit RGBA8 (`D3DCOLOR`) into 4 float lanes in `[0.0, 1.0]`. Xenia's helper is `vmx::unpack_d3dcolor`.
|
||||
- Other formats (UBYTE4N, SHORT2N, etc.) are not yet implemented in xenia-rs; the interpreter logs a warning and passes `VB` through unchanged.
|
||||
- **Inverse of [`vpkd3d128`](vpkd3d128.md).** The same format code used to pack must be used to unpack.
|
||||
- **Source-width is a single 32-bit word** of `VB` (typically lane 0; the helpers read the appropriate component). The other three input word lanes are ignored for `D3DCOLOR`.
|
||||
- **IEEE-754 binary32 outputs,** already normalised to `[0.0, 1.0]` (integer value divided by 255, then cast to float).
|
||||
- **No `VSCR[SAT]` effect**, no FPSCR, no exceptions.
|
||||
- **VMX128 register-fusion** on `VD` and `VB`.
|
||||
- **No IBM AIX entry** — Xenon-only.
|
||||
- **No `Rc`, no XER.**
|
||||
|
||||
## Related Instructions
|
||||
|
||||
- [`vpkd3d128`](vpkd3d128.md) — the inverse pack.
|
||||
- [`vupkhpx`](../vmx/vupkhpx.md), [`vupklpx`](../vmx/vupklpx.md) — standard Altivec 1-5-5-5 pixel unpacks.
|
||||
- [`vupkhsb`](../vmx/vupkhsb.md), [`vupklsb`](../vmx/vupklsb.md) — sign-extending byte→half-word unpacks (the integer analogue).
|
||||
- [`vcsxwfp128`](vcsxwfp128.md), [`vcuxwfp128`](vcuxwfp128.md) — int → float with scale; sometimes used as an alternate decode path.
|
||||
|
||||
## IBM Reference
|
||||
|
||||
- No IBM AIX entry — Xbox 360 VMX128 extension only. "D3D" denotes the Direct3D 9 vertex/pixel format catalogue (`D3DDECLTYPE_*`).
|
||||
- Xbox 360 XDK, Altivec-128 (VMX128) extensions.
|
||||
- Microsoft D3D9 documentation: `D3DDECLTYPE_D3DCOLOR`, `D3DDECLTYPE_UBYTE4N`, etc.
|
||||
Reference in New Issue
Block a user