Files
xenia-rs/migration/project-root/ppc-manual/vmx/vperm.md
MechaCat02 e6d43a23ac chore: add migration/ bundle for cross-machine setup
Bundles state that lives OUTSIDE the xenia-rs repo so a fresh clone on
another machine can be brought up to identical configuration via
migration/setup.sh:

  - claude-memory/             ~/.claude/projects/-home-fabi-RE-Project-Sylpheed/memory/
                               (103 files, 1.1 MB - MEMORY.md + every
                                project_xenia_rs_*.md from audits
                                addis_signext through audit-058)
  - project-root/dot-claude/   <project-root>/.claude/settings.json
                               (Stop hook + permissions)
  - project-root/ppc-manual/   <project-root>/ppc-manual/
                               (PowerPC reference docs, 397 files, 3.7 MB)
  - project-root/run-canary.sh <project-root>/run-canary.sh
  - README.md                  Human-readable setup checklist
  - setup.sh                   Idempotent installer (also reclones
                               xenia-canary at pinned HEAD 6de80dffe)
  - MANIFEST.md                Per-file mapping + per-file-not-bundled
                               restoration recipe

Excluded from bundle (not shippable via git):
  - Sylpheed ISO (7.8 GB; copyright; manual copy required)
  - sylpheed.db (395 MB; regenerable from XEX via analysis tooling)
  - target/ build artifacts (rebuild on target)
  - audit-runs probe firehoses (.log/.stdout/.stderr ~11 GB; rerun if needed)
  - audit-runs memory dumps (.bin ~4.5 GB; rerun audit-026/027/029 if needed)
  - xenia-canary checkout (setup.sh reclones from
    git.mc02.dev/fabi/Xenia-Canary.git at HEAD 6de80dffe)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 21:38:38 +02:00

217 lines
9.2 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# `vperm` — Vector Permute
> **Category:** [VMX (Altivec)](../categories/vmx.md) · **Form:** [VA](../forms/VA.md) · **Opcode:** `0x1000002b`
<!-- GENERATED: BEGIN -->
## Assembler Mnemonics
| Mnemonic | XML entry | Flags | Description |
| --- | --- | --- | --- |
| `vperm` | `vperm` | — | Vector Permute |
| `vperm128` | `vperm128` | — | Vector128 Permute |
## Syntax
```asm
vperm [VD], [VA], [VB], [VC]
vperm128 [VD], [VA], [VB], [VC]
```
## Encoding
### `vperm` — form `VA`
- **Opcode word:** `0x1000002b`
- **Primary opcode (bits 05):** `4`
- **Extended opcode:** `43`
- **Synchronising:** no
| Bits | Field | Meaning |
| --- | --- | --- |
| 05 | `OPCD` | primary opcode (4) |
| 610 | `VRT` | destination vector register |
| 1115 | `VRA` | source A |
| 1620 | `VRB` | source B |
| 2125 | `VRC` | source C / shift |
| 2631 | `XO` | extended opcode (6 bits) |
### `vperm128` — form `VX128_2`
- **Opcode word:** `0x14000000`
- **Primary opcode (bits 05):** `5`
- **Extended opcode:** `0`
- **Synchronising:** no
| Bits | Field | Meaning |
| --- | --- | --- |
| 05 | `OPCD` | primary opcode (5) |
| 610 | `VD128l` | destination low 5 bits |
| 1115 | `VA128l` | source A low 5 bits |
| 1620 | `VB128l` | source B low 5 bits |
| 21 | `VA128H` | source A high bit |
| 2325 | `VC` | source C 3-bit field |
| 26 | `VA128h` | source A middle bit |
| 2829 | `VD128h` | destination high 2 bits |
| 3031 | `VB128h` | source B high 2 bits |
## Operands
| Field | Role | Description |
| --- | --- | --- |
| `VA` | vperm: read; vperm128: read | Source A vector register. |
| `VB` | vperm: read; vperm128: read | Source B vector register. |
| `VC` | vperm: read; vperm128: read | Source C vector register / 3-bit selector. |
| `VD` | vperm: write; vperm128: write | Destination vector register. |
## Register Effects
### `vperm`
- **Reads (always):** `VA`, `VB`, `VC`
- **Reads (conditional):** _none_
- **Writes (always):** `VD`
- **Writes (conditional):** _none_
### `vperm128`
- **Reads (always):** `VA`, `VB`, `VC`
- **Reads (conditional):** _none_
- **Writes (always):** `VD`
- **Writes (conditional):** _none_
## Status-Register Effects
_No condition-register or status-register effects._
## Operation (pseudocode)
```
; Pseudocode derives directly from the xenia-rs interpreter
; arm (see Implementation References). Operation semantics:
; - Read source operands from the fields listed under Operands.
; - Apply the arithmetic / logical / memory action described
; in the Description field above.
; - Write results to the destination register(s); update any
; status bits enumerated under Status-Register Effects.
; Consult the IBM AIX reference link under IBM Reference for
; canonical PPC-style pseudocode where xenia's expression is
; terse.
```
## C Translation Example
```c
/* C translation: the xenia-rs interpreter arm below in */
/* Implementation References is the authoritative semantic */
/* snapshot. Translate it line-by-line: */
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
/* The Register Effects and Status-Register Effects tables above */
/* enumerate every side effect a faithful translation must emit. */
```
## Implementation References
**`vperm`**
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vperm"`](../../xenia-canary/tools/ppc-instructions.xml)
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:1199`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L1199)
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:112`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L112)
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:586`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L586)
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2278-2302`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2278-L2302)
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
```rust
PpcOpcode::vperm | PpcOpcode::vperm128 => {
let (va, vb, vd);
let vc;
if matches!(instr.opcode, PpcOpcode::vperm128) {
va = instr.va128();
vb = instr.vb128();
vd = instr.vd128();
vc = instr.vc128_2();
} else {
va = instr.ra();
vb = instr.rb();
vd = instr.rd();
vc = instr.rc();
}
let a_bytes = ctx.vr[va].as_bytes();
let b_bytes = ctx.vr[vb].as_bytes();
let c_bytes = ctx.vr[vc].as_bytes();
let mut r = [0u8; 16];
for i in 0..16 {
let idx = (c_bytes[i] & 0x1F) as usize;
r[i] = if idx < 16 { a_bytes[idx] } else { b_bytes[idx - 16] };
}
ctx.vr[vd] = xenia_types::Vec128::from_bytes(r);
ctx.pc += 4;
}
```
</details>
**`vperm128`**
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vperm128"`](../../xenia-canary/tools/ppc-instructions.xml)
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:1202`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L1202)
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:112`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L112)
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:605`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L605)
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2278-2302`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2278-L2302)
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
```rust
PpcOpcode::vperm | PpcOpcode::vperm128 => {
let (va, vb, vd);
let vc;
if matches!(instr.opcode, PpcOpcode::vperm128) {
va = instr.va128();
vb = instr.vb128();
vd = instr.vd128();
vc = instr.vc128_2();
} else {
va = instr.ra();
vb = instr.rb();
vd = instr.rd();
vc = instr.rc();
}
let a_bytes = ctx.vr[va].as_bytes();
let b_bytes = ctx.vr[vb].as_bytes();
let c_bytes = ctx.vr[vc].as_bytes();
let mut r = [0u8; 16];
for i in 0..16 {
let idx = (c_bytes[i] & 0x1F) as usize;
r[i] = if idx < 16 { a_bytes[idx] } else { b_bytes[idx - 16] };
}
ctx.vr[vd] = xenia_types::Vec128::from_bytes(r);
ctx.pc += 4;
}
```
</details>
<!-- GENERATED: END -->
## Special Cases & Edge Conditions
- **Per-byte selector drives a cross-vector permute.** Each byte of `VC` is a 5-bit selector (low 5 bits used, upper 3 bits ignored). Bit 3 of that 5-bit field (i.e. the "16 bit") chooses which source: 0 selects from `VA`, 1 selects from `VB`. The low 4 bits index a byte within the chosen 16-byte operand.
- **`vperm` is the universal "16-byte reshuffle" primitive.** It can express any byte-level permutation of 32 source bytes (`VA ‖ VB`) down to 16 destination bytes, including duplicates and drops.
- **Big-endian byte indexing.** `VC.b[0]` controls `VD.b[0]` (the MSB byte). Selector value 0 picks `VA.b[0]`, value 15 picks `VA.b[15]`, value 16 picks `VB.b[0]`, value 31 picks `VB.b[15]`.
- **Upper 3 bits of each `VC` byte are ignored.** Only bits 3..7 (the low 5) are consulted, so values like 0x1F and 0x5F both mean "byte 15 of VB". Software can use those upper bits for its own tagging.
- **Pair with [`lvsl`](lvsl.md) / [`lvsr`](lvsr.md) for unaligned 16-byte loads.** `lvsl` produces the selector that shifts "left" by `EA & 0xF` bytes; feeding that into `vperm` with two aligned `lvx` results yields the unaligned 16-byte view.
- **Aliasing legal.** `VD` may equal `VA` or `VB`.
- **VMX128 sibling [`vperm128`](vperm128.md).** Same shape with the 7-bit register file. The VMX128 encoding carries `VC` in the 3-bit `VC` sub-field of the `VX128_2` form — which only lets `VC` select one of **8** specific registers, not 128. In xenia's decoder this is `vc128()`.
- **No flags, no VSCR side-effect.**
## Related Instructions
- [`vsldoi`](vsldoi.md) — static-shift-by-`SHB` form; when the shift is a compile-time constant this is cheaper than `lvsl`+`vperm`.
- [`lvsl`](lvsl.md), [`lvsr`](lvsr.md) — generate the permute mask from an effective address.
- [`vmrghb`](vmrghb.md), [`vmrglb`](vmrglb.md), [`vmrghh`](vmrghh.md), [`vmrglh`](vmrglh.md), [`vmrghw`](vmrghw.md), [`vmrglw`](vmrglw.md) — dedicated merges that are a subset of `vperm`.
- [`vspltb`](vspltb.md), [`vsplth`](vsplth.md), [`vspltw`](vspltw.md) — splat-from-lane, also expressible via `vperm` + a constant mask.
- [`vpkuhum`](vpkuhum.md) and other `vpk*` — narrower-lane packs whose pattern can also be encoded in `vperm`.
## IBM Reference
- [AIX 7.3 — `vperm` (Vector Permute)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-vperm-vector-permute-instruction)
- [IBM AltiVec Technology Programmer's Interface Manual, Chapter 6 — Permute and Formatting](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf)