Files
xenia-rs/migration/project-root/ppc-manual/vmx/lvsl.md
MechaCat02 e6d43a23ac chore: add migration/ bundle for cross-machine setup
Bundles state that lives OUTSIDE the xenia-rs repo so a fresh clone on
another machine can be brought up to identical configuration via
migration/setup.sh:

  - claude-memory/             ~/.claude/projects/-home-fabi-RE-Project-Sylpheed/memory/
                               (103 files, 1.1 MB - MEMORY.md + every
                                project_xenia_rs_*.md from audits
                                addis_signext through audit-058)
  - project-root/dot-claude/   <project-root>/.claude/settings.json
                               (Stop hook + permissions)
  - project-root/ppc-manual/   <project-root>/ppc-manual/
                               (PowerPC reference docs, 397 files, 3.7 MB)
  - project-root/run-canary.sh <project-root>/run-canary.sh
  - README.md                  Human-readable setup checklist
  - setup.sh                   Idempotent installer (also reclones
                               xenia-canary at pinned HEAD 6de80dffe)
  - MANIFEST.md                Per-file mapping + per-file-not-bundled
                               restoration recipe

Excluded from bundle (not shippable via git):
  - Sylpheed ISO (7.8 GB; copyright; manual copy required)
  - sylpheed.db (395 MB; regenerable from XEX via analysis tooling)
  - target/ build artifacts (rebuild on target)
  - audit-runs probe firehoses (.log/.stdout/.stderr ~11 GB; rerun if needed)
  - audit-runs memory dumps (.bin ~4.5 GB; rerun audit-026/027/029 if needed)
  - xenia-canary checkout (setup.sh reclones from
    git.mc02.dev/fabi/Xenia-Canary.git at HEAD 6de80dffe)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 21:38:38 +02:00

185 lines
7.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# `lvsl` — Load Vector for Shift Left Indexed
> **Category:** [VMX (Altivec)](../categories/vmx.md) · **Form:** [X](../forms/X.md) · **Opcode:** `0x7c00000c`
<!-- GENERATED: BEGIN -->
## Assembler Mnemonics
| Mnemonic | XML entry | Flags | Description |
| --- | --- | --- | --- |
| `lvsl` | `lvsl` | — | Load Vector for Shift Left Indexed |
| `lvsl128` | `lvsl128` | — | Load Vector for Shift Left Indexed 128 |
## Syntax
```asm
lvsl [VD], [RA0], [RB]
lvsl128 [VD], [RA0], [RB]
```
## Encoding
### `lvsl` — form `X`
- **Opcode word:** `0x7c00000c`
- **Primary opcode (bits 05):** `31`
- **Extended opcode:** `6`
- **Synchronising:** no
| Bits | Field | Meaning |
| --- | --- | --- |
| 05 | `OPCD` | primary opcode |
| 610 | `RT/FRT/VRT` | destination |
| 1115 | `RA/FRA/VRA` | source A |
| 1620 | `RB/FRB/VRB` | source B |
| 2130 | `XO` | extended opcode (10 bits) |
| 31 | `Rc` | record-form flag |
### `lvsl128` — form `VX128_1`
- **Opcode word:** `0x10000003`
- **Primary opcode (bits 05):** `4`
- **Extended opcode:** `3`
- **Synchronising:** no
| Bits | Field | Meaning |
| --- | --- | --- |
| 05 | `OPCD` | primary opcode (4) |
| 610 | `VD128l` | destination low 5 bits |
| 1115 | `RA` | address register |
| 1620 | `RB` | offset register |
| 2127 | `XO` | extended opcode |
| 2829 | `VD128h` | destination high 2 bits |
| 3031 | `—` | reserved |
## Operands
| Field | Role | Description |
| --- | --- | --- |
| `RA0` | lvsl: read; lvsl128: read | Source GPR; when the encoded register number is 0 the operand is the literal 64-bit zero, **not** `r0`. |
| `RB` | lvsl: read; lvsl128: read | Source GPR. |
| `VD` | lvsl: write; lvsl128: write | Destination vector register. |
## Register Effects
### `lvsl`
- **Reads (always):** `RA0`, `RB`
- **Reads (conditional):** _none_
- **Writes (always):** `VD`
- **Writes (conditional):** _none_
### `lvsl128`
- **Reads (always):** `RA0`, `RB`
- **Reads (conditional):** _none_
- **Writes (always):** `VD`
- **Writes (conditional):** _none_
## Status-Register Effects
_No condition-register or status-register effects._
## Operation (pseudocode)
```
addr_lo <- ((RA|0) + (RB))[60:63]
for i in 0..15: VD[i] <- addr_lo + i
```
## C Translation Example
```c
/* lvsl VD, RA, RB — load-shift-left permute control */
uint64_t base = (insn.RA == 0) ? 0 : r[insn.RA];
uint8_t sh = (uint8_t)((base + r[insn.RB]) & 0xF);
for (int i = 0; i < 16; ++i) v[insn.VD].b[i] = sh + i;
```
## Implementation References
**`lvsl`**
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="lvsl"`](../../xenia-canary/tools/ppc-instructions.xml)
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:111`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L111)
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:46`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L46)
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:751`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L751)
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2520-2529`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2520-L2529)
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
```rust
PpcOpcode::lvsl | PpcOpcode::lvsl128 => {
let ea = if instr.ra() == 0 { 0u64 } else { ctx.gpr[instr.ra()] };
let ea = ea.wrapping_add(ctx.gpr[instr.rb()]);
let sh = (ea & 0xF) as u8;
let mut r = [0u8; 16];
for i in 0..16 { r[i] = sh + i as u8; }
let vd = if matches!(instr.opcode, PpcOpcode::lvsl128) { instr.vd128() } else { instr.rd() };
ctx.vr[vd] = xenia_types::Vec128::from_bytes(r);
ctx.pc += 4;
}
```
</details>
**`lvsl128`**
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="lvsl128"`](../../xenia-canary/tools/ppc-instructions.xml)
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:114`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L114)
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:46`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L46)
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:412`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L412)
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:2520-2529`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L2520-L2529)
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
```rust
PpcOpcode::lvsl | PpcOpcode::lvsl128 => {
let ea = if instr.ra() == 0 { 0u64 } else { ctx.gpr[instr.ra()] };
let ea = ea.wrapping_add(ctx.gpr[instr.rb()]);
let sh = (ea & 0xF) as u8;
let mut r = [0u8; 16];
for i in 0..16 { r[i] = sh + i as u8; }
let vd = if matches!(instr.opcode, PpcOpcode::lvsl128) { instr.vd128() } else { instr.rd() };
ctx.vr[vd] = xenia_types::Vec128::from_bytes(r);
ctx.pc += 4;
}
```
</details>
<!-- GENERATED: END -->
## Extended Pseudocode
```
; lvsl VD, RA, RB — load vector for shift left (generates a permute mask)
EA <- (RA|0) + (RB) ; full 64-bit EA; only the low 4 bits matter
sh <- EA[60:63] ; bits 60..63 of EA (the misalignment)
for i in 0..15:
VD[i] <- sh + i ; bytes 0..15 of VD = {sh, sh+1, …, sh+15}
```
## Special Cases & Edge Conditions
- **No memory is actually read.** Despite the name, `lvsl` / `lvsr` do **not** touch memory. They consume the effective address only to extract the low four bits (the alignment offset) and materialise a 16-byte permute control vector in `VD`. They are pure "address → permute-mask" converters.
- **Big-endian byte indexing.** `VD[0]` is the most-significant byte of the 128-bit register (lane 0). When `EA & 0xF == 0` the output is `{0, 1, 2, …, 15}`, i.e. the identity permute. When `EA & 0xF == 3` the output is `{3, 4, …, 18}` — modulo nothing, the values *do* exceed 15. That's intentional: fed into [`vperm`](vperm.md) (`vperm VD, VA, VB, VC`), byte selectors 0..15 index into `VA` and 16..31 index into `VB`. A stream of `lvsl` + two aligned `lvx` loads of consecutive 16-byte blocks + `vperm` reconstructs the unaligned 16-byte vector at `EA`.
- **Pair with [`lvsr`](lvsr.md) for the opposite direction.** `lvsl` shifts "left" (toward the low index / high address byte); `lvsr` shifts "right". Which one to pick depends on which aligned block you're starting from — see the idiom below.
- **Standard unaligned-load idiom.**
```
lvx vAL, r0, rA ; aligned block at EA & ~0xF
lvx vAH, r0, rA + 16 ; next aligned block
lvsl vC, r0, rA ; permute mask from misalignment
vperm vD, vAL, vAH, vC ; the unaligned 16 bytes starting at EA
```
- **`RA0` semantics.** When `RA = 0` the base is the literal zero, so `lvsl vD, 0, rB` derives the mask from `rB & 0xF`.
- **VMX128 sibling (`lvsl128`).** Same semantics; only the `VD` register is encoded with the 7-bit VMX128 register-fusion (`VD128l ‖ VD128h`) so `vD` may be `v0..v127`.
- **No flags, no side effects** beyond writing `VD`. Trivial to move and schedule.
## Related Instructions
- [`lvsr`](lvsr.md) — the mirror: `VD[i] = 16 sh + i`.
- [`vperm`](vperm.md) — consumes the mask to perform arbitrary byte-level permutation across two vectors.
- [`lvx`](lvx.md), [`lvlx`](lvlx.md), [`lvrx`](lvrx.md) — the actual memory loads used alongside the mask.
- [`vsldoi`](vsldoi.md) — static-offset shift-double; when the shift is compile-time known, this is cheaper than the `lvsl`/`vperm` pair.
## IBM Reference
- [AIX 7.3 — `lvsl` (Load Vector for Shift Left Indexed)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-lvsl-load-vector-shift-left-indexed)
- [IBM AltiVec Technology Programmer's Interface Manual — unaligned-load idiom](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf)