Files
xenia-rs/migration/project-root/ppc-manual/vmx/vmhaddshs.md
MechaCat02 e6d43a23ac chore: add migration/ bundle for cross-machine setup
Bundles state that lives OUTSIDE the xenia-rs repo so a fresh clone on
another machine can be brought up to identical configuration via
migration/setup.sh:

  - claude-memory/             ~/.claude/projects/-home-fabi-RE-Project-Sylpheed/memory/
                               (103 files, 1.1 MB - MEMORY.md + every
                                project_xenia_rs_*.md from audits
                                addis_signext through audit-058)
  - project-root/dot-claude/   <project-root>/.claude/settings.json
                               (Stop hook + permissions)
  - project-root/ppc-manual/   <project-root>/ppc-manual/
                               (PowerPC reference docs, 397 files, 3.7 MB)
  - project-root/run-canary.sh <project-root>/run-canary.sh
  - README.md                  Human-readable setup checklist
  - setup.sh                   Idempotent installer (also reclones
                               xenia-canary at pinned HEAD 6de80dffe)
  - MANIFEST.md                Per-file mapping + per-file-not-bundled
                               restoration recipe

Excluded from bundle (not shippable via git):
  - Sylpheed ISO (7.8 GB; copyright; manual copy required)
  - sylpheed.db (395 MB; regenerable from XEX via analysis tooling)
  - target/ build artifacts (rebuild on target)
  - audit-runs probe firehoses (.log/.stdout/.stderr ~11 GB; rerun if needed)
  - audit-runs memory dumps (.bin ~4.5 GB; rerun audit-026/027/029 if needed)
  - xenia-canary checkout (setup.sh reclones from
    git.mc02.dev/fabi/Xenia-Canary.git at HEAD 6de80dffe)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 21:38:38 +02:00

147 lines
6.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# `vmhaddshs` — Vector Multiply-High and Add Signed Signed Half Word Saturate
> **Category:** [VMX (Altivec)](../categories/vmx.md) · **Form:** [VA](../forms/VA.md) · **Opcode:** `0x10000020`
<!-- GENERATED: BEGIN -->
## Assembler Mnemonics
| Mnemonic | XML entry | Flags | Description |
| --- | --- | --- | --- |
| `vmhaddshs` | `vmhaddshs` | — | Vector Multiply-High and Add Signed Signed Half Word Saturate |
## Syntax
```asm
vmhaddshs [VD], [VA], [VB], [VC]
```
## Encoding
### `vmhaddshs` — form `VA`
- **Opcode word:** `0x10000020`
- **Primary opcode (bits 05):** `4`
- **Extended opcode:** `32`
- **Synchronising:** no
| Bits | Field | Meaning |
| --- | --- | --- |
| 05 | `OPCD` | primary opcode (4) |
| 610 | `VRT` | destination vector register |
| 1115 | `VRA` | source A |
| 1620 | `VRB` | source B |
| 2125 | `VRC` | source C / shift |
| 2631 | `XO` | extended opcode (6 bits) |
## Operands
| Field | Role | Description |
| --- | --- | --- |
| `VA` | vmhaddshs: read | Source A vector register. |
| `VB` | vmhaddshs: read | Source B vector register. |
| `VC` | vmhaddshs: read | Source C vector register / 3-bit selector. |
| `VD` | vmhaddshs: write | Destination vector register. |
| `VSCR` | vmhaddshs: write | Vector Status and Control Register (NJ/SAT bits). |
## Register Effects
### `vmhaddshs`
- **Reads (always):** `VA`, `VB`, `VC`
- **Reads (conditional):** _none_
- **Writes (always):** `VD`, `VSCR`
- **Writes (conditional):** _none_
## Status-Register Effects
- `vmhaddshs`: **VSCR[SAT]** may be stickied on saturating vector operations.
## Operation (pseudocode)
```
; Pseudocode derives directly from the xenia-rs interpreter
; arm (see Implementation References). Operation semantics:
; - Read source operands from the fields listed under Operands.
; - Apply the arithmetic / logical / memory action described
; in the Description field above.
; - Write results to the destination register(s); update any
; status bits enumerated under Status-Register Effects.
; Consult the IBM AIX reference link under IBM Reference for
; canonical PPC-style pseudocode where xenia's expression is
; terse.
```
## C Translation Example
```c
/* C translation: the xenia-rs interpreter arm below in */
/* Implementation References is the authoritative semantic */
/* snapshot. Translate it line-by-line: */
/* - ctx.gpr[N] -> r[N] (or f[]/v[] for FPRs/VRs) */
/* - mem.read_u*/write_u* -> mem_read_u*_be / mem_write_u*_be */
/* - ctx.update_cr_signed(fld, v) -> update_cr_signed(fld, v) */
/* - ctx.xer_ca / xer_ov / xer_so -> xer.CA / xer.OV / xer.SO */
/* The Register Effects and Status-Register Effects tables above */
/* enumerate every side effect a faithful translation must emit. */
```
## Implementation References
**`vmhaddshs`**
- xenia-canary XML: [`tools/ppc-instructions.xml` — search for `mnem="vmhaddshs"`](../../xenia-canary/tools/ppc-instructions.xml)
- xenia-canary emit: [`src/xenia/cpu/ppc/ppc_emit_altivec.cc:883`](../../xenia-canary/src/xenia/cpu/ppc/ppc_emit_altivec.cc#L883)
- xenia-rs opcode: [`crates/xenia-cpu/src/opcode.rs:102`](../../xenia-rs/crates/xenia-cpu/src/opcode.rs#L102)
- xenia-rs decoder: [`crates/xenia-cpu/src/decoder.rs:576`](../../xenia-rs/crates/xenia-cpu/src/decoder.rs#L576)
- xenia-rs interpreter: [`crates/xenia-cpu/src/interpreter.rs:3519-3533`](../../xenia-rs/crates/xenia-cpu/src/interpreter.rs#L3519-L3533)
<details><summary>xenia-rs interpreter body (frozen snapshot)</summary>
```rust
PpcOpcode::vmhaddshs => {
// vD[i] = sat_i16((vA[i] * vB[i]) >> 15 + vC[i])
let a = crate::vmx::as_i16x8(ctx.vr[instr.ra()]);
let b = crate::vmx::as_i16x8(ctx.vr[instr.rb()]);
let c = crate::vmx::as_i16x8(ctx.vr[instr.rc()]);
let mut r = [0i16; 8]; let mut sat = false;
for i in 0..8 {
let prod = (a[i] as i32 * b[i] as i32) >> 15;
let (v, s) = crate::vmx::sat_i32_to_i16(prod + c[i] as i32);
r[i] = v; sat |= s;
}
if sat { ctx.set_vscr_sat(true); }
ctx.vr[instr.rd()] = crate::vmx::from_i16x8(r);
ctx.pc += 4;
}
```
</details>
<!-- GENERATED: END -->
## Special Cases & Edge Conditions
- **Q15 fixed-point multiply-add, saturating.** Eight half-word lanes; per lane:
```
prod = (int16(VA[i]) * int16(VB[i])) >> 15 ; truncating, no rounding
VD[i] = clamp(prod + int16(VC[i]), -32768, +32767)
```
The "h" in the mnemonic is "high half" — only the upper 17 bits of the 32-bit signed product survive (after >>15), then the accumulator is added.
- **Truncating, not rounding.** Bit 14 of the product is discarded silently. Use [`vmhraddshs`](vmhraddshs.md) when half-up rounding is needed (it adds `0x4000` to the product before the shift). The two are otherwise identical.
- **`VSCR[SAT]` is sticky-set** if `prod + VC[i]` overflows `int16`. Cleared only by [`mtvscr`](mtvscr.md). Xenia uses `crate::vmx::sat_i32_to_i16` ([`crates/xenia-cpu/src/vmx.rs`](../../xenia-rs/crates/xenia-cpu/src/vmx.rs)).
- **Pathological case `0x8000 * 0x8000 >> 15`.** Equals `0x10000` in the un-saturated product = `+32768` after the shift, which overflows `int16` even before adding `VC`. The clamp then produces `+32767` and stickies SAT. This is the classic Q15 "minus-one-times-minus-one" gotcha.
- **Big-endian half lanes.** Lane 0 is the most-significant half.
- **No XER changes, no exceptions.**
- **No VMX128 sibling.**
- **Common usage.** Q15 IIR / FIR filter taps, fixed-point matrix-vector multiplies for audio.
## Related Instructions
- [`vmhraddshs`](vmhraddshs.md) — same operation with rounded multiply (`+0x4000` before `>> 15`).
- [`vmladduhm`](vmladduhm.md) — same shape, modulo (no shift, no saturate), unsigned half lanes.
- [`vmsumshs`](vmsumshs.md), [`vmsumshm`](vmsumshm.md) — multiply-sum across pairs of lanes.
- [`vaddshs`](vaddshs.md), [`vmaxsh`](vmaxsh.md) — saturating add and max at the same lane width, useful in the same DSP kernels.
## IBM Reference
- [AIX 7.3 — `vmhaddshs` (Vector Multiply-High and Add Signed Half Word Saturate)](https://www.ibm.com/docs/en/aix/7.3.0?topic=set-vmhaddshs-vector-multiply-high-add-signed-half-word-saturate-instruction)
- [IBM AltiVec Technology Programmer's Interface Manual, Chapter 6 — Multiply-Add Family](https://www.nxp.com/docs/en/reference-manual/ALTIVECPIM.pdf)