Files
xenia-rs/migration/project-root/ppc-manual/generator/README.md
MechaCat02 e6d43a23ac chore: add migration/ bundle for cross-machine setup
Bundles state that lives OUTSIDE the xenia-rs repo so a fresh clone on
another machine can be brought up to identical configuration via
migration/setup.sh:

  - claude-memory/             ~/.claude/projects/-home-fabi-RE-Project-Sylpheed/memory/
                               (103 files, 1.1 MB - MEMORY.md + every
                                project_xenia_rs_*.md from audits
                                addis_signext through audit-058)
  - project-root/dot-claude/   <project-root>/.claude/settings.json
                               (Stop hook + permissions)
  - project-root/ppc-manual/   <project-root>/ppc-manual/
                               (PowerPC reference docs, 397 files, 3.7 MB)
  - project-root/run-canary.sh <project-root>/run-canary.sh
  - README.md                  Human-readable setup checklist
  - setup.sh                   Idempotent installer (also reclones
                               xenia-canary at pinned HEAD 6de80dffe)
  - MANIFEST.md                Per-file mapping + per-file-not-bundled
                               restoration recipe

Excluded from bundle (not shippable via git):
  - Sylpheed ISO (7.8 GB; copyright; manual copy required)
  - sylpheed.db (395 MB; regenerable from XEX via analysis tooling)
  - target/ build artifacts (rebuild on target)
  - audit-runs probe firehoses (.log/.stdout/.stderr ~11 GB; rerun if needed)
  - audit-runs memory dumps (.bin ~4.5 GB; rerun audit-026/027/029 if needed)
  - xenia-canary checkout (setup.sh reclones from
    git.mc02.dev/fabi/Xenia-Canary.git at HEAD 6de80dffe)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 21:38:38 +02:00

115 lines
4.9 KiB
Markdown

# Manual generator
Python scripts that build the `ppc-manual/` tree from the two
authoritative sources in this repository:
- `xenia-canary/tools/ppc-instructions.xml` — metadata for all 455
Xbox 360 PPC instructions (mnemonic, form, group, opcode, in/out
fields, disasm template).
- `xenia-rs/crates/xenia-cpu/src/` — the Rust interpreter. Individual
instruction semantics live in `interpreter.rs` match arms.
- `xenia-canary/src/xenia/cpu/ppc/ppc_emit_*.cc` — the C++ emit
functions; referenced by line number only.
## Files
| File | Purpose |
| --- | --- |
| `generate_manual.py` | Main entry point. Parses XML, builds families, renders pages, writes `index.json`. |
| `xml_model.py` | XML parser + `expand_runtime_variants()` (produces the set of Rc/OE/LK-expanded mnemonics a single XML entry covers). |
| `bit_layout.py` | Per-form bit-field tables (rendered into the Encoding section of every page and into `forms/*.md`). |
| `rust_scraper.py` | Locates each `PpcOpcode::<mnem>` enum variant, decoder arm, and interpreter match-arm line range. |
| `cxx_scraper.py` | Locates `InstrEmit_<mnem>` in the xenia-canary emit `.cc` files. |
## Running
```bash
python3 ppc-manual/generator/generate_manual.py # full generate
python3 ppc-manual/generator/generate_manual.py --dry-run # parse + consistency checks only
python3 ppc-manual/generator/generate_manual.py --out /tmp/out # alternate output root
python3 ppc-manual/generator/generate_manual.py --xml /path/to/ppc-instructions.xml
```
No third-party dependencies; Python 3.10+ standard library only.
## Idempotency
The generator is re-runnable without data loss:
1. Each page has a pair of sentinel comments:
- `<!-- GENERATED: BEGIN -->`
- `<!-- GENERATED: END -->`
2. On re-run, only the text **between** the sentinels is rewritten.
Everything after `END` (Special Cases, Related Instructions, IBM
Reference) is preserved verbatim.
3. If the `END` sentinel is missing, the generator assumes a reviewer
has fully taken over the file and skips it entirely.
## Consistency checks (enforced by `--dry-run` as well)
- **XML entry count ≡ 455** — warns if the XML has been modified.
- **family membership total ≡ XML entry count** — every XML entry
must land in exactly one family.
- **index coverage ≡ runtime-expanded mnemonic count** — the JSON
index must contain a key for every runtime variant (`add`, `add.`,
`addo`, `addo.`, `bclr`, `bclrl`, …).
## Family grouping rules
Three rules applied in order (see `_family_head` in
`generate_manual.py`):
1. If a mnemonic ends in `128` and the non-128 sibling exists, it
joins the sibling's family. So `vaddfp128` is consolidated into
the `vaddfp` page.
2. For memory ops (group `m`), trailing `u`, `x`, or `ux` suffixes
are stripped when the base exists. So `lwz`, `lwzu`, `lwzx`,
`lwzux` all land on the `lwz` page.
3. Otherwise the mnemonic is its own family head.
All other flag variants (`Rc`, `OE`, `LK`) are **runtime** — they are
NOT separate XML entries; they are listed in the page's "Assembler
Mnemonics" table.
## Category mapping
| XML group | Category dir | Notes |
| --- | --- | --- |
| `i` (integer) | `alu/` | |
| `m` (memory) | `memory/` | |
| `b` (branch) | `branch/` | Includes `sc` and traps |
| `c` (control) | `control/` | CR logical, SPR, sync |
| `f` (fpu) | `fpu/` | |
| `v` (vector) | `vmx/` or `vmx128/` | Split by form: `VX128*``vmx128/` |
## Extending the generator
- **Pseudocode seeds.** The `PSEUDOCODE_SEEDS` dict in
`generate_manual.py` maps an XML mnemonic to a PPC-style pseudocode
block. Add entries here to pre-fill the Operation section for
additional mnemonics. Phase 2 reviewers can still override by
writing content outside the sentinels.
- **C translation seeds.** Similar dict of C snippets keyed by family
head.
- **Field descriptions.** `FIELD_DESCRIPTIONS` maps XML field names to
IBM-style prose. Missing entries are marked "_Phase 2: document
this field._"
## Known limitations
- Extended-opcode extraction in `xml_model.Instruction.extended_opcode`
is best-effort per form. For VMX128 variants the extracted value may
not match the exact pattern used by xenia's decoder tree — the page
still shows it as a reference but the decoder source (linked on
every page) is authoritative.
- `rust_scraper` uses a naive brace counter to delimit interpreter
match arms. It works for the current interpreter because the match
arms use balanced braces and no string literals with unbalanced
braces. If the interpreter ever adopts such literals the scraper
will need a Rust-aware parser.
- The generator treats mnemonics ending in `x` as xenia convention
("extended/XO form") and strips them for assembly display — except
for the memory group, where `x` is the natural indexed-form suffix.
If future xenia XML adds a new group where `x` is structural, the
heuristic in `xml_model.expand_runtime_variants` needs updating.