Bundles state that lives OUTSIDE the xenia-rs repo so a fresh clone on
another machine can be brought up to identical configuration via
migration/setup.sh:
- claude-memory/ ~/.claude/projects/-home-fabi-RE-Project-Sylpheed/memory/
(103 files, 1.1 MB - MEMORY.md + every
project_xenia_rs_*.md from audits
addis_signext through audit-058)
- project-root/dot-claude/ <project-root>/.claude/settings.json
(Stop hook + permissions)
- project-root/ppc-manual/ <project-root>/ppc-manual/
(PowerPC reference docs, 397 files, 3.7 MB)
- project-root/run-canary.sh <project-root>/run-canary.sh
- README.md Human-readable setup checklist
- setup.sh Idempotent installer (also reclones
xenia-canary at pinned HEAD 6de80dffe)
- MANIFEST.md Per-file mapping + per-file-not-bundled
restoration recipe
Excluded from bundle (not shippable via git):
- Sylpheed ISO (7.8 GB; copyright; manual copy required)
- sylpheed.db (395 MB; regenerable from XEX via analysis tooling)
- target/ build artifacts (rebuild on target)
- audit-runs probe firehoses (.log/.stdout/.stderr ~11 GB; rerun if needed)
- audit-runs memory dumps (.bin ~4.5 GB; rerun audit-026/027/029 if needed)
- xenia-canary checkout (setup.sh reclones from
git.mc02.dev/fabi/Xenia-Canary.git at HEAD 6de80dffe)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
115 lines
4.9 KiB
Markdown
115 lines
4.9 KiB
Markdown
# Manual generator
|
|
|
|
Python scripts that build the `ppc-manual/` tree from the two
|
|
authoritative sources in this repository:
|
|
|
|
- `xenia-canary/tools/ppc-instructions.xml` — metadata for all 455
|
|
Xbox 360 PPC instructions (mnemonic, form, group, opcode, in/out
|
|
fields, disasm template).
|
|
- `xenia-rs/crates/xenia-cpu/src/` — the Rust interpreter. Individual
|
|
instruction semantics live in `interpreter.rs` match arms.
|
|
- `xenia-canary/src/xenia/cpu/ppc/ppc_emit_*.cc` — the C++ emit
|
|
functions; referenced by line number only.
|
|
|
|
## Files
|
|
|
|
| File | Purpose |
|
|
| --- | --- |
|
|
| `generate_manual.py` | Main entry point. Parses XML, builds families, renders pages, writes `index.json`. |
|
|
| `xml_model.py` | XML parser + `expand_runtime_variants()` (produces the set of Rc/OE/LK-expanded mnemonics a single XML entry covers). |
|
|
| `bit_layout.py` | Per-form bit-field tables (rendered into the Encoding section of every page and into `forms/*.md`). |
|
|
| `rust_scraper.py` | Locates each `PpcOpcode::<mnem>` enum variant, decoder arm, and interpreter match-arm line range. |
|
|
| `cxx_scraper.py` | Locates `InstrEmit_<mnem>` in the xenia-canary emit `.cc` files. |
|
|
|
|
## Running
|
|
|
|
```bash
|
|
python3 ppc-manual/generator/generate_manual.py # full generate
|
|
python3 ppc-manual/generator/generate_manual.py --dry-run # parse + consistency checks only
|
|
python3 ppc-manual/generator/generate_manual.py --out /tmp/out # alternate output root
|
|
python3 ppc-manual/generator/generate_manual.py --xml /path/to/ppc-instructions.xml
|
|
```
|
|
|
|
No third-party dependencies; Python 3.10+ standard library only.
|
|
|
|
## Idempotency
|
|
|
|
The generator is re-runnable without data loss:
|
|
|
|
1. Each page has a pair of sentinel comments:
|
|
- `<!-- GENERATED: BEGIN -->`
|
|
- `<!-- GENERATED: END -->`
|
|
2. On re-run, only the text **between** the sentinels is rewritten.
|
|
Everything after `END` (Special Cases, Related Instructions, IBM
|
|
Reference) is preserved verbatim.
|
|
3. If the `END` sentinel is missing, the generator assumes a reviewer
|
|
has fully taken over the file and skips it entirely.
|
|
|
|
## Consistency checks (enforced by `--dry-run` as well)
|
|
|
|
- **XML entry count ≡ 455** — warns if the XML has been modified.
|
|
- **family membership total ≡ XML entry count** — every XML entry
|
|
must land in exactly one family.
|
|
- **index coverage ≡ runtime-expanded mnemonic count** — the JSON
|
|
index must contain a key for every runtime variant (`add`, `add.`,
|
|
`addo`, `addo.`, `bclr`, `bclrl`, …).
|
|
|
|
## Family grouping rules
|
|
|
|
Three rules applied in order (see `_family_head` in
|
|
`generate_manual.py`):
|
|
|
|
1. If a mnemonic ends in `128` and the non-128 sibling exists, it
|
|
joins the sibling's family. So `vaddfp128` is consolidated into
|
|
the `vaddfp` page.
|
|
2. For memory ops (group `m`), trailing `u`, `x`, or `ux` suffixes
|
|
are stripped when the base exists. So `lwz`, `lwzu`, `lwzx`,
|
|
`lwzux` all land on the `lwz` page.
|
|
3. Otherwise the mnemonic is its own family head.
|
|
|
|
All other flag variants (`Rc`, `OE`, `LK`) are **runtime** — they are
|
|
NOT separate XML entries; they are listed in the page's "Assembler
|
|
Mnemonics" table.
|
|
|
|
## Category mapping
|
|
|
|
| XML group | Category dir | Notes |
|
|
| --- | --- | --- |
|
|
| `i` (integer) | `alu/` | |
|
|
| `m` (memory) | `memory/` | |
|
|
| `b` (branch) | `branch/` | Includes `sc` and traps |
|
|
| `c` (control) | `control/` | CR logical, SPR, sync |
|
|
| `f` (fpu) | `fpu/` | |
|
|
| `v` (vector) | `vmx/` or `vmx128/` | Split by form: `VX128*` → `vmx128/` |
|
|
|
|
## Extending the generator
|
|
|
|
- **Pseudocode seeds.** The `PSEUDOCODE_SEEDS` dict in
|
|
`generate_manual.py` maps an XML mnemonic to a PPC-style pseudocode
|
|
block. Add entries here to pre-fill the Operation section for
|
|
additional mnemonics. Phase 2 reviewers can still override by
|
|
writing content outside the sentinels.
|
|
- **C translation seeds.** Similar dict of C snippets keyed by family
|
|
head.
|
|
- **Field descriptions.** `FIELD_DESCRIPTIONS` maps XML field names to
|
|
IBM-style prose. Missing entries are marked "_Phase 2: document
|
|
this field._"
|
|
|
|
## Known limitations
|
|
|
|
- Extended-opcode extraction in `xml_model.Instruction.extended_opcode`
|
|
is best-effort per form. For VMX128 variants the extracted value may
|
|
not match the exact pattern used by xenia's decoder tree — the page
|
|
still shows it as a reference but the decoder source (linked on
|
|
every page) is authoritative.
|
|
- `rust_scraper` uses a naive brace counter to delimit interpreter
|
|
match arms. It works for the current interpreter because the match
|
|
arms use balanced braces and no string literals with unbalanced
|
|
braces. If the interpreter ever adopts such literals the scraper
|
|
will need a Rust-aware parser.
|
|
- The generator treats mnemonics ending in `x` as xenia convention
|
|
("extended/XO form") and strips them for assembly display — except
|
|
for the memory group, where `x` is the natural indexed-form suffix.
|
|
If future xenia XML adds a new group where `x` is structural, the
|
|
heuristic in `xml_model.expand_runtime_variants` needs updating.
|