From a4926c73f40b2fa574f9bb6aca837e543218f506 Mon Sep 17 00:00:00 2001 From: MechaCat02 Date: Fri, 5 Jun 2026 07:21:32 +0200 Subject: [PATCH] docs: handoff report for continuing on another machine Snapshot of repo layout, branch map, gitignore policy for the heavy local artifacts, and the iterate-2.BC investigation state (next steps 2.BD handle disambiguation -> 2.BE host-driven ISR delivery). Co-Authored-By: Claude Opus 4.8 (1M context) --- HANDOFF.md | 119 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 119 insertions(+) create mode 100644 HANDOFF.md diff --git a/HANDOFF.md b/HANDOFF.md new file mode 100644 index 0000000..6c537f5 --- /dev/null +++ b/HANDOFF.md @@ -0,0 +1,119 @@ +# Handoff — Project Sylpheed RE / xenia-rs + +_Generated 2026-06-05 to continue work on another machine._ + +## TL;DR + +Reverse-engineering **Project Sylpheed — Arc of Deception** boot under a Rust +Xbox 360 emulator (`xenia-rs`), using **Xenia Canary** as the reference +("canary") oracle. The game boots but **wedges after ~2–6 frames**: a render / +VSync-event producer stops firing post-boot, so guest threads block forever. +Investigation is at **iterate 2.BC**; next step is **2.BD (handle +disambiguation)**, then **2.BE (architecture fix)**. + +## Repo / machine layout + +Workspace root: `/home/fabi/RE - Project Sylpheed/` (NOT a git repo itself). + +| Dir | Git remote | Purpose | +|-----|-----------|---------| +| `xenia-rs/` | `git.mc02.dev/fabi/xenia-rs.git` | **Main project** — the Rust emulator + all RE work | +| `/home/fabi/Xenia-Canary/` | `git.mc02.dev/fabi/Xenia-Canary.git` | Reference Canary build (branch `xenia-rs`) | +| `xenia-canary/` | `github.com/xenia-canary/xenia-canary` | Upstream canary checkout (incl. `third_party/snappy` submodule) | +| `xenia-canary/third_party/snappy` | `git.mc02.dev/fabi/Snappy.git` (fork) | snappy + cross-build patch (see below) | +| `xenia/` | upstream xenia | Reference only | +| `sylpheed-reborn/` | — | **DEAD — ignore** | + +The big game asset (`*.iso`, 7.8 GB) and `*.pe`/`*.xex.json` live at workspace +root and are **not** in git. + +### What is and isn't in git (xenia-rs) + +`.gitignore` now excludes the heavy, regenerable local artifacts so the repo +stays portable: + +- **Committed:** all source, plus `audit-runs/**` analysis **notes** + (`.md`/`.txt`/small `.json` digests, ~6 MB). +- **Ignored:** `audit-runs/**` raw traces (`.jsonl`, `.jsonl.gz`, `.gz`, `.csv`, + `.stdout`, `.stderr`, `.log` — **~146 GB**), `.claude/` agent worktrees + (~66 GB), `*.bin` dumps, `exit-thread-state.json`, `*.bak`. + +To regenerate traces on the new machine, re-run the emulator/diff harnesses +(see `docs/` and the per-iterate `audit-runs/iterate-*/` note files). + +## Branch map (xenia-rs, all pushed) + +- `master` — golden baseline (`sylpheed_n50m` golden, post-AUDIT-054). +- `chore/portable-snapshot` — **active line**; HEAD `ef93a4f` carries the + dormant parity fixes (`nt_create_event` polarity, MMIO VSync hardcode) + + iterate notes. +- `iterate-2AT-deref`, `iterate-2AU-xaudio`, `iterate-2AZ-vsync` — throwaway + probe instrumentation, preserved as-is (inert per findings; do not merge + blindly). +- `worktree-agent-a0848e51cc0d72503` — stale worktree ref (no unique work). + +## Where the investigation stands (iterate 2.BC) + +The authoritative running log is the persistent memory at +`~/.claude/projects/-home-fabi-RE---Project-Sylpheed/memory/` (`MEMORY.md` +index + topic files). Key state: + +- **The wedge is a genuine producer bug, independent of cadence mode.** Running + the game `--parallel` (wall-clock 60 Hz VSync) also wedges after ~2 frames + (iterate 2.BB), so it is **not** a lockstep artifact. Cadence-clock direction + is a dead end. +- **Canary's frame pacing = a host "GPU Frame limiter" thread** (canary tid=2, + `graphics_system.cc:146`) that calls `NtSetEvent` ~4660× at 60 Hz and runs + the guest VSync ISR **synchronously on the host thread** + (`MarkVblank → DispatchInterruptCallback → EmulateCPInterruptDPC → + processor_->Execute`), scheduler-independent (iterate 2.BA). +- **Ours has no host frame-limiter.** It injects the ISR onto a guest *victim* + thread (`try_inject_graphics_interrupt`, `crates/xenia-app/src/main.rs` + ~3729). Once the guest blocks/idles after boot, ISR delivery stops — ours + fires the signal path ~96× early-boot then **stops**. +- **`opt_callback` signal path IS wired in ours** (iterate 2.BC, falsifies the + earlier 2.AT "NULL delegate" claim): `sub_822F2248` body = 3 parts; part C + `@0x822F22CC` calls `bl 0x822F13B0` (singleton `0x828f3844`) → + `NtSetEvent(ev0)` via `0x824AA2F0`. Runtime: this reaches `NtSetEvent` 96× on + **handle 0x108c** then stops. So divergence = **cadence/delivery + architecture**, not a missing delegate. + +### Open question to resolve FIRST — iterate 2.BD (~0 LOC) + +**Handle disambiguation.** `opt_callback` signals **0x108c**, but `tid=1` was +recorded wedging on **0x10e8**. Are these the same event or different? +- If `tid=1`'s wait is really 0x108c (0x10e8 a mislabel) → the cadence/delivery + fix unwedges tid=1. +- If 0x10e8 is a separate event → it needs its own producer. + +Map who-waits / who-signals for `0x108c / 0x1090 / 0x10e8 / 0x1004` in **both** +ours and canary before writing any fix. (`0x1004` = tid=12 DPC work-queue wake, +also dead post-boot.) + +### Then iterate 2.BE — architecture fix (~20–60 LOC, MEDIUM) + +Replace victim-thread ISR injection with **host-driven synchronous ISR +delivery** mirroring canary's `EmulateCPInterruptDPC` frame-limiter, so VSync +keeps firing after the guest blocks. Fix surface: +`crates/xenia-kernel/src/interrupts.rs` + `crates/xenia-app/src/main.rs`. +This is why the 2.AZ clock-swap was inert — the gap is *delivery +architecture*, not the clock. + +## Workflow notes + +- **The user drives dispatch cadence.** After a research iterate completes, + sync memory + report concisely, then **pause for explicit go** — do not + auto-dispatch the next sub-agent. +- Methodology rule earned the hard way (#44/#46): before claiming "X never + fires / signal missing", trace the **whole** function body at runtime + (`--lr-trace`) and verify the reference engine actually has it non-null — + an empty slot is only a bug if canary's is populated. + +## Verify the checkout on the new machine + +```sh +cd xenia-rs +git checkout chore/portable-snapshot +cargo test # 300 + 230 + 149 + 11 suites expected green +# determinism baseline: sylpheed_n50m golden should be bit-identical to master +```