Bundles state that lives OUTSIDE the xenia-rs repo so a fresh clone on
another machine can be brought up to identical configuration via
migration/setup.sh:
- claude-memory/ ~/.claude/projects/-home-fabi-RE-Project-Sylpheed/memory/
(103 files, 1.1 MB - MEMORY.md + every
project_xenia_rs_*.md from audits
addis_signext through audit-058)
- project-root/dot-claude/ <project-root>/.claude/settings.json
(Stop hook + permissions)
- project-root/ppc-manual/ <project-root>/ppc-manual/
(PowerPC reference docs, 397 files, 3.7 MB)
- project-root/run-canary.sh <project-root>/run-canary.sh
- README.md Human-readable setup checklist
- setup.sh Idempotent installer (also reclones
xenia-canary at pinned HEAD 6de80dffe)
- MANIFEST.md Per-file mapping + per-file-not-bundled
restoration recipe
Excluded from bundle (not shippable via git):
- Sylpheed ISO (7.8 GB; copyright; manual copy required)
- sylpheed.db (395 MB; regenerable from XEX via analysis tooling)
- target/ build artifacts (rebuild on target)
- audit-runs probe firehoses (.log/.stdout/.stderr ~11 GB; rerun if needed)
- audit-runs memory dumps (.bin ~4.5 GB; rerun audit-026/027/029 if needed)
- xenia-canary checkout (setup.sh reclones from
git.mc02.dev/fabi/Xenia-Canary.git at HEAD 6de80dffe)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7.3 KiB
name, description, type, originSessionId
| name | description | type | originSessionId |
|---|---|---|---|
| xenia-rs --ui architecture (stable facts) | Threading/bridge design, shader pipeline, GPU integration, HUD — stable across sessions. History + live state in `project_xenia_rs_current_state.md`. | project | 1e348be4-7f53-438a-9c1b-e0c2fcb7ec0d |
Threading & bridge
exec --ui runs winit on the main thread and the scheduler/interpreter on a worker thread. Cross-thread communication: Arc-shared atomics + EventLoopProxy user events. KernelState::ui: Option<UiBridge> carries closures that (a) read host gamepad and (b) post SwapInfo + frontbuffer bytes to the UI. GuestMemory stays pinned to the interpreter thread; only cooked bytes cross.
Why: winit 0.30+ ApplicationHandler requires the main thread and wgpu's Surface is tied to Window. The interpreter is single-threaded (6 cooperative HW slots); making it multithread-safe would require Arc<RwLock<GuestMemory>> on every guest instruction.
How to apply: when adding cross-thread UI state, extend SwapInfo (post-swap) or add an atomic on UiHandles — don't reach across threads directly.
GPU pipeline (P2–P7 stable)
xenia-gpu::GpuSystem— one perKernelState. Owns theRegisterFile, theRingBufferView(+ IB stack for nestedPM4_INDIRECT_BUFFER), theTextureCache/RenderTargetCache(P4/P5), and theGpuMmioatomic mailbox exposed via the0x7FC8_0000MMIO aperture (Canarygraphics_system.cc:141). Per scheduler round:sync_with_mmio()thenexecute_one()of whatever's ready.- Type-3 packet coverage: every non-draw Type-3 opcode is implemented (NOP, INDIRECT_BUFFER[_PFD], WAIT_REG_MEM, REG_RMW, REG_TO_MEM, MEM_WRITE, COND_WRITE, EVENT_WRITE[_SHD/_EXT/_ZPD], SET_CONSTANT[2], SET_SHADER_CONSTANTS, LOAD_ALU_CONSTANT, IM_LOAD[_IMMEDIATE], CONTEXT_UPDATE, INVALIDATE_STATE, VIZ_QUERY, ME_INIT, SET_BIN_MASK/SELECT, INTERRUPT, XE_SWAP).
DRAW_INDX*capturesDrawState+ProcessedPrimitive+ metrics. - WGSL shader interpreter (P3b/c + P7):
xenia-gpu::ucodedecoder +pack_for_wgsldense layout;xenos_interp.wgsl(~465 LOC) implements the CF walker + 13 vec ALU ops + 6 scalar ops + R32G32B32A32_FLOAT vertex fetch + texture sampling.XenosPipeline::newbuilds two bind groups; uploads shader+constants+vertex before each batch indispatch_xenos_draws. P7 added a direct Xenos→WGSL translator for when shader-bug isolation is needed. - Texture cache (P5): page-version invalidated via
GuestMemory::page_version. Formats supported:K8888,K565,Dxt1,Dxt2_3,Dxt4_5(M5). Host sidetexture_cache_host.rsmaps each toRgba8Unorm/Bc{1,2,3}RgbaUnormwith format-awarebytes_per_row. - Render target cache (P4): EDRAM resolve handler
handle_event_initiatorwired into all fourPM4_EVENT_WRITE*variants. On event code 15 (TILE_FLUSH), snapshotsRB_COPY_*intolast_resolve, bumpsstats.resolves_total. Actual EDRAM→memory byte copy still deferred.
MMIO aperture (stable)
- Base
0x7FC8_0000, mask0xFFFF_0000, size0x0001_0000. Install viaMmioRegiononGuestMemory. - Registers served (others trace+zero):
CP_RB_WPTR,CP_RB_RPTR,CP_INT_STATUS,CP_INT_ACK(0x071D, write-echo),D1MODE_VBLANK_VLINE_STATUS(0x1951 / byte offset0x6544, W1TC on bit 0). - Bit 0 of
D1MODE_VBLANK_VLINE_STATUSis set by the app main loop on every synthetic vsync tick; Sylpheed's callbackrlwinm. r,r,0,31,31; bc 12,2,skipgates all vsync work on it.
Scheduler + interrupts
HwStatevariants:Idle,Ready,Blocked(BlockReason),Exited(code),ServicingIrq(BlockReason).ServicingIrqis used by the graphics-interrupt injector to stash a block reason while running the callback;wake()andround_scheduleboth treatServicingIrqas runnable.- Graphics interrupt injection (post-M8):
try_inject_graphics_interruptpicks any non-Idle/ExitedHW slot (prefersReady, falls back toBlocked).InterruptState::injected_hwtracks which slot ran the callback. The LR-sentinel return path restores pre-injection ctx and re-blocks with the stashed reason (unless awake()during the callback cleared it). - Deadlock recovery: when all live threads are
Blocked/Idle/Exitedand no timer is pending, force-wake every blocked thread withSTATUS_TIMEOUTingpr[3].scheduler.deadlock_recoveriescounter tracks this. - Main thread exit is NOT a halt: when
tid=1hitsLR_HALT_SENTINELwe mark itExitedand continue; the outer loop halts only whenhas_live_thread()is false. Sylpheed's design spawns workers then returns from main.
HLE primitives (stable)
- Pseudo-handle resolution
resolve_pseudo_handle(state, h):0xFFFFFFFE→ current thread handle,0xFFFFFFFF→ 0, others pass through. Called at top of everyOb*/Nt*Wait*export. - PKEVENT shim
ensure_dispatcher_object(state, mem, ptr):Ke*sync functions takePKEVENTpointers; first touch reads Xenon DISPATCHER_HEADER (type byte + SignalState at +4 + Limit at +0x10 for semaphores) and mints a shadowKernelObjectkeyed by the pointer.refresh_pkevent_shadow_from_guestre-syncsSignalStateon each wait. - WaitAny handle-index return: Canary's
WaitMultiplereturnsSTATUS_WAIT_0 + indexfor WaitAny.do_wait_multiplematches;set_wake_status_for_waitanyupdatesgpr[3]on wake. - I/O completion signaling:
signal_io_completion_event(state, event_handle)fires at every completion path ofNtReadFile/NtWriteFile(r4 = event). - Empty-path / root-device opens (
NtCreateFile("game:\")etc.): synth a zero-byteKernelObject::Filewith emptypath.NtQueryInformationFileclass 5 reportsDirectory=1for empty///:-tail paths; class 34 (FileNetworkOpenInformation, 56 B) reportsFILE_ATTRIBUTE_DIRECTORYat offset +48.
HUD
6 rows, well-spaced, cyan accents:
- Title + uptime + instr/kIPS (live counter via
instructions_counteratomic). - Swaps.
- GPU stats (packets, draws_total, resolves_total, interrupts).
- Last-draw prim/verts.
- Pad state.
- Render path:
xdispatch: xlated=N interp=M xlated-pipelines=P tex-cache=T fb=WxH.
One-shot tracing::info! latches: "first Xenos draw dispatched" and "first translator pipeline compiled".
Observability defaults
Silences wgpu/winit/naga/gilrs at warn (wgpu at error). Override via --log-filter='info,wgpu_core=trace' during bring-up. --trace-chrome PATH captures Chrome/Perfetto trace; --profile PATH.svg emits a flamegraph.
Interpreter performance (post-Tier-3)
~10 MIPS end-to-end on Sylpheed. Three wins stacked: de-hot-patted metrics::counter! per instruction; direct-mapped 64k DecodeCache keyed by PC with page-version invalidation; Debugger::wants_hooks() short-circuit + trace_enabled = false default (previous O(n²) Vec::remove(0) on the trace log was the real bottleneck, not metrics).
Deferred Tier 4 — threaded-code dispatch / JIT. Only worth doing after the shader translator + HLE coverage gaps narrow; fast-but-wrong produces fast-wrong output.
Phase history
Complete roadmap P1–P8 + perf Tiers 1–3 + first-pixels M1–M9 all landed. Details deliberately elided here — they're in the individual commit messages and the project_xenia_rs_current_state.md next-steps file. This doc stays focused on stable facts a new session needs before touching the code.