Files
xenia-rs/migration/claude-memory/project_xenia_rs_ui.md
MechaCat02 e6d43a23ac chore: add migration/ bundle for cross-machine setup
Bundles state that lives OUTSIDE the xenia-rs repo so a fresh clone on
another machine can be brought up to identical configuration via
migration/setup.sh:

  - claude-memory/             ~/.claude/projects/-home-fabi-RE-Project-Sylpheed/memory/
                               (103 files, 1.1 MB - MEMORY.md + every
                                project_xenia_rs_*.md from audits
                                addis_signext through audit-058)
  - project-root/dot-claude/   <project-root>/.claude/settings.json
                               (Stop hook + permissions)
  - project-root/ppc-manual/   <project-root>/ppc-manual/
                               (PowerPC reference docs, 397 files, 3.7 MB)
  - project-root/run-canary.sh <project-root>/run-canary.sh
  - README.md                  Human-readable setup checklist
  - setup.sh                   Idempotent installer (also reclones
                               xenia-canary at pinned HEAD 6de80dffe)
  - MANIFEST.md                Per-file mapping + per-file-not-bundled
                               restoration recipe

Excluded from bundle (not shippable via git):
  - Sylpheed ISO (7.8 GB; copyright; manual copy required)
  - sylpheed.db (395 MB; regenerable from XEX via analysis tooling)
  - target/ build artifacts (rebuild on target)
  - audit-runs probe firehoses (.log/.stdout/.stderr ~11 GB; rerun if needed)
  - audit-runs memory dumps (.bin ~4.5 GB; rerun audit-026/027/029 if needed)
  - xenia-canary checkout (setup.sh reclones from
    git.mc02.dev/fabi/Xenia-Canary.git at HEAD 6de80dffe)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 21:38:38 +02:00

7.3 KiB
Raw Blame History

name, description, type, originSessionId
name description type originSessionId
xenia-rs --ui architecture (stable facts) Threading/bridge design, shader pipeline, GPU integration, HUD — stable across sessions. History + live state in `project_xenia_rs_current_state.md`. project 1e348be4-7f53-438a-9c1b-e0c2fcb7ec0d

Threading & bridge

exec --ui runs winit on the main thread and the scheduler/interpreter on a worker thread. Cross-thread communication: Arc-shared atomics + EventLoopProxy user events. KernelState::ui: Option<UiBridge> carries closures that (a) read host gamepad and (b) post SwapInfo + frontbuffer bytes to the UI. GuestMemory stays pinned to the interpreter thread; only cooked bytes cross.

Why: winit 0.30+ ApplicationHandler requires the main thread and wgpu's Surface is tied to Window. The interpreter is single-threaded (6 cooperative HW slots); making it multithread-safe would require Arc<RwLock<GuestMemory>> on every guest instruction.

How to apply: when adding cross-thread UI state, extend SwapInfo (post-swap) or add an atomic on UiHandles — don't reach across threads directly.

GPU pipeline (P2P7 stable)

  • xenia-gpu::GpuSystem — one per KernelState. Owns the RegisterFile, the RingBufferView (+ IB stack for nested PM4_INDIRECT_BUFFER), the TextureCache / RenderTargetCache (P4/P5), and the GpuMmio atomic mailbox exposed via the 0x7FC8_0000 MMIO aperture (Canary graphics_system.cc:141). Per scheduler round: sync_with_mmio() then execute_one() of whatever's ready.
  • Type-3 packet coverage: every non-draw Type-3 opcode is implemented (NOP, INDIRECT_BUFFER[_PFD], WAIT_REG_MEM, REG_RMW, REG_TO_MEM, MEM_WRITE, COND_WRITE, EVENT_WRITE[_SHD/_EXT/_ZPD], SET_CONSTANT[2], SET_SHADER_CONSTANTS, LOAD_ALU_CONSTANT, IM_LOAD[_IMMEDIATE], CONTEXT_UPDATE, INVALIDATE_STATE, VIZ_QUERY, ME_INIT, SET_BIN_MASK/SELECT, INTERRUPT, XE_SWAP). DRAW_INDX* captures DrawState + ProcessedPrimitive + metrics.
  • WGSL shader interpreter (P3b/c + P7): xenia-gpu::ucode decoder + pack_for_wgsl dense layout; xenos_interp.wgsl (~465 LOC) implements the CF walker + 13 vec ALU ops + 6 scalar ops + R32G32B32A32_FLOAT vertex fetch + texture sampling. XenosPipeline::new builds two bind groups; uploads shader+constants+vertex before each batch in dispatch_xenos_draws. P7 added a direct Xenos→WGSL translator for when shader-bug isolation is needed.
  • Texture cache (P5): page-version invalidated via GuestMemory::page_version. Formats supported: K8888, K565, Dxt1, Dxt2_3, Dxt4_5 (M5). Host side texture_cache_host.rs maps each to Rgba8Unorm/Bc{1,2,3}RgbaUnorm with format-aware bytes_per_row.
  • Render target cache (P4): EDRAM resolve handler handle_event_initiator wired into all four PM4_EVENT_WRITE* variants. On event code 15 (TILE_FLUSH), snapshots RB_COPY_* into last_resolve, bumps stats.resolves_total. Actual EDRAM→memory byte copy still deferred.

MMIO aperture (stable)

  • Base 0x7FC8_0000, mask 0xFFFF_0000, size 0x0001_0000. Install via MmioRegion on GuestMemory.
  • Registers served (others trace+zero): CP_RB_WPTR, CP_RB_RPTR, CP_INT_STATUS, CP_INT_ACK (0x071D, write-echo), D1MODE_VBLANK_VLINE_STATUS (0x1951 / byte offset 0x6544, W1TC on bit 0).
  • Bit 0 of D1MODE_VBLANK_VLINE_STATUS is set by the app main loop on every synthetic vsync tick; Sylpheed's callback rlwinm. r,r,0,31,31; bc 12,2,skip gates all vsync work on it.

Scheduler + interrupts

  • HwState variants: Idle, Ready, Blocked(BlockReason), Exited(code), ServicingIrq(BlockReason). ServicingIrq is used by the graphics-interrupt injector to stash a block reason while running the callback; wake() and round_schedule both treat ServicingIrq as runnable.
  • Graphics interrupt injection (post-M8): try_inject_graphics_interrupt picks any non-Idle/Exited HW slot (prefers Ready, falls back to Blocked). InterruptState::injected_hw tracks which slot ran the callback. The LR-sentinel return path restores pre-injection ctx and re-blocks with the stashed reason (unless a wake() during the callback cleared it).
  • Deadlock recovery: when all live threads are Blocked/Idle/Exited and no timer is pending, force-wake every blocked thread with STATUS_TIMEOUT in gpr[3]. scheduler.deadlock_recoveries counter tracks this.
  • Main thread exit is NOT a halt: when tid=1 hits LR_HALT_SENTINEL we mark it Exited and continue; the outer loop halts only when has_live_thread() is false. Sylpheed's design spawns workers then returns from main.

HLE primitives (stable)

  • Pseudo-handle resolution resolve_pseudo_handle(state, h): 0xFFFFFFFE → current thread handle, 0xFFFFFFFF → 0, others pass through. Called at top of every Ob*/Nt*Wait* export.
  • PKEVENT shim ensure_dispatcher_object(state, mem, ptr): Ke* sync functions take PKEVENT pointers; first touch reads Xenon DISPATCHER_HEADER (type byte + SignalState at +4 + Limit at +0x10 for semaphores) and mints a shadow KernelObject keyed by the pointer. refresh_pkevent_shadow_from_guest re-syncs SignalState on each wait.
  • WaitAny handle-index return: Canary's WaitMultiple returns STATUS_WAIT_0 + index for WaitAny. do_wait_multiple matches; set_wake_status_for_waitany updates gpr[3] on wake.
  • I/O completion signaling: signal_io_completion_event(state, event_handle) fires at every completion path of NtReadFile/NtWriteFile (r4 = event).
  • Empty-path / root-device opens (NtCreateFile("game:\") etc.): synth a zero-byte KernelObject::File with empty path. NtQueryInformationFile class 5 reports Directory=1 for empty///:-tail paths; class 34 (FileNetworkOpenInformation, 56 B) reports FILE_ATTRIBUTE_DIRECTORY at offset +48.

HUD

6 rows, well-spaced, cyan accents:

  1. Title + uptime + instr/kIPS (live counter via instructions_counter atomic).
  2. Swaps.
  3. GPU stats (packets, draws_total, resolves_total, interrupts).
  4. Last-draw prim/verts.
  5. Pad state.
  6. Render path: xdispatch: xlated=N interp=M xlated-pipelines=P tex-cache=T fb=WxH.

One-shot tracing::info! latches: "first Xenos draw dispatched" and "first translator pipeline compiled".

Observability defaults

Silences wgpu/winit/naga/gilrs at warn (wgpu at error). Override via --log-filter='info,wgpu_core=trace' during bring-up. --trace-chrome PATH captures Chrome/Perfetto trace; --profile PATH.svg emits a flamegraph.

Interpreter performance (post-Tier-3)

~10 MIPS end-to-end on Sylpheed. Three wins stacked: de-hot-patted metrics::counter! per instruction; direct-mapped 64k DecodeCache keyed by PC with page-version invalidation; Debugger::wants_hooks() short-circuit + trace_enabled = false default (previous O(n²) Vec::remove(0) on the trace log was the real bottleneck, not metrics).

Deferred Tier 4 — threaded-code dispatch / JIT. Only worth doing after the shader translator + HLE coverage gaps narrow; fast-but-wrong produces fast-wrong output.

Phase history

Complete roadmap P1P8 + perf Tiers 13 + first-pixels M1M9 all landed. Details deliberately elided here — they're in the individual commit messages and the project_xenia_rs_current_state.md next-steps file. This doc stays focused on stable facts a new session needs before touching the code.