Files
xenia-rs/migration/claude-memory/project_xenia_rs_concurrency_m1_progress.md
MechaCat02 e6d43a23ac chore: add migration/ bundle for cross-machine setup
Bundles state that lives OUTSIDE the xenia-rs repo so a fresh clone on
another machine can be brought up to identical configuration via
migration/setup.sh:

  - claude-memory/             ~/.claude/projects/-home-fabi-RE-Project-Sylpheed/memory/
                               (103 files, 1.1 MB - MEMORY.md + every
                                project_xenia_rs_*.md from audits
                                addis_signext through audit-058)
  - project-root/dot-claude/   <project-root>/.claude/settings.json
                               (Stop hook + permissions)
  - project-root/ppc-manual/   <project-root>/ppc-manual/
                               (PowerPC reference docs, 397 files, 3.7 MB)
  - project-root/run-canary.sh <project-root>/run-canary.sh
  - README.md                  Human-readable setup checklist
  - setup.sh                   Idempotent installer (also reclones
                               xenia-canary at pinned HEAD 6de80dffe)
  - MANIFEST.md                Per-file mapping + per-file-not-bundled
                               restoration recipe

Excluded from bundle (not shippable via git):
  - Sylpheed ISO (7.8 GB; copyright; manual copy required)
  - sylpheed.db (395 MB; regenerable from XEX via analysis tooling)
  - target/ build artifacts (rebuild on target)
  - audit-runs probe firehoses (.log/.stdout/.stderr ~11 GB; rerun if needed)
  - audit-runs memory dumps (.bin ~4.5 GB; rerun audit-026/027/029 if needed)
  - xenia-canary checkout (setup.sh reclones from
    git.mc02.dev/fabi/Xenia-Canary.git at HEAD 6de80dffe)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 21:38:38 +02:00

6.0 KiB
Raw Blame History

name, description, type, originSessionId
name description type originSessionId
xenia-rs concurrency rollout — M1 complete (2026-04-26) All 10 M1 sub-steps landed. Default GPU backend is now threaded (worker thread on its own); `--gpu-inline` is the rollback. 395 workspace tests pass; sylpheed -n 2M golden matches in both modes; VdSwap=1/=2 fire end-to-end under threaded mode. project af90c866-579c-4506-af85-cd5a5030af85

What's landed (M1.1M1.10)

All 10 M1 sub-steps complete. Default GPU backend at runtime is threaded (GpuBackend::Threaded); --gpu-inline (or --ui, or XENIA_GPU_INLINE=1) selects the legacy synchronous path.

Key types and modules

  • xenia_gpu::GpuBackend — enum Inline(GpuSystem) | Threaded(GpuHandle). Forwarding methods: mmio(), as_inline[_mut](), initialize_ring_buffer, enable_rptr_writeback, extend_write_ptr_by, drain_to_current_wptr, notify_xe_swap, has_pending_interrupts, take_pending_interrupts, digest_snapshot. (crates/xenia-gpu/src/handle.rs)

  • GpuCommandInitializeRing, EnableRptrWriteback, DrainFence{target_wptr, reply_tx}, NotifyXeSwap{frontbuffer_phys, width, height}, Shutdown.

  • GpuHandle::send_cmd(cmd) wraps the raw cmd_tx.send with M1.7 parker discipline (set wake_pending=true Release + unpark() worker thread).

  • GpuWorker::run(Arc<GuestMemory>) — registers self as wake target, drains commands, syncs MMIO + executes packets in batches of 64, refreshes Arc<Mutex<GpuDigestSnapshot>> for the CPU-side digest, drains pending_interrupts → int_tx, parks via park_timeout(16ms) when idle.

  • spawn_gpu_worker(worker, Arc<GuestMemory>) -> JoinHandle spawns the worker; shutdown_and_join_with_timeout joins with 1 s defensive timeout.

Memory model

  • GuestMemory.page_table: Vec<AtomicU64> with per-page Acquire/Release. alloc, is_mapped, page_entry, write_bulk, translate_virtual_mut all &self.
  • GuestMemory.writes_total: AtomicU64 + page_versions: Vec<AtomicU64> with Release on bump, Acquire on read.
  • MemoryAccess::write_u32_fence / read_u32_fence (M1.8) — Release fence before the write / Acquire fence after the read. Migrated EVENT_WRITE_SHD and writeback_read_ptr to use the fenced variants.
  • All MemoryAccess writes take &self post the M1.4(b) handoff. ~140 &mut GuestMemory callsites swept across 10 files. GuestMemoryPcr<'_> callsites use &mut because PcrWriter::write_pcr_id(&mut self, ...).

Concurrency primitives (live in production)

  • MMIO mailboxes (Arc<AtomicU32> × 5): cp_rb_wptr, cp_rb_rptr, cp_int_status, cp_int_ack, d1mode_vblank_vline_status. Release on writer / Acquire on reader.
  • GpuMmio.wake_pending: Arc<AtomicBool> + worker_thread: Arc<Mutex<Option<Thread>>>. WPTR write callback sets+unpark()s; worker swaps→park.
  • crossbeam_channel::unbounded for cmd_tx/cmd_rx and int_tx/int_rx.
  • bounded(1) reply channels for DrainFence (CPU's recv_timeout(1s) + worker's Instant-based 900 ms internal deadline).
  • Arc<Mutex<GpuDigestSnapshot>> refreshed once per worker iteration; CPU reads via digest_snapshot().

CLI / env defaults

default                               → threaded
--gpu-inline (or XENIA_GPU_INLINE=1)  → inline
--gpu-thread (or XENIA_GPU_THREAD=1)  → threaded (explicit)
--ui                                  → forces inline (UI worker not yet shared-mem-aware)

Verification (all green)

Check Result
cargo build --workspace clean
cargo test --workspace 395 passed, 0 failed
xenia-rs check sylpheed.iso -n 2_000_000 --expect golden/sylpheed_n2m.json (default = threaded) matches
Same with --gpu-inline matches
xenia-rs exec sylpheed.iso -n 30_000_000 --halt-on-deadlock (default = threaded) exit 0
VdSwap=1 + VdSwap=2 under threaded mode both fire (~18M + ~28M cycles)
GPU worker shutdown clean within 1 s yes

Beyond ~50M instructions both threaded and inline modes hit the same RtlRaiseException pre-existing bug (unrelated to concurrency rollout).

Known limitations / deferred

  • --ui + threaded backend: cmd_exec_inner panics if both are set; --ui auto-forces inline. Rationale: run_with_ui consumes GuestMemory by value; migrating it to Arc<GuestMemory> is a separate work item.
  • Inline path retained: kept as the rollback rail and the --ui path. M1.10 cleanup deferred to post-M3 per plan.
  • Beyond ~50M instructions: both modes hit a pre-existing RtlRaiseException. Not a regression.

Next milestone (M2)

KernelStateInner + Arc<Mutex<...>> refactor, per-slot Mutex<HwSlot>, ThreadRef generation packing, ReservationTable for lwarx/stwcx.. Some M2 work was pulled forward by M1.4 (page_table atomization) — that's already complete.

Files of note