Root cause (physical-mirror aliasing gap → GPU read wrong region → ring never truly drained → render worker ring-space wait → no frame → no draw): The Xbox 360 maps its 512 MB of physical DRAM into several virtual mirror windows differing only in cache policy — bare physical (0x0xxxxxxx), write-combine (0x4xxxxxxx), and cached 0xA/0xC/0xExxxxxxx — all aliasing addr & 0x1FFF_FFFF. Ours has one flat membase and `heap_alloc` (MmAllocatePhysicalMemoryEx) commits physical backing in the 0x4xxxxxxx window. The guest masks its CP-ring allocation base to bare physical (0x4adcc000 & 0x1FFFFFFF = 0x0adcc000) before handing it to VdInitializeRingBuffer, and PM4 INDIRECT_BUFFER / writeback / resolve pointers are likewise bare-physical. Ours stored those verbatim and read `membase + 0x0adcc000`, a never-committed zero-filled page — so the GPU drained ~718k zero PM4 headers, never executed the real Type3/DRAW stream, and the RPtr writeback landed on a zero page the render worker (tid=8) polls, freezing it forever. Fix (GPU/Vd-boundary translation, not memory-layer): add `physical_to_backing(addr)` deriving the committed backing exactly from `heap_alloc`'s placement (0x4000_0000 | (addr & 0x1FFF_FFFF), idempotent for the WC window, flat for non-physical code/stack). Apply it at every point the GPU/kernel consumes a guest physical address: ring base (initialize_ring_buffer), RPtr writeback (enable_rptr_writeback), PM4 INDIRECT_BUFFER pointer, WAIT_REG_MEM / COND_WRITE memory poll+write, REG_TO_MEM / MEM_WRITE / EVENT_WRITE* / LOAD_ALU_CONSTANT / IM_LOAD addresses, the resolve dest write, and the vd_swap frontbuffer present read. This was chosen over memory-layer aliasing because the latter re-projects every CPU load/store and corrupts the guest's flat 0xA/0xC/0xE accesses (it caused an early PC=0xfffffffc fault). Two adjacent GPU-backend gates this exposed and also fixed (canary-faithful): - WaitCmp::from_wait_info was off by one vs canary's MatchValueAndRef selector (it decoded wait_info&7==3 as NotEqual instead of Equal), inverting the standard CP coherency wait so the GPU parked forever on the first INDIRECT_BUFFER. Remapped to 1=Less..7=Always, 0=Never. - Added MakeCoherent: a WAIT polling COHER_STATUS_HOST clears the status bit (mirrors command_processor.cc:801-838) so the coherency handshake resolves. Result: the GPU now decodes the real Type3 packets at 0x4adcc000 (ME_INIT, INDIRECT_BUFFER → real Type0/WAIT_REG_MEM at 0x4adf5080) instead of zero-headers; RPtr at 0x408619fc advances (0x13, 0x16, … written by the GPU worker); the frame loop sub_822F1AA8 actively writes the controller at 0x40d09a40 (0x20→0x21→0x23); no fault, full 200M/1B budget runs clean. draws_seen is still 0: the remaining gate is upstream and separate — the main frame loop never sets controller bit-28 (frame-ready) at [0x40d09a40] (stalls at 0x23, the known iterate-2C state-divergence gate), so the guest never enqueues a render IB; the GPU only ever replays the init IB. This fix correctly unblocks the GPU ring/IB/RPtr data path (gate-2 GPU backend); the bit-28 frame-ready gate is the next target. Stable golden (sylpheed_n50m) unchanged (draws/swaps/RTs/shaders identical at 50M); regenerated twice byte-identical. cargo test --workspace: 672 passed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
50 lines
1.6 KiB
Rust
50 lines
1.6 KiB
Rust
//! Xenos GPU emulation for xenia-rs.
|
|
//!
|
|
//! Modules:
|
|
//! - [`pm4`]: packet format decoder + Type-3 opcode set.
|
|
//! - [`ring_view`]: ring-buffer bookkeeping (base/size/read/write pointers).
|
|
//! - [`register_file`]: 0x6000-entry register array backing the CP + state.
|
|
//! - [`gpu_system`]: top-level `GpuSystem` + PM4 executor running one packet
|
|
//! per call (see the plan's P2 for the design rationale).
|
|
//!
|
|
//! Legacy module `ring_drain` and `command_processor` are retained while P3+
|
|
//! migrations finish; they will be removed once every caller is on
|
|
//! [`gpu_system::GpuSystem`].
|
|
|
|
pub mod command_processor;
|
|
pub mod draw_state;
|
|
pub mod edram;
|
|
pub mod gpu_system;
|
|
pub mod handle;
|
|
pub mod mmio_region;
|
|
pub mod pm4;
|
|
pub mod primitive;
|
|
pub mod register_file;
|
|
pub mod ring_drain;
|
|
pub mod ring_view;
|
|
pub mod render_target_cache;
|
|
pub mod resolve;
|
|
pub mod shader_metrics;
|
|
pub mod shaders;
|
|
pub mod texture_cache;
|
|
pub mod tiled_address;
|
|
pub mod translator;
|
|
pub mod ucode;
|
|
pub mod xenos_constants;
|
|
|
|
pub use gpu_system::{
|
|
ExecOutcome, GpuBlock, GpuMmio, GpuStats, GpuSystem, InterruptSource, PendingInterrupt,
|
|
PHYSICAL_BACKING_BASE, ShaderBlob, SwapNotification, WaitCmp, physical_to_backing,
|
|
};
|
|
pub use handle::{
|
|
DrainReply, GpuBackend, GpuCommand, GpuDigestSnapshot, GpuHandle, GpuWorker,
|
|
shutdown_and_join_with_timeout, spawn_gpu_worker, spawn_noop_worker,
|
|
};
|
|
pub use mmio_region::build_region as build_mmio_region;
|
|
pub use pm4::{
|
|
PacketHeader, PacketKind, PM4_INTERRUPT, PM4_NOP, PM4_XE_SWAP, SWAP_SIGNATURE,
|
|
type3_opcode_name,
|
|
};
|
|
pub use ring_drain::{DrainResult, drain};
|
|
pub use ring_view::RingBufferView;
|