Files
xenia-rs/crates/xenia-gpu/src/lib.rs
MechaCat02 2bdb93e51e [iterate-2K] GPU physical-mirror aliasing: ring/IB/RPtr/resolve read wrong host region
Root cause (physical-mirror aliasing gap → GPU read wrong region → ring never
truly drained → render worker ring-space wait → no frame → no draw):

The Xbox 360 maps its 512 MB of physical DRAM into several virtual mirror
windows differing only in cache policy — bare physical (0x0xxxxxxx),
write-combine (0x4xxxxxxx), and cached 0xA/0xC/0xExxxxxxx — all aliasing
addr & 0x1FFF_FFFF. Ours has one flat membase and `heap_alloc`
(MmAllocatePhysicalMemoryEx) commits physical backing in the 0x4xxxxxxx
window. The guest masks its CP-ring allocation base to bare physical
(0x4adcc000 & 0x1FFFFFFF = 0x0adcc000) before handing it to
VdInitializeRingBuffer, and PM4 INDIRECT_BUFFER / writeback / resolve
pointers are likewise bare-physical. Ours stored those verbatim and read
`membase + 0x0adcc000`, a never-committed zero-filled page — so the GPU
drained ~718k zero PM4 headers, never executed the real Type3/DRAW stream,
and the RPtr writeback landed on a zero page the render worker (tid=8) polls,
freezing it forever.

Fix (GPU/Vd-boundary translation, not memory-layer): add
`physical_to_backing(addr)` deriving the committed backing exactly from
`heap_alloc`'s placement (0x4000_0000 | (addr & 0x1FFF_FFFF), idempotent for
the WC window, flat for non-physical code/stack). Apply it at every point the
GPU/kernel consumes a guest physical address: ring base
(initialize_ring_buffer), RPtr writeback (enable_rptr_writeback), PM4
INDIRECT_BUFFER pointer, WAIT_REG_MEM / COND_WRITE memory poll+write,
REG_TO_MEM / MEM_WRITE / EVENT_WRITE* / LOAD_ALU_CONSTANT / IM_LOAD addresses,
the resolve dest write, and the vd_swap frontbuffer present read. This was
chosen over memory-layer aliasing because the latter re-projects every CPU
load/store and corrupts the guest's flat 0xA/0xC/0xE accesses (it caused an
early PC=0xfffffffc fault).

Two adjacent GPU-backend gates this exposed and also fixed (canary-faithful):
- WaitCmp::from_wait_info was off by one vs canary's MatchValueAndRef
  selector (it decoded wait_info&7==3 as NotEqual instead of Equal),
  inverting the standard CP coherency wait so the GPU parked forever on the
  first INDIRECT_BUFFER. Remapped to 1=Less..7=Always, 0=Never.
- Added MakeCoherent: a WAIT polling COHER_STATUS_HOST clears the status bit
  (mirrors command_processor.cc:801-838) so the coherency handshake resolves.

Result: the GPU now decodes the real Type3 packets at 0x4adcc000 (ME_INIT,
INDIRECT_BUFFER → real Type0/WAIT_REG_MEM at 0x4adf5080) instead of
zero-headers; RPtr at 0x408619fc advances (0x13, 0x16, … written by the GPU
worker); the frame loop sub_822F1AA8 actively writes the controller at
0x40d09a40 (0x20→0x21→0x23); no fault, full 200M/1B budget runs clean.

draws_seen is still 0: the remaining gate is upstream and separate — the main
frame loop never sets controller bit-28 (frame-ready) at [0x40d09a40] (stalls
at 0x23, the known iterate-2C state-divergence gate), so the guest never
enqueues a render IB; the GPU only ever replays the init IB. This fix
correctly unblocks the GPU ring/IB/RPtr data path (gate-2 GPU backend); the
bit-28 frame-ready gate is the next target.

Stable golden (sylpheed_n50m) unchanged (draws/swaps/RTs/shaders identical at
50M); regenerated twice byte-identical. cargo test --workspace: 672 passed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 13:39:57 +02:00

50 lines
1.6 KiB
Rust

//! Xenos GPU emulation for xenia-rs.
//!
//! Modules:
//! - [`pm4`]: packet format decoder + Type-3 opcode set.
//! - [`ring_view`]: ring-buffer bookkeeping (base/size/read/write pointers).
//! - [`register_file`]: 0x6000-entry register array backing the CP + state.
//! - [`gpu_system`]: top-level `GpuSystem` + PM4 executor running one packet
//! per call (see the plan's P2 for the design rationale).
//!
//! Legacy module `ring_drain` and `command_processor` are retained while P3+
//! migrations finish; they will be removed once every caller is on
//! [`gpu_system::GpuSystem`].
pub mod command_processor;
pub mod draw_state;
pub mod edram;
pub mod gpu_system;
pub mod handle;
pub mod mmio_region;
pub mod pm4;
pub mod primitive;
pub mod register_file;
pub mod ring_drain;
pub mod ring_view;
pub mod render_target_cache;
pub mod resolve;
pub mod shader_metrics;
pub mod shaders;
pub mod texture_cache;
pub mod tiled_address;
pub mod translator;
pub mod ucode;
pub mod xenos_constants;
pub use gpu_system::{
ExecOutcome, GpuBlock, GpuMmio, GpuStats, GpuSystem, InterruptSource, PendingInterrupt,
PHYSICAL_BACKING_BASE, ShaderBlob, SwapNotification, WaitCmp, physical_to_backing,
};
pub use handle::{
DrainReply, GpuBackend, GpuCommand, GpuDigestSnapshot, GpuHandle, GpuWorker,
shutdown_and_join_with_timeout, spawn_gpu_worker, spawn_noop_worker,
};
pub use mmio_region::build_region as build_mmio_region;
pub use pm4::{
PacketHeader, PacketKind, PM4_INTERRUPT, PM4_NOP, PM4_XE_SWAP, SWAP_SIGNATURE,
type3_opcode_name,
};
pub use ring_drain::{DrainResult, drain};
pub use ring_view::RingBufferView;