[iterate-3O] Real-render slice: replay guest geometry in --ui (Route A)

Replace the synthetic placeholder triangle in the --ui window with the
splash's REAL guest geometry, proving the faithful-render pipe end to end.

Architecture: Route A (UI-side replay). A per-draw capture channel carries
each PM4_DRAW_INDX*'s real state to the UI, which replays it through the
existing wgpu Xenos pipeline. The deterministic headless core is untouched:
capture is gated on an Option<Vec<DrawCapture>> that is None in headless
mode and only enabled on the --ui path, so the --gpu-inline n50m golden is
byte-identical (verified 2x).

The hard part was sourcing real vertices. The WGSL VS already does
format-aware vertex fetch from the b4 storage buffer at the address from the
fetch constant -- but b4 was never populated and the fetch address is an
absolute guest dword address. The slice:
  * xenia-gpu/draw_capture.rs: parse the active VS, find its first vertex
    fetch, read that fetch constant, copy a bounded window of guest memory
    at the fetch base. Best-effort: has_real_vertices=false falls back to
    procedural geometry (never fabricated pixels).
  * gpu_system.rs: accumulate one DrawCapture per draw into frame_captures.
  * exports.rs (vd_swap): drain + publish the frame's captures to the UI.
  * ui_bridge/bridge.rs: new publish_geometry channel + UiHandles.geometry.
  * WGSL (interp + translator): rebase the absolute fetch address by a new
    DrawConstants.vertex_base_dwords so it indexes the uploaded window.
  * render.rs: dispatch_xenos_captures uploads each draw's real vertex
    window + matching shader, issues real DrawRequests (real prim type,
    host vertex count, vs/ps keys).
  * app.rs: prefer the real-capture replay; HUD adds real-geo=N counter.

Verified in --ui on Sylpheed: "first Xenos capture batch replayed (real
geometry) captures=24 real_vertex_draws=24" -- all draws resolved a real
guest vertex window; WGSL compiles; no validation errors over 1616 swaps.

Still synthetic-free but not yet pixel-perfect: textures/UVs, DMA index
buffers (auto-index only for now), and kCopy resolve routing are staged
for follow-ups. Faithful: real vertex data, prim types, shaders, constants.

cargo test --workspace green; n50m golden unchanged (2x byte-identical).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-06-17 22:38:46 +02:00
parent 6bb4355e3d
commit 504592ac13
12 changed files with 509 additions and 65 deletions

View File

@@ -36,7 +36,9 @@ struct DrawConstants {
draw_index: u32,
vertex_count: u32,
prim_kind: u32,
_pad: u32,
/// iterate-3O: guest dword base of the uploaded `vertex_buffer` window.
/// The WGSL subtracts this from the absolute vertex-fetch address.
vertex_base_dwords: u32,
}
/// Submitted to [`XenosPipeline::render_one`] to render one captured draw.
@@ -48,6 +50,9 @@ pub struct DrawRequest {
pub vertex_count: u32,
/// Xenos primitive-type code; shader may branch on it in P3b+.
pub prim_kind: u32,
/// iterate-3O: guest dword base of the per-draw vertex window uploaded to
/// `vertex_buffer` (b4). 0 = no real vertex window (procedural fallback).
pub vertex_base_dwords: u32,
}
/// Reasonable upper bound on a single shader blob (dwords). Most Xbox 360
@@ -193,7 +198,7 @@ impl XenosPipeline {
draw_index: 0,
vertex_count: 3,
prim_kind: 4,
_pad: 0,
vertex_base_dwords: 0,
};
let draw_ctx_buffer = device.create_buffer_init(&wgpu::util::BufferInitDescriptor {
label: Some("xenos draw ctx"),
@@ -480,7 +485,7 @@ impl XenosPipeline {
draw_index: req.draw_index,
vertex_count: req.vertex_count.max(3),
prim_kind: req.prim_kind,
_pad: 0,
vertex_base_dwords: req.vertex_base_dwords,
};
queue.write_buffer(&self.draw_ctx_buffer, 0, bytemuck::bytes_of(&cb));
@@ -606,7 +611,7 @@ impl XenosPipeline {
draw_index: req.draw_index,
vertex_count: req.vertex_count.max(3),
prim_kind: req.prim_kind,
_pad: 0,
vertex_base_dwords: req.vertex_base_dwords,
};
queue.write_buffer(&self.draw_ctx_buffer, 0, bytemuck::bytes_of(&cb));