[iterate-3O] Real-render slice: replay guest geometry in --ui (Route A)
Replace the synthetic placeholder triangle in the --ui window with the
splash's REAL guest geometry, proving the faithful-render pipe end to end.
Architecture: Route A (UI-side replay). A per-draw capture channel carries
each PM4_DRAW_INDX*'s real state to the UI, which replays it through the
existing wgpu Xenos pipeline. The deterministic headless core is untouched:
capture is gated on an Option<Vec<DrawCapture>> that is None in headless
mode and only enabled on the --ui path, so the --gpu-inline n50m golden is
byte-identical (verified 2x).
The hard part was sourcing real vertices. The WGSL VS already does
format-aware vertex fetch from the b4 storage buffer at the address from the
fetch constant -- but b4 was never populated and the fetch address is an
absolute guest dword address. The slice:
* xenia-gpu/draw_capture.rs: parse the active VS, find its first vertex
fetch, read that fetch constant, copy a bounded window of guest memory
at the fetch base. Best-effort: has_real_vertices=false falls back to
procedural geometry (never fabricated pixels).
* gpu_system.rs: accumulate one DrawCapture per draw into frame_captures.
* exports.rs (vd_swap): drain + publish the frame's captures to the UI.
* ui_bridge/bridge.rs: new publish_geometry channel + UiHandles.geometry.
* WGSL (interp + translator): rebase the absolute fetch address by a new
DrawConstants.vertex_base_dwords so it indexes the uploaded window.
* render.rs: dispatch_xenos_captures uploads each draw's real vertex
window + matching shader, issues real DrawRequests (real prim type,
host vertex count, vs/ps keys).
* app.rs: prefer the real-capture replay; HUD adds real-geo=N counter.
Verified in --ui on Sylpheed: "first Xenos capture batch replayed (real
geometry) captures=24 real_vertex_draws=24" -- all draws resolved a real
guest vertex window; WGSL compiles; no validation errors over 1616 swaps.
Still synthetic-free but not yet pixel-perfect: textures/UVs, DMA index
buffers (auto-index only for now), and kCopy resolve routing are staged
for follow-ups. Faithful: real vertex data, prim types, shaders, constants.
cargo test --workspace green; n50m golden unchanged (2x byte-identical).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -436,6 +436,12 @@ pub struct GpuSystem {
|
||||
/// `GpuSystem::new` and lives for the whole GPU lifetime — no
|
||||
/// per-frame churn.
|
||||
pub edram: crate::edram::ShadowEdram,
|
||||
/// UI-only: when `Some`, every `PM4_DRAW_INDX*` appends a
|
||||
/// [`crate::draw_capture::DrawCapture`] here so the host UI can replay the
|
||||
/// real guest geometry. `None` in headless/deterministic mode — the
|
||||
/// `--gpu-inline` golden never enables this, so capture is entirely inert
|
||||
/// for `check`. Drained (taken) by `vd_swap` at each present.
|
||||
pub frame_captures: Option<Vec<crate::draw_capture::DrawCapture>>,
|
||||
}
|
||||
|
||||
impl GpuSystem {
|
||||
@@ -463,6 +469,15 @@ impl GpuSystem {
|
||||
texture_cache: crate::texture_cache::TextureCache::new(),
|
||||
last_draw_textures: Vec::new(),
|
||||
edram: crate::edram::ShadowEdram::new(),
|
||||
frame_captures: None,
|
||||
}
|
||||
}
|
||||
|
||||
/// Enable per-draw geometry capture for the host UI. Inert (and never
|
||||
/// called) in headless/deterministic mode. Idempotent.
|
||||
pub fn enable_frame_capture(&mut self) {
|
||||
if self.frame_captures.is_none() {
|
||||
self.frame_captures = Some(Vec::new());
|
||||
}
|
||||
}
|
||||
|
||||
@@ -1295,8 +1310,56 @@ impl GpuSystem {
|
||||
"gpu: DRAW_INDX captured"
|
||||
);
|
||||
self.last_draw = Some(ds);
|
||||
let host_vertex_count = processed.host_vertex_count;
|
||||
self.last_primitive = Some(processed);
|
||||
|
||||
// iterate-3O: UI-only per-draw geometry capture. Resolves the
|
||||
// real guest vertex window behind this draw (from the active
|
||||
// VS's vertex-fetch constant) so the host UI can replay the
|
||||
// actual splash geometry instead of synthetic shapes. Entirely
|
||||
// inert in headless/deterministic mode (`frame_captures` is
|
||||
// `None`), so the `--gpu-inline` golden is unaffected.
|
||||
if self.frame_captures.is_some() {
|
||||
let vs_key = self.active_vs_key.unwrap_or(0);
|
||||
let ps_key = self.active_ps_key.unwrap_or(0);
|
||||
let parsed_vs = self
|
||||
.active_vs_key
|
||||
.and_then(|k| self.shader_blobs.get(&k))
|
||||
.map(|b| crate::ucode::parse_shader(&b.dwords));
|
||||
let (idx_src, idx_size) = match ds.index_source {
|
||||
crate::draw_state::IndexSource::Dma { index_size, .. } => {
|
||||
(ds.index_source, index_size)
|
||||
}
|
||||
crate::draw_state::IndexSource::Immediate { index_size } => {
|
||||
(ds.index_source, index_size)
|
||||
}
|
||||
crate::draw_state::IndexSource::AutoIndex => {
|
||||
(ds.index_source, crate::draw_state::IndexSize::Sixteen)
|
||||
}
|
||||
};
|
||||
let cap = crate::draw_capture::build(
|
||||
self.stats.draws_seen as u32,
|
||||
ds.primitive,
|
||||
host_vertex_count,
|
||||
idx_src,
|
||||
idx_size,
|
||||
vs_key,
|
||||
ps_key,
|
||||
parsed_vs.as_ref(),
|
||||
&self.register_file,
|
||||
mem,
|
||||
);
|
||||
if let Some(caps) = self.frame_captures.as_mut() {
|
||||
// Bound the per-frame list so a runaway frame can't grow
|
||||
// host memory without limit; keep the most recent.
|
||||
const MAX_CAPS: usize = 4096;
|
||||
if caps.len() >= MAX_CAPS {
|
||||
caps.remove(0);
|
||||
}
|
||||
caps.push(cap);
|
||||
}
|
||||
}
|
||||
|
||||
// P5b: decode the textures the *active pixel shader* actually
|
||||
// samples. Parse the bound PS, collect its `tfetch`
|
||||
// fetch-constant slots, read each 6-dword fetch constant from
|
||||
|
||||
Reference in New Issue
Block a user