Files
xenia-rs/crates/xenia-kernel/src/ui_bridge.rs
MechaCat02 504592ac13 [iterate-3O] Real-render slice: replay guest geometry in --ui (Route A)
Replace the synthetic placeholder triangle in the --ui window with the
splash's REAL guest geometry, proving the faithful-render pipe end to end.

Architecture: Route A (UI-side replay). A per-draw capture channel carries
each PM4_DRAW_INDX*'s real state to the UI, which replays it through the
existing wgpu Xenos pipeline. The deterministic headless core is untouched:
capture is gated on an Option<Vec<DrawCapture>> that is None in headless
mode and only enabled on the --ui path, so the --gpu-inline n50m golden is
byte-identical (verified 2x).

The hard part was sourcing real vertices. The WGSL VS already does
format-aware vertex fetch from the b4 storage buffer at the address from the
fetch constant -- but b4 was never populated and the fetch address is an
absolute guest dword address. The slice:
  * xenia-gpu/draw_capture.rs: parse the active VS, find its first vertex
    fetch, read that fetch constant, copy a bounded window of guest memory
    at the fetch base. Best-effort: has_real_vertices=false falls back to
    procedural geometry (never fabricated pixels).
  * gpu_system.rs: accumulate one DrawCapture per draw into frame_captures.
  * exports.rs (vd_swap): drain + publish the frame's captures to the UI.
  * ui_bridge/bridge.rs: new publish_geometry channel + UiHandles.geometry.
  * WGSL (interp + translator): rebase the absolute fetch address by a new
    DrawConstants.vertex_base_dwords so it indexes the uploaded window.
  * render.rs: dispatch_xenos_captures uploads each draw's real vertex
    window + matching shader, issues real DrawRequests (real prim type,
    host vertex count, vs/ps keys).
  * app.rs: prefer the real-capture replay; HUD adds real-geo=N counter.

Verified in --ui on Sylpheed: "first Xenos capture batch replayed (real
geometry) captures=24 real_vertex_draws=24" -- all draws resolved a real
guest vertex window; WGSL compiles; no validation errors over 1616 swaps.

Still synthetic-free but not yet pixel-perfect: textures/UVs, DMA index
buffers (auto-index only for now), and kCopy resolve routing are staged
for follow-ups. Faithful: real vertex data, prim types, shaders, constants.

cargo test --workspace green; n50m golden unchanged (2x byte-identical).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 22:38:46 +02:00

200 lines
9.2 KiB
Rust

//! Bridge between the kernel (CPU-thread side) and a host UI (main-thread side).
//!
//! The kernel side needs to:
//! - snapshot the latest host gamepad each time a guest calls
//! `XamInputGetState`, and
//! - signal the UI when the guest calls `VdSwap` so the UI can upload the
//! guest's frontbuffer to a wgpu texture and present it.
//!
//! Both directions are expressed as trait-object closures so that `xenia-kernel`
//! does not have to depend on winit/wgpu/gilrs. The [`UiBridge`] is installed
//! on [`KernelState::ui`] by `cmd_exec` when `--ui` is passed.
use std::collections::HashMap;
use std::sync::Arc;
use std::sync::atomic::{AtomicBool, AtomicU64};
use xenia_gpu::draw_capture::DrawCapture;
use xenia_gpu::texture_cache::TextureKey;
use xenia_gpu::xenos_constants::XenosConstantsBlock;
use xenia_hid::GamepadState;
use xenia_memory::MemoryAccess;
/// Information surfaced to the UI each time the guest presents a frame.
///
/// Fields mirror the seven "interesting" arguments to `VdSwap` in
/// `xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_video.cc`: the raw
/// frontbuffer pointer, its dimensions, and the format/color-space enum values
/// the guest passed through.
#[derive(Clone, Copy, Debug)]
pub struct SwapInfo {
/// Guest physical/virtual address of the frontbuffer to present.
pub frontbuffer_addr: u32,
/// Width in pixels as reported by the guest.
pub width: u32,
/// Height in pixels as reported by the guest.
pub height: u32,
/// Xenos texture format enum (the guest passes a pointer; we dereference
/// it here). 0 means "unknown / guest passed a null pointer".
pub texture_format: u32,
/// Color-space enum (sRGB / BT.709 / …).
pub color_space: u32,
/// Monotonically increasing frame counter maintained by the kernel; useful
/// for HUD display and deduping.
pub frame_index: u64,
/// Total PM4 `DRAW_INDX*` packets the GPU has captured since boot.
/// Surfaced so the UI HUD can show progress even before the full
/// uber-shader pipeline is wired in.
pub draws_total: u64,
/// Total PM4 packets executed, across all opcodes — useful signal for
/// "is the GPU actually getting anything at all to consume?".
pub packets_total: u64,
/// Most-recent draw's Xenos primitive-type code (0 = none yet).
pub last_draw_prim: u32,
/// Most-recent draw's vertex count.
pub last_draw_vertex_count: u32,
/// Indirect-buffer jumps so far (useful "is the game driving the ring
/// buffer through IBs?" signal).
pub indirect_buffer_jumps: u64,
/// WAIT_REG_MEM stalls observed on the GPU slot.
pub wait_reg_mem_blocks: u64,
/// Summed CPU instruction count across all 6 HW threads. Mirrors the
/// `cycle_count` field each `PpcContext` maintains; gives the HUD a live
/// "how far has the guest run?" readout.
pub instructions_total: u64,
/// Active VS shader blob key at the most recent DRAW_INDX* (0 = none).
/// P3b: the UI uses this to index into `handles.shader_blobs` so the
/// Xenos uber-shader interpreter can upload the matching microcode.
pub vs_blob_key: u32,
/// Active PS shader blob key at the most recent DRAW_INDX*.
pub ps_blob_key: u32,
/// P4: total EDRAM→memory resolves fired since boot (TILE_FLUSH
/// events). Non-zero means the game is committing pixels.
pub resolves_total: u64,
/// Subset of `resolves_total` whose byte-copy path succeeded and wrote
/// at least one sample into guest memory.
pub resolves_copied_total: u64,
/// Subset of `resolves_total` that were skipped by the byte-copy path
/// due to an unsupported format / MSAA mode / 3D destination.
pub resolves_skipped_total: u64,
/// P4: unique RT keys seen (from the GPU's internal render-target
/// cache). Grows as the game exercises new RT footprints.
pub unique_render_targets: u64,
/// P6: total graphics-interrupt callbacks delivered (v-sync + CP).
/// Non-zero means `VdSetGraphicsInterruptCallback` has been wired end
/// to end and callbacks are actually running.
pub interrupts_delivered: u64,
/// P6: graphics-interrupts queued but dropped (callback unset,
/// thread 0 blocked, or already inside another callback).
pub interrupts_dropped: u64,
}
/// Handles the kernel uses to talk to a running host UI.
///
/// None of the closures are allowed to block for long — they are called from
/// the CPU interpreter thread on the hot path.
#[derive(Clone)]
pub struct UiBridge {
/// Snapshot the host gamepad. Called from `XamInputGetState`.
pub gamepad: Arc<dyn Fn() -> GamepadState + Send + Sync>,
/// Report that the guest completed a frame. The closure gets the swap
/// metadata plus a borrow of guest memory so it can copy the frontbuffer
/// bytes into a UI-owned staging buffer before returning. Called from
/// `VdSwap` on the CPU thread.
pub post_swap: Arc<dyn Fn(SwapInfo, &dyn MemoryAccess) + Send + Sync>,
/// Indicates the UI wants the CPU loop to stop. Checked periodically by
/// the interpreter loop.
pub shutdown: Arc<AtomicBool>,
/// Set to `true` when a gamepad is present. `XamInputGetState` returns
/// `ERROR_DEVICE_NOT_CONNECTED` when this is `false`.
pub gamepad_connected: Arc<AtomicBool>,
/// Live CPU instruction counter mirror. The app's run loop publishes
/// the sum of `ctx.cycle_count` across HW threads here every ~8k
/// instructions so the HUD can report progress between VdSwap events.
pub instructions_counter: Arc<AtomicU64>,
/// P3b asset publish: `vd_swap` snapshots the GPU's `shader_blobs` and
/// constants register region and feeds them to the UI so the Xenos
/// uber-shader interpreter has the microcode + constants needed to
/// execute the guest draw. Split from `post_swap` so the asset wire
/// stays optional — if the UI doesn't need them (headless mode) the
/// closure is a no-op.
pub publish_xenos_assets:
Arc<dyn Fn(HashMap<u32, Vec<u32>>, XenosConstantsBlock) + Send + Sync>,
/// P4 frontbuffer publish: at each `VdSwap`, the kernel CPU-side
/// detiles the guest frontbuffer (k_8_8_8_8 Tiled2D) into a linear
/// RGBA8 buffer and hands it to the UI. The closure receives
/// `(width, height, bytes)` — the UI uploads it as a texture.
pub publish_frontbuffer:
Arc<dyn Fn(u32, u32, Vec<u8>) + Send + Sync>,
/// P5 primary texture publish: at each `VdSwap`, the kernel thread
/// decodes the PS shader's primary-texture fetch constant (slot 0
/// for now) and hands the decoded linear bytes + key to the UI so
/// the xenos pipeline can bind a real texture at `@group(1)`.
/// Receives `(TextureKey, bytes)`; when `None` is sent the UI
/// reverts to its magenta stub.
pub publish_texture:
Arc<dyn Fn(Option<(TextureKey, Vec<u8>)>) + Send + Sync>,
/// iterate-3O real-render slice: at each `VdSwap`, the kernel hands the
/// UI the per-draw geometry captured this frame (one [`DrawCapture`] per
/// `PM4_DRAW_INDX*`), including the real guest vertex window. The UI
/// replays them through the Xenos wgpu pipeline so the splash renders its
/// actual geometry instead of synthetic placeholder shapes. Empty in the
/// degenerate case (no draws or capture disabled).
pub publish_geometry:
Arc<dyn Fn(Vec<DrawCapture>) + Send + Sync>,
}
impl UiBridge {
/// Snapshot input state (user 0 only; higher indices are unconnected).
pub fn snapshot_gamepad(&self) -> GamepadState {
(self.gamepad)()
}
/// True iff a gamepad is connected for user 0.
pub fn is_connected(&self, user_index: u32) -> bool {
user_index == 0
&& self
.gamepad_connected
.load(std::sync::atomic::Ordering::Relaxed)
}
/// Push a swap event to the UI thread.
pub fn notify_swap(&self, info: SwapInfo, mem: &dyn MemoryAccess) {
(self.post_swap)(info, mem);
}
/// Snapshot current shader blobs + constants and hand them to the UI.
/// Call from `vd_swap` so the UI has the matching assets for every
/// draw captured in this frame.
pub fn publish_assets(
&self,
blobs: HashMap<u32, Vec<u32>>,
constants: XenosConstantsBlock,
) {
(self.publish_xenos_assets)(blobs, constants);
}
/// True iff the UI asked for shutdown.
pub fn should_shutdown(&self) -> bool {
self.shutdown.load(std::sync::atomic::Ordering::Relaxed)
}
/// Hand a detiled frontbuffer frame to the UI. Called at most once per
/// `VdSwap`. `bytes` must be `width * height * 4` bytes in
/// `Rgba8Unorm` order (the UI pipeline's expected layout).
pub fn publish_frontbuffer(&self, width: u32, height: u32, bytes: Vec<u8>) {
(self.publish_frontbuffer)(width, height, bytes);
}
/// Hand one decoded guest texture to the UI. `Some` = update the bound
/// slot; `None` = revert to the magenta stub.
pub fn publish_texture(&self, tex: Option<(TextureKey, Vec<u8>)>) {
(self.publish_texture)(tex);
}
/// Hand this frame's captured per-draw geometry to the UI.
pub fn publish_geometry(&self, caps: Vec<DrawCapture>) {
(self.publish_geometry)(caps);
}
}