The 3O→3R real-render slice ran the guest's real translated VS/PS on real
captured vertices at full boot speed, but the --ui window stayed blank.
Bifurcated with an env-gated frontbuffer readback + per-vertex NDC dump
(both removed): the captured splash quads (RectangleList, k_32_32_FLOAT,
3 verts) were non-zero and sane, so this was a transform/decode chain of
bugs, not missing geometry. Four coupled root causes:
- GPUBUG-106 (ucode/alu.rs): decode_alu read EVERY field out of w2, but
canary's AluInstruction lays dest/write-mask/export/scalar-opcode in w0,
the vector opcode + source regs in w2, swizzle/negate/pred in w1. The
misread made every *export* ALU decode with vector_write_mask=0 → no
oPos/oColor export emitted → the translated VS collapsed every vertex to
the clip origin. Rewrote the field map to match ucode.h:2036-2086.
- GPUBUG-107 (ucode/fetch.rs + translator.rs): the translator hardcoded
R32G32B32A32_FLOAT (4 floats, stride 4); the splash quads are
k_32_32_FLOAT (2 floats, stride 2). Over-striding read the next vertex's
X into .w → negative W → the rectangle clipped behind the camera. Decode
the real VertexFormat + dword stride and emit the matching component
read (1/2/3/4 float formats; others reject to the interpreter).
- GPUBUG-108 (translator.rs + xenos_interp.wgsl): the vfetch recomputed
the buffer base from xenos_consts.fetch[], but that uniform carries the
last-published per-frame fetch constant, not this draw's (stale
0x8a000002 vs the real base). The captured window already begins at the
fetch base, so index from 0 (vertex i at i*stride) when a real window is
present; only the synthetic fallback consults the uniform.
- iterate-3S NDC transform (draw_capture.rs + xenos_pipeline.rs + WGSL):
the guest VS emits screen-space pixel coords (clip disabled, VTE viewport
scale/offset off). Added compute_ndc_xy (mirrors canary
GetHostViewportInfo): rescales render-target pixels to [-1,1] clip with
the Y-flip for wgpu, plumbed per-draw into DrawConstants and applied in
both the translated and interpreter VS.
Result (env-gated readback, since removed): the real splash geometry now
fills ~50% of the frontbuffer in a clean triangular coverage pattern, real
positions from real guest vertices through the real translated shaders
(textures are the next stage — sampled color is still the magenta/white
texture stub, tex-cache=0). Headless-inert: draw_capture is only built
when frame_captures is Some (--ui); the changed decoders feed only the UI
translator/metrics. Golden byte-identical (check -n50m --gpu-inline
--stable-digest exit 0); 679 workspace tests green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>