[iterate-3W] draw_capture: walk CF exec sequence to find the real vertex fetch

Fix the UI-side vertex-window resolver (`resolve_vertex_window`) so it
identifies vertex fetches via the control-flow `Exec` clause `sequence`
bitmap instead of blindly decoding every 3-dword triple.

Root cause (GPUBUG-109): the Xenos instruction block packs ALU and fetch
instructions identically (96 bits each); only the owning `Exec` clause's
`sequence` bitmap (2 bits per instruction, bit[2*i] = fetch/ALU) tells
them apart. The old resolver scanned every triple and trusted the first
that happened to decode as a vertex fetch, gated by a `dword0 & 3 == 3`
"type" guard. On real shaders this mis-decoded ALU triples as fetches and
either picked a garbage fetch-constant slot or rejected the clause before
reaching the true vertex fetch. Now walk the CF exec clauses exactly as
the translator does (`translator.rs::emit_exec`) and take the first
sequence-flagged *vertex* fetch.

Measured (env-gated probes, removed before commit): the resolver now
reaches the real fetch on every splash VS. The RectangleList draws
(vs 0x36660986 / 0xd4c14f46) keep resolving real geometry (valid fetch
const 0). The publisher-logo QuadList (vs 0x03b7b020) is correctly seen
to fetch from a fetch constant whose dword0 = 0x1 (no vertex buffer) —
i.e. its geometry is NOT sourced from a memory vertex buffer, so it still
(correctly) falls to the procedural path. That remaining gap (the logo's
auto-generated/index-derived geometry) is the next milestone-1 step; this
commit removes the decoder defect that masked it.

Determinism: UI-only. `resolve_vertex_window` runs only when
`frame_captures` is `Some` (i.e. `--ui`); the headless `--gpu-inline`
core never calls it. `check -n50000000 --gpu-inline --stable-digest`
exit 0 and byte-identical run-to-run. cargo test --workspace: 681 green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-06-18 18:50:13 +02:00
parent da7c29b6d2
commit 39723dfe37

View File

@@ -191,29 +191,42 @@ fn resolve_vertex_window(
rf: &RegisterFile,
mem: &dyn MemoryAccess,
) -> Option<(Vec<u32>, u32)> {
// The instruction block is 3 dwords per ALU/fetch triple. We don't have
// per-triple kind flags here, so we scan every triple and accept the
// first one that decodes as a *vertex* fetch with a plausible constant.
// iterate-3W (GPUBUG-109): the instruction block packs ALU and fetch
// instructions identically (96 bits / 3 dwords each); ONLY the owning
// `Exec` control-flow clause's `sequence` bitmap (2 bits per instruction,
// bit[2*i]=fetch/ALU) tells them apart. The previous blind triple-walk
// decoded ALU triples as fetches → garbage fetch-constant indices and a
// bogus `type==3` guard, never reaching the real vertex fetch. Walk the CF
// exec clauses exactly as the translator does (`translator.rs::emit_exec`)
// and take the FIRST sequence-flagged *vertex* fetch.
let instrs = &parsed_vs.instructions;
let mut fetch_const: Option<u8> = None;
let mut t = 0usize;
while t + 2 < instrs.len() {
let w0 = instrs[t];
let w1 = instrs[t + 1];
let w2 = instrs[t + 2];
if let crate::ucode::fetch::FetchInstruction::Vertex(vf) =
crate::ucode::fetch::decode_fetch([w0, w1, w2])
{
// Validate the referenced fetch constant is a real vertex fetch
// (type==3, kVertex) before trusting it.
let fc = vf.fetch_const as u32;
let dword0 = rf.read(CONST_BASE_FETCH + fc * 6);
if dword0 & 0x3 == 3 {
fetch_const = Some(vf.fetch_const);
'clauses: for clause in &parsed_vs.cf {
let crate::ucode::control_flow::ControlFlowInstruction::Exec {
address,
count,
sequence,
..
} = *clause
else {
continue;
};
for i in 0..(count as usize) {
// bit[2*i] of the sequence bitmap: 1 = fetch, 0 = ALU.
if (sequence >> (i * 2)) & 1 == 0 {
continue;
}
let base = (address as usize + i) * 3;
if base + 2 >= instrs.len() {
break;
}
if let crate::ucode::fetch::FetchInstruction::Vertex(vf) =
crate::ucode::fetch::decode_fetch([instrs[base], instrs[base + 1], instrs[base + 2]])
{
fetch_const = Some(vf.fetch_const);
break 'clauses;
}
}
t += 3;
}
let fc = fetch_const? as u32;
let dword0 = rf.read(CONST_BASE_FETCH + fc * 6);