[iterate-3W] draw_capture: walk CF exec sequence to find the real vertex fetch
Fix the UI-side vertex-window resolver (`resolve_vertex_window`) so it identifies vertex fetches via the control-flow `Exec` clause `sequence` bitmap instead of blindly decoding every 3-dword triple. Root cause (GPUBUG-109): the Xenos instruction block packs ALU and fetch instructions identically (96 bits each); only the owning `Exec` clause's `sequence` bitmap (2 bits per instruction, bit[2*i] = fetch/ALU) tells them apart. The old resolver scanned every triple and trusted the first that happened to decode as a vertex fetch, gated by a `dword0 & 3 == 3` "type" guard. On real shaders this mis-decoded ALU triples as fetches and either picked a garbage fetch-constant slot or rejected the clause before reaching the true vertex fetch. Now walk the CF exec clauses exactly as the translator does (`translator.rs::emit_exec`) and take the first sequence-flagged *vertex* fetch. Measured (env-gated probes, removed before commit): the resolver now reaches the real fetch on every splash VS. The RectangleList draws (vs 0x36660986 / 0xd4c14f46) keep resolving real geometry (valid fetch const 0). The publisher-logo QuadList (vs 0x03b7b020) is correctly seen to fetch from a fetch constant whose dword0 = 0x1 (no vertex buffer) — i.e. its geometry is NOT sourced from a memory vertex buffer, so it still (correctly) falls to the procedural path. That remaining gap (the logo's auto-generated/index-derived geometry) is the next milestone-1 step; this commit removes the decoder defect that masked it. Determinism: UI-only. `resolve_vertex_window` runs only when `frame_captures` is `Some` (i.e. `--ui`); the headless `--gpu-inline` core never calls it. `check -n50000000 --gpu-inline --stable-digest` exit 0 and byte-identical run-to-run. cargo test --workspace: 681 green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -191,29 +191,42 @@ fn resolve_vertex_window(
|
|||||||
rf: &RegisterFile,
|
rf: &RegisterFile,
|
||||||
mem: &dyn MemoryAccess,
|
mem: &dyn MemoryAccess,
|
||||||
) -> Option<(Vec<u32>, u32)> {
|
) -> Option<(Vec<u32>, u32)> {
|
||||||
// The instruction block is 3 dwords per ALU/fetch triple. We don't have
|
// iterate-3W (GPUBUG-109): the instruction block packs ALU and fetch
|
||||||
// per-triple kind flags here, so we scan every triple and accept the
|
// instructions identically (96 bits / 3 dwords each); ONLY the owning
|
||||||
// first one that decodes as a *vertex* fetch with a plausible constant.
|
// `Exec` control-flow clause's `sequence` bitmap (2 bits per instruction,
|
||||||
|
// bit[2*i]=fetch/ALU) tells them apart. The previous blind triple-walk
|
||||||
|
// decoded ALU triples as fetches → garbage fetch-constant indices and a
|
||||||
|
// bogus `type==3` guard, never reaching the real vertex fetch. Walk the CF
|
||||||
|
// exec clauses exactly as the translator does (`translator.rs::emit_exec`)
|
||||||
|
// and take the FIRST sequence-flagged *vertex* fetch.
|
||||||
let instrs = &parsed_vs.instructions;
|
let instrs = &parsed_vs.instructions;
|
||||||
let mut fetch_const: Option<u8> = None;
|
let mut fetch_const: Option<u8> = None;
|
||||||
let mut t = 0usize;
|
'clauses: for clause in &parsed_vs.cf {
|
||||||
while t + 2 < instrs.len() {
|
let crate::ucode::control_flow::ControlFlowInstruction::Exec {
|
||||||
let w0 = instrs[t];
|
address,
|
||||||
let w1 = instrs[t + 1];
|
count,
|
||||||
let w2 = instrs[t + 2];
|
sequence,
|
||||||
if let crate::ucode::fetch::FetchInstruction::Vertex(vf) =
|
..
|
||||||
crate::ucode::fetch::decode_fetch([w0, w1, w2])
|
} = *clause
|
||||||
{
|
else {
|
||||||
// Validate the referenced fetch constant is a real vertex fetch
|
continue;
|
||||||
// (type==3, kVertex) before trusting it.
|
};
|
||||||
let fc = vf.fetch_const as u32;
|
for i in 0..(count as usize) {
|
||||||
let dword0 = rf.read(CONST_BASE_FETCH + fc * 6);
|
// bit[2*i] of the sequence bitmap: 1 = fetch, 0 = ALU.
|
||||||
if dword0 & 0x3 == 3 {
|
if (sequence >> (i * 2)) & 1 == 0 {
|
||||||
fetch_const = Some(vf.fetch_const);
|
continue;
|
||||||
|
}
|
||||||
|
let base = (address as usize + i) * 3;
|
||||||
|
if base + 2 >= instrs.len() {
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
|
if let crate::ucode::fetch::FetchInstruction::Vertex(vf) =
|
||||||
|
crate::ucode::fetch::decode_fetch([instrs[base], instrs[base + 1], instrs[base + 2]])
|
||||||
|
{
|
||||||
|
fetch_const = Some(vf.fetch_const);
|
||||||
|
break 'clauses;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
t += 3;
|
|
||||||
}
|
}
|
||||||
let fc = fetch_const? as u32;
|
let fc = fetch_const? as u32;
|
||||||
let dword0 = rf.read(CONST_BASE_FETCH + fc * 6);
|
let dword0 = rf.read(CONST_BASE_FETCH + fc * 6);
|
||||||
|
|||||||
Reference in New Issue
Block a user