[iterate-3P] Real splash geometry in --ui: fix CF predication decode + translator op coverage

Stage 1 of the iterate-3O resume plan: make the P7 translator actually
compile the splash's real VS/PS so real per-vertex POSITIONS render via the
host wgpu pipeline, instead of every draw falling to the interpreter (which
emits a placeholder triangle). Two coupled fixes, both faithful (Route A):

1. ucode/control_flow.rs (GPUBUG-103): clause-level predication was decoded
   from payload bits 28/29, which fall inside the exec clause's `sequence_`/
   `vc_hi_` fields, NOT the predicate flag. That stamped `predicated=true`
   on plain `kExec` clauses, so the translator rejected EVERY splash VS as
   `cf_cond`. Per canary ucode.h, clause predication is determined by the
   *opcode* (only kCondExecPred* = 5/6/13/14 are predicate-register-gated;
   their `condition_` is at word1 bit 9 = payload bit 41). kExec/kExecEnd
   (1/2) run unconditionally; kCondExec (3/4) is bool-constant-gated (not
   modeled). Diagnosed live in --ui: reject reason cf_cond on all 7 splash
   shader pairs → after fix, predicated=false and CF passes.

2. translator.rs: with CF passing, the next reject was `scl_op_unsupported`
   for scalar opcodes 4 (kMulsPrev2 / LIT emul) and 8 (kSgts), plus thin
   vector coverage. Expanded vector_expr + scalar_expr to mirror the runtime
   interpreter's op set (which mirrors canary AluVectorOpcode/AluScalarOpcode):
   CND_EQ/GE/GT, TRUNC, MAX4, DST for vectors; the full SEQS/SGTS/SGES/SNES,
   MULS_PREV2 (with the -FLT_MAX / non-finite / b<=0 guard), SUBS(_PREV),
   EXP/LOG/RCP/RSQ/SQRT/SIN/COS, FRCS/TRUNCS/FLOORS for scalars. Side-effecting
   ops (setp*/kills*/maxas*) still reject → interpreter fallback (honest).

Result (--ui, measured): xlated-pipelines 0→6, all draws served by the
translator (served_interp=0) — real VS/PS now run on the host GPU. The
splash is still not visibly correct because the captured guest vertex
windows read all-zero: the vertex-buffer base VA (~0x0adf_xxxx) is UNMAPPED
in guest memory (mem.translate()==None). That is a CPU/kernel memory-mapping
gap, not a GPU-render gap — the next stage.

Determinism: both files are in xenia-gpu core but the CF `predicated` field
only feeds the UI translator + a metric tag, never deterministic state.
Verified: `check -n50000000 --gpu-inline --stable-digest` matches the golden
byte-for-byte (exit 0); 679 tests green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-06-18 15:07:06 +02:00
parent 504592ac13
commit 6ff184694d
2 changed files with 78 additions and 20 deletions

View File

@@ -96,10 +96,26 @@ pub fn decode_cf_pair(word0: u32, word1: u32, word2: u32) -> (ControlFlowInstruc
fn decode_single(payload: u64) -> ControlFlowInstruction {
// Top 4 bits of the 48-bit payload.
let opcode = ((payload >> 44) & 0xF) as u8;
// Predicate bit + condition live at the 28..30 range for exec/jmp. Rough
// extraction — good enough for the interpreter, which logs unknowns.
let predicated = ((payload >> 28) & 1) != 0;
let predicate_condition = ((payload >> 29) & 1) != 0;
// GPUBUG-103 (iterate-3P): clause-level predication is determined by the
// *opcode*, not by free bits. The 48-bit CF payload is word0 = bits 0..31,
// word1 = bits 32..47. Per canary `ucode.h`:
// * `ControlFlowExecInstruction` (kExec/kExecEnd, opcodes 1/2): NOT
// predicate-gated — it runs unconditionally.
// * `ControlFlowCondExecInstruction` (kCondExec/kCondExecEnd, 3/4): gated
// by a *bool constant*, `condition_` at word1 bit 10 = payload bit 42.
// We don't model bool-constant gating in the WGSL paths (the bool is
// virtually always set for these), so treat as unconditional.
// * `ControlFlowCondExecPredInstruction` (kCondExecPred/...End/Clean...,
// 5/6/13/14): gated by the *predicate register*; `condition_` at word1
// bit 9 = payload bit 41.
// The prior code read bits 28/29 (which fall inside `sequence_`/`vc_hi_`)
// and stamped `predicated=true` on plenty of plain `kExec` clauses — which
// made the P7 translator reject EVERY splash VS as `cf_cond`, forcing the
// interpreter (placeholder geometry) for all draws.
let is_pred_gated = matches!(opcode, 5 | 6 | 13 | 14);
let predicated = is_pred_gated;
let predicate_condition = is_pred_gated && ((payload >> 41) & 1) != 0;
// Xenos `ControlFlowOpcode` (canary `ucode.h:86-160`):
// 0 kNop, 1 kExec, 2 kExecEnd, 3 kCondExec, 4 kCondExecEnd,