The silph title state machine (tid13) blocked on event 0x10a0, never signaled. Root: the event's producer chain runs on the silph worker (entry 0x821C4AD0, our tid14), which was starved. tid14 shares a HW slot with a guest spinlock/ barrier participant (sub_824D1328, entry 0x824D2940) that busy-spins on the db16cyc hint `or r31,r31,r31` (encoding 0x7FFFFB78) at 0x824D140C. Under our round-robin lockstep the spinner consumed its whole block every round and starved the co-located tid14 (only 9 progress hits over 200M instr) — so the producer never reached the event-create/duplicate/signal dance the canary oracle performs (handle F80000E8 set by the submitter F8000044 via a duplicated handle). Fix (canary-faithful): recognize the db16cyc spin hint exactly as canary's InstrEmit_orx does (code 0x7FFFFB78 -> DelayExecution) and surface it as a new StepResult::Yield. The scheduler's yield_current() promotes every Ready peer on the slot past STARVE_LIMIT so begin_slot_visit picks one next round, then they reset and the spinner reclaims the slot — fair alternation, no priority inversion, pure function of slot state (deterministic). Result (lockstep, cache-persist, -n 200M): tid14 progresses past its old stall into a real wait; tid13 advances off 0x10a0 to a new event; hub/submitter re-enter their wait loops. imports 280k->592k, packets 124M->164M, swaps 1->2. draws still 0 (the splash's first draw is a further-upstream gate). Determinism preserved (two cold n50m runs byte-identical). n50m golden re-baselined (imports 90296->339766, swaps 1->2; draws unchanged 0). n2m golden unchanged (db16cyc not reached in first 2M). Tests 670/670. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
5.5 KiB
Round 34 — silph_ui_synth.rs (cluster B sibling) — DEFERRED PLAN
Background
Rounds 23-33 drove γ-cluster #2 down to the actual gate: sub_821741C8 (silph worker-dispatch loop) fires 0× in ours / 471× in canary (tid=6). It's invoked via dynamic vtable slot 9 from sub_821752C0 thunk. The vtable writer is in the audit-050 unreachability island — there's no static caller chain to hook into.
The fix shape is a synth module analogous to silph_synth.rs (rounds 18-21):
- Synthesize a singleton-like object with the right vtable
- Spawn a guest thread at the right entry with this object as r3
- Let the dispatch chain do the rest
Rounds 18-21 took 4 rounds to land cluster A's analog and ended at "workers run live but idle" because of missing foreign-pointer fields. Cluster B will face similar challenges.
Sub-round breakdown (estimated 5-8 rounds)
34.α — Probe canary's dispatcher singleton (1 round)
Capture canary's runtime state at sub_821741C8 entry:
r3 = 0xBCA44C00(canary tid=6's dispatcher singleton)- Dump
r3..r3+0x80to identify all fields - Note vtable address at
[r3+0]
WINEDEBUG=-all wine xenia_canary.exe --mute=true --audit_handle_lifecycle=true \
--audit_jit_prolog_pc=0x821741C8 --audit_jit_prolog_r3_bytes=128 \
--audit_jit_prolog_mem_dump=<vtable_va_from_r3+0> \
...
34.β — Probe full vtable layout (1 round)
Read the vtable bytes statically from the PE (canary's [r3+0] IS a static XEX VA — same trick as round 21):
- Read 32-64 slots from PE at file offset = vtable VA - 0x82000000
- Confirm slot 9 =
sub_821C7CB8andvtable+0x24thunk tosub_821741C8 - Look at all other slots — do any reference deep guest code that needs more init?
Cross-reference each slot's DB reach. If a slot is the dispatcher's own method body, it'll be called from within the chain — needs to exist.
34.γ — Skeleton synth + thread spawn (1 round)
Create crates/xenia-kernel/src/silph_ui_synth.rs mirroring silph_synth.rs structure:
pub fn spawn_silph_ui_dispatcher(state: &mut KernelState, mem: &GuestMemory, scheduler: &mut Scheduler) -> Result<u32, &'static str> {
if state.silph_ui_synth_done { return Ok(state.silph_ui_synth_ctx); }
// Allocate ~0x100-0x200 bytes for the dispatcher singleton
let ctx = state.heap_alloc(0x200, 16)?;
mem.write_zeros(ctx, 0x200);
// Install static-XEX vtable at [+0]
mem.write_u32(ctx + 0x00, VTABLE_VA); // discovered in 34.β
// Other init fields from 34.α dump
// ...
// Spawn dispatcher thread at sub_821748F0 with r3=ctx
scheduler.spawn(SpawnParams{
entry: 0x821748F0,
start_context: ctx,
create_suspended: false,
...
})?;
state.silph_ui_synth_done = true;
state.silph_ui_synth_ctx = ctx;
Ok(ctx)
}
Hook point: first reach of sub_821CB030 in the existing silph factory chain (the call site that should normally trigger this dispatcher's creation in canary).
Add 3-mode env gate: XENIA_SILPH_UI_SYNTH={unset|=suspend|=1}.
34.δ — Run + diagnose first crash (1 round)
Almost certainly crashes on a NULL deref of one of the singleton's fields. Use round 19's pattern:
- Probe at thread entry + early BB heads
- Identify the offset that's accessed
- Compare to canary's value at that offset
34.ε..η — Iterate on field fills (2-4 rounds)
Each crash identifies one more required field. Fill it. Re-run. Continue until workers idle (verdict D analog).
34.θ — Producer-side seeding (1 round)
Even with the dispatcher running, work-items may not flow. Per round 32 it's pool 3 that's starved (271 fires in canary). The producers are sub_821CBEA8 / sub_821D24A0 / sub_821CD458 — they may need their own bootstrap. Probe what triggers them in canary.
Verification at each stage
After every commit:
cargo test --release --workspace— 765/765 must passXENIA_CACHE_PERSIST=1 XENIA_SILPH_UI_SYNTH=1 ./target/release/xenia-rs exec <ISO> -n 50000000 --trace-handles-focus=0x1218,0x1224,0x12a4,0x12ac- Check:
- No crash
sub_821741C8firessub_82450b68r4=3 fires increase- Handle 0x1224 / 0x1218 transition out of NO_SIGNALS_DESPITE_WAITS
- Eventually:
VdSwap > 1, draws > 0
Risk register
- High: dispatcher singleton may require many more fields than the analog WorkerCtx (rounds 18-21 needed 8 KEVENTs + ring + descriptors + index table; UI dispatcher likely has similar scope)
- High: foreign-arena pointers in canary's heap (similar to round 19's
[+0x28/+0x2C/+0x30]) may need their own synthesis - Medium: cluster B's worker may itself spawn threads which need contexts which need... cascading scope
- Low: workspace tests breaking (probe infrastructure is solid)
- Low: existing iterate-2BE work regressing (it's on a separate branch)
Off-ramps
If we hit a wall at any sub-round, the off-ramps are:
- Land the infrastructure as opt-in (rounds 18-21 pattern) and ship cluster A + cluster B both as opt-in env vars
- Drop cluster B entirely and PR the iterate-2BE work to master (production-ready architectural fix)
- Pivot to lockstep diff of inflate function (round 30 hypothesis (i)) if cluster B keeps producing crash-fix layers
Branch plan
New branch: iterate-2BF/silph-ui-synth off iterate-2BF/synthetic-silph-spawn HEAD 40f208e. Each sub-round = 1 commit. All commits opt-in via env var; default behavior unchanged.
When ready to execute
Dispatch with the prompt at the round-33 agent's recommendation, starting at sub-round 34.α.