xenia-rs

Author	SHA1	Message	Date
MechaCat02	de21c7a544	[iterate-2G] db16cyc spin-hint cooperative yield: unblock title-screen 0x10a0 gate The silph title state machine (tid13) blocked on event 0x10a0, never signaled. Root: the event's producer chain runs on the silph worker (entry 0x821C4AD0, our tid14), which was starved. tid14 shares a HW slot with a guest spinlock/ barrier participant (sub_824D1328, entry 0x824D2940) that busy-spins on the db16cyc hint `or r31,r31,r31` (encoding 0x7FFFFB78) at 0x824D140C. Under our round-robin lockstep the spinner consumed its whole block every round and starved the co-located tid14 (only 9 progress hits over 200M instr) — so the producer never reached the event-create/duplicate/signal dance the canary oracle performs (handle F80000E8 set by the submitter F8000044 via a duplicated handle). Fix (canary-faithful): recognize the db16cyc spin hint exactly as canary's InstrEmit_orx does (code 0x7FFFFB78 -> DelayExecution) and surface it as a new StepResult::Yield. The scheduler's yield_current() promotes every Ready peer on the slot past STARVE_LIMIT so begin_slot_visit picks one next round, then they reset and the spinner reclaims the slot — fair alternation, no priority inversion, pure function of slot state (deterministic). Result (lockstep, cache-persist, -n 200M): tid14 progresses past its old stall into a real wait; tid13 advances off 0x10a0 to a new event; hub/submitter re-enter their wait loops. imports 280k->592k, packets 124M->164M, swaps 1->2. draws still 0 (the splash's first draw is a further-upstream gate). Determinism preserved (two cold n50m runs byte-identical). n50m golden re-baselined (imports 90296->339766, swaps 1->2; draws unchanged 0). n2m golden unchanged (db16cyc not reached in first 2M). Tests 670/670. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-13 10:38:17 +02:00
MechaCat02	db90ad0f7d	[AUDIT-059 R-D2] Phase D auto-signal POC confirms audit-049 wedge diagnosis Hook NtCreateEvent for the silph::UImpl tid=13 chain (entry=0x821748F0, start_context=0x4024a840, frame-1 LR=0x821CB15C inside sub_821CB030+0x128) and auto-signal the resulting handle after XENIA_SILPH_UI_AUTOSIGNAL_DELAY instructions. Env-gated; default off. SR4 verdict B (partial unwedge): - handle 0x1078 signal_attempts 0->1 - tid=13 Blocked(WaitAny[0x1078]) -> Ready pc=0x824a9108 - ExCreateThread 10 -> 12 (new silph::UImpl tid=14, worker tid=15) - New downstream wedges 0x1084 + 0x1088 - cxx_throw runtime_error on tid=5 inside R26 dispatcher (BST not-registered instance lhs=0x715a7af0) - VdSwap stays 1; no draws (POC is diagnostic, not final fix) Confirms Phase C diagnosis end-to-end. The real signaler must (a) drive NtSetEvent on the silph KEVENT AND (b) register the dispatcher's BST instance upstream; this POC only does (a). Reading-error class #20: ctx.lr at kernel export entry is the thunk wrapper's return slot, NOT the guest caller's post-bl PC. Walk back-chain 1 step to get frames[1].lr. Reading-error class #21: --parallel and lockstep have SEPARATE outer loops in main.rs (run_execution_parallel line 2928 vs run_execution line 2706). Per-round hooks must be wired in BOTH paths. Files: - crates/xenia-cpu/src/scheduler.rs: GuestThread.start_entry/start_context fields + spawn() population + current_thread_entry_and_ctx() helper - crates/xenia-kernel/src/state.rs: AutoSignalPending struct, env-parsed silph_autosignal_delay, pending Vec, last_cycle_hint, set_now_cycle_hint, maybe_register_silph_autosignal (walks back-chain), fire_due_silph_autosignals - crates/xenia-kernel/src/exports.rs: hook in nt_create_event - crates/xenia-app/src/main.rs: fire-site + cycle hint in both outer loops - audit-runs/audit-059-handle-disambiguation/round-D2-autosignal-poc/FINDINGS.md Tests 655/655 green. Default behavior byte-identical when env unset. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-11 18:38:38 +02:00
MechaCat02	481591fdb2	[AUDIT-059 R-C1] Phase C: bit-28 setter hypothesis REFUTED via dump-addr Phase A's diagnosis (bit 28 of [0x40d09a40] gets set to exit sub_822F1AA8's loop) is falsified by direct probe + --dump-addr in 4 sub-rounds. Key evidence: - sub_821B55D8 candidate fn fires 0× in ours; sub_824AA858 (XamInputSetState wrapper) fires 0× in canary too — chain is dead code in both engines. - end-of-run dump shows [0x40d09a40+0] = 0x00000021, same as at entry — bit 28 is NEVER set. - bcctrl at PC 0x822F1B4C (sub_822F1AA8+0xA4) fires (LR=0x822F1B50) but the post-bcctrl BB head 0x822F1B50 fires 0× — bcctrl never returns. - sub_82173990 (vtable[0] of singleton at [0x828E1F08]) is the call target; tid=1 wedges inside this 768-byte function on a thread-join to handle 0x1070 (= tid=13's thread handle). - tid=13 (entry=sub_821748F0, ctx=0x4024a840, handle=0x1070) reaches sub_821C4EB0 (silph::UImpl@GamePart_Title) at cycle 1882 → audit-049 cluster IS reached, wedges on handle 0x1078 there. C.2 force-clear POC NOT EXECUTED — would be no-op since bit 28 is never set. Per plan stopping criterion, hand back instead of proceeding blind. Adds reading-error class #19: disasm-pattern-match without runtime verification (Phase A scanned 49 oris-0x1000 sites and declared one the setter without ever observing the bit get set). No xenia-rs source changes. Canary repo also unchanged (config edit reverted clean). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-11 17:57:27 +02:00
MechaCat02	52c30d82a7	[AUDIT-059 R-A] Phase A backward-trace: divergence is sub_822F1AA8 loop exit, not factory/registry Round-37 anchor reframe: both engines install the SAME static .rdata vtable 0x820A183C at [0x828E1F08]. Instance VAs differ only because of ε-class allocator divergence (audit-043). vtable bytes byte-identical; the user prompt's "factory/registry" framing was falsified. Phase A walkthrough (rounds A1..A8): - A.1 canary --audit_jit_prolog_pc=0x821741C8: tid=6, r3=0xBCCC4A80 (= inner sub-object of [0x828E1F08]'s singleton), LR=0x822F1D5C (return-from-bctrl inside sub_822F1AA8) - A.2 found tid=6 spawn site sub_821746B0 at PC 0x82174824 spawning entry=sub_821748F0 ctx=BC365700/BC366DA0. sub_822F1AA8 ALSO spawns a second thread (entry=sub_822F1EE0 ctx=BCE24A40) at PC 0x822F1B08 - A.3 sub_822F1AA8 has 2 callers, both in sub_8216EA68 (its sole caller is sub_824AB748 = entry_point) - A.4 ours mirror probe: sub_821746B0 enters, [0x828E2B14] gate passes, ExCreateThread fires returning handle 0x1070 (= tid=13). Ours' tid=13 IS the same logical thread as canary's spawned silph initializer - A.5 canary --audit_jit_prolog_pc=0x821749C0: fires only 2× on short-lived tid=17, tid=26 (the spawned initializers — NOT tid=6) - A.6 canary --audit_jit_prolog_pc=0x822F1AA8: fires 1× on tid=6 with r3=0xBCE24A40 LR=0x8216EE14 (the second sub_822F1AA8 call site) - A.7 canary --audit_jit_prolog_pc=0x824AB748 (entry_point): fires on tid=00000006. CONFIRMS canary's tid=6 = canary's main thread. Verdict: identical call chain entry_point → sub_8216EA68 → sub_822F1AA8 in both engines; same controller (ε-divergent VA, byte-identical fields). Canary's main thread stays in sub_822F1AA8's dispatcher loop firing sub_821741C8 ~1678×/30s. Ours' main thread exits the loop and thread-joins on the spawned initializer (tid=13), which is itself wedged on handle 0x1078 forever. Loop exit is gated by bit 28 of [r30+0] (the controller's flag word). Same value 0x21 at function entry in both engines. Some code between entry and loop check sets bit 28 in ours but not in canary. Mem-watch on 0x40d09a40 shows zero guest stores in ours' 50M parallel run — setter is either a kernel-side store, computed alias, or probe-quantum-elided JIT store. Phase B classification: Class 3a (state-divergence on controller object). The vtable is the same; the controller's bit 28 evolves differently during sub_822F1AA8 setup. Class 4 (synthesis) is now less attractive since we correctly reach the dispatcher with the right inputs — we just exit too soon. Phase C will need either JIT instrumentation to identify the bit-28 setter, or a kernel-side hook to clear bit 28 on entry to the loop check site. Findings notes: - round-A4b-ours-spawn-gate/FINDINGS.md (spawn topology + tid mapping) - round-A8-ours-822F1AA8-trace/FINDINGS.md (full loop structure + bit-28 gate) New reading-error class #18: probe-output anchor misframing (singleton[VA]=X vtable=Y was misread as "Y is canary-only vtable" when Y is the same .rdata vtable in both engines). Branch: iterate-2C/silph-ui-spawn-trace off master @ `229b46c`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-11 17:02:20 +02:00

4 Commits