# Phase C+23 cold-vs-cold result (2026-05-18) ## Outcome: ENGINE FIX LANDED `addis` sign-extension fix at `xenia-cpu/src/interpreter.rs` resolves D-NEW-2 (ε-class timeout sign-extension on the canary tid=12 → ours tid=7 sister chain). 5 LOC effective. Determinism preserved (3× cold runs byte-identical post-fix). ## Matched-prefix table (vs C+22 baseline) | chain | C+22 | C+23 (fresh) | delta | |--------------------------------|---------|--------------|-------| | canary tid=6 → ours tid=1 main | 104,607 | 104,607 | 0 | | canary tid=4 → ours tid=11 | 11 | 11 | 0 | | canary tid=7 → ours tid=2 | 32 | 32 | 0 | | canary tid=12 → ours tid=7 | 3 | **4** | **+1** | | canary tid=14 → ours tid=9 | 41 | 41 | 0 | | canary tid=15 → ours tid=10 | 16 | 16 | 0 | ## Floating-event absorption counts (fresh c23) | chain | floating_create (c/o) | floating_wait (c/o) | |--------------------------------|-----------------------|---------------------| | canary tid=6 → ours tid=1 main | 2 / 0 | 3 / 0 | | canary tid=15 → ours tid=10 | 0 / 1 | 0 / 0 | | others | 0 / 0 | 0 / 0 | C+18 absorber engaged on main chain (2 canary handle.create floated) and on tid=15→10 (1 ours handle.create floated). C+21 absorber engaged on main chain (3 canary wait.begin events floated — this canary cold sample took the contended slow path 3 times). ## Cold-stable invariants - **ours-cold byte-identical (det-fields) across 3 runs**: digest `23cf4c4cbf61a577caa4118ab2308ba6`. Replaces C+22's `e1dfcb1559f987b35012a7f2dc6d93f5` baseline (digest moved due to engine source change). New baseline anchored here. - **Event count** unchanged: 121,569 ours events (matches C+22). - **Phase B `image_canonical_sha256` = `ea8d160e9369328a5b922258a92113efb8d7ce3e1a5c12cc521e375985c91c18`** — UNCHANGED. Image-loading path untouched. - **Engine source change**: `xenia-cpu/src/interpreter.rs::addis` (5 LOC effective, ~25 LOC including comment + commented-out truncation). No `xenia-canary` source changes. No diff-tool changes. - **Tests**: kernel 204 unchanged; cpu 288 → 291 (3 new regression tests for the addis fix). ## Direct fix-verification at the divergence point ours-cold post-fix, tid=7 events 0-4: ``` [0] import.call KeWaitForSingleObject [1] kernel.call KeWaitForSingleObject [2] handle.create sid=6e3d96c5a52bf429 [3] wait.begin {timeout_ns: -30000000, alertable: false, wait_type: any} [4] kernel.return return_value=0 status=0x00000000 ``` canary-cold, tid=12 events 0-4: ``` [0] import.call KeWaitForSingleObject [1] kernel.call KeWaitForSingleObject [2] handle.create sid=c49d8f0ab90401ea (different SID, absorbed) [3] wait.begin {timeout_ns: -30000000, alertable: false, wait_type: any} [4] kernel.return return_value=258 status=0x00000102 (TIMEOUT) ``` `timeout_ns: -30000000` MATCHES across engines (was `429466729600` pre-fix). ## New downstream divergence at idx=4 (C+23 → C+24+ target) The advance reveals the next-class issue at idx=4: ``` canary: [4] kernel.return KeWaitForSingleObject return_value=258 (TIMEOUT) ours: [4] kernel.return KeWaitForSingleObject return_value=0 (SUCCESS) ``` Classification: **(A) scheduler-determinism**, same family as C+20 and C+22 escalations. Ours's monolithic-thread runner doesn't allow the 30 ms timeout window to elapse with no signaler, so the wait returns SUCCESS (the event was already signaled at the entry?) or the wait was implicit-fast-served. Canary's contended scheduler lets the timeout fire. Engine-side fix requires the parallel scheduler-determinism track (multi-session refactor). ## Verification that fix is NOT diff-tool jitter Multiple distinct evidences: 1. **Direct ours-cold inspection** — the `wait.begin.timeout_ns` field is read directly from ours-cold.jsonl (no diff-tool interpretation), and it's now -30000000. 2. **Unit tests** — `lis_ori_std_negative_timeout_writes_sign_ extended_doubleword` in xenia-cpu asserts the architectural fact directly. 3. **Determinism** — 3× cold runs produce byte-identical det-fields digest. The fix isn't a race that flickered on this one sample. 4. **Phase B image hash unchanged** — the fix is purely behavioral on the JIT layer, not a re-link or image change. ## Cascade outcome - A=verify canary's timeout read logic: PASS (identical formula). - B=identify encoding bug class: PASS — (d) sign-extension. - C=land fix: PASS — 5 LOC + 3 tests. - D=tid=12→7 advances past 3: PASS (3 → 4). - E=no regression on main or other sisters: PASS (all preserved). ## Files - `investigation.md` - `cold-vs-cold-result.md` (this file) - `diff-cold-vs-cold.md` - `re-validation.md` - `ours-cold.jsonl` / `ours-cold-stdout.log` / `ours-cold-stderr.log` - `canary-cold-trunc.jsonl` / `canary-cold-stdout.log` - `canary-binary-cache-pre-wipe.tar.gz` / `canary-xdg-cache-pre-wipe.tar.gz` - `digest-cold-stable-1.json` / `-2.json` / `-3.json` - `fix.diff` ## Next-target recommendation - **C+24 = D-NEW-3** (canary tid=14 → ours tid=9 idx=41): canary calls `XAudioGetVoiceCategoryVolumeChangeMask`; ours calls `RtlEnterCriticalSection`. Likely missing/stubbed XAudio export in ours causing fallback. Independent of scheduler-determinism. - **Parallel scheduler-determinism track**: tackle the C+20/C+22 + the newly-surfaced C+23-idx=4 family at the root via a per-CS-pointer expected-contention inference layer. Multi-session.