# Iterate 2.N — Clean re-baseline post-2.F/2.H/2.L/2.M (writer report) **Date:** 2026-05-28. **LOC delta:** engine **0**, canary **0**, tooling **0**. Pure recon, no source modifications. **Tests:** N/A (no code change). **Cascade:** N/A — recon class per tripstone #39; #40 explicitly NOT claiming any cascade fix. ## Headline **BASELINE-CLEAN-DIVERGENCE-CHARACTERIZED.** Categorized first-divergence detected on the main chain (canary tid=6 → ours tid=1) at matched-prefix position **105,286** — **bit-identical to the Phase C+23 baseline** (was 105,286 before yesterday's fixes; remains 105,286 today). Engine fixes 2.F (VdSwap drain) + 2.H (vA0000000 bucket) + tooling fix 2.L (categorized harness) + 2.M (always-on exit-state.json) all verified operating as designed. ours wedge geometry **bit-identical to 2.K/2.M** (`exit-thread-state.json` diff-clean vs 2.M). One previously-hidden `[return_value mismatch]` surfaces on the sister chain (canary tid=12 → ours tid=7) at idx=4 — KeWaitForSingleObject returns `258` (STATUS_TIMEOUT) on canary vs `0` (SUCCESS) on ours, a return-value class divergence that the categorized harness now flags explicitly. ## Mode Pure measurement, ZERO LOC change. Invocation (identical to 2.J/2.K/2.M): ``` XENIA_CACHE_WIPE=1 timeout 600 ./target/release/xenia-rs exec \ -n 50000000 --quiet \ --phase-a-event-log audit-runs/iterate-2N-rebaseline/ours-cold.jsonl \ "" ``` XDG cache directory `~/.local/share/xenia-rs/cache/` empty at run start (belt-and-braces; `XENIA_CACHE_WIPE=1` already redirects to per-pid tmpdir). Canary trace **reused** from `phase-c23-keWait-timeout-encoding/canary-cold-trunc.jsonl` (cold-cache capture from 2026-05-18; matches the canonical Phase C+23 baseline used by 2.J/2.L). No fresh canary run needed. Categorized harness invocation: ``` python3 tools/diff-events/diff_events.py \ --canary audit-runs/phase-c23-keWait-timeout-encoding/canary-cold-trunc.jsonl \ --ours audit-runs/iterate-2N-rebaseline/ours-cold.jsonl \ --out audit-runs/iterate-2N-rebaseline/diff-report.md ``` ## Infrastructure gate verification | infrastructure | expectation | observed | result | |---|---|---|---| | **2.F** VdSwap drain (900ms → 1ms) | host_ns at idx 105,285 ≈ 0.67s (not 1.6s) | 0.670s | **PASS** | | **2.H** vA0000000 physical heap bucket | `0xbe8cbb3c`, `0xbd184a40`, `0xbc6c5640` thread ctx_ptrs | identical 3 vA-bucket ctx_ptrs | **PASS** | | **2.L** categorized tags surface | `[return_value mismatch]` / `[status mismatch]` / `[args_resolved.path mismatch]` greppable | 1 `[return_value mismatch]` on sister chain tid=12→7 | **PASS** | | **2.L** raw idx surfaced both sides | `canary raw tid_event_idx=N, ours raw tid_event_idx=M` in report | present on every divergence line | **PASS** | | **2.M** `exit-thread-state.json` auto-emit | sibling file in trace dir, no flag needed | 9651 bytes, 13 threads + 10 wedge entries | **PASS** | | **2.M** stderr emission notice | `exit-thread-state: wrote 13 thread(s), 10 wedge entr(ies)` | identical line emitted | **PASS** | All four infrastructure pieces working as designed. No regression. ## Cascade questions (recon-only — no fix claim) ### (a) Has the matched prefix grown post-fixes? **No — matched prefix is bit-identical to Phase C+23 baseline at 105,286.** This is the same prefix length 2.J/2.L/2.M produced. The cache-wipe + physical-heap + VdSwap-drain fixes did NOT advance the matched-prefix length on the main chain — they shifted *what's running* within the prefix (cache-rebuild tid=4 from 160 → 2,075 events, wedge PC geometry from 1.7s spin → 0.7s natural-end) but did not extend it. The divergence at the post-VdSwap control-flow boundary (`VdGetCurrentDisplayGamma` canary vs `KeAcquireSpinLockAtRaisedIrql` ours) was NOT what cache-wipe / heap-bucket / VdSwap-drain were addressing. Conclusion: Phase C+23 cap remains the next-frontier on the main chain. ### (b) Is the first divergence still at idx 102424? **No — 102,424 (the NtQueryFullAttributesFile cache-probe SUCCESS/NO_SUCH_FILE inversion) is CLOSED. Main chain advances to 105,286.** All 8 ours-side `NtQueryFullAttributesFile` cache-probe returns now equal `0xc000000f` (STATUS_NO_SUCH_FILE), bit-aligned with canary's cold-cache returns (verified by enumerating every `cache:\*` probe in ours-cold.jsonl). The 2.J finding holds: cache-state parity restored, cache-probe inversion absent, harness correctly advances to the next-downstream divergence. ### (c) What is the new first divergence's category + signature? **Main chain (canary tid=6 → ours tid=1) at matched-prefix 105,286** (canary raw idx=105298, ours raw idx=105286): **`payload.ord` mismatch — NOT a categorized return_value / status / args case.** Canary fires `import.call VdGetCurrentDisplayGamma` (ord=441) immediately after `kernel.return VdSwap`; ours fires `import.call KeAcquireSpinLockAtRaisedIrql` (ord=77) at the same position. Different functions called from the same matched-prefix tail = control-flow branch divergence inside the post-VdSwap guest code. Pre-context (last 5 matching events): `VdGetSystemCommandBuffer` (call+return) → `VdSwap` (import.call, kernel.call, kernel.return). After `kernel.return VdSwap`, the two engines branch. **Sister chain (canary tid=12 → ours tid=7) at idx=4** (FIRST iteration where 2.L's category tag actually fires): **`[return_value mismatch] kernel.return name=KeWaitForSingleObject: canary=258 ours=0`.** Canary returns `STATUS_TIMEOUT` (`0x102` = 258); ours returns `SUCCESS` (0). Pre-context (idx 0-3 match exactly): import.call + kernel.call `KeWaitForSingleObject` → `handle.create` (different SIDs: canary `c49d8f0ab90401ea` vs ours `9559797117e919f0`, but absorbed by Phase C+18 cross-tid SID matching) → `wait.begin` with `timeout_ns=-30000000` (30ms relative wait, IDENTICAL on both sides) → divergent return. This is the AUDIT-069 / phase-C+23 KeWaitForSingleObject timeout-encoding family but at a NEW position the categorized harness now exposes. ours's wait returns SUCCESS where canary times out, implying ours's wait object is signaled within the 30ms window where canary's is not — opposite of the audio underrun class (#34/#35). Worth investigating in next iterate as a new lead. ### (d) Does ours's exit-state show same 5 blocked tids at PC=0x824ac578? **Yes — bit-identical to 2.M.** `diff -q iterate-2N-rebaseline/exit-thread-state.json iterate-2M-exit-state-dump/exit-thread-state.json` returns silent (no differences). 13 alive threads, 10 wedge entries. Blocked tids at PC `0x824ac578`: **tid 1, 13, 4, 5, 3** (same 5 as 2.K/2.M). Wedge map: ``` tid=1 → Thread(id=13) (handle 0x000012c8, signaler=13 → circular) tid=13 → Event(sig=false) (handle 0x000012d0, signaler=null) tid=4 → Semaphore(0/2147483647) (handle 0x00001028, signaler=null = AUDIT-069 work-sem) tid=5 → Event(sig=false) (handle 0x000012e4, signaler=null) tid=3 → Event(sig=false) (handle 0x00001020, signaler=null) tid=11 → Event(sig=false) × 2 (handles 0x828a3244, 0x828a3220) tid=2 → Event(sig=false) (handle 0x8287093c) tid=8 → Event(sig=false) (handle 0x000010ec) tid=8 → Semaphore(0/2147483647) (handle 0x000010d8) ``` Wedge geometry stable across 2.M ↔ 2.N (deterministic). ### (e) ours's thread set vs canary at same wallclock — what is missing? ours's 10 thread.create entry_pcs: ``` 0x82181830, 0x8245a5d0, 0x82450a28, 0x82457ef0, 0x824cd458, 0x822f1ee0, 0x824d2878, 0x824d2940, 0x82178950, 0x821748f0 ``` Canary's spawns up to canary host_ns ≤ 1.698s (matched-prefix tail +100ms slack): the **SAME 8 entry_pcs in the SAME order** (ours's 9th + 10th spawns happen slightly past the matched-prefix-tail wallclock in ours, but canary spawns those same two at host_ns=1.897s/1.902s, also past the matched-prefix tail). At the matched-prefix boundary the thread sets are bit-identical entry-pc-wise. Canary diverges by spawning **8 additional threads** in the full 97s capture window: ``` 0x821c4ad0 @ 1.924s tid=17 0x822c6870 @ 1.928s tid=18 (× 2) 0x824563e0 @ 2.050s tid=6 (spawned-by) 0x82170430 @ 2.064s tid=6 0x823dde30 @ 2.082s tid=6 0x823ddb50 @ 2.083s tid=6 (× 2) ``` These are the canary-only `sub_825070F0` worker fan-out family + `0x821c4ad0` renderer + `0x822c6870` audio classes that the AUDIT-049/ 2.K/2.M lineage documents — ours never reaches the install epoch that spawns them because ours wedges/budget-ends at host_ns=1.008s (50M-instr budget cap), while the install epoch fires on canary at host_ns≈9.4s per AUDIT-068. **Thread-set gap is the same as documented; no new missing-thread class surfaced.** ## Comparison table — Phase C+23 baseline vs 2.N | metric | Phase C+23 (2.J/2.L) | 2.N | delta | |---|---|---|---| | Main-chain matched prefix | 105,286 | **105,286** | **0** (bit-identical) | | Main-chain first divergence kind | `payload.ord` (import.call) | `payload.ord` (import.call) | UNCHANGED | | Main-chain divergence: canary fn | `VdGetCurrentDisplayGamma` (ord 441) | `VdGetCurrentDisplayGamma` (ord 441) | UNCHANGED | | Main-chain divergence: ours fn | `KeAcquireSpinLockAtRaisedIrql` (ord 77) | `KeAcquireSpinLockAtRaisedIrql` (ord 77) | UNCHANGED | | Cache-probe inversions (NtQueryFullAttributesFile) | 0 (closed by 2.I/2.J) | **0** | UNCHANGED | | ours total events | 121,569 (2.J/2.M) | **121,569** | **0** (bit-identical) | | ours thread.create count | 10 (2.J/2.M) | **10** | **0** (bit-identical) | | ours wedge map size | 10 entries (2.M) | **10** | bit-identical to 2.M | | ours blocked tids at PC=0x824ac578 | {1,13,4,5,3} (2.M) | **{1,13,4,5,3}** | bit-identical to 2.M | | Categorized `[return_value mismatch]` count | n/a (pre-2.L) | **1** (sister chain tid=12→7) | newly visible | | Categorized `[status mismatch]` count | n/a | 0 | — | | Categorized `[args_resolved.* mismatch]` count | n/a | 0 | — | | `exit-thread-state.json` auto-emit | n/a (pre-2.M) | **YES** (no flag) | infrastructure-new | ## Confidence - **HIGH** that ours's 2.N trace is deterministic vs 2.M (121,569 events bit-equal payload-wise; only host_ns and post-divergence guest_cycle differ on 6 of 121,569 lines). - **HIGH** that infrastructure 2.F/2.H/2.L/2.M all operate as designed (gate table all-PASS). - **HIGH** that matched-prefix length is 105,286 (categorized harness output explicit, raw idx printed on both sides per 2.L's reading-error #41 closure). - **HIGH** that wedge geometry is unchanged from 2.M (bit-identical `exit-thread-state.json`). - **HIGH** that the main-chain divergence is a `payload.ord` (not a return_value / status / args) class — i.e., the categorized harness correctly does NOT misclassify it. - **MEDIUM-HIGH** that the sister-chain `[return_value mismatch]` at tid=12→7 idx=4 (KeWaitForSingleObject 258 vs 0) is a NEW finding worth investigating. The categorized harness made this visible at-a-glance for the first time. Pre-2.L it would have surfaced as a generic `payload.return_value` line, not greppable as a return-value class. ## Tripstone audit - **#28** (cross-engine tid stability): comparisons keyed on entry_pc. Main chain identified by (canary tid=6, ours tid=1) — these are stable cross-engine identities established by the harness's alignment, not by raw tid integers. Wedge map intra-run tids acceptable (ours-only). - **#39** (composite progression): recon class, NO progression claim made. VdSwap count UNCHANGED (1), draw count UNCHANGED (0). - **#40** (single-keystone framing): explicitly NOT proposing any fix. This iterate verifies prior fixes are clean; it does NOT assert any one-step cascade unblock. - **#41** (silent test-harness state leak): CLOSED at the harness output level by 2.L. Verified empirically — categorized tags emit on return-value mismatch (sister chain tid=12→7 idx=4), and raw per-tid idx surfaced on both sides of every divergence. - **#42** (Phase-A blind to blocked-forever waits): CLOSED at the output level by 2.M. Verified empirically — `exit-thread-state.json` auto-emitted with full 13-thread snapshot + 10-entry wedge map. No flag required; no manual diag dump needed. ## Next-iterate recommendation (single sentence, no fix proposal) Two clean leads with the new visibility, in priority order: **(1)** the sister-chain `[return_value mismatch]` at tid=12→7 idx=4 (KeWaitForSingleObject canary=258 STATUS_TIMEOUT vs ours=0 SUCCESS) is brand-new actionable data the categorized harness uncovered — worth a ~0-LOC investigation iterate to localize which wait object differs and why ours's signaler races canary's by < 30ms; **(2)** the main-chain post-VdSwap branch at 105,286 is the same blocker as Phase C+23 and remains the strategic target, but is downstream of the install-epoch gap and likely needs the longer-budget replay (`-n 500000000`) plus the install-chain investigation already outlined in 2.K/2.J writer-reports. ## Artifacts Under `xenia-rs/audit-runs/iterate-2N-rebaseline/`: - `ours-cold.jsonl` (Phase-A trace, 121,569 events, ~28MB, payload bit-equal to 2.J/2.M) - `ours-cold.stdout.log` (empty — quiet mode) - `ours-cold.stderr.log` (single line: 2.M emission notice) - `exit-thread-state.json` (13 threads + 10 wedge entries; bit-equal to 2.M's) - `diff-report.md` (categorized harness output: 4 first-divergence blocks, 1 `[return_value mismatch]` tag, all with raw idx both sides) - `writer-report.md` (this file) Canary trace REUSED (not re-captured): `xenia-rs/audit-runs/phase-c23-keWait-timeout-encoding/canary-cold-trunc.jsonl` (132MB, 565,773 events, cold-cache 2026-05-18 capture).