Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
6.0 KiB
Phase C+17 cold-vs-cold result (2026-05-14)
Matched-prefix table
| canary_tid | ours_tid | C+16 | C+17 | delta | first_divergence_at | kind |
|---|---|---|---|---|---|---|
| 6 | 1 | 102,171 | 102,553 | +382 | 102,553 | NtDuplicateObject no handle.create (NEW-1) |
| 4 | 11 | 8 | 11 | +3 | — | no divergence in 11 events (ours stalls) |
| 7 | 2 | 30 | 32 | +2 | — | no divergence in 32 events |
| 12 | 7 | 2 | 3 | +1 | 3 | timeout_ns differs in wait.begin (NEW-2) |
| 14 | 9 | 2 | 41 | +39 | 41 | unrelated XAudioGetVoiceCategoryVolumeChangeMask |
| 15 | 10 | 16 | 2 | -14 | 2 | ordering: ours emits handle.create on first thread-touch of shared dispatcher (NEW-3) |
Main chain advanced +382 (D-2/D-3/D-4 root cause resolved). 4 of 5 sister
chains advanced. The tid=15→10 chain regressed by 14 events due to a
cross-thread-caching ordering side-effect (see broad-impact.md / NEW-3); the
underlying state alignment is the SAME root cause, so the regression is
"observation-side" — canary's GetNativeObject is process-global, so the
adoption happens on whichever thread touches the dispatcher first.
New first divergence on main (idx=102,553)
canary: [102551] import.call NtDuplicateObject
ours: [102551] import.call NtDuplicateObject
canary: [102552] kernel.call NtDuplicateObject
ours: [102552] kernel.call NtDuplicateObject
canary: [102553] handle.create sid=df686b147b291902 (object_type=1)
ours: [102553] kernel.return NtDuplicateObject
canary: [102554] kernel.return NtDuplicateObject
ours: [102554] import.call RtlEnterCriticalSection
Canary's NtDuplicateObject_entry calls ObjectTable::DuplicateHandle which
fires AddHandle for the new slot, emitting handle.create. Ours's
nt_duplicate_object short-circuits via handle aliasing (AUDIT-062's
dup_id=source_id design) and does NOT emit a new handle.create. This is
D-NEW-1 HIGH — first C+18 target.
Acceptance gates
- Gate 1 (default-off digest): PASS — 3× reproducible at
e1dfcb1559f987b35012a7f2dc6d93f5(unchanged from C+13/C+15-α/C+16 baseline). The fix is observation-only at the digest level; the new shadow-handle refcount entries do not feed back into guest behavior inside the 50M-instruction window. - Gate 2 (cvar-on emit): PASS — ours 121,544 events (was 121,537 in
C+16, +7 from new lazy
handle.createemits in the main chain bring-up); canary 3,059,463 events in ~90s. Both JSONL parse cleanly. - Gate 3 (diff tool): PASS — diff tool produces 6-chain report with
the new SID-skip semantics for
wait.begin.handles_semantic_ids. - Gate 4 (cold-vs-cold): PASS — main matched prefix advances 102,171 → 102,553 (+382). 4 of 5 sister chains advance; 1 minor regression on tid=15→10 (NEW-3, observation-side).
- Gate 5 (build clean): PASS —
cargo build --releaseclean (1 pre-existing dead_code warning unrelated). - Gate 6 (tests): PASS — 186 → 191 (added 5 new lifecycle tests for
ensure_dispatcher_object; all pass + entire workspace green). - Gate 7 (Phase B image hash): PASS —
image_loaded_sha256=ea8d160e9369328a5b922258a92113efb8d7ce3e1a5c12cc521e375985c91c18(unchanged). - Gate 8 (event-log determinism): PASS —
handle.createevent stream (post-strip ofhost_ns) is bit-identical across 3 cold runs: md50bd91b4c61dea52d72859e7d9c3541ba.
Sister-chain analysis
All 5 sister chains' first divergences are no longer "wait.begin with SID=0":
- tid=4→11: was
KeWaitForMultipleObjectsat idx=8 with empty SIDs; now goes 11 events deep with NO divergence (ours stalls, but for reasons unrelated to D-2/D-3/D-4). - tid=7→2: was
KeWaitForSingleObjectat idx=30 with SID=0; now 32 events with NO divergence. - tid=12→7: was at idx=2 with SID=0; now idx=3 — the
handle.creatematches (SID skipped per diff-tool policy), divergence is nowtimeout_nsmismatch (-30000000 vs 429466729600) — a real game-side wait-quantum mismatch. - tid=14→9: was at idx=2 with SID=0; now idx=41 — reached a real
XAudioGetVoiceCategoryVolumeChangeMaskdivergence (sister-chain audio export the boot doesn't reach in ours). - tid=15→10: was at idx=16 (no divergence in 16 events); now idx=2
diverges because ours emits
handle.createon this thread's first touch of a globally-shared semaphore dispatcher at0x828a3230, while canary emitted it earlier on another thread. Observation-side ordering issue; underlying state model is the same. NEW-3 below.
Refcount leak risk audit
The fix bumps state.handle_refcount[ptr] = 1 for each first-touch shadow.
Three concerns and mitigations:
- Leak risk: no code path currently destroys these shadows
(
ensure_dispatcher_objectadoptions). Canary's design has the same property —GetNativeObject-synthesizedXObjects survive until process exit. No leak relative to canary's behavior. - Double-bump risk: the early-return guard at the top of
ensure_dispatcher_object(state.objects.contains_key(&ptr)) ensures the refcount entry is initialized exactly once per pointer. Testensure_dispatcher_object_is_idempotent_on_repeated_touchverifies this. - Refcount underflow risk: if a future change wires
handle.destroyon shadow removal (e.g., whenNtCloseis somehow called on a guest dispatcher pointer), the refcount must not underflow. Theor_insert(1)form preserves any pre-existing refcount (e.g., if the same pointer was previously allocated viaalloc_handle_for, though that's impossible sincenext_handlestarts at0x1000and pointers live above0x1_0000).