Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
6.8 KiB
Phase C+15-α — New Divergence Catalog (2026-05-14)
Surfaced by the schema-v1.1 wiring of handle.create/destroy,
thread.create/exit, wait.begin in both engines.
Cold-vs-cold matched-prefix table (post-wiring)
| canary_tid | ours_tid | matched | first_divergence_at | divergence kind |
|---|---|---|---|---|
| 6 | 1 | 102,168 | 102,168 | extra handle.destroy in ours (XamTaskCloseHandle refcount mismatch) |
| 15 | 10 | 16 | — | no divergence in 16 evts (canary 3.6M, ours stalls) |
| 7 | 2 | 30 | 30 | KeWaitForSingleObject native-obj handle (class E) |
| 4 | 11 | 8 | 8 | KeWaitForMultipleObjects native-obj handle (class E) |
| 12 | 7 | 2 | 2 | KeWaitForSingleObject native-obj handle (class E) |
| 14 | 9 | 2 | 2 | KeWaitForSingleObject native-obj handle (class E) |
Main matched prefix dropped from 104,574 (C+13/C+14) to 102,168 — a regression of ~2,400 events. This is the expected outcome: invisible state divergences are now visible.
Cataloged divergences (priority-ordered for future iterate)
D-1 (HIGH) — main chain idx=102,168: extra handle.destroy on XamTaskCloseHandle
- Chain: canary tid=6 ↔ ours tid=1.
- Event:
- ours:
handle.destroy sid=b53a312c0ac30f49thenkernel.return XamTaskCloseHandle return=1 - canary:
kernel.return XamTaskCloseHandle return=1(nohandle.destroy)
- ours:
- Hypothesis: Ours's
xam_task_close_handle(xam.rs:300-344) decrements refcount and destroys the handle when it reaches 0. Canary'sXamTaskCloseHandle_entry→NtClose→ObjectTable::ReleaseHandleonly destroys when refcount reaches 0; canary's spawned thread keeps an additional ref on the thread handle (object->Retain()inXThread::Createline 408 viaRetainHandle()). Ours's refcount of 1 at this point is wrong — should be 2 (user ref + spawned-thread ref). Ours destroys prematurely. - Impact: leaks downstream divergences; spawned thread now has a dangling handle reference.
- Fix scope: ~20 LOC in
xam_task_schedule/ex_create_thread— add explicitstate.handle_refcount[handle] += 1after spawn for the XThread's own ref. Verify against canary'sRetainHandle()semantics.
D-2 (HIGH) — chain tid=4 / canary, tid=11 / ours: ours stops at idx=8
- Chain: canary tid=4 ↔ ours tid=11.
- Event:
- ours:
kernel.return KeWaitForMultipleObjects status=0at idx=8, then stream ends (9 total events). - canary:
handle.create sid=bcaf14d76932b128 (Event)at idx=8, thenhandle.create sid=0760e947bacff199at idx=9, then continues for 151,690 events.
- ours:
- Hypothesis (class E asymmetry): Canary's
KeWaitForMultipleObjects_entryiterates the object pointer array and callsXObject::GetNativeObject<XObject>(kernel_state, object_ptr, -1, true)for each — when the object has not yet been wrapped in anXObject*, this CREATES a new XObject (and thus a new handle). Ours'sdo_wait_multipleusesresolve_pseudo_handlewhich does NOT create a new XObject — it looks up the existing handle. The "handle for the native dispatcher object" is an engine-architectural difference: canary lazily wraps, ours pre-registers. - Impact: every KeWait that takes object pointers (not handles) creates N extra handle.create events on the canary side. Ours emits none.
- Fix scope: this is class E (intentional asymmetry). Recommended action:
add
Ke{Wait,Set,Reset,...}*Object*exports that take object pointers to a diff-tool suppress-handle-create-side-effect list, OR have ours emit a synthetichandle.createwhenresolve_pseudo_handlefirst encounters a new pointer. Latter aligns canary's view better. ~30-50 LOC.
D-3 (HIGH) — same class on chains 7→2 (idx=30), 12→7 (idx=2), 14→9 (idx=2)
Same root cause as D-2 — KeWaitForSingleObject with raw object pointer.
Canary's xeKeWaitForSingleObject calls GetNativeObject which creates a
handle for the dispatcher; ours's resolve_pseudo_handle does not.
Group all 4 chains under one fix in D-2.
D-4 (MEDIUM) — wait.begin SID 0000000000000000 on tid=10 of ours
- Chain: canary tid=15 ↔ ours tid=10 (the only thread where prefix didn't regress — but ours stalls at idx=16).
- Event at idx=2: both engines emit
wait.beginbut ours'shandles_semantic_ids = ["0000000000000000"]while canary's is real. - Hypothesis: SID = 0 means
lookup_handle_semantic_idreturned 0 (handle not registered). The handle being waited on must have been created before the event_log SID registry was active (during boot / init), OR it's a pseudo-handle fromresolve_pseudo_handle. Pseudo-handles aren't real handles in our model. - Fix scope: when
lookup_handle_semantic_id(h) == 0, lazy-emit a synthetichandle.createforh(with a default object_type perstate.objects[h]'s schema kind). Aligns with D-2 fix. ~10 LOC.
D-5 (LOW) — chains 7→2, 12→7, 14→9: ours streams truncated
- Ours's tid=2/7/9/10 streams are 32/4/76/16 events long; canary's are 32/27,834/4,733,192/3,610,535. Ours's worker threads stall early.
- Hypothesis: Downstream of D-2 / D-1 — once the main thread or peer workers diverge, downstream threads block on signals that never come.
- Fix scope: deferred until D-1/D-2 land; likely no separate fix needed.
Acceptance gate status
- Gate 1 (default-off digest): PASS — 3× reproducible at
e1dfcb1559f987b35012a7f2dc6d93f5(unchanged from C+13 baseline). - Gate 2 (cvar-on emit): PASS — both engines produce 14M+ / 121K events respectively; JSONL parses cleanly; all new kinds present.
- Gate 3 (diff tool): PASS — diff tool consumes new kinds, produces
6-chain divergence report. Cross-engine SID skip-comparison documented in
SKIP_PAYLOAD_FIELDS_BY_KIND. - Gate 4 (cold-vs-cold): PASS (with regression as designed) — main chain prefix 104,574 → 102,168 (-2,406 events). Divergence catalog produced.
- Gate 5 (build clean): PASS — canary + ours both build.
- Gate 6 (tests): PASS — 181 → 181 passing (no new tests added; existing unchanged).
Reading-error class avoided
Class #29 — per-host-thread tid_event_idx counter for shared synthetic tids:
canary's pre-session thread_local uint64_t t_tid_event_idx was correct for
guest-tid events (1 tid : 1 host_thread) but broken for boot-time emissions
with tid=0 because boot init runs on multiple host threads. Symptom: the
diff tool rejected the canary log with "events out of order at index 8".
Fixed via tid-keyed global map (matches ours's design).