# Phase C+18 Investigation — Shared-global first-toucher race (2026-05-14) ## Framing verification (reading-error #28 discipline) C+17 result: main matched-prefix advanced 102,171 → 102,553 (+382) when ours's `ensure_dispatcher_object` started emitting `handle.create` for synthesized shadows. But sister chain `tid=15→10` REGRESSED from 16 → 2: ``` canary tid=15: ours tid=10: [0] import.call KeWaitForSingleObject [0] import.call KeWaitForSingleObject [1] kernel.call KeWaitForSingleObject [1] kernel.call KeWaitForSingleObject [2] wait.begin sid=66ae1b598f928969 [2] handle.create sid=b9e6799594b746ee [3] kernel.return [3] wait.begin sid=b9e6799594b746ee [4] kernel.return ``` The two engines disagree at idx=2: canary's tid=15 has `wait.begin`, ours's tid=10 has `handle.create`. The SIDs are different too (`66ae1b598f928969` vs `b9e6799594b746ee`) but the diff tool already SKIPS SID fields per C+15-α schema-v1. ## Root cause: shared-global first-toucher race The dispatcher at guest pointer `0x828a3230` is a **process-global KSEMAPHORE** (object_type=3) that's touched by MULTIPLE guest threads during boot: - Canary: some thread other than tid=15 (likely the main boot thread, tid=6) touches it first → emits `handle.create` there. By the time tid=15 reaches `KeWaitForSingleObject`, the wrapper exists, so `XObject::GetNativeObject` short-circuits via the `kXObjSignature` marker and emits NO additional event. Canary tid=15's stream is 3 events long: import → kernel.call → wait.begin → kernel.return. - Ours: tid=10 happens to be the first toucher → ours's `ensure_dispatcher_object` emits `handle.create` on tid=10. ours tid=10's stream is 4 events long: import → kernel.call → **handle.create** → wait.begin → kernel.return. Both engines do the right thing semantically; whichever thread wins the "first toucher" race depends on thread scheduling, which is NOT bit-identical across engines (different host schedulers, JIT, etc.). The diff tool sees one extra event on one side and reports it as a divergence — but it's **observation-side**, not behavioral. This is C+17 D-NEW-3. ## Verified via static + dynamic evidence 1. Both ours's `ensure_dispatcher_object` (exports.rs:4363) and canary's `XObject::GetNativeObject` (xobject.cc:397-483) are **per-pointer idempotent**: re-entry on a pointer that already has the `kXObjSignature` marker short-circuits without emit. 2. The shared `objects` table is process-global in both engines (`KernelState::objects` map; canary's `KernelState::object_table()`). 3. In the ours-cold log, `0x828a3230` appears in exactly ONE `handle.create` (on tid=10) — confirming the per-pointer idempotence: ``` $ grep '"raw_handle_id":"0x828a3230"' ours-cold.jsonl {"kind":"handle.create","tid":10,"tid_event_idx":2,...} ``` 4. The canary diff side reports `[2] wait.begin` with a SID that refers to a dispatcher whose `handle.create` was already emitted elsewhere (likely on canary tid=6 main chain or a worker). 5. The SID computation in both engines uses `semantic_id(create_site_pc=0, creating_tid, idx_at_creation, object_type)`. Both `creating_tid` and `idx_at_creation` depend on WHICH thread did the first touch — so even if both engines wrapped the same dispatcher, their SIDs would still differ. ## Class of bug Class η — **harness observation-side asymmetry on scheduling-non- deterministic process-global state**. Not a real engine bug; both engines are doing the right thing. The harness (per-tid sequence diff) is the wrong abstraction for this class of event. ## Fix shape Two coordinated changes, both small and additive: ### (A) Engine: scheduling-invariant SID for process-global dispatchers Add `event_log::semantic_id_shared_global(pointer, object_type)` (ours and canary) — a SID recipe keyed only on `(pointer, object_type)`. Inputs to the existing FNV-1a: ``` create_site_pc = SHARED_GLOBAL_SID_MARKER (= 0xC01AB005, fixed sentinel) creating_tid = 0 tid_event_idx = pointer as u64 object_type = object_type ``` The marker constant sits outside any plausible guest-PC range (PPC text 0x82000000-0x82FFFFFF; XEX header 0x3001xxxx; heap 0x4xxxxxxx) so it NEVER collides with regular per-thread SIDs (which use real PCs). `ensure_dispatcher_object` (ours) and `XObject::GetNativeObject` (canary) route their `handle.create` emit through this recipe instead of the per-thread `semantic_id`. Both engines compute the **same SID** for the same dispatcher pointer regardless of which guest thread wins the first-toucher race. ### (B) Diff tool: cross-tid floating `handle.create` matching Pre-pass: collect the set of shared-global SIDs across BOTH engines and ALL tids. A `handle.create` event is detected as shared-global by recomputing the deterministic SID from its `(raw_handle_id, object_type)` payload and matching against `handle_semantic_id`. When per-tid comparison finds a kind mismatch where one side has a `handle.create` whose SID is in the floating set: - Advance only that side's stream pointer past the floating event. - Re-compare at the same canonical position. This handles the "extra event on tid=10 but not tid=15" case symmetrically. Subsequent `wait.begin` events whose `handles_semantic_ids` element matches a shared-global SID continue to align via the schema-v1 strict-equality rule (SID fields are already skipped per the C+15-α SKIP_PAYLOAD_FIELDS_BY_KIND policy, but the underlying object alignment is preserved by the deterministic recipe — useful for future passes that re-enable SID comparison). ### Why this is the right fix (not over-suppression) - **Pointer-derived SIDs are unique per object identity**. Two distinct dispatchers at the same pointer with different `object_type` get distinct SIDs (defense in depth). - **Regular per-thread `handle.create` events keep strict alignment**. Only events whose SID matches the deterministic shared-global recipe are eligible for cross-tid absorption. A regular file-handle create (allocated via `alloc_handle_for`/`AddHandle`) uses the per-(tid, idx) SID recipe and CANNOT match the shared-global hash by construction. - **The diff tool still reports real divergences**. Tests confirm: - `test_non_floating_real_divergence_still_caught` — an unrelated extra event on ours's side IS reported. - `test_strict_alignment_without_floating` — when the floating set is empty, legacy strict behavior holds.