Phase C+2 — Additive canonicalization in tools/diff-events/diff_events.py. This file is untracked in git (added during the Phase A harness session 2026-05-13 and never committed). Below is the additive delta applied this session (Path α — diff-tool canonicalization of allocator returns). --- a/tools/diff-events/diff_events.py (pre-C+2) +++ b/tools/diff-events/diff_events.py (post-C+2) @@ Module-level (additive, after SKIP_PAYLOAD_FIELDS_BY_KIND) @@ +# Allocator-returning kernel exports whose `kernel.return.payload.return_value` +# is a host-allocator-dependent guest VA. Canary and ours legitimately route +# allocations to different heap regions (e.g. canary `MmAllocatePhysicalMemoryEx` +# returns `0xBC220000` from `vC0000000` while ours returns `0x40105000` from +# its single user-heap region — see AUDIT-043 "ε host-allocator address-space +# divergence" and Phase B `report.md` ε-class). Comparing raw VAs would always +# diverge at the first allocator call. +# +# Canonicalization: per `(tid, export_name)` we assign a stable ordinal +# (0, 1, 2, …) to each successive `kernel.return.return_value`, replacing +# both sides' value with the sentinel string `_>` +# before payload comparison. As long as both engines call the same +# allocator the same number of times in the same order on a given thread, +# the comparison treats them as equivalent. +# +# Limitations (documented): +# * If one engine calls an allocator more times than the other, ordinals +# drift and subsequent allocator returns appear as divergences. That's +# the correct outcome — ordinal-count mismatch IS a behavioral +# divergence. +# * `payload.status` is left untouched: it's a copy of the raw VA in +# hex-string form, useful in diff context. +# * Other payload fields that happen to embed an allocator VA (e.g. a +# future `args_resolved.base_address` in a free-call) are NOT +# canonicalized — out of scope for this divergence. Extend the set +# below as new divergence classes surface. +ALLOCATOR_RETURN_FNS = frozenset( + [ + "MmAllocatePhysicalMemoryEx", + "MmAllocatePhysicalMemory", + "NtAllocateVirtualMemory", + "RtlAllocateHeap", + "MmCreateKernelStack", + ] +) + + +def canonicalize_allocator_returns(events_by_tid: dict) -> None: + """In-place: rewrite `payload.return_value` for every kernel.return whose + `payload.name` is in ALLOCATOR_RETURN_FNS, replacing the raw VA with + `_>`. Ordinals are per (tid, name) and assigned + in event order. + + Called on each engine's stream independently; because ordinals are + assigned deterministically by per-tid call order, equivalent streams + produce equivalent sentinels.""" + for tid, evs in events_by_tid.items(): + # name -> next ordinal to assign on this tid + counters: dict[str, int] = {} + for ev in evs: + if ev.get("kind") != "kernel.return": + continue + payload = ev.get("payload") or {} + name = payload.get("name") + if name not in ALLOCATOR_RETURN_FNS: + continue + ordinal = counters.get(name, 0) + counters[name] = ordinal + 1 + sentinel = f"" + payload["return_value"] = sentinel + # `payload.status` mirrors `return_value` as a hex string for + # allocator entries (xboxkrnl trampoline doesn't distinguish + # NTSTATUS from pointer-typed returns). Canonicalize together + # so they stay in lockstep. + if "status" in payload: + payload["status"] = sentinel + @@ main() arg parsing (after --validate-identical) @@ + ap.add_argument( + "--no-canonicalize-allocators", + action="store_true", + help="Disable per-tid ordinal canonicalization of allocator return " + "values (default: enabled). See ALLOCATOR_RETURN_FNS for the " + "covered set. Disabling reproduces the raw-VA comparison.", + ) @@ main() body (after load_events) @@ + if not args.no_canonicalize_allocators: + canonicalize_allocator_returns(canary_evs) + canonicalize_allocator_returns(ours_evs) End of patch. Net additive surface: ~70 LOC. Existing diff behavior preserved via `--no-canonicalize-allocators` flag (verified to reproduce the baseline 161-match summary byte-identically — see re-validation.md gate 6).