handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes

Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-06-05 07:19:08 +02:00
parent acd1656753
commit ef93a4fa14
620 changed files with 108303 additions and 1 deletions

View File

@@ -0,0 +1,96 @@
Phase C+2 — Additive canonicalization in tools/diff-events/diff_events.py.
This file is untracked in git (added during the Phase A harness session
2026-05-13 and never committed). Below is the additive delta applied this
session (Path α — diff-tool canonicalization of allocator returns).
--- a/tools/diff-events/diff_events.py (pre-C+2)
+++ b/tools/diff-events/diff_events.py (post-C+2)
@@ Module-level (additive, after SKIP_PAYLOAD_FIELDS_BY_KIND) @@
+# Allocator-returning kernel exports whose `kernel.return.payload.return_value`
+# is a host-allocator-dependent guest VA. Canary and ours legitimately route
+# allocations to different heap regions (e.g. canary `MmAllocatePhysicalMemoryEx`
+# returns `0xBC220000` from `vC0000000` while ours returns `0x40105000` from
+# its single user-heap region — see AUDIT-043 "ε host-allocator address-space
+# divergence" and Phase B `report.md` ε-class). Comparing raw VAs would always
+# diverge at the first allocator call.
+#
+# Canonicalization: per `(tid, export_name)` we assign a stable ordinal
+# (0, 1, 2, …) to each successive `kernel.return.return_value`, replacing
+# both sides' value with the sentinel string `<ALLOC_<NAME>_<ORDINAL>>`
+# before payload comparison. As long as both engines call the same
+# allocator the same number of times in the same order on a given thread,
+# the comparison treats them as equivalent.
+#
+# Limitations (documented):
+# * If one engine calls an allocator more times than the other, ordinals
+# drift and subsequent allocator returns appear as divergences. That's
+# the correct outcome — ordinal-count mismatch IS a behavioral
+# divergence.
+# * `payload.status` is left untouched: it's a copy of the raw VA in
+# hex-string form, useful in diff context.
+# * Other payload fields that happen to embed an allocator VA (e.g. a
+# future `args_resolved.base_address` in a free-call) are NOT
+# canonicalized — out of scope for this divergence. Extend the set
+# below as new divergence classes surface.
+ALLOCATOR_RETURN_FNS = frozenset(
+ [
+ "MmAllocatePhysicalMemoryEx",
+ "MmAllocatePhysicalMemory",
+ "NtAllocateVirtualMemory",
+ "RtlAllocateHeap",
+ "MmCreateKernelStack",
+ ]
+)
+
+
+def canonicalize_allocator_returns(events_by_tid: dict) -> None:
+ """In-place: rewrite `payload.return_value` for every kernel.return whose
+ `payload.name` is in ALLOCATOR_RETURN_FNS, replacing the raw VA with
+ `<ALLOC_<NAME>_<ORDINAL>>`. Ordinals are per (tid, name) and assigned
+ in event order.
+
+ Called on each engine's stream independently; because ordinals are
+ assigned deterministically by per-tid call order, equivalent streams
+ produce equivalent sentinels."""
+ for tid, evs in events_by_tid.items():
+ # name -> next ordinal to assign on this tid
+ counters: dict[str, int] = {}
+ for ev in evs:
+ if ev.get("kind") != "kernel.return":
+ continue
+ payload = ev.get("payload") or {}
+ name = payload.get("name")
+ if name not in ALLOCATOR_RETURN_FNS:
+ continue
+ ordinal = counters.get(name, 0)
+ counters[name] = ordinal + 1
+ sentinel = f"<ALLOC_{name}_{ordinal}>"
+ payload["return_value"] = sentinel
+ # `payload.status` mirrors `return_value` as a hex string for
+ # allocator entries (xboxkrnl trampoline doesn't distinguish
+ # NTSTATUS from pointer-typed returns). Canonicalize together
+ # so they stay in lockstep.
+ if "status" in payload:
+ payload["status"] = sentinel
+
@@ main() arg parsing (after --validate-identical) @@
+ ap.add_argument(
+ "--no-canonicalize-allocators",
+ action="store_true",
+ help="Disable per-tid ordinal canonicalization of allocator return "
+ "values (default: enabled). See ALLOCATOR_RETURN_FNS for the "
+ "covered set. Disabling reproduces the raw-VA comparison.",
+ )
@@ main() body (after load_events) @@
+ if not args.no_canonicalize_allocators:
+ canonicalize_allocator_returns(canary_evs)
+ canonicalize_allocator_returns(ours_evs)
End of patch. Net additive surface: ~70 LOC. Existing diff behavior preserved
via `--no-canonicalize-allocators` flag (verified to reproduce the baseline
161-match summary byte-identically — see re-validation.md gate 6).