Files
xenia-rs/audit-runs/phase-c25-mm-allocator-family/investigation.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

5.9 KiB

Phase C+25 — MmGetPhysicalAddress canonicalization

Step 1 — Framing verification (per reading-error #28)

From phase-w-wedge-reattack/diff-postfix.md at canary tid=6 → ours tid=1 idx 105,112:

canary: [105119] kernel.return MmGetPhysicalAddress return_value=353042432  status=0x150b0000
ours:   [105112] kernel.return MmGetPhysicalAddress return_value=182251520  status=0x0adcf000

Decoded:

  • canary 353042432 = 0x150B0000. Per xenia-canary/src/xenia/memory.cc:2317-2325 (PhysicalHeap::GetPhysicalAddress): address -= heap_base_; if (heap_base_ >= 0xE0000000) address += 0x1000;. To produce 0x150B0000 from vE0000000 (heap_base 0xE0000000): input VA 0xF50AF0000xF50AF000 - 0xE0000000 + 0x1000 = 0x150B0000. ✓
  • ours 0x0ADCF000. Per exports.rs:985-988 (mm_get_physical_address): ctx.gpr[3] &= 0x1FFF_FFFF. To produce 0x0ADCF000 from the unified heap region 0x40000000+: input VA 0x4ADCF0000x4ADCF000 & 0x1FFF_FFFF = 0x0ADCF000. ✓

Pre-context: identical sequence of MmAllocatePhysicalMemoryEx (canonicalized to shared sentinel) → MmGetPhysicalAddress. Next event after divergence: VdInitializeRingBuffer — the GPU consumes the PA opaquely.

Both engines' translations are SELF-CONSISTENT: within each engine, the same input VA always maps to the same PA, and any subsequent GPU command pointing at that PA gets read back from the same host backing store. The divergence at the diff layer is a host-allocator-region symptom, not a semantic bug.

Step 2 — Classification

Four candidates:

  • (A) Per-call value bug. NO — both formulas are correct for their respective heap layouts. Canary's PhysicalHeap::GetPhysicalAddress is the authoritative implementation for the three-heap memory model; ours's & 0x1FFF_FFFF mask is the documented equivalent for the unified heap (KRNBUG-Mm-04 noted at exports.rs:3771).
  • (B) Allocator-region routing bug. YES, but this is the C+2 Path β deferral — ours has a single KernelState::heap_alloc cursor at 0x40000000; canary has three physical heaps at vA0/vC0/vE0 routed by page size via LookupHeapByType. Estimated >100 LOC and would change boot trajectory unpredictably. OUT OF SCOPE per Phase C+2 scope discipline.
  • (C) Canonicalization gap. YES — MmGetPhysicalAddress is a VA→PA translator whose return is consumed opaquely by GPU/audio subsystems. The same per-(tid,name) ordinal sentinel scheme that covers MmAllocatePhysicalMemoryEx (C+2) applies here. Fix: extend ALLOCATOR_RETURN_FNS.
  • (D) Upstream. NO — the predecessor kernel.call MmGetPhysicalAddress matched cleanly on both engines.

Selected: (C) — diff-tool canonicalization.

Step 3 — Fix

Extended ALLOCATOR_RETURN_FNS in xenia-rs/tools/diff-events/diff_events.py with "MmGetPhysicalAddress" and a 20-line comment block explaining the deferred-Path-β rationale. Zero engine LOC.

Per-(tid,name) ordinal sentinels (<ALLOC_MmGetPhysicalAddress_N>) reuse the existing canonicalize_allocator_returns machinery. As long as both engines call the translator the same number of times in the same per-tid order, the ordinals line up. A translation-count mismatch correctly surfaces as a divergence (ordinal drift → distinct sentinels at that position).

The payload.status field is auto-mirrored (existing behavior of the canonicalizer, since trampoline doesn't distinguish NTSTATUS from pointer-typed returns).

Step 4 — Tests added

test_diff_events.py gains 4 unit tests (lines added at top of main()):

  1. test_mm_get_physical_address_in_allocator_set — registry guard.
  2. test_mm_get_physical_address_canonicalization — two-call per-tid ordinal.
  3. test_mm_get_physical_address_cross_engine_alignment — end-to-end: the exact C+25 divergence (0x150B0000 vs 0x0ADCF000) canonicalizes to the same sentinel on both sides.
  4. test_mm_get_physical_address_count_mismatch_still_diverges — ordinal-drift negative test.

39 baseline tests + 4 new = 43 total, all PASS.

Why no engine fix

Per project_phase_c2_MmAllocatePhysicalMemoryEx_2026_05_13.md's "Future work: β-class engine fix (deferred)" section:

If a future Phase C+N session surfaces a divergence whose causal chain goes through region-arithmetic on a MmAllocatePhysicalMemoryEx return value (e.g. MmGetPhysicalAddress yielding bus-incompatible addresses for GPU command buffers), escalate to engine-side: add 3 physical heaps in xenia-memory / KernelState, route MmAllocatePhysicalMemoryEx through page-size lookup. Estimated 100-200 LOC + GPU/audio bridge re-validation; out of scope for single-session work.

This C+25 divergence IS the predicted scenario. The GPU is in-process here — both engines independently consume the PA they themselves emitted, so the opaque-pass-through invariant holds. The PA values diverge between engines but neither is wrong in its own coordinate space.

Engine fix is deferred to a dedicated Path β session (estimated 100-200 LOC + multi-subsystem re-validation across GPU command buffer mappings, XMA audio context mapping via MmMapIoSpace, and any guest code paths doing PA arithmetic). Tripstone #3 explicitly forbids in-session escalation here.

Why progression metric is not expected to move

Phase W documented the wedge: tid=1 (main) joins on tid=13, tid=13 waits on worker event 0x12d0 that never gets signaled. The wedge is upstream of any GPU activity. Advancing matched-prefix past MmGetPhysicalAddress does NOT exercise any new game-logic branch — it just allows the diff harness to continue measuring beyond a previously-occluded translator-return divergence.

Per task spec: "If only the secondary metric moves and the primary remains pinned (swaps=1, draws=0), document candidly: 'matched-prefix advanced but no game progression — wedge persists per Phase W finding'." That's exactly what happens here.