xenia-rs/audit-runs/phase-c25-mm-allocator-family/investigation.md

# Phase C+25 — MmGetPhysicalAddress canonicalization

## Step 1 — Framing verification (per reading-error #28)

From `phase-w-wedge-reattack/diff-postfix.md` at `canary tid=6 → ours tid=1` idx 105,112:

```
canary: [105119] kernel.return MmGetPhysicalAddress return_value=353042432  status=0x150b0000
ours:   [105112] kernel.return MmGetPhysicalAddress return_value=182251520  status=0x0adcf000
```

Decoded:
- canary 353042432 = `0x150B0000`. Per `xenia-canary/src/xenia/memory.cc:2317-2325`
  (`PhysicalHeap::GetPhysicalAddress`): `address -= heap_base_; if (heap_base_ >=
  0xE0000000) address += 0x1000;`. To produce `0x150B0000` from `vE0000000` (heap_base
  `0xE0000000`): input VA `0xF50AF000` → `0xF50AF000 - 0xE0000000 + 0x1000 = 0x150B0000`. ✓
- ours `0x0ADCF000`. Per `exports.rs:985-988` (`mm_get_physical_address`):
  `ctx.gpr[3] &= 0x1FFF_FFFF`. To produce `0x0ADCF000` from the unified heap region
  `0x40000000+`: input VA `0x4ADCF000` → `0x4ADCF000 & 0x1FFF_FFFF = 0x0ADCF000`. ✓

Pre-context: identical sequence of `MmAllocatePhysicalMemoryEx` (canonicalized to
shared sentinel) → `MmGetPhysicalAddress`. Next event after divergence:
`VdInitializeRingBuffer` — the GPU consumes the PA opaquely.

Both engines' translations are SELF-CONSISTENT: within each engine, the same input
VA always maps to the same PA, and any subsequent GPU command pointing at that PA
gets read back from the same host backing store. The divergence at the diff layer
is a host-allocator-region symptom, not a semantic bug.

## Step 2 — Classification

Four candidates:

- **(A)** Per-call value bug. NO — both formulas are correct for their respective
  heap layouts. Canary's `PhysicalHeap::GetPhysicalAddress` is the authoritative
  implementation for the three-heap memory model; ours's `& 0x1FFF_FFFF` mask is
  the documented equivalent for the unified heap (KRNBUG-Mm-04 noted at
  `exports.rs:3771`).
- **(B)** Allocator-region routing bug. YES, but this is the C+2 Path β deferral —
  ours has a single `KernelState::heap_alloc` cursor at `0x40000000`; canary has
  three physical heaps at `vA0/vC0/vE0` routed by page size via
  `LookupHeapByType`. Estimated >100 LOC and would change boot trajectory
  unpredictably. **OUT OF SCOPE per Phase C+2 scope discipline.**
- **(C)** Canonicalization gap. YES — `MmGetPhysicalAddress` is a VA→PA translator
  whose return is consumed opaquely by GPU/audio subsystems. The same per-(tid,name)
  ordinal sentinel scheme that covers `MmAllocatePhysicalMemoryEx` (C+2) applies
  here. Fix: extend `ALLOCATOR_RETURN_FNS`.
- **(D)** Upstream. NO — the predecessor `kernel.call MmGetPhysicalAddress`
  matched cleanly on both engines.

**Selected: (C) — diff-tool canonicalization.**

## Step 3 — Fix

Extended `ALLOCATOR_RETURN_FNS` in `xenia-rs/tools/diff-events/diff_events.py`
with `"MmGetPhysicalAddress"` and a 20-line comment block explaining the
deferred-Path-β rationale. Zero engine LOC.

Per-(tid,name) ordinal sentinels (`<ALLOC_MmGetPhysicalAddress_N>`) reuse the
existing `canonicalize_allocator_returns` machinery. As long as both engines
call the translator the same number of times in the same per-tid order, the
ordinals line up. A translation-count mismatch correctly surfaces as a
divergence (ordinal drift → distinct sentinels at that position).

The `payload.status` field is auto-mirrored (existing behavior of the
canonicalizer, since trampoline doesn't distinguish NTSTATUS from pointer-typed
returns).

## Step 4 — Tests added

`test_diff_events.py` gains 4 unit tests (lines added at top of `main()`):

1. `test_mm_get_physical_address_in_allocator_set` — registry guard.
2. `test_mm_get_physical_address_canonicalization` — two-call per-tid ordinal.
3. `test_mm_get_physical_address_cross_engine_alignment` — end-to-end: the
   exact C+25 divergence (`0x150B0000` vs `0x0ADCF000`) canonicalizes to the
   same sentinel on both sides.
4. `test_mm_get_physical_address_count_mismatch_still_diverges` — ordinal-drift
   negative test.

39 baseline tests + 4 new = 43 total, all PASS.

## Why no engine fix

Per `project_phase_c2_MmAllocatePhysicalMemoryEx_2026_05_13.md`'s "Future work:
β-class engine fix (deferred)" section:

> If a future Phase C+N session surfaces a divergence whose causal chain goes
> through region-arithmetic on a `MmAllocatePhysicalMemoryEx` return value
> (e.g. `MmGetPhysicalAddress` yielding bus-incompatible addresses for GPU
> command buffers), escalate to engine-side: add 3 physical heaps in
> `xenia-memory` / `KernelState`, route `MmAllocatePhysicalMemoryEx` through
> page-size lookup. Estimated 100-200 LOC + GPU/audio bridge re-validation;
> out of scope for single-session work.

This C+25 divergence IS the predicted scenario. The GPU is in-process here —
both engines independently consume the PA they themselves emitted, so the
opaque-pass-through invariant holds. The PA values diverge between engines
but neither is wrong in its own coordinate space.

Engine fix is deferred to a dedicated Path β session (estimated 100-200 LOC +
multi-subsystem re-validation across GPU command buffer mappings, XMA audio
context mapping via `MmMapIoSpace`, and any guest code paths doing PA
arithmetic). Tripstone #3 explicitly forbids in-session escalation here.

## Why progression metric is not expected to move

Phase W documented the wedge: tid=1 (main) joins on tid=13, tid=13 waits on
worker event `0x12d0` that never gets signaled. The wedge is upstream of any
GPU activity. Advancing matched-prefix past `MmGetPhysicalAddress` does NOT
exercise any new game-logic branch — it just allows the diff harness to
continue measuring beyond a previously-occluded translator-return divergence.

Per task spec: "If only the secondary metric moves and the primary remains
pinned (`swaps=1, draws=0`), document candidly: 'matched-prefix advanced but
no game progression — wedge persists per Phase W finding'." That's exactly
what happens here.