Files
xenia-rs/audit-runs/phase-c25-mm-allocator-family/investigation.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

118 lines
5.9 KiB
Markdown

# Phase C+25 — MmGetPhysicalAddress canonicalization
## Step 1 — Framing verification (per reading-error #28)
From `phase-w-wedge-reattack/diff-postfix.md` at `canary tid=6 → ours tid=1` idx 105,112:
```
canary: [105119] kernel.return MmGetPhysicalAddress return_value=353042432 status=0x150b0000
ours: [105112] kernel.return MmGetPhysicalAddress return_value=182251520 status=0x0adcf000
```
Decoded:
- canary 353042432 = `0x150B0000`. Per `xenia-canary/src/xenia/memory.cc:2317-2325`
(`PhysicalHeap::GetPhysicalAddress`): `address -= heap_base_; if (heap_base_ >=
0xE0000000) address += 0x1000;`. To produce `0x150B0000` from `vE0000000` (heap_base
`0xE0000000`): input VA `0xF50AF000``0xF50AF000 - 0xE0000000 + 0x1000 = 0x150B0000`. ✓
- ours `0x0ADCF000`. Per `exports.rs:985-988` (`mm_get_physical_address`):
`ctx.gpr[3] &= 0x1FFF_FFFF`. To produce `0x0ADCF000` from the unified heap region
`0x40000000+`: input VA `0x4ADCF000``0x4ADCF000 & 0x1FFF_FFFF = 0x0ADCF000`. ✓
Pre-context: identical sequence of `MmAllocatePhysicalMemoryEx` (canonicalized to
shared sentinel) → `MmGetPhysicalAddress`. Next event after divergence:
`VdInitializeRingBuffer` — the GPU consumes the PA opaquely.
Both engines' translations are SELF-CONSISTENT: within each engine, the same input
VA always maps to the same PA, and any subsequent GPU command pointing at that PA
gets read back from the same host backing store. The divergence at the diff layer
is a host-allocator-region symptom, not a semantic bug.
## Step 2 — Classification
Four candidates:
- **(A)** Per-call value bug. NO — both formulas are correct for their respective
heap layouts. Canary's `PhysicalHeap::GetPhysicalAddress` is the authoritative
implementation for the three-heap memory model; ours's `& 0x1FFF_FFFF` mask is
the documented equivalent for the unified heap (KRNBUG-Mm-04 noted at
`exports.rs:3771`).
- **(B)** Allocator-region routing bug. YES, but this is the C+2 Path β deferral —
ours has a single `KernelState::heap_alloc` cursor at `0x40000000`; canary has
three physical heaps at `vA0/vC0/vE0` routed by page size via
`LookupHeapByType`. Estimated >100 LOC and would change boot trajectory
unpredictably. **OUT OF SCOPE per Phase C+2 scope discipline.**
- **(C)** Canonicalization gap. YES — `MmGetPhysicalAddress` is a VA→PA translator
whose return is consumed opaquely by GPU/audio subsystems. The same per-(tid,name)
ordinal sentinel scheme that covers `MmAllocatePhysicalMemoryEx` (C+2) applies
here. Fix: extend `ALLOCATOR_RETURN_FNS`.
- **(D)** Upstream. NO — the predecessor `kernel.call MmGetPhysicalAddress`
matched cleanly on both engines.
**Selected: (C) — diff-tool canonicalization.**
## Step 3 — Fix
Extended `ALLOCATOR_RETURN_FNS` in `xenia-rs/tools/diff-events/diff_events.py`
with `"MmGetPhysicalAddress"` and a 20-line comment block explaining the
deferred-Path-β rationale. Zero engine LOC.
Per-(tid,name) ordinal sentinels (`<ALLOC_MmGetPhysicalAddress_N>`) reuse the
existing `canonicalize_allocator_returns` machinery. As long as both engines
call the translator the same number of times in the same per-tid order, the
ordinals line up. A translation-count mismatch correctly surfaces as a
divergence (ordinal drift → distinct sentinels at that position).
The `payload.status` field is auto-mirrored (existing behavior of the
canonicalizer, since trampoline doesn't distinguish NTSTATUS from pointer-typed
returns).
## Step 4 — Tests added
`test_diff_events.py` gains 4 unit tests (lines added at top of `main()`):
1. `test_mm_get_physical_address_in_allocator_set` — registry guard.
2. `test_mm_get_physical_address_canonicalization` — two-call per-tid ordinal.
3. `test_mm_get_physical_address_cross_engine_alignment` — end-to-end: the
exact C+25 divergence (`0x150B0000` vs `0x0ADCF000`) canonicalizes to the
same sentinel on both sides.
4. `test_mm_get_physical_address_count_mismatch_still_diverges` — ordinal-drift
negative test.
39 baseline tests + 4 new = 43 total, all PASS.
## Why no engine fix
Per `project_phase_c2_MmAllocatePhysicalMemoryEx_2026_05_13.md`'s "Future work:
β-class engine fix (deferred)" section:
> If a future Phase C+N session surfaces a divergence whose causal chain goes
> through region-arithmetic on a `MmAllocatePhysicalMemoryEx` return value
> (e.g. `MmGetPhysicalAddress` yielding bus-incompatible addresses for GPU
> command buffers), escalate to engine-side: add 3 physical heaps in
> `xenia-memory` / `KernelState`, route `MmAllocatePhysicalMemoryEx` through
> page-size lookup. Estimated 100-200 LOC + GPU/audio bridge re-validation;
> out of scope for single-session work.
This C+25 divergence IS the predicted scenario. The GPU is in-process here —
both engines independently consume the PA they themselves emitted, so the
opaque-pass-through invariant holds. The PA values diverge between engines
but neither is wrong in its own coordinate space.
Engine fix is deferred to a dedicated Path β session (estimated 100-200 LOC +
multi-subsystem re-validation across GPU command buffer mappings, XMA audio
context mapping via `MmMapIoSpace`, and any guest code paths doing PA
arithmetic). Tripstone #3 explicitly forbids in-session escalation here.
## Why progression metric is not expected to move
Phase W documented the wedge: tid=1 (main) joins on tid=13, tid=13 waits on
worker event `0x12d0` that never gets signaled. The wedge is upstream of any
GPU activity. Advancing matched-prefix past `MmGetPhysicalAddress` does NOT
exercise any new game-logic branch — it just allows the diff harness to
continue measuring beyond a previously-occluded translator-return divergence.
Per task spec: "If only the secondary metric moves and the primary remains
pinned (`swaps=1, draws=0`), document candidly: 'matched-prefix advanced but
no game progression — wedge persists per Phase W finding'." That's exactly
what happens here.