Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
118 lines
5.9 KiB
Markdown
118 lines
5.9 KiB
Markdown
# Phase C+25 — MmGetPhysicalAddress canonicalization
|
|
|
|
## Step 1 — Framing verification (per reading-error #28)
|
|
|
|
From `phase-w-wedge-reattack/diff-postfix.md` at `canary tid=6 → ours tid=1` idx 105,112:
|
|
|
|
```
|
|
canary: [105119] kernel.return MmGetPhysicalAddress return_value=353042432 status=0x150b0000
|
|
ours: [105112] kernel.return MmGetPhysicalAddress return_value=182251520 status=0x0adcf000
|
|
```
|
|
|
|
Decoded:
|
|
- canary 353042432 = `0x150B0000`. Per `xenia-canary/src/xenia/memory.cc:2317-2325`
|
|
(`PhysicalHeap::GetPhysicalAddress`): `address -= heap_base_; if (heap_base_ >=
|
|
0xE0000000) address += 0x1000;`. To produce `0x150B0000` from `vE0000000` (heap_base
|
|
`0xE0000000`): input VA `0xF50AF000` → `0xF50AF000 - 0xE0000000 + 0x1000 = 0x150B0000`. ✓
|
|
- ours `0x0ADCF000`. Per `exports.rs:985-988` (`mm_get_physical_address`):
|
|
`ctx.gpr[3] &= 0x1FFF_FFFF`. To produce `0x0ADCF000` from the unified heap region
|
|
`0x40000000+`: input VA `0x4ADCF000` → `0x4ADCF000 & 0x1FFF_FFFF = 0x0ADCF000`. ✓
|
|
|
|
Pre-context: identical sequence of `MmAllocatePhysicalMemoryEx` (canonicalized to
|
|
shared sentinel) → `MmGetPhysicalAddress`. Next event after divergence:
|
|
`VdInitializeRingBuffer` — the GPU consumes the PA opaquely.
|
|
|
|
Both engines' translations are SELF-CONSISTENT: within each engine, the same input
|
|
VA always maps to the same PA, and any subsequent GPU command pointing at that PA
|
|
gets read back from the same host backing store. The divergence at the diff layer
|
|
is a host-allocator-region symptom, not a semantic bug.
|
|
|
|
## Step 2 — Classification
|
|
|
|
Four candidates:
|
|
|
|
- **(A)** Per-call value bug. NO — both formulas are correct for their respective
|
|
heap layouts. Canary's `PhysicalHeap::GetPhysicalAddress` is the authoritative
|
|
implementation for the three-heap memory model; ours's `& 0x1FFF_FFFF` mask is
|
|
the documented equivalent for the unified heap (KRNBUG-Mm-04 noted at
|
|
`exports.rs:3771`).
|
|
- **(B)** Allocator-region routing bug. YES, but this is the C+2 Path β deferral —
|
|
ours has a single `KernelState::heap_alloc` cursor at `0x40000000`; canary has
|
|
three physical heaps at `vA0/vC0/vE0` routed by page size via
|
|
`LookupHeapByType`. Estimated >100 LOC and would change boot trajectory
|
|
unpredictably. **OUT OF SCOPE per Phase C+2 scope discipline.**
|
|
- **(C)** Canonicalization gap. YES — `MmGetPhysicalAddress` is a VA→PA translator
|
|
whose return is consumed opaquely by GPU/audio subsystems. The same per-(tid,name)
|
|
ordinal sentinel scheme that covers `MmAllocatePhysicalMemoryEx` (C+2) applies
|
|
here. Fix: extend `ALLOCATOR_RETURN_FNS`.
|
|
- **(D)** Upstream. NO — the predecessor `kernel.call MmGetPhysicalAddress`
|
|
matched cleanly on both engines.
|
|
|
|
**Selected: (C) — diff-tool canonicalization.**
|
|
|
|
## Step 3 — Fix
|
|
|
|
Extended `ALLOCATOR_RETURN_FNS` in `xenia-rs/tools/diff-events/diff_events.py`
|
|
with `"MmGetPhysicalAddress"` and a 20-line comment block explaining the
|
|
deferred-Path-β rationale. Zero engine LOC.
|
|
|
|
Per-(tid,name) ordinal sentinels (`<ALLOC_MmGetPhysicalAddress_N>`) reuse the
|
|
existing `canonicalize_allocator_returns` machinery. As long as both engines
|
|
call the translator the same number of times in the same per-tid order, the
|
|
ordinals line up. A translation-count mismatch correctly surfaces as a
|
|
divergence (ordinal drift → distinct sentinels at that position).
|
|
|
|
The `payload.status` field is auto-mirrored (existing behavior of the
|
|
canonicalizer, since trampoline doesn't distinguish NTSTATUS from pointer-typed
|
|
returns).
|
|
|
|
## Step 4 — Tests added
|
|
|
|
`test_diff_events.py` gains 4 unit tests (lines added at top of `main()`):
|
|
|
|
1. `test_mm_get_physical_address_in_allocator_set` — registry guard.
|
|
2. `test_mm_get_physical_address_canonicalization` — two-call per-tid ordinal.
|
|
3. `test_mm_get_physical_address_cross_engine_alignment` — end-to-end: the
|
|
exact C+25 divergence (`0x150B0000` vs `0x0ADCF000`) canonicalizes to the
|
|
same sentinel on both sides.
|
|
4. `test_mm_get_physical_address_count_mismatch_still_diverges` — ordinal-drift
|
|
negative test.
|
|
|
|
39 baseline tests + 4 new = 43 total, all PASS.
|
|
|
|
## Why no engine fix
|
|
|
|
Per `project_phase_c2_MmAllocatePhysicalMemoryEx_2026_05_13.md`'s "Future work:
|
|
β-class engine fix (deferred)" section:
|
|
|
|
> If a future Phase C+N session surfaces a divergence whose causal chain goes
|
|
> through region-arithmetic on a `MmAllocatePhysicalMemoryEx` return value
|
|
> (e.g. `MmGetPhysicalAddress` yielding bus-incompatible addresses for GPU
|
|
> command buffers), escalate to engine-side: add 3 physical heaps in
|
|
> `xenia-memory` / `KernelState`, route `MmAllocatePhysicalMemoryEx` through
|
|
> page-size lookup. Estimated 100-200 LOC + GPU/audio bridge re-validation;
|
|
> out of scope for single-session work.
|
|
|
|
This C+25 divergence IS the predicted scenario. The GPU is in-process here —
|
|
both engines independently consume the PA they themselves emitted, so the
|
|
opaque-pass-through invariant holds. The PA values diverge between engines
|
|
but neither is wrong in its own coordinate space.
|
|
|
|
Engine fix is deferred to a dedicated Path β session (estimated 100-200 LOC +
|
|
multi-subsystem re-validation across GPU command buffer mappings, XMA audio
|
|
context mapping via `MmMapIoSpace`, and any guest code paths doing PA
|
|
arithmetic). Tripstone #3 explicitly forbids in-session escalation here.
|
|
|
|
## Why progression metric is not expected to move
|
|
|
|
Phase W documented the wedge: tid=1 (main) joins on tid=13, tid=13 waits on
|
|
worker event `0x12d0` that never gets signaled. The wedge is upstream of any
|
|
GPU activity. Advancing matched-prefix past `MmGetPhysicalAddress` does NOT
|
|
exercise any new game-logic branch — it just allows the diff harness to
|
|
continue measuring beyond a previously-occluded translator-return divergence.
|
|
|
|
Per task spec: "If only the secondary metric moves and the primary remains
|
|
pinned (`swaps=1, draws=0`), document candidly: 'matched-prefix advanced but
|
|
no game progression — wedge persists per Phase W finding'." That's exactly
|
|
what happens here.
|