handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
117
audit-runs/phase-c25-mm-allocator-family/investigation.md
Normal file
117
audit-runs/phase-c25-mm-allocator-family/investigation.md
Normal file
@@ -0,0 +1,117 @@
|
||||
# Phase C+25 — MmGetPhysicalAddress canonicalization
|
||||
|
||||
## Step 1 — Framing verification (per reading-error #28)
|
||||
|
||||
From `phase-w-wedge-reattack/diff-postfix.md` at `canary tid=6 → ours tid=1` idx 105,112:
|
||||
|
||||
```
|
||||
canary: [105119] kernel.return MmGetPhysicalAddress return_value=353042432 status=0x150b0000
|
||||
ours: [105112] kernel.return MmGetPhysicalAddress return_value=182251520 status=0x0adcf000
|
||||
```
|
||||
|
||||
Decoded:
|
||||
- canary 353042432 = `0x150B0000`. Per `xenia-canary/src/xenia/memory.cc:2317-2325`
|
||||
(`PhysicalHeap::GetPhysicalAddress`): `address -= heap_base_; if (heap_base_ >=
|
||||
0xE0000000) address += 0x1000;`. To produce `0x150B0000` from `vE0000000` (heap_base
|
||||
`0xE0000000`): input VA `0xF50AF000` → `0xF50AF000 - 0xE0000000 + 0x1000 = 0x150B0000`. ✓
|
||||
- ours `0x0ADCF000`. Per `exports.rs:985-988` (`mm_get_physical_address`):
|
||||
`ctx.gpr[3] &= 0x1FFF_FFFF`. To produce `0x0ADCF000` from the unified heap region
|
||||
`0x40000000+`: input VA `0x4ADCF000` → `0x4ADCF000 & 0x1FFF_FFFF = 0x0ADCF000`. ✓
|
||||
|
||||
Pre-context: identical sequence of `MmAllocatePhysicalMemoryEx` (canonicalized to
|
||||
shared sentinel) → `MmGetPhysicalAddress`. Next event after divergence:
|
||||
`VdInitializeRingBuffer` — the GPU consumes the PA opaquely.
|
||||
|
||||
Both engines' translations are SELF-CONSISTENT: within each engine, the same input
|
||||
VA always maps to the same PA, and any subsequent GPU command pointing at that PA
|
||||
gets read back from the same host backing store. The divergence at the diff layer
|
||||
is a host-allocator-region symptom, not a semantic bug.
|
||||
|
||||
## Step 2 — Classification
|
||||
|
||||
Four candidates:
|
||||
|
||||
- **(A)** Per-call value bug. NO — both formulas are correct for their respective
|
||||
heap layouts. Canary's `PhysicalHeap::GetPhysicalAddress` is the authoritative
|
||||
implementation for the three-heap memory model; ours's `& 0x1FFF_FFFF` mask is
|
||||
the documented equivalent for the unified heap (KRNBUG-Mm-04 noted at
|
||||
`exports.rs:3771`).
|
||||
- **(B)** Allocator-region routing bug. YES, but this is the C+2 Path β deferral —
|
||||
ours has a single `KernelState::heap_alloc` cursor at `0x40000000`; canary has
|
||||
three physical heaps at `vA0/vC0/vE0` routed by page size via
|
||||
`LookupHeapByType`. Estimated >100 LOC and would change boot trajectory
|
||||
unpredictably. **OUT OF SCOPE per Phase C+2 scope discipline.**
|
||||
- **(C)** Canonicalization gap. YES — `MmGetPhysicalAddress` is a VA→PA translator
|
||||
whose return is consumed opaquely by GPU/audio subsystems. The same per-(tid,name)
|
||||
ordinal sentinel scheme that covers `MmAllocatePhysicalMemoryEx` (C+2) applies
|
||||
here. Fix: extend `ALLOCATOR_RETURN_FNS`.
|
||||
- **(D)** Upstream. NO — the predecessor `kernel.call MmGetPhysicalAddress`
|
||||
matched cleanly on both engines.
|
||||
|
||||
**Selected: (C) — diff-tool canonicalization.**
|
||||
|
||||
## Step 3 — Fix
|
||||
|
||||
Extended `ALLOCATOR_RETURN_FNS` in `xenia-rs/tools/diff-events/diff_events.py`
|
||||
with `"MmGetPhysicalAddress"` and a 20-line comment block explaining the
|
||||
deferred-Path-β rationale. Zero engine LOC.
|
||||
|
||||
Per-(tid,name) ordinal sentinels (`<ALLOC_MmGetPhysicalAddress_N>`) reuse the
|
||||
existing `canonicalize_allocator_returns` machinery. As long as both engines
|
||||
call the translator the same number of times in the same per-tid order, the
|
||||
ordinals line up. A translation-count mismatch correctly surfaces as a
|
||||
divergence (ordinal drift → distinct sentinels at that position).
|
||||
|
||||
The `payload.status` field is auto-mirrored (existing behavior of the
|
||||
canonicalizer, since trampoline doesn't distinguish NTSTATUS from pointer-typed
|
||||
returns).
|
||||
|
||||
## Step 4 — Tests added
|
||||
|
||||
`test_diff_events.py` gains 4 unit tests (lines added at top of `main()`):
|
||||
|
||||
1. `test_mm_get_physical_address_in_allocator_set` — registry guard.
|
||||
2. `test_mm_get_physical_address_canonicalization` — two-call per-tid ordinal.
|
||||
3. `test_mm_get_physical_address_cross_engine_alignment` — end-to-end: the
|
||||
exact C+25 divergence (`0x150B0000` vs `0x0ADCF000`) canonicalizes to the
|
||||
same sentinel on both sides.
|
||||
4. `test_mm_get_physical_address_count_mismatch_still_diverges` — ordinal-drift
|
||||
negative test.
|
||||
|
||||
39 baseline tests + 4 new = 43 total, all PASS.
|
||||
|
||||
## Why no engine fix
|
||||
|
||||
Per `project_phase_c2_MmAllocatePhysicalMemoryEx_2026_05_13.md`'s "Future work:
|
||||
β-class engine fix (deferred)" section:
|
||||
|
||||
> If a future Phase C+N session surfaces a divergence whose causal chain goes
|
||||
> through region-arithmetic on a `MmAllocatePhysicalMemoryEx` return value
|
||||
> (e.g. `MmGetPhysicalAddress` yielding bus-incompatible addresses for GPU
|
||||
> command buffers), escalate to engine-side: add 3 physical heaps in
|
||||
> `xenia-memory` / `KernelState`, route `MmAllocatePhysicalMemoryEx` through
|
||||
> page-size lookup. Estimated 100-200 LOC + GPU/audio bridge re-validation;
|
||||
> out of scope for single-session work.
|
||||
|
||||
This C+25 divergence IS the predicted scenario. The GPU is in-process here —
|
||||
both engines independently consume the PA they themselves emitted, so the
|
||||
opaque-pass-through invariant holds. The PA values diverge between engines
|
||||
but neither is wrong in its own coordinate space.
|
||||
|
||||
Engine fix is deferred to a dedicated Path β session (estimated 100-200 LOC +
|
||||
multi-subsystem re-validation across GPU command buffer mappings, XMA audio
|
||||
context mapping via `MmMapIoSpace`, and any guest code paths doing PA
|
||||
arithmetic). Tripstone #3 explicitly forbids in-session escalation here.
|
||||
|
||||
## Why progression metric is not expected to move
|
||||
|
||||
Phase W documented the wedge: tid=1 (main) joins on tid=13, tid=13 waits on
|
||||
worker event `0x12d0` that never gets signaled. The wedge is upstream of any
|
||||
GPU activity. Advancing matched-prefix past `MmGetPhysicalAddress` does NOT
|
||||
exercise any new game-logic branch — it just allows the diff harness to
|
||||
continue measuring beyond a previously-occluded translator-return divergence.
|
||||
|
||||
Per task spec: "If only the secondary metric moves and the primary remains
|
||||
pinned (`swaps=1, draws=0`), document candidly: 'matched-prefix advanced but
|
||||
no game progression — wedge persists per Phase W finding'." That's exactly
|
||||
what happens here.
|
||||
Reference in New Issue
Block a user