Files
xenia-rs/audit-runs/phase-c2-MmAllocatePhysicalMemoryEx/investigation.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

190 lines
8.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Phase C+2 — investigation: `MmAllocatePhysicalMemoryEx` at idx=161
## Divergence
| | canary | ours |
|---|---|---|
| `payload.return_value` (idx=161) | `18446744072570929152` = sign-ext `0xFFFFFFFF_BC220000` | `1074810880` = `0x40105000` |
| `payload.status` | `0xbc220000` | `0x40105000` |
| Memory region | physical heap `vC0000000` (range `0xC0000000`, size `0x20000000`, 16MB pages) | user heap (single bump region `0x40000000``0x6FFFFFFF`) |
## Step 1 — Locate `MmAllocatePhysicalMemoryEx` in both engines
### Canary
`xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_memory.cc:415-503`
```c
uint32_t xeMmAllocatePhysicalMemoryEx(uint32_t flags, uint32_t region_size,
uint32_t protect_bits,
uint32_t min_addr_range,
uint32_t max_addr_range,
uint32_t alignment) {
...
// page_size = 4096 | 64KB | 16MB based on X_MEM_LARGE_PAGES / X_MEM_16MB_PAGES
...
auto heap = static_cast<PhysicalHeap*>(
kernel_memory()->LookupHeapByType(true, page_size));
...
heap->AllocRange(heap_min_addr, heap_max_addr, adjusted_size,
adjusted_alignment, allocation_type, protect, top_down,
&base_address);
return base_address;
}
```
`LookupHeapByType(physical=true, page_size)` returns one of three physical
heaps based on page_size (`xenia-canary/src/xenia/memory.cc:467-475`):
* `page_size ≤ 4096``vE0000000` (base `0xE0000000`, size `0x1FD00000`, 4KB pages)
* `page_size ≤ 64*1024``vA0000000` (base `0xA0000000`, size `0x20000000`, 64KB pages)
* else (i.e. 16MB) → `vC0000000` (base `0xC0000000`, size `0x20000000`, 16MB pages)
Canary returned `0xBC220000` (just below `0xC0000000` because `top_down=true`),
so the request used `X_MEM_16MB_PAGES`.
### Ours
`xenia-rs/crates/xenia-kernel/src/exports.rs:650-682`:
```rust
fn mm_allocate_physical_memory_ex(ctx, mem, state) {
let flags = ctx.gpr[3] as u32;
let size = ctx.gpr[4] as u32;
if size == 0 { ctx.gpr[3] = 0; return; }
match state.heap_alloc(size, mem) {
Some(addr) => ctx.gpr[3] = addr as u64,
None => ctx.gpr[3] = 0,
}
}
```
Routes to `KernelState::heap_alloc` (`state.rs:956-974`):
```rust
pub fn heap_alloc(&mut self, size: u32, mem) -> Option<u32> {
let aligned_size = (size + 0xFFF) & !0xFFF;
let base = self.heap_cursor.fetch_add(aligned_size, ...); // starts at 0x40000000
if new_top > 0x6FFF_FFFF { return None; }
mem.alloc(base, aligned_size, RW)?;
Some(base)
}
```
`heap_cursor` initialized to `0x40000000` (`state.rs:325`). At idx=161 the
cursor was advanced to `0x40105000` after ~16 prior 64KB-aligned allocations.
## Step 2 — Map both engines' memory layouts
### Canary (`xenia-canary/src/xenia/memory.h:598-608`, `memory.cc:215-242`)
| region | base | size | page | type | purpose |
|---|---|---|---|---|---|
| `v00000000` | `0x00000000` | `0x40000000` | 4KB | virtual | low system / zero page protected |
| `v40000000` | `0x40000000` | `0x3F000000` | 64KB | virtual | user-virtual `NtAllocateVirtualMemory` |
| `v80000000` | `0x80000000` | `0x10000000` | 64KB | XEX image | code+data |
| `v90000000` | `0x90000000` | `0x10000000` | 4KB | XEX image | (alt) |
| `physical` | `0x00000000` | `0x20000000` | 4KB | physical-bus | bus-address space |
| `vA0000000` | `0xA0000000` | `0x20000000` | 64KB | **physical** | 64KB-page physical alloc |
| `vC0000000` | `0xC0000000` | `0x20000000` | 16MB | **physical** | 16MB-page physical alloc |
| `vE0000000` | `0xE0000000` | `0x1FD00000` | 4KB | **physical** | 4KB-page physical alloc |
`MmAllocatePhysicalMemoryEx` → one of the three physical heaps based on page size.
`NtAllocateVirtualMemory` → one of the two virtual heaps based on page size.
### Ours (`xenia-rs/crates/xenia-kernel/src/state.rs:325-326`, `:956-985`)
| region | base | size | purpose |
|---|---|---|---|
| `heap_cursor` | `0x40000000` | up to `0x6FFFFFFF` | unified bump-alloc for ALL kernel allocs |
| `stack_cursor` | `0x71000000` | ascending | stack pages |
Ours has a **single** unified user-heap-style bump region. There is **no
distinct physical-memory region**. Both `MmAllocatePhysicalMemoryEx` and
`NtAllocateVirtualMemory` route through `heap_alloc`. The host-side page
table (`xenia-memory` / `heap.rs`) does have `HeapType::GuestPhysical`
defined but the `KernelState` allocator only uses one cursor.
## Step 3 — (α) vs (β) classification
The fundamental question: is ours's memory layout a deliberate simplification
that is later canonicalizable, OR is it a memory-model bug?
**Evidence for (β) — wrong region**:
* Xbox 360 architecturally distinguishes physical-VA regions
(`0xA0000000`+, `0xC0000000`+, `0xE0000000`+) from virtual-VA regions
(`0x40000000`+). Game code that uses `MmGetPhysicalAddress` masks
`& 0x1FFF_FFFF` (Xbox 360 has 512MB physical bus). Different *guest*
VAs in different *regions* therefore map to different *physical*
addresses, which GPU command buffers consume directly.
* `MmGetPhysicalAddress(0xBC220000) & 0x1FFFFFFF = 0x1C220000`
* `MmGetPhysicalAddress(0x40105000) & 0x1FFFFFFF = 0x00105000`
* These are different bus addresses. If the game stores the VA in a
command-buffer descriptor consumed by ours's GPU, the GPU will read
different memory than the canary's GPU would.
**Evidence for (α) — same memory model, host-VA drift only**:
* AUDIT-043 (2026-05-09) established that within a single region (canary's
pool at `0xBC32C880` vs ours's pool at `0x40541xxx`), *same logical
allocation* maps to *different guest VAs*. The "same VA backs different
data" tripstone is universal — true within a region, true across
regions. From the diff tool's perspective, both are "host-allocator
divergence ε".
* Phase B's `report.md` explicitly classifies ε as "catalog only".
* Even if (β) is the real issue, fixing it in ours requires:
- Adding physical-heap regions to `xenia-memory` / `KernelState`.
- Wiring `MmAllocatePhysicalMemoryEx` to route by page size.
- Re-validating all downstream code (GPU command buffer, kernel
objects, audio mixer buffers, etc.) that touched the unified heap.
- Likely > 100 LOC and changes ours's boot trajectory unpredictably.
**Decision: this session lands Path α (diff-tool canonicalization)**.
Rationale:
1. The task brief explicitly authorizes Path α for ε-class divergences:
"either (a) canonicalize the comparison (mask out heap-address fields,
similar to image canonicalization for import slots), or (b) align
ours's allocator region with canary's. AUDIT-043 already noted this
is fundamental for emulator pool allocators; class ε is structural."
2. Per "if it requires more than ~100 LOC or touches the core memory
model significantly, STOP and report" — Path β is plausibly that
scope. The honest move is to land the canonicalization (which extends
the matched prefix substantially, see re-validation.md) and leave a
clear marker that Path β is the deeper architectural cleanup, to be
scoped as its own multi-session effort.
3. Path α is **falsifiable**: if downstream divergences at idx 102014+
surface evidence that the unified-region routing actually broke game
logic (e.g. GPU command-buffer corruption, MmGetPhysicalAddress
mismatch in payload data), that's prima facie reason to escalate to
Path β. This session creates the conditions for that observation;
it does not pre-commit to a model rewrite.
**Mixed-case acknowledgement**: ours's `MmAllocatePhysicalMemoryEx`
*may* mis-route in a way that breaks downstream code (β-leak). The
matched-prefix metric below (161 → 102014) is a *positive* signal that
this is NOT the case for at least the first ~102K events: the game's
boot sequence does not (yet) do region-arithmetic that distinguishes
`0xBC220000` from `0x40105000`. If a later divergence (e.g. at 102014,
`RtlImageXexHeaderField` — out of scope for this session) does turn
out to be a downstream consequence of the wrong region, that's the
trigger to escalate.
## Allocator function set covered by Path α
For completeness in the canonicalization (not for "widening scope" of the
fix — the divergence at idx=161 is the only one this session targets;
listing the other allocators only ensures the canonicalization is uniform
and doesn't surface false ordinal-drift later):
* `MmAllocatePhysicalMemoryEx` — the immediate target
* `MmAllocatePhysicalMemory` — same family
* `NtAllocateVirtualMemory` — sibling allocator (returns user-heap VA)
* `RtlAllocateHeap` — Rtl-side heap (returns user-heap VA)
* `MmCreateKernelStack` — stack allocator
If any of these *also* diverge in raw-VA form but the surrounding code
agrees (same ordinal call sequence), they'll silently canonicalize. If
they diverge on call-count ordering, the ordinals drift and the
divergence surfaces correctly at the first drifted call. That's the
right behavior.