handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
189
audit-runs/phase-c2-MmAllocatePhysicalMemoryEx/investigation.md
Normal file
189
audit-runs/phase-c2-MmAllocatePhysicalMemoryEx/investigation.md
Normal file
@@ -0,0 +1,189 @@
|
||||
# Phase C+2 — investigation: `MmAllocatePhysicalMemoryEx` at idx=161
|
||||
|
||||
## Divergence
|
||||
|
||||
| | canary | ours |
|
||||
|---|---|---|
|
||||
| `payload.return_value` (idx=161) | `18446744072570929152` = sign-ext `0xFFFFFFFF_BC220000` | `1074810880` = `0x40105000` |
|
||||
| `payload.status` | `0xbc220000` | `0x40105000` |
|
||||
| Memory region | physical heap `vC0000000` (range `0xC0000000`, size `0x20000000`, 16MB pages) | user heap (single bump region `0x40000000`–`0x6FFFFFFF`) |
|
||||
|
||||
## Step 1 — Locate `MmAllocatePhysicalMemoryEx` in both engines
|
||||
|
||||
### Canary
|
||||
|
||||
`xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_memory.cc:415-503`
|
||||
|
||||
```c
|
||||
uint32_t xeMmAllocatePhysicalMemoryEx(uint32_t flags, uint32_t region_size,
|
||||
uint32_t protect_bits,
|
||||
uint32_t min_addr_range,
|
||||
uint32_t max_addr_range,
|
||||
uint32_t alignment) {
|
||||
...
|
||||
// page_size = 4096 | 64KB | 16MB based on X_MEM_LARGE_PAGES / X_MEM_16MB_PAGES
|
||||
...
|
||||
auto heap = static_cast<PhysicalHeap*>(
|
||||
kernel_memory()->LookupHeapByType(true, page_size));
|
||||
...
|
||||
heap->AllocRange(heap_min_addr, heap_max_addr, adjusted_size,
|
||||
adjusted_alignment, allocation_type, protect, top_down,
|
||||
&base_address);
|
||||
return base_address;
|
||||
}
|
||||
```
|
||||
|
||||
`LookupHeapByType(physical=true, page_size)` returns one of three physical
|
||||
heaps based on page_size (`xenia-canary/src/xenia/memory.cc:467-475`):
|
||||
|
||||
* `page_size ≤ 4096` → `vE0000000` (base `0xE0000000`, size `0x1FD00000`, 4KB pages)
|
||||
* `page_size ≤ 64*1024` → `vA0000000` (base `0xA0000000`, size `0x20000000`, 64KB pages)
|
||||
* else (i.e. 16MB) → `vC0000000` (base `0xC0000000`, size `0x20000000`, 16MB pages)
|
||||
|
||||
Canary returned `0xBC220000` (just below `0xC0000000` because `top_down=true`),
|
||||
so the request used `X_MEM_16MB_PAGES`.
|
||||
|
||||
### Ours
|
||||
|
||||
`xenia-rs/crates/xenia-kernel/src/exports.rs:650-682`:
|
||||
|
||||
```rust
|
||||
fn mm_allocate_physical_memory_ex(ctx, mem, state) {
|
||||
let flags = ctx.gpr[3] as u32;
|
||||
let size = ctx.gpr[4] as u32;
|
||||
if size == 0 { ctx.gpr[3] = 0; return; }
|
||||
match state.heap_alloc(size, mem) {
|
||||
Some(addr) => ctx.gpr[3] = addr as u64,
|
||||
None => ctx.gpr[3] = 0,
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Routes to `KernelState::heap_alloc` (`state.rs:956-974`):
|
||||
|
||||
```rust
|
||||
pub fn heap_alloc(&mut self, size: u32, mem) -> Option<u32> {
|
||||
let aligned_size = (size + 0xFFF) & !0xFFF;
|
||||
let base = self.heap_cursor.fetch_add(aligned_size, ...); // starts at 0x40000000
|
||||
if new_top > 0x6FFF_FFFF { return None; }
|
||||
mem.alloc(base, aligned_size, RW)?;
|
||||
Some(base)
|
||||
}
|
||||
```
|
||||
|
||||
`heap_cursor` initialized to `0x40000000` (`state.rs:325`). At idx=161 the
|
||||
cursor was advanced to `0x40105000` after ~16 prior 64KB-aligned allocations.
|
||||
|
||||
## Step 2 — Map both engines' memory layouts
|
||||
|
||||
### Canary (`xenia-canary/src/xenia/memory.h:598-608`, `memory.cc:215-242`)
|
||||
|
||||
| region | base | size | page | type | purpose |
|
||||
|---|---|---|---|---|---|
|
||||
| `v00000000` | `0x00000000` | `0x40000000` | 4KB | virtual | low system / zero page protected |
|
||||
| `v40000000` | `0x40000000` | `0x3F000000` | 64KB | virtual | user-virtual `NtAllocateVirtualMemory` |
|
||||
| `v80000000` | `0x80000000` | `0x10000000` | 64KB | XEX image | code+data |
|
||||
| `v90000000` | `0x90000000` | `0x10000000` | 4KB | XEX image | (alt) |
|
||||
| `physical` | `0x00000000` | `0x20000000` | 4KB | physical-bus | bus-address space |
|
||||
| `vA0000000` | `0xA0000000` | `0x20000000` | 64KB | **physical** | 64KB-page physical alloc |
|
||||
| `vC0000000` | `0xC0000000` | `0x20000000` | 16MB | **physical** | 16MB-page physical alloc |
|
||||
| `vE0000000` | `0xE0000000` | `0x1FD00000` | 4KB | **physical** | 4KB-page physical alloc |
|
||||
|
||||
`MmAllocatePhysicalMemoryEx` → one of the three physical heaps based on page size.
|
||||
`NtAllocateVirtualMemory` → one of the two virtual heaps based on page size.
|
||||
|
||||
### Ours (`xenia-rs/crates/xenia-kernel/src/state.rs:325-326`, `:956-985`)
|
||||
|
||||
| region | base | size | purpose |
|
||||
|---|---|---|---|
|
||||
| `heap_cursor` | `0x40000000` | up to `0x6FFFFFFF` | unified bump-alloc for ALL kernel allocs |
|
||||
| `stack_cursor` | `0x71000000` | ascending | stack pages |
|
||||
|
||||
Ours has a **single** unified user-heap-style bump region. There is **no
|
||||
distinct physical-memory region**. Both `MmAllocatePhysicalMemoryEx` and
|
||||
`NtAllocateVirtualMemory` route through `heap_alloc`. The host-side page
|
||||
table (`xenia-memory` / `heap.rs`) does have `HeapType::GuestPhysical`
|
||||
defined but the `KernelState` allocator only uses one cursor.
|
||||
|
||||
## Step 3 — (α) vs (β) classification
|
||||
|
||||
The fundamental question: is ours's memory layout a deliberate simplification
|
||||
that is later canonicalizable, OR is it a memory-model bug?
|
||||
|
||||
**Evidence for (β) — wrong region**:
|
||||
* Xbox 360 architecturally distinguishes physical-VA regions
|
||||
(`0xA0000000`+, `0xC0000000`+, `0xE0000000`+) from virtual-VA regions
|
||||
(`0x40000000`+). Game code that uses `MmGetPhysicalAddress` masks
|
||||
`& 0x1FFF_FFFF` (Xbox 360 has 512MB physical bus). Different *guest*
|
||||
VAs in different *regions* therefore map to different *physical*
|
||||
addresses, which GPU command buffers consume directly.
|
||||
* `MmGetPhysicalAddress(0xBC220000) & 0x1FFFFFFF = 0x1C220000`
|
||||
* `MmGetPhysicalAddress(0x40105000) & 0x1FFFFFFF = 0x00105000`
|
||||
* These are different bus addresses. If the game stores the VA in a
|
||||
command-buffer descriptor consumed by ours's GPU, the GPU will read
|
||||
different memory than the canary's GPU would.
|
||||
|
||||
**Evidence for (α) — same memory model, host-VA drift only**:
|
||||
* AUDIT-043 (2026-05-09) established that within a single region (canary's
|
||||
pool at `0xBC32C880` vs ours's pool at `0x40541xxx`), *same logical
|
||||
allocation* maps to *different guest VAs*. The "same VA backs different
|
||||
data" tripstone is universal — true within a region, true across
|
||||
regions. From the diff tool's perspective, both are "host-allocator
|
||||
divergence ε".
|
||||
* Phase B's `report.md` explicitly classifies ε as "catalog only".
|
||||
* Even if (β) is the real issue, fixing it in ours requires:
|
||||
- Adding physical-heap regions to `xenia-memory` / `KernelState`.
|
||||
- Wiring `MmAllocatePhysicalMemoryEx` to route by page size.
|
||||
- Re-validating all downstream code (GPU command buffer, kernel
|
||||
objects, audio mixer buffers, etc.) that touched the unified heap.
|
||||
- Likely > 100 LOC and changes ours's boot trajectory unpredictably.
|
||||
|
||||
**Decision: this session lands Path α (diff-tool canonicalization)**.
|
||||
Rationale:
|
||||
|
||||
1. The task brief explicitly authorizes Path α for ε-class divergences:
|
||||
"either (a) canonicalize the comparison (mask out heap-address fields,
|
||||
similar to image canonicalization for import slots), or (b) align
|
||||
ours's allocator region with canary's. AUDIT-043 already noted this
|
||||
is fundamental for emulator pool allocators; class ε is structural."
|
||||
2. Per "if it requires more than ~100 LOC or touches the core memory
|
||||
model significantly, STOP and report" — Path β is plausibly that
|
||||
scope. The honest move is to land the canonicalization (which extends
|
||||
the matched prefix substantially, see re-validation.md) and leave a
|
||||
clear marker that Path β is the deeper architectural cleanup, to be
|
||||
scoped as its own multi-session effort.
|
||||
3. Path α is **falsifiable**: if downstream divergences at idx 102014+
|
||||
surface evidence that the unified-region routing actually broke game
|
||||
logic (e.g. GPU command-buffer corruption, MmGetPhysicalAddress
|
||||
mismatch in payload data), that's prima facie reason to escalate to
|
||||
Path β. This session creates the conditions for that observation;
|
||||
it does not pre-commit to a model rewrite.
|
||||
|
||||
**Mixed-case acknowledgement**: ours's `MmAllocatePhysicalMemoryEx`
|
||||
*may* mis-route in a way that breaks downstream code (β-leak). The
|
||||
matched-prefix metric below (161 → 102014) is a *positive* signal that
|
||||
this is NOT the case for at least the first ~102K events: the game's
|
||||
boot sequence does not (yet) do region-arithmetic that distinguishes
|
||||
`0xBC220000` from `0x40105000`. If a later divergence (e.g. at 102014,
|
||||
`RtlImageXexHeaderField` — out of scope for this session) does turn
|
||||
out to be a downstream consequence of the wrong region, that's the
|
||||
trigger to escalate.
|
||||
|
||||
## Allocator function set covered by Path α
|
||||
|
||||
For completeness in the canonicalization (not for "widening scope" of the
|
||||
fix — the divergence at idx=161 is the only one this session targets;
|
||||
listing the other allocators only ensures the canonicalization is uniform
|
||||
and doesn't surface false ordinal-drift later):
|
||||
|
||||
* `MmAllocatePhysicalMemoryEx` — the immediate target
|
||||
* `MmAllocatePhysicalMemory` — same family
|
||||
* `NtAllocateVirtualMemory` — sibling allocator (returns user-heap VA)
|
||||
* `RtlAllocateHeap` — Rtl-side heap (returns user-heap VA)
|
||||
* `MmCreateKernelStack` — stack allocator
|
||||
|
||||
If any of these *also* diverge in raw-VA form but the surrounding code
|
||||
agrees (same ordinal call sequence), they'll silently canonicalize. If
|
||||
they diverge on call-count ordering, the ordinals drift and the
|
||||
divergence surfaces correctly at the first drifted call. That's the
|
||||
right behavior.
|
||||
Reference in New Issue
Block a user