Files

MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes

Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-05 07:19:08 +02:00

8.8 KiB

Raw Blame History

Phase C+2 — investigation: `MmAllocatePhysicalMemoryEx` at idx=161

Divergence

	canary	ours
`payload.return_value` (idx=161)	`18446744072570929152` = sign-ext `0xFFFFFFFF_BC220000`	`1074810880` = `0x40105000`
`payload.status`	`0xbc220000`	`0x40105000`
Memory region	physical heap `vC0000000` (range `0xC0000000`, size `0x20000000`, 16MB pages)	user heap (single bump region `0x40000000`–`0x6FFFFFFF`)

Step 1 — Locate `MmAllocatePhysicalMemoryEx` in both engines

Canary

xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_memory.cc:415-503

uint32_t xeMmAllocatePhysicalMemoryEx(uint32_t flags, uint32_t region_size,
                                      uint32_t protect_bits,
                                      uint32_t min_addr_range,
                                      uint32_t max_addr_range,
                                      uint32_t alignment) {
  ...
  // page_size = 4096 | 64KB | 16MB based on X_MEM_LARGE_PAGES / X_MEM_16MB_PAGES
  ...
  auto heap = static_cast<PhysicalHeap*>(
      kernel_memory()->LookupHeapByType(true, page_size));
  ...
  heap->AllocRange(heap_min_addr, heap_max_addr, adjusted_size,
                   adjusted_alignment, allocation_type, protect, top_down,
                   &base_address);
  return base_address;
}

LookupHeapByType(physical=true, page_size) returns one of three physical heaps based on page_size (xenia-canary/src/xenia/memory.cc:467-475):

page_size ≤ 4096 → vE0000000 (base 0xE0000000, size 0x1FD00000, 4KB pages)
page_size ≤ 64*1024 → vA0000000 (base 0xA0000000, size 0x20000000, 64KB pages)
else (i.e. 16MB) → vC0000000 (base 0xC0000000, size 0x20000000, 16MB pages)

Canary returned 0xBC220000 (just below 0xC0000000 because top_down=true), so the request used X_MEM_16MB_PAGES.

Ours

xenia-rs/crates/xenia-kernel/src/exports.rs:650-682:

fn mm_allocate_physical_memory_ex(ctx, mem, state) {
    let flags = ctx.gpr[3] as u32;
    let size = ctx.gpr[4] as u32;
    if size == 0 { ctx.gpr[3] = 0; return; }
    match state.heap_alloc(size, mem) {
        Some(addr) => ctx.gpr[3] = addr as u64,
        None       => ctx.gpr[3] = 0,
    }
}

Routes to KernelState::heap_alloc (state.rs:956-974):

pub fn heap_alloc(&mut self, size: u32, mem) -> Option<u32> {
    let aligned_size = (size + 0xFFF) & !0xFFF;
    let base = self.heap_cursor.fetch_add(aligned_size, ...);  // starts at 0x40000000
    if new_top > 0x6FFF_FFFF { return None; }
    mem.alloc(base, aligned_size, RW)?;
    Some(base)
}

heap_cursor initialized to 0x40000000 (state.rs:325). At idx=161 the cursor was advanced to 0x40105000 after ~16 prior 64KB-aligned allocations.

Step 2 — Map both engines' memory layouts

Canary (`xenia-canary/src/xenia/memory.h:598-608`, `memory.cc:215-242`)

region	base	size	page	type	purpose
`v00000000`	`0x00000000`	`0x40000000`	4KB	virtual	low system / zero page protected
`v40000000`	`0x40000000`	`0x3F000000`	64KB	virtual	user-virtual `NtAllocateVirtualMemory`
`v80000000`	`0x80000000`	`0x10000000`	64KB	XEX image	code+data
`v90000000`	`0x90000000`	`0x10000000`	4KB	XEX image	(alt)
`physical`	`0x00000000`	`0x20000000`	4KB	physical-bus	bus-address space
`vA0000000`	`0xA0000000`	`0x20000000`	64KB	physical	64KB-page physical alloc
`vC0000000`	`0xC0000000`	`0x20000000`	16MB	physical	16MB-page physical alloc
`vE0000000`	`0xE0000000`	`0x1FD00000`	4KB	physical	4KB-page physical alloc

MmAllocatePhysicalMemoryEx → one of the three physical heaps based on page size. NtAllocateVirtualMemory → one of the two virtual heaps based on page size.

Ours (`xenia-rs/crates/xenia-kernel/src/state.rs:325-326`, `:956-985`)

region	base	size	purpose
`heap_cursor`	`0x40000000`	up to `0x6FFFFFFF`	unified bump-alloc for ALL kernel allocs
`stack_cursor`	`0x71000000`	ascending	stack pages

Ours has a single unified user-heap-style bump region. There is no distinct physical-memory region. Both MmAllocatePhysicalMemoryEx and NtAllocateVirtualMemory route through heap_alloc. The host-side page table (xenia-memory / heap.rs) does have HeapType::GuestPhysical defined but the KernelState allocator only uses one cursor.

Step 3 — (α) vs (β) classification

The fundamental question: is ours's memory layout a deliberate simplification that is later canonicalizable, OR is it a memory-model bug?

Evidence for (β) — wrong region:

Xbox 360 architecturally distinguishes physical-VA regions (0xA0000000+, 0xC0000000+, 0xE0000000+) from virtual-VA regions (0x40000000+). Game code that uses MmGetPhysicalAddress masks & 0x1FFF_FFFF (Xbox 360 has 512MB physical bus). Different guest VAs in different regions therefore map to different physical addresses, which GPU command buffers consume directly.
MmGetPhysicalAddress(0xBC220000) & 0x1FFFFFFF = 0x1C220000
MmGetPhysicalAddress(0x40105000) & 0x1FFFFFFF = 0x00105000
These are different bus addresses. If the game stores the VA in a command-buffer descriptor consumed by ours's GPU, the GPU will read different memory than the canary's GPU would.

Evidence for (α) — same memory model, host-VA drift only:

AUDIT-043 (2026-05-09) established that within a single region (canary's pool at 0xBC32C880 vs ours's pool at 0x40541xxx), same logical allocation maps to different guest VAs. The "same VA backs different data" tripstone is universal — true within a region, true across regions. From the diff tool's perspective, both are "host-allocator divergence ε".
Phase B's report.md explicitly classifies ε as "catalog only".
Even if (β) is the real issue, fixing it in ours requires:
- Adding physical-heap regions to xenia-memory / KernelState.
- Wiring MmAllocatePhysicalMemoryEx to route by page size.
- Re-validating all downstream code (GPU command buffer, kernel objects, audio mixer buffers, etc.) that touched the unified heap.
- Likely > 100 LOC and changes ours's boot trajectory unpredictably.

Decision: this session lands Path α (diff-tool canonicalization). Rationale:

The task brief explicitly authorizes Path α for ε-class divergences: "either (a) canonicalize the comparison (mask out heap-address fields, similar to image canonicalization for import slots), or (b) align ours's allocator region with canary's. AUDIT-043 already noted this is fundamental for emulator pool allocators; class ε is structural."
Per "if it requires more than ~100 LOC or touches the core memory model significantly, STOP and report" — Path β is plausibly that scope. The honest move is to land the canonicalization (which extends the matched prefix substantially, see re-validation.md) and leave a clear marker that Path β is the deeper architectural cleanup, to be scoped as its own multi-session effort.
Path α is falsifiable: if downstream divergences at idx 102014+ surface evidence that the unified-region routing actually broke game logic (e.g. GPU command-buffer corruption, MmGetPhysicalAddress mismatch in payload data), that's prima facie reason to escalate to Path β. This session creates the conditions for that observation; it does not pre-commit to a model rewrite.

Mixed-case acknowledgement: ours's MmAllocatePhysicalMemoryEx may mis-route in a way that breaks downstream code (β-leak). The matched-prefix metric below (161 → 102014) is a positive signal that this is NOT the case for at least the first ~102K events: the game's boot sequence does not (yet) do region-arithmetic that distinguishes 0xBC220000 from 0x40105000. If a later divergence (e.g. at 102014, RtlImageXexHeaderField — out of scope for this session) does turn out to be a downstream consequence of the wrong region, that's the trigger to escalate.

Allocator function set covered by Path α

For completeness in the canonicalization (not for "widening scope" of the fix — the divergence at idx=161 is the only one this session targets; listing the other allocators only ensures the canonicalization is uniform and doesn't surface false ordinal-drift later):

MmAllocatePhysicalMemoryEx — the immediate target
MmAllocatePhysicalMemory — same family
NtAllocateVirtualMemory — sibling allocator (returns user-heap VA)
RtlAllocateHeap — Rtl-side heap (returns user-heap VA)
MmCreateKernelStack — stack allocator

If any of these also diverge in raw-VA form but the surrounding code agrees (same ordinal call sequence), they'll silently canonicalize. If they diverge on call-count ordering, the ordinals drift and the divergence surfaces correctly at the first drifted call. That's the right behavior.

8.8 KiB Raw Blame History Unescape Escape

Phase C+2 — investigation: MmAllocatePhysicalMemoryEx at idx=161