Files
xenia-rs/audit-runs/phase-c3-RtlImageXexHeaderField/investigation.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

9.0 KiB

Phase C+3 — investigation: RtlImageXexHeaderField at idx=102014

Divergence

canary ours (pre-fix)
payload.return_value (idx=102014, tid=6→1) 805433576 = 0x3001F0E8 0
payload.status 0x3001f0e8 0x00000000
Surrounding context (idx 102009..102013): RtlLeaveCriticalSectionRtlImageXexHeaderField.
Game thread tid=6 main tid=1 main
Next event (idx=102015) NtCreateFile NtCreateFile (matches)

0x3001F0E8 is in canary's virtual-heap region (0x30xxxxxx) — the Memory::SystemHeapAlloc band — so the value is a guest VA pointing inside canary's in-guest XEX header copy (allocated in user_module.cc:224 as guest_xex_header_). Ours returns 0 because its stub rtl_image_xex_header_field (exports.rs:2391-2395) returned 0 unconditionally.

Step 1 — Event context at idx=102014

From canary's existing Phase A capture (xenia-rs/audit-runs/phase-c-first-divergence/phase-a/canary.jsonl), canary's tid=6 makes only two RtlImageXexHeaderField calls in the matched prefix:

event idx event kind payload
0 import.call RtlImageXexHeaderField
1 kernel.call RtlImageXexHeaderField (args:{} — schema-v1 doesn't capture args)
2 kernel.return return_value=0 status=0x00000000
102012 import.call RtlImageXexHeaderField
102013 kernel.call RtlImageXexHeaderField
102014 kernel.return return_value=805433576 status=0x3001f0e8

Ours pre-fix makes the same call sequence (verified by capture in phase-c1-keQuerySystemTime/ours.jsonl) — both RtlImageXexHeaderField calls returned 0.

Schema-v1 records empty args:{}, so field_key (r4) and xex_header_ptr (r3) aren't directly readable from the JSONL. A one-shot eprintln in ours's stub revealed both calls pass:

  • call #1: xex_header_ptr=0x00000000 field_key=0x00020401 (DEFAULT_HEAP_SIZE — not present in this XEX, so even with a valid header pointer the result would be 0)
  • call #2: xex_header_ptr=0x00000000 field_key=0x00040006 (EXECUTION_INFO — low byte 0x06, "else" class, returns header_base + offset(0x10E8))

xenia-rs/target/release/xenia-rs info against the ISO confirms the in-XEX optional-header table. Key 0x00040006 is present with value 0x000010E8; key 0x00020401 is not present. So canary's 0x3001F0E8 = 0x3001E000 + 0x10E8 — canary's guest_xex_header_ lives at 0x3001E000. The game queries EXECUTION_INFO and uses the returned VA to read media_id / title_id / disc_number / disc_count.

Step 2 — Source-read both engines

Canary

xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_rtl.cc:501-515:

pointer_result_t RtlImageXexHeaderField_entry(pointer_t<xex2_header> xex_header,
                                              dword_t field_dword) {
  uint32_t field_value = 0;
  uint32_t field = field_dword;
  if (!xex_header) {
    return field_value;
  }
  UserModule::GetOptHeader(kernel_memory(), xex_header, xex2_header_keys(field),
                           &field_value);
  return field_value;
}

UserModule::GetOptHeader (user_module.cc:335-369):

for (uint32_t i = 0; i < header->header_count; i++) {
  auto& opt_header = header->headers[i];
  if (opt_header.key != key) continue;
  switch (opt_header.key & 0xFF) {
    case 0x00: field_value = opt_header.value; break;
    case 0x01: field_value = memory->HostToGuestVirtual(&opt_header.value); break;
    default:   field_value = memory->HostToGuestVirtual(header) + opt_header.offset; break;
  }
  break;
}
*out = field_value;

The argument xex_header is a guest VA pointing at the in-guest copy of the raw XEX header bytes (allocated by user_module.cc:223-227's guest_xex_header_ = SystemHeapAlloc(header->header_size); memcpy(...)). The game reaches it via *XexExecutableModuleHandle → hmodule_ptr → *(hmodule + 0x58) = xex_header_base (canary xmodule.h:49).

Ours

xenia-rs/crates/xenia-kernel/src/exports.rs:2391-2395 (pre-fix):

fn rtl_image_xex_header_field(ctx, _mem, _state) {
    // r3 = xex_header_ptr, r4 = field_id
    // Return 0 for all fields
    ctx.gpr[3] = 0;
}

A complete stub. The entire function body is wrong.

xenia-rs/crates/xenia-app/src/main.rs:1440-1442 (pre-fix):

("xboxkrnl.exe", 0x0193) => {
    // XexExecutableModuleHandle -> image base
    mem.write_u32(addr, base);
}

Writes image_base (e.g. 0x82000000) at the variable slot instead of a guest VA pointing to an X_LDR_DATA_TABLE_ENTRY. The game's CRT derefs *XexExecutableModuleHandle = base, then walks *(base + 0x58) which reads PE OptionalHeader bytes (0x61602063 for this ISO). Game treats that as invalid → falls through to call RtlImageXexHeaderField with r3=NULL regardless of which key it wants to query.

Step 3 — Classification

This is class (B-extreme): not "missing handler for one field key" but "the entire function body is a stub returning 0". The XEX header data IS parsed by ours's loader (xenia-xex/src/header.rs defines Vec<Xex2OptionalHeader>), but never made available to the kernel import handler.

Additionally, the upstream LDR chain is also wrong: XexExecutableModule Handle doesn't point to a real LDR_DATA_TABLE_ENTRY. But fixing THAT turned out to be Phase-A-regressing — see below.

Sub-finding: LDR fix shifts boot trajectory

The first fix attempt (initial commit: replace mem.write_u32(addr, base) with a proper X_LDR_DATA_TABLE_ENTRY allocation that pointed to a copy of the XEX header) BROKE the matched-prefix metric:

approach tid=6→tid=1 matched
pre-fix (C+2 baseline) 102014
with full LDR setup (first attempt) 0 (regression)
header-bytes-only, KernelState fallback in handler (final) 102032 (+18 past 102014)

Reason: ours's CRT entry path examines *XexExecutableModuleHandle. When it's 0x82000000 (image base), the CRT takes the "module not yet queryable" path which makes an early RtlImageXexHeaderField(NULL, key) probe (returning 0 — matches canary). When *XexExecutableModuleHandle is 0x4xxxxxxx (a real LDR allocated by KernelState::heap_alloc), the CRT takes the "module queryable" path and skips the early probe call entirely. The two engines' event sequences then drift starting at idx=0.

Canary's hmodule_ptr lands at 0x4xxxxxxx too (via Memory::SystemHeapAlloc — actually canary's lookup gives 0x30xxxxxx for the virtual heap; ours lands in 0x4xxxxxxx). Either way it should be the same "queryable" address class — but canary's CRT still makes the early probe. Possibly because of cycle-level timing differences in when *XexExecutableModuleHandle gets the final hmodule_ptr value (canary writes it during LaunchModule which is called after some PreLaunch initialization; ours writes it during the xenia-app's Phase 3 variable-import patcher, which runs before any guest code). This is too deep to chase in this session.

Final approach preserves the pre-fix CRT branch (game still passes ptr=NULL on most calls) by keeping *XexExecutableModuleHandle = base, then routes the handler through a KernelState fallback to recover the correct return value. The handler now returns xex_header_va + 0x10E8 for the EXECUTION_INFO query at idx=102014.

Step 4 — Pick the fix

Three deltas:

  1. KernelState::xex_header_guest_ptr: u32 — record where the guest-memory copy of the raw XEX header lives.
  2. xenia-app::cmd_exec at the XexExecutableModuleHandle patcher: keep *XexExecutableModuleHandle = base (don't disturb the CRT branch), but additionally allocate header.header_size bytes in guest memory and mem.write_bulk(&data[..header_size]) to copy the raw header in. Record the resulting guest VA in kernel.xex_header_guest_ptr.
  3. rtl_image_xex_header_field — implement the lookup mirroring canary's UserModule::GetOptHeader. Fall back to state.xex_header_guest_ptr when the caller passes NULL.

Plus a python-side canonicalization addition:

  1. diff_events.py — add RtlImageXexHeaderField to ALLOCATOR_RETURN_FNS. The return value for "else"-class keys is a guest VA inside the in-guest XEX header copy, which is host-allocator-dependent (0x30xxxxxx in canary, 0x4xxxxxxx in ours). Per-(tid, name) ordinal sentinels mask the VA divergence — same pattern as Phase C+2's allocator canonicalization.

Total: ~80 LOC, 4 files.

Cross-validation

  • Pre-fix eprintln trace confirms xex_header_ptr=0 for both ours calls; field keys are 0x00020401 (not in XEX → returns 0) and 0x00040006 (in XEX, "else" class → returns header_base + 0x10E8).
  • Canary's idx=102014 return value 0x3001F0E8 = 0x3001E000 + 0x10E8 confirms canary's guest_xex_header_ is at 0x3001E000 and key 0x00040006's offset entry is 0x10E8.
  • ours's xenia-rs info against the ISO confirms key 0x00040006 is present with value 0x000010E8.

All three independent evidence sources converge on the same field semantics.