Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
215 lines
9.0 KiB
Markdown
215 lines
9.0 KiB
Markdown
# Phase C+3 — investigation: `RtlImageXexHeaderField` at idx=102014
|
|
|
|
## Divergence
|
|
|
|
| | canary | ours (pre-fix) |
|
|
|---|---|---|
|
|
| `payload.return_value` (idx=102014, tid=6→1) | `805433576` = `0x3001F0E8` | `0` |
|
|
| `payload.status` | `0x3001f0e8` | `0x00000000` |
|
|
| Surrounding context (idx 102009..102013): `RtlLeaveCriticalSection` → `RtlImageXexHeaderField`. ||
|
|
| Game thread | tid=6 main | tid=1 main |
|
|
| Next event (idx=102015) | `NtCreateFile` | `NtCreateFile` (matches) |
|
|
|
|
`0x3001F0E8` is in canary's virtual-heap region (`0x30xxxxxx`) — the
|
|
`Memory::SystemHeapAlloc` band — so the value is a guest VA pointing
|
|
inside canary's in-guest XEX header copy (allocated in
|
|
`user_module.cc:224` as `guest_xex_header_`). Ours returns 0 because
|
|
its stub `rtl_image_xex_header_field` (`exports.rs:2391-2395`) returned
|
|
0 unconditionally.
|
|
|
|
## Step 1 — Event context at idx=102014
|
|
|
|
From canary's existing Phase A capture
|
|
(`xenia-rs/audit-runs/phase-c-first-divergence/phase-a/canary.jsonl`),
|
|
canary's tid=6 makes only **two** `RtlImageXexHeaderField` calls in the
|
|
matched prefix:
|
|
|
|
| event idx | event kind | payload |
|
|
|---|---|---|
|
|
| 0 | import.call | `RtlImageXexHeaderField` |
|
|
| 1 | kernel.call | `RtlImageXexHeaderField` (args:{} — schema-v1 doesn't capture args) |
|
|
| 2 | kernel.return | `return_value=0 status=0x00000000` |
|
|
| 102012 | import.call | `RtlImageXexHeaderField` |
|
|
| 102013 | kernel.call | `RtlImageXexHeaderField` |
|
|
| 102014 | kernel.return | `return_value=805433576 status=0x3001f0e8` |
|
|
|
|
Ours pre-fix makes the same call sequence (verified by capture in
|
|
`phase-c1-keQuerySystemTime/ours.jsonl`) — both `RtlImageXexHeaderField`
|
|
calls returned 0.
|
|
|
|
Schema-v1 records empty `args:{}`, so `field_key` (r4) and `xex_header_ptr`
|
|
(r3) aren't directly readable from the JSONL. A one-shot `eprintln` in
|
|
ours's stub revealed both calls pass:
|
|
|
|
* call #1: `xex_header_ptr=0x00000000 field_key=0x00020401` (DEFAULT_HEAP_SIZE — not present in this XEX, so even with a valid header pointer the result would be 0)
|
|
* call #2: `xex_header_ptr=0x00000000 field_key=0x00040006` (EXECUTION_INFO — low byte `0x06`, "else" class, returns `header_base + offset(0x10E8)`)
|
|
|
|
`xenia-rs/target/release/xenia-rs info` against the ISO confirms the
|
|
in-XEX optional-header table. Key `0x00040006` is present with value
|
|
`0x000010E8`; key `0x00020401` is not present. So canary's `0x3001F0E8`
|
|
= `0x3001E000 + 0x10E8` — canary's `guest_xex_header_` lives at
|
|
`0x3001E000`. The game queries `EXECUTION_INFO` and uses the
|
|
returned VA to read media_id / title_id / disc_number / disc_count.
|
|
|
|
## Step 2 — Source-read both engines
|
|
|
|
### Canary
|
|
|
|
`xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_rtl.cc:501-515`:
|
|
|
|
```c
|
|
pointer_result_t RtlImageXexHeaderField_entry(pointer_t<xex2_header> xex_header,
|
|
dword_t field_dword) {
|
|
uint32_t field_value = 0;
|
|
uint32_t field = field_dword;
|
|
if (!xex_header) {
|
|
return field_value;
|
|
}
|
|
UserModule::GetOptHeader(kernel_memory(), xex_header, xex2_header_keys(field),
|
|
&field_value);
|
|
return field_value;
|
|
}
|
|
```
|
|
|
|
`UserModule::GetOptHeader` (`user_module.cc:335-369`):
|
|
|
|
```c
|
|
for (uint32_t i = 0; i < header->header_count; i++) {
|
|
auto& opt_header = header->headers[i];
|
|
if (opt_header.key != key) continue;
|
|
switch (opt_header.key & 0xFF) {
|
|
case 0x00: field_value = opt_header.value; break;
|
|
case 0x01: field_value = memory->HostToGuestVirtual(&opt_header.value); break;
|
|
default: field_value = memory->HostToGuestVirtual(header) + opt_header.offset; break;
|
|
}
|
|
break;
|
|
}
|
|
*out = field_value;
|
|
```
|
|
|
|
The argument `xex_header` is a guest VA pointing at the in-guest copy of
|
|
the raw XEX header bytes (allocated by `user_module.cc:223-227`'s
|
|
`guest_xex_header_ = SystemHeapAlloc(header->header_size); memcpy(...)`).
|
|
The game reaches it via `*XexExecutableModuleHandle → hmodule_ptr →
|
|
*(hmodule + 0x58) = xex_header_base` (canary `xmodule.h:49`).
|
|
|
|
### Ours
|
|
|
|
`xenia-rs/crates/xenia-kernel/src/exports.rs:2391-2395` (pre-fix):
|
|
|
|
```rust
|
|
fn rtl_image_xex_header_field(ctx, _mem, _state) {
|
|
// r3 = xex_header_ptr, r4 = field_id
|
|
// Return 0 for all fields
|
|
ctx.gpr[3] = 0;
|
|
}
|
|
```
|
|
|
|
A complete stub. The entire function body is wrong.
|
|
|
|
`xenia-rs/crates/xenia-app/src/main.rs:1440-1442` (pre-fix):
|
|
|
|
```rust
|
|
("xboxkrnl.exe", 0x0193) => {
|
|
// XexExecutableModuleHandle -> image base
|
|
mem.write_u32(addr, base);
|
|
}
|
|
```
|
|
|
|
Writes `image_base` (e.g. `0x82000000`) at the variable slot instead of
|
|
a guest VA pointing to an `X_LDR_DATA_TABLE_ENTRY`. The game's CRT
|
|
derefs `*XexExecutableModuleHandle = base`, then walks `*(base + 0x58)`
|
|
which reads PE OptionalHeader bytes (`0x61602063` for this ISO). Game
|
|
treats that as invalid → falls through to call `RtlImageXexHeaderField`
|
|
with `r3=NULL` regardless of which key it wants to query.
|
|
|
|
## Step 3 — Classification
|
|
|
|
This is **class (B-extreme)**: not "missing handler for one field key"
|
|
but "the entire function body is a stub returning 0". The XEX header
|
|
data IS parsed by ours's loader (`xenia-xex/src/header.rs` defines
|
|
`Vec<Xex2OptionalHeader>`), but never made available to the kernel
|
|
import handler.
|
|
|
|
Additionally, the upstream LDR chain is also wrong: `XexExecutableModule
|
|
Handle` doesn't point to a real LDR_DATA_TABLE_ENTRY. But fixing THAT
|
|
turned out to be Phase-A-regressing — see below.
|
|
|
|
### Sub-finding: LDR fix shifts boot trajectory
|
|
|
|
The first fix attempt (initial commit: replace `mem.write_u32(addr, base)`
|
|
with a proper `X_LDR_DATA_TABLE_ENTRY` allocation that pointed to a
|
|
copy of the XEX header) BROKE the matched-prefix metric:
|
|
|
|
| approach | tid=6→tid=1 matched |
|
|
|---|---|
|
|
| pre-fix (C+2 baseline) | 102014 |
|
|
| with full LDR setup (first attempt) | **0** (regression) |
|
|
| header-bytes-only, KernelState fallback in handler (final) | **102032** (+18 past 102014) |
|
|
|
|
Reason: ours's CRT entry path examines `*XexExecutableModuleHandle`.
|
|
When it's `0x82000000` (image base), the CRT takes the "module not yet
|
|
queryable" path which makes an early `RtlImageXexHeaderField(NULL, key)`
|
|
probe (returning 0 — matches canary). When `*XexExecutableModuleHandle`
|
|
is `0x4xxxxxxx` (a real LDR allocated by `KernelState::heap_alloc`), the
|
|
CRT takes the "module queryable" path and skips the early probe call
|
|
entirely. The two engines' event sequences then drift starting at idx=0.
|
|
|
|
Canary's `hmodule_ptr` lands at `0x4xxxxxxx` too (via
|
|
`Memory::SystemHeapAlloc` — actually canary's lookup gives `0x30xxxxxx`
|
|
for the virtual heap; ours lands in `0x4xxxxxxx`). Either way it
|
|
should be the same "queryable" address class — but canary's CRT still
|
|
makes the early probe. Possibly because of cycle-level timing
|
|
differences in when `*XexExecutableModuleHandle` gets the final
|
|
hmodule_ptr value (canary writes it during `LaunchModule` which is
|
|
called after some PreLaunch initialization; ours writes it during the
|
|
xenia-app's Phase 3 variable-import patcher, which runs before any
|
|
guest code). This is too deep to chase in this session.
|
|
|
|
Final approach **preserves** the pre-fix CRT branch (game still passes
|
|
ptr=NULL on most calls) by keeping `*XexExecutableModuleHandle = base`,
|
|
then routes the handler through a KernelState fallback to recover the
|
|
correct return value. The handler now returns `xex_header_va + 0x10E8`
|
|
for the EXECUTION_INFO query at idx=102014.
|
|
|
|
## Step 4 — Pick the fix
|
|
|
|
Three deltas:
|
|
|
|
1. **`KernelState::xex_header_guest_ptr: u32`** — record where the
|
|
guest-memory copy of the raw XEX header lives.
|
|
2. **`xenia-app::cmd_exec`** at the `XexExecutableModuleHandle` patcher:
|
|
keep `*XexExecutableModuleHandle = base` (don't disturb the CRT
|
|
branch), but additionally allocate `header.header_size` bytes in
|
|
guest memory and `mem.write_bulk(&data[..header_size])` to copy the
|
|
raw header in. Record the resulting guest VA in
|
|
`kernel.xex_header_guest_ptr`.
|
|
3. **`rtl_image_xex_header_field`** — implement the lookup mirroring
|
|
canary's `UserModule::GetOptHeader`. Fall back to
|
|
`state.xex_header_guest_ptr` when the caller passes NULL.
|
|
|
|
Plus a python-side canonicalization addition:
|
|
|
|
4. **`diff_events.py`** — add `RtlImageXexHeaderField` to
|
|
`ALLOCATOR_RETURN_FNS`. The return value for "else"-class keys is a
|
|
guest VA inside the in-guest XEX header copy, which is
|
|
host-allocator-dependent (`0x30xxxxxx` in canary,
|
|
`0x4xxxxxxx` in ours). Per-(tid, name) ordinal sentinels mask the
|
|
VA divergence — same pattern as Phase C+2's allocator canonicalization.
|
|
|
|
Total: ~80 LOC, 4 files.
|
|
|
|
## Cross-validation
|
|
|
|
* Pre-fix `eprintln` trace confirms `xex_header_ptr=0` for both ours
|
|
calls; field keys are `0x00020401` (not in XEX → returns 0) and
|
|
`0x00040006` (in XEX, "else" class → returns `header_base + 0x10E8`).
|
|
* Canary's idx=102014 return value `0x3001F0E8 = 0x3001E000 + 0x10E8`
|
|
confirms canary's `guest_xex_header_` is at `0x3001E000` and key
|
|
`0x00040006`'s offset entry is `0x10E8`.
|
|
* ours's `xenia-rs info` against the ISO confirms key `0x00040006`
|
|
is present with value `0x000010E8`.
|
|
|
|
All three independent evidence sources converge on the same field
|
|
semantics.
|