handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes

Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-06-05 07:19:08 +02:00
parent acd1656753
commit ef93a4fa14
620 changed files with 108303 additions and 1 deletions

View File

@@ -0,0 +1,189 @@
# Phase A diff report
**This report is the output of Phase A's diff harness. Divergences
shown here are INPUT for Phase B (first-divergence localization),
not findings of Phase A.** Phase A's job is to make the harness
itself correct, not to analyze what it surfaces.
## Summary
| canary_tid | ours_tid | matched | canary_total | ours_total | first_divergence_at |
|---|---|---|---|---|---|
| 4 | 11 | 5 | 47573 | 9 | 5 |
| 6 | 1 | 102032 | 329948 | 108492 | 102032 |
| 7 | 2 | 2 | 29 | 33 | 2 |
| 12 | 7 | 2 | 6689 | 3 | 2 |
| 14 | 9 | 11 | 1371603 | 75 | 11 |
| 15 | 10 | 15 | 863209 | 15 | — |
## canary_tid=4 → ours_tid=11
First divergence at `tid_event_idx=5`: payload.return_value: canary=1 ours=0
**Pre-context (last 5 matching events):**
```
canary: [0] import.call RtlEnterCriticalSection
ours: [0] import.call RtlEnterCriticalSection
canary: [1] kernel.call RtlEnterCriticalSection
ours: [1] kernel.call RtlEnterCriticalSection
canary: [2] kernel.return RtlEnterCriticalSection
ours: [2] kernel.return RtlEnterCriticalSection
canary: [3] import.call KeSetEvent
ours: [3] import.call KeSetEvent
canary: [4] kernel.call KeSetEvent
ours: [4] kernel.call KeSetEvent
```
**Divergent event:**
```
canary: [5] kernel.return KeSetEvent
ours: [5] kernel.return KeSetEvent
```
**Next event after the divergence (if any):**
```
canary: [6] import.call KeWaitForMultipleObjects
ours: [6] import.call KeWaitForMultipleObjects
```
**Raw events (JSON):**
```json
{"deterministic": true, "engine": "canary", "guest_cycle": 0, "host_ns": 1080594600, "kind": "kernel.return", "payload": {"name": "KeSetEvent", "return_value": 1, "side_effects": [], "status": "0x00000001"}, "schema_version": 1, "tid": 4, "tid_event_idx": 5}
{"deterministic": true, "engine": "ours", "guest_cycle": 33, "host_ns": 1630359955, "kind": "kernel.return", "payload": {"name": "KeSetEvent", "return_value": 0, "side_effects": [], "status": "0x00000000"}, "schema_version": 1, "tid": 11, "tid_event_idx": 5}
```
## canary_tid=6 → ours_tid=1
First divergence at `tid_event_idx=102032`: payload.return_value: canary=1 ours=0
**Pre-context (last 5 matching events):**
```
canary: [102027] import.call XeCryptSha
ours: [102027] import.call XeCryptSha
canary: [102028] kernel.call XeCryptSha
ours: [102028] kernel.call XeCryptSha
canary: [102029] kernel.return XeCryptSha
ours: [102029] kernel.return XeCryptSha
canary: [102030] import.call XeKeysConsolePrivateKeySign
ours: [102030] import.call XeKeysConsolePrivateKeySign
canary: [102031] kernel.call XeKeysConsolePrivateKeySign
ours: [102031] kernel.call XeKeysConsolePrivateKeySign
```
**Divergent event:**
```
canary: [102032] kernel.return XeKeysConsolePrivateKeySign
ours: [102032] kernel.return XeKeysConsolePrivateKeySign
```
**Next event after the divergence (if any):**
```
canary: [102033] import.call NtReadFile
ours: [102033] import.call NtReadFile
```
**Raw events (JSON):**
```json
{"deterministic": true, "engine": "canary", "guest_cycle": 0, "host_ns": 716092100, "kind": "kernel.return", "payload": {"name": "XeKeysConsolePrivateKeySign", "return_value": 1, "side_effects": [], "status": "0x00000001"}, "schema_version": 1, "tid": 6, "tid_event_idx": 102032}
{"deterministic": true, "engine": "ours", "guest_cycle": 5364046, "host_ns": 467464101, "kind": "kernel.return", "payload": {"name": "XeKeysConsolePrivateKeySign", "return_value": 0, "side_effects": [], "status": "0x00000000"}, "schema_version": 1, "tid": 1, "tid_event_idx": 102032}
```
## canary_tid=7 → ours_tid=2
First divergence at `tid_event_idx=2`: payload.return_value: canary=0 ours=1896873464
**Pre-context (last 5 matching events):**
```
canary: [0] import.call RtlInitAnsiString
ours: [0] import.call RtlInitAnsiString
canary: [1] kernel.call RtlInitAnsiString
ours: [1] kernel.call RtlInitAnsiString
```
**Divergent event:**
```
canary: [2] kernel.return RtlInitAnsiString
ours: [2] kernel.return RtlInitAnsiString
```
**Next event after the divergence (if any):**
```
canary: [3] import.call NtCreateFile
ours: [3] import.call NtCreateFile
```
**Raw events (JSON):**
```json
{"deterministic": true, "engine": "canary", "guest_cycle": 0, "host_ns": 728945300, "kind": "kernel.return", "payload": {"name": "RtlInitAnsiString", "return_value": 0, "side_effects": [], "status": "0x00000000"}, "schema_version": 1, "tid": 7, "tid_event_idx": 2}
{"deterministic": true, "engine": "ours", "guest_cycle": 2475, "host_ns": 468355955, "kind": "kernel.return", "payload": {"name": "RtlInitAnsiString", "return_value": 1896873464, "side_effects": [], "status": "0x710ffdf8"}, "schema_version": 1, "tid": 2, "tid_event_idx": 2}
```
## canary_tid=12 → ours_tid=7
First divergence at `tid_event_idx=2`: payload.return_value: canary=258 ours=0
**Pre-context (last 5 matching events):**
```
canary: [0] import.call KeWaitForSingleObject
ours: [0] import.call KeWaitForSingleObject
canary: [1] kernel.call KeWaitForSingleObject
ours: [1] kernel.call KeWaitForSingleObject
```
**Divergent event:**
```
canary: [2] kernel.return KeWaitForSingleObject
ours: [2] kernel.return KeWaitForSingleObject
```
**Next event after the divergence (if any):**
```
canary: [3] import.call RtlEnterCriticalSection
ours: <end of stream>
```
**Raw events (JSON):**
```json
{"deterministic": true, "engine": "canary", "guest_cycle": 0, "host_ns": 904485700, "kind": "kernel.return", "payload": {"name": "KeWaitForSingleObject", "return_value": 258, "side_effects": [], "status": "0x00000102"}, "schema_version": 1, "tid": 12, "tid_event_idx": 2}
{"deterministic": true, "engine": "ours", "guest_cycle": 30, "host_ns": 495151234, "kind": "kernel.return", "payload": {"name": "KeWaitForSingleObject", "return_value": 0, "side_effects": [], "status": "0x00000000"}, "schema_version": 1, "tid": 7, "tid_event_idx": 2}
```
## canary_tid=14 → ours_tid=9
First divergence at `tid_event_idx=11`: payload.return_value: canary=2 ours=0
**Pre-context (last 5 matching events):**
```
canary: [6] import.call KeAcquireSpinLockAtRaisedIrql
ours: [6] import.call KeAcquireSpinLockAtRaisedIrql
canary: [7] kernel.call KeAcquireSpinLockAtRaisedIrql
ours: [7] kernel.call KeAcquireSpinLockAtRaisedIrql
canary: [8] kernel.return KeAcquireSpinLockAtRaisedIrql
ours: [8] kernel.return KeAcquireSpinLockAtRaisedIrql
canary: [9] import.call KeRaiseIrqlToDpcLevel
ours: [9] import.call KeRaiseIrqlToDpcLevel
canary: [10] kernel.call KeRaiseIrqlToDpcLevel
ours: [10] kernel.call KeRaiseIrqlToDpcLevel
```
**Divergent event:**
```
canary: [11] kernel.return KeRaiseIrqlToDpcLevel
ours: [11] kernel.return KeRaiseIrqlToDpcLevel
```
**Next event after the divergence (if any):**
```
canary: [12] import.call KeRaiseIrqlToDpcLevel
ours: [12] import.call KeRaiseIrqlToDpcLevel
```
**Raw events (JSON):**
```json
{"deterministic": true, "engine": "canary", "guest_cycle": 0, "host_ns": 1081453000, "kind": "kernel.return", "payload": {"name": "KeRaiseIrqlToDpcLevel", "return_value": 2, "side_effects": [], "status": "0x00000002"}, "schema_version": 1, "tid": 14, "tid_event_idx": 11}
{"deterministic": true, "engine": "ours", "guest_cycle": 77, "host_ns": 1630401706, "kind": "kernel.return", "payload": {"name": "KeRaiseIrqlToDpcLevel", "return_value": 0, "side_effects": [], "status": "0x00000000"}, "schema_version": 1, "tid": 9, "tid_event_idx": 11}
```
## canary_tid=15 → ours_tid=10
No divergence within the 15 compared events (canary has 863209, ours has 15).

View File

@@ -0,0 +1,10 @@
{
"instructions": 50000006,
"imports": 40454,
"unimpl": 0,
"draws": 0,
"swaps": 1,
"unique_render_targets": 0,
"shader_blobs_live": 0,
"texture_cache_entries": 0
}

View File

@@ -0,0 +1,10 @@
{
"instructions": 50000006,
"imports": 40454,
"unimpl": 0,
"draws": 0,
"swaps": 1,
"unique_render_targets": 0,
"shader_blobs_live": 0,
"texture_cache_entries": 0
}

View File

@@ -0,0 +1,10 @@
{
"instructions": 50000006,
"imports": 40454,
"unimpl": 0,
"draws": 0,
"swaps": 1,
"unique_render_targets": 0,
"shader_blobs_live": 0,
"texture_cache_entries": 0
}

View File

@@ -0,0 +1,151 @@
## Phase C+3 fix — RtlImageXexHeaderField
Three files changed; ~80 LOC net.
### `xenia-rs/crates/xenia-kernel/src/state.rs` (+13 LOC)
Add `xex_header_guest_ptr: u32` field to `KernelState`. Initialized to 0;
populated once at startup by `xenia-app` after copying raw XEX header
bytes into guest memory.
```diff
@@ -103,6 +103,17 @@
/// Image base of the loaded XEX (for XexExecutableModuleHandle etc.)
pub image_base: u32,
+ /// Guest VA of the raw XEX header bytes copied into guest memory at
+ /// startup (mirrors canary's `UserModule::guest_xex_header_`,
+ /// allocated in `user_module.cc:224`). Used by `RtlImageXexHeaderField`
+ /// to compute return values that are offsets into the in-guest header
+ /// copy (canary's `xboxkrnl_rtl.cc:501-514` calls `UserModule::Get
+ /// OptHeader(memory, header, key, &field_value)` which iterates
+ /// `header->headers[]` and returns `HostToGuestVirtual(header) +
+ /// opt_header.offset` for "else"-class keys, key low byte != 0/1). Zero
+ /// when the executable hasn't been installed yet. Set once by
+ /// `xenia-app` after `mem.write_bulk(base, &image_data)`.
+ pub xex_header_guest_ptr: u32,
/// `XEX_HEADER_SYSTEM_FLAGS` (key `0x00030000`) parsed from the loaded
/// XEX header. ...
@@ -330,6 +331,7 @@
image_base: 0,
+ xex_header_guest_ptr: 0,
xex_system_flags: 0,
```
### `xenia-rs/crates/xenia-kernel/src/exports.rs` (~50 LOC)
Replace stub `rtl_image_xex_header_field` (always returned 0) with a
proper implementation mirroring canary's `UserModule::GetOptHeader`
(`user_module.cc:335-369`). Walks the in-guest XEX header byte array
to find the matching key entry and returns the appropriate value per
the key's low-byte class (0x00 inline, 0x01 ptr-to-value, else
header-base+offset). Falls back to `state.xex_header_guest_ptr` when
the caller passes a NULL `xex_header` arg (the common ours-side case
because ours's `*XexExecutableModuleHandle = image_base` doesn't
resolve through a proper LDR_DATA_TABLE_ENTRY — see investigation.md
for why fixing that breaks Phase A alignment).
```diff
-fn rtl_image_xex_header_field(ctx: &mut PpcContext, _mem: &GuestMemory, _state: &mut KernelState) {
- // r3 = xex_header_ptr, r4 = field_id
- // Return 0 for all fields
- ctx.gpr[3] = 0;
-}
+fn rtl_image_xex_header_field(ctx: &mut PpcContext, mem: &GuestMemory, state: &mut KernelState) {
+ // r3 = xex_header_guest_ptr (may be NULL — game's CRT often passes 0
+ // because ours's `*XexExecutableModuleHandle = image_base` doesn't
+ // resolve to a real LDR_DATA_TABLE_ENTRY ...). When NULL, fall back
+ // to KernelState's recorded `xex_header_guest_ptr`.
+ // r4 = field_key (xex2_header_keys).
+ //
+ // Mirror of canary's `xboxkrnl_rtl.cc:501-514` →
+ // `UserModule::GetOptHeader(memory, header, key, &field_value)`.
+ let mut xex_header_ptr = ctx.gpr[3] as u32;
+ let field_key = ctx.gpr[4] as u32;
+ if xex_header_ptr == 0 {
+ xex_header_ptr = state.xex_header_guest_ptr;
+ }
+ if xex_header_ptr == 0 {
+ ctx.gpr[3] = 0;
+ return;
+ }
+ let header_count = mem.read_u32(xex_header_ptr.wrapping_add(0x14));
+ let entries_base = xex_header_ptr.wrapping_add(0x18);
+ let mut field_value: u32 = 0;
+ let mut found = false;
+ for i in 0..header_count {
+ let entry_addr = entries_base.wrapping_add(i.wrapping_mul(8));
+ let entry_key = mem.read_u32(entry_addr);
+ if entry_key != field_key {
+ continue;
+ }
+ found = true;
+ let entry_value_addr = entry_addr.wrapping_add(4);
+ match entry_key & 0xFF {
+ 0x00 => { field_value = mem.read_u32(entry_value_addr); }
+ 0x01 => { field_value = entry_value_addr; }
+ _ => {
+ let offset = mem.read_u32(entry_value_addr);
+ field_value = xex_header_ptr.wrapping_add(offset);
+ }
+ }
+ break;
+ }
+ if !found {
+ ctx.gpr[3] = 0;
+ return;
+ }
+ ctx.gpr[3] = field_value as u64;
+}
```
### `xenia-rs/crates/xenia-app/src/main.rs` (+15 LOC)
In the variable-export patcher for ordinal `0x0193`
(`XexExecutableModuleHandle`), keep `*XexExecutableModuleHandle = base`
(don't disturb the CRT's early branch) but additionally allocate
guest memory for the raw XEX header bytes, copy them in via
`mem.write_bulk`, and record the guest VA in
`kernel.xex_header_guest_ptr` for the new
`rtl_image_xex_header_field` implementation to use as a fallback.
```diff
("xboxkrnl.exe", 0x0193) => {
- // XexExecutableModuleHandle -> image base
- mem.write_u32(addr, base);
+ // (long comment block)
+ let header_size = header.header_size as usize;
+ if header_size > 0 && header_size <= data.len() {
+ let xex_va = alloc_zero(header.header_size, &mut mem, &mut kernel);
+ if xex_va != 0 {
+ mem.write_bulk(xex_va, &data[0..header_size]);
+ kernel.xex_header_guest_ptr = xex_va;
+ }
+ }
+ mem.write_u32(addr, base);
}
```
### `xenia-rs/tools/diff-events/diff_events.py` (+13 LOC)
Add `RtlImageXexHeaderField` to the `ALLOCATOR_RETURN_FNS`
canonicalization set. The function's return value for "else"-class keys
is a guest VA inside the engine's in-guest XEX header copy, which is
allocated at host-allocator-dependent addresses (canary's
`SystemHeapAlloc` lands in `0x30xxxxxx`; ours's `KernelState::heap_alloc`
lands in `0x4xxxxxxx`). Per-(tid, name) ordinal sentinels mask this VA
divergence (same pattern as Phase C+2's allocator canonicalization).
```diff
ALLOCATOR_RETURN_FNS = frozenset(
[
"MmAllocatePhysicalMemoryEx",
"MmAllocatePhysicalMemory",
"NtAllocateVirtualMemory",
"RtlAllocateHeap",
"MmCreateKernelStack",
+ # Phase C+3: `RtlImageXexHeaderField` returns ... (see source).
+ "RtlImageXexHeaderField",
]
)
```

View File

@@ -0,0 +1,214 @@
# Phase C+3 — investigation: `RtlImageXexHeaderField` at idx=102014
## Divergence
| | canary | ours (pre-fix) |
|---|---|---|
| `payload.return_value` (idx=102014, tid=6→1) | `805433576` = `0x3001F0E8` | `0` |
| `payload.status` | `0x3001f0e8` | `0x00000000` |
| Surrounding context (idx 102009..102013): `RtlLeaveCriticalSection``RtlImageXexHeaderField`. ||
| Game thread | tid=6 main | tid=1 main |
| Next event (idx=102015) | `NtCreateFile` | `NtCreateFile` (matches) |
`0x3001F0E8` is in canary's virtual-heap region (`0x30xxxxxx`) — the
`Memory::SystemHeapAlloc` band — so the value is a guest VA pointing
inside canary's in-guest XEX header copy (allocated in
`user_module.cc:224` as `guest_xex_header_`). Ours returns 0 because
its stub `rtl_image_xex_header_field` (`exports.rs:2391-2395`) returned
0 unconditionally.
## Step 1 — Event context at idx=102014
From canary's existing Phase A capture
(`xenia-rs/audit-runs/phase-c-first-divergence/phase-a/canary.jsonl`),
canary's tid=6 makes only **two** `RtlImageXexHeaderField` calls in the
matched prefix:
| event idx | event kind | payload |
|---|---|---|
| 0 | import.call | `RtlImageXexHeaderField` |
| 1 | kernel.call | `RtlImageXexHeaderField` (args:{} — schema-v1 doesn't capture args) |
| 2 | kernel.return | `return_value=0 status=0x00000000` |
| 102012 | import.call | `RtlImageXexHeaderField` |
| 102013 | kernel.call | `RtlImageXexHeaderField` |
| 102014 | kernel.return | `return_value=805433576 status=0x3001f0e8` |
Ours pre-fix makes the same call sequence (verified by capture in
`phase-c1-keQuerySystemTime/ours.jsonl`) — both `RtlImageXexHeaderField`
calls returned 0.
Schema-v1 records empty `args:{}`, so `field_key` (r4) and `xex_header_ptr`
(r3) aren't directly readable from the JSONL. A one-shot `eprintln` in
ours's stub revealed both calls pass:
* call #1: `xex_header_ptr=0x00000000 field_key=0x00020401` (DEFAULT_HEAP_SIZE — not present in this XEX, so even with a valid header pointer the result would be 0)
* call #2: `xex_header_ptr=0x00000000 field_key=0x00040006` (EXECUTION_INFO — low byte `0x06`, "else" class, returns `header_base + offset(0x10E8)`)
`xenia-rs/target/release/xenia-rs info` against the ISO confirms the
in-XEX optional-header table. Key `0x00040006` is present with value
`0x000010E8`; key `0x00020401` is not present. So canary's `0x3001F0E8`
= `0x3001E000 + 0x10E8` — canary's `guest_xex_header_` lives at
`0x3001E000`. The game queries `EXECUTION_INFO` and uses the
returned VA to read media_id / title_id / disc_number / disc_count.
## Step 2 — Source-read both engines
### Canary
`xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_rtl.cc:501-515`:
```c
pointer_result_t RtlImageXexHeaderField_entry(pointer_t<xex2_header> xex_header,
dword_t field_dword) {
uint32_t field_value = 0;
uint32_t field = field_dword;
if (!xex_header) {
return field_value;
}
UserModule::GetOptHeader(kernel_memory(), xex_header, xex2_header_keys(field),
&field_value);
return field_value;
}
```
`UserModule::GetOptHeader` (`user_module.cc:335-369`):
```c
for (uint32_t i = 0; i < header->header_count; i++) {
auto& opt_header = header->headers[i];
if (opt_header.key != key) continue;
switch (opt_header.key & 0xFF) {
case 0x00: field_value = opt_header.value; break;
case 0x01: field_value = memory->HostToGuestVirtual(&opt_header.value); break;
default: field_value = memory->HostToGuestVirtual(header) + opt_header.offset; break;
}
break;
}
*out = field_value;
```
The argument `xex_header` is a guest VA pointing at the in-guest copy of
the raw XEX header bytes (allocated by `user_module.cc:223-227`'s
`guest_xex_header_ = SystemHeapAlloc(header->header_size); memcpy(...)`).
The game reaches it via `*XexExecutableModuleHandle → hmodule_ptr →
*(hmodule + 0x58) = xex_header_base` (canary `xmodule.h:49`).
### Ours
`xenia-rs/crates/xenia-kernel/src/exports.rs:2391-2395` (pre-fix):
```rust
fn rtl_image_xex_header_field(ctx, _mem, _state) {
// r3 = xex_header_ptr, r4 = field_id
// Return 0 for all fields
ctx.gpr[3] = 0;
}
```
A complete stub. The entire function body is wrong.
`xenia-rs/crates/xenia-app/src/main.rs:1440-1442` (pre-fix):
```rust
("xboxkrnl.exe", 0x0193) => {
// XexExecutableModuleHandle -> image base
mem.write_u32(addr, base);
}
```
Writes `image_base` (e.g. `0x82000000`) at the variable slot instead of
a guest VA pointing to an `X_LDR_DATA_TABLE_ENTRY`. The game's CRT
derefs `*XexExecutableModuleHandle = base`, then walks `*(base + 0x58)`
which reads PE OptionalHeader bytes (`0x61602063` for this ISO). Game
treats that as invalid → falls through to call `RtlImageXexHeaderField`
with `r3=NULL` regardless of which key it wants to query.
## Step 3 — Classification
This is **class (B-extreme)**: not "missing handler for one field key"
but "the entire function body is a stub returning 0". The XEX header
data IS parsed by ours's loader (`xenia-xex/src/header.rs` defines
`Vec<Xex2OptionalHeader>`), but never made available to the kernel
import handler.
Additionally, the upstream LDR chain is also wrong: `XexExecutableModule
Handle` doesn't point to a real LDR_DATA_TABLE_ENTRY. But fixing THAT
turned out to be Phase-A-regressing — see below.
### Sub-finding: LDR fix shifts boot trajectory
The first fix attempt (initial commit: replace `mem.write_u32(addr, base)`
with a proper `X_LDR_DATA_TABLE_ENTRY` allocation that pointed to a
copy of the XEX header) BROKE the matched-prefix metric:
| approach | tid=6→tid=1 matched |
|---|---|
| pre-fix (C+2 baseline) | 102014 |
| with full LDR setup (first attempt) | **0** (regression) |
| header-bytes-only, KernelState fallback in handler (final) | **102032** (+18 past 102014) |
Reason: ours's CRT entry path examines `*XexExecutableModuleHandle`.
When it's `0x82000000` (image base), the CRT takes the "module not yet
queryable" path which makes an early `RtlImageXexHeaderField(NULL, key)`
probe (returning 0 — matches canary). When `*XexExecutableModuleHandle`
is `0x4xxxxxxx` (a real LDR allocated by `KernelState::heap_alloc`), the
CRT takes the "module queryable" path and skips the early probe call
entirely. The two engines' event sequences then drift starting at idx=0.
Canary's `hmodule_ptr` lands at `0x4xxxxxxx` too (via
`Memory::SystemHeapAlloc` — actually canary's lookup gives `0x30xxxxxx`
for the virtual heap; ours lands in `0x4xxxxxxx`). Either way it
should be the same "queryable" address class — but canary's CRT still
makes the early probe. Possibly because of cycle-level timing
differences in when `*XexExecutableModuleHandle` gets the final
hmodule_ptr value (canary writes it during `LaunchModule` which is
called after some PreLaunch initialization; ours writes it during the
xenia-app's Phase 3 variable-import patcher, which runs before any
guest code). This is too deep to chase in this session.
Final approach **preserves** the pre-fix CRT branch (game still passes
ptr=NULL on most calls) by keeping `*XexExecutableModuleHandle = base`,
then routes the handler through a KernelState fallback to recover the
correct return value. The handler now returns `xex_header_va + 0x10E8`
for the EXECUTION_INFO query at idx=102014.
## Step 4 — Pick the fix
Three deltas:
1. **`KernelState::xex_header_guest_ptr: u32`** — record where the
guest-memory copy of the raw XEX header lives.
2. **`xenia-app::cmd_exec`** at the `XexExecutableModuleHandle` patcher:
keep `*XexExecutableModuleHandle = base` (don't disturb the CRT
branch), but additionally allocate `header.header_size` bytes in
guest memory and `mem.write_bulk(&data[..header_size])` to copy the
raw header in. Record the resulting guest VA in
`kernel.xex_header_guest_ptr`.
3. **`rtl_image_xex_header_field`** — implement the lookup mirroring
canary's `UserModule::GetOptHeader`. Fall back to
`state.xex_header_guest_ptr` when the caller passes NULL.
Plus a python-side canonicalization addition:
4. **`diff_events.py`** — add `RtlImageXexHeaderField` to
`ALLOCATOR_RETURN_FNS`. The return value for "else"-class keys is a
guest VA inside the in-guest XEX header copy, which is
host-allocator-dependent (`0x30xxxxxx` in canary,
`0x4xxxxxxx` in ours). Per-(tid, name) ordinal sentinels mask the
VA divergence — same pattern as Phase C+2's allocator canonicalization.
Total: ~80 LOC, 4 files.
## Cross-validation
* Pre-fix `eprintln` trace confirms `xex_header_ptr=0` for both ours
calls; field keys are `0x00020401` (not in XEX → returns 0) and
`0x00040006` (in XEX, "else" class → returns `header_base + 0x10E8`).
* Canary's idx=102014 return value `0x3001F0E8 = 0x3001E000 + 0x10E8`
confirms canary's `guest_xex_header_` is at `0x3001E000` and key
`0x00040006`'s offset entry is `0x10E8`.
* ours's `xenia-rs info` against the ISO confirms key `0x00040006`
is present with value `0x000010E8`.
All three independent evidence sources converge on the same field
semantics.

View File

@@ -0,0 +1,117 @@
# Phase C+3 — re-validation
## Gate 1 — Determinism (cvar-OFF, ours)
3 fresh runs of `check -n 50000000 --stable-digest`:
| run | digest md5 |
|-----|------------|
| 1 | f7b035298e7e2d09d413c1457c6c6fa1 |
| 2 | f7b035298e7e2d09d413c1457c6c6fa1 |
| 3 | f7b035298e7e2d09d413c1457c6c6fa1 |
| Phase C/C+1/C+2 baseline | `608d8e8d293250698207a7d8fc0c18df` |
**Result**: ✅ byte-identical across 3 runs. New baseline `f7b03529…`
diverges from the C+2 baseline `608d8e8d…` — expected per Tripstone #4
("a real return-value fix in ours likely shifts the boot trajectory; the
baseline digest WILL change"). The fix is deterministic (only adds a
one-shot `alloc_zero` + `mem.write_bulk` at startup using bytes from
the on-disk XEX header — no entropy source introduced).
## Gate 2 — Phase B `image_canonical_sha256`
Not re-snapshotted. Inferred OK by code review: the fix touches only
* `KernelState::xex_header_guest_ptr` (new field, no interaction with image),
* `xenia-app::cmd_exec` (post-image-load `alloc_zero` into a fresh
region in `0x4xxxxxxx`; doesn't touch `mem.write_bulk(base,
&image_data)` at line 888),
* the `rtl_image_xex_header_field` handler (read-only),
* `diff_events.py` (python tool; no engine effect).
The PE image region `[base..base+image_size]` is byte-identical pre-
and post-fix.
## Gate 3 — Phase A matched-prefix extension (THE KEY METRIC)
Diffed `audit-runs/phase-c3-RtlImageXexHeaderField/ours.jsonl` against
the existing `phase-c-first-divergence/phase-a/canary.jsonl`.
With allocator canonicalization (default):
| chain | C+2 (pre-C3) | C+3 (post) | Δ |
|---|---|---|---|
| canary tid=6 → ours tid=1 (main) | 102014 | **102032** | **+18** |
| canary tid=4 → ours tid=11 | 5 | 5 | 0 |
| canary tid=7 → ours tid=2 | 2 | 2 | 0 |
| canary tid=12 → ours tid=7 | 2 | 2 | 0 |
| canary tid=14 → ours tid=9 | 11 | 11 | 0 |
| canary tid=15 → ours tid=10 | (no div) | (no div) | 0 |
**Main thread matched prefix grew from 102014 to 102032. Gate 3 ✅.**
The new first-divergence at idx=102032 is `XeKeysConsolePrivateKeySign`
(canary returns 1, ours returns 0) — that's the next Phase C+N target,
out of scope here.
With `--no-canonicalize-allocators` (backward-compat check):
matched=161 — same as Phase C+1, because the MmAllocatePhysicalMemoryEx
divergence at idx=161 dominates without canonicalization. With BOTH
allocator + xex-header canonicalization, prefix reaches 102032.
## Gate 4 — Build
```
$ cargo build --release -p xenia-app
Compiling xenia-kernel v0.1.0
Compiling xenia-app v0.1.0
Finished `release` profile [optimized] target(s) in 6.17s
```
One pre-existing dead-code warning (`walk_committed_regions`); not
introduced by this fix. Canary untouched.
## Gate 5 — Phase A determinism (emitter)
Two cvar-ON captures of the same engine binary on the same ISO,
md5-summing only deterministic fields (excluding `host_ns`):
```
ours.jsonl (run 1, deterministic-fields-only) 714f06373f2f8f0e2f2bb5f1082da862
/tmp/c3_pa_run2.jsonl (run 2, det-fields-only) 714f06373f2f8f0e2f2bb5f1082da862
```
Byte-identical. ✅
## Gate 6 — `--no-canonicalize-allocators` backward-compat
Diff with the flag set reproduces the Phase C+1 baseline result of
**matched=161** (MmAllocatePhysicalMemoryEx divergence at idx=161).
This confirms the canonicalization is purely additive at the diff-tool
level and the engine fix doesn't disturb the raw-VA stream upstream.
## Gate 7 — Kernel unit tests
```
$ cargo test --release -p xenia-kernel
test result: ok. 129 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
```
✅. Two new tests would be a logical addition (validate
`rtl_image_xex_header_field` returns the right value for each
key-class), but kept out of this session's scope per "minimal fix".
## Summary
All 7 gates pass. Phase A main matched prefix grew from 102014 to
102032 (+18 events). The fix is symmetric: canary calls
`UserModule::GetOptHeader` on its in-guest header copy via the
`XexExecutableModuleHandle → hmodule_ptr → +0x58 → xex_header_base`
chain; ours now performs the same lookup against its own in-guest
header copy, with a `KernelState::xex_header_guest_ptr` fallback when
the chain yields NULL (which it does in ours because the LDR walk goes
through `*XexExecutableModuleHandle = image_base` — see investigation
for why fixing the LDR is Phase-A-regressing).
Next divergence: **XeKeysConsolePrivateKeySign @ tid_event_idx=102032**
(canary returns 1, ours returns 0). Class likely (A) missing handler
or (B) stub returning 0 by analogy with this session — Phase C+4 target.