Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
7.2 KiB
Phase C+6 — investigation: call-name divergence at idx=102132
Divergence
| canary | ours (pre-fix) | |
|---|---|---|
import.call at idx=102132, tid=6→1 |
NtClose (ord 207) |
IoDismountVolumeByFileHandle (ord 60) |
Phase 0 — Rule out ord→name lookup bug
Both engines map ord 0x3C (60) to IoDismountVolumeByFileHandle and
ord 0xCF (207) to NtClose. Cross-check:
xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_table.inc:74XE_EXPORT(xboxkrnl, 0x0000003C, IoDismountVolumeByFileHandle, kFunction)xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_table.inc:221XE_EXPORT(xboxkrnl, 0x000000CF, NtClose, kFunction)xenia-rs/crates/xenia-kernel/src/exports.rs:31register_export(Xboxkrnl, 0x3C, "IoDismountVolumeByFileHandle", stub_success)xenia-rs/crates/xenia-kernel/src/exports.rs:91register_export(Xboxkrnl, 0xCF, "NtClose", nt_close)
Ord→name mapping is byte-identical. Phase 0 rules out emitter name-lookup bug. This is a real event-stream divergence.
Phase 1 — Capture context around idx=102132 in both streams
audit-runs/phase-c5-NtWriteFile/ours.jsonl lines 102134..102137
(idx 102132..102135) and phase-c-first-divergence/phase-a/canary.jsonl
matching tid=6 events:
ours[102132] import.call IoDismountVolumeByFileHandle (ord 60)
ours[102133] kernel.call IoDismountVolumeByFileHandle
ours[102134] kernel.return IoDismountVolumeByFileHandle returns 0
ours[102135] import.call NtClose (ord 207)
ours[102136] kernel.call NtClose
ours[102137] kernel.return NtClose returns 0
canary[102132] import.call NtClose (ord 207)
canary[102133] kernel.call NtClose
canary[102134] kernel.return NtClose returns 0
canary[102135] import.call NtOpenFile (ord 223)
Observation: Ours's events 102135..102137 (NtClose triple) are
bit-identical to canary's events 102132..102134. Ours has 3 EXTRA
events (IoDismountVolumeByFileHandle triple) injected at idx=102132
that canary's stream does NOT contain. After ours's 3-event surplus,
both streams realign: ours[102138] = canary[102135] = NtOpenFile.
So the game DOES call IoDismountVolumeByFileHandle in BOTH engines; the difference is purely whether the Phase A emitter fires.
Phase 2 — Source-read both emitter paths
Canary emit logic (path A: declared export)
xenia-canary/src/xenia/kernel/util/shim_utils.h:597-602:
const bool phase_a_on = phase_a_bridge::Enabled();
if (phase_a_on) {
phase_a_bridge::EmitImportAndCall(
phase_a_bridge::KernelModuleIdName(MODULE), ORDINAL,
export_entry->name);
}
Inside Trampoline, only reachable when a DECLARE_XBOXKRNL_EXPORT
shim wires export_entry->function_data.trampoline = &X::Trampoline.
Canary emit logic (path B: table-entry-only, no DECLARE)
xenia-canary/src/xenia/cpu/xex_module.cc:1310-1335 import-thunk
generator: when kernel_export->function_data.trampoline == nullptr
(no DECLARE shim), the thunk is rewritten to sc 2; blr — the syscall
form. The "extern handler" wired in SetupExtern(handler=nullptr, ...)
forwards the call to PPCFrontend::SyscallHandler
(ppc_frontend.cc:83-92):
void SyscallHandler(PPCContext* ppc_context, void* arg0, void* arg1) {
uint64_t syscall_number = ppc_context->r[0];
switch (syscall_number) {
default:
assert_unhandled_case(syscall_number);
XELOGE("Unhandled syscall {}!", syscall_number);
break;
}
}
No phase_a_bridge::EmitImportAndCall call. Canary emits NO Phase A
events for table-entry-only exports. Verified by grepping
xenia-canary for DECLARE_XBOXKRNL_EXPORT declarations — only 287
ords have implementations; IoDismountVolumeByFileHandle is NOT among
them. (See /tmp/canary_decl.txt snapshot in investigation.md's
work artifacts.)
Ours emit logic
xenia-rs/crates/xenia-kernel/src/state.rs:585-632 (pre-fix):
if let Some(&(name, func)) = self.exports.get(&(module, ordinal)) {
...
let phase_a_on = crate::event_log::is_enabled();
...
if phase_a_on {
crate::event_log::emit_import_call(...);
crate::event_log::emit_kernel_call(...);
}
func(&mut ctx, mem, self);
if phase_a_on {
let return_value = if is_void { 0 } else { ctx.gpr[3] };
crate::event_log::emit_kernel_return(...);
}
}
Ours emits import.call/kernel.call/kernel.return for EVERY
registered ord, regardless of whether canary has a real shim or
a syscall-thunk. IoDismountVolumeByFileHandle is registered as
stub_success and therefore generates 3 spurious Phase A events.
Phase 3 — Classification
Class (E) Phase A emitter framing — Phase A coverage gap. Ours's
emitter fires for stubs that canary leaves silent (canary uses the
syscall thunk for table-entry-only exports, which does not reach
Trampoline).
Not class (A) (no guest-code branch flip — events realign at +3), not class (α) (this is not canonicalization — it's an emitter asymmetry that we can fix at source), not class (D) (no deferred- item interaction — heap region / clock not involved).
Sister bugs surfaced (out-of-scope — documented for follow-up)
comm -23 <ours-xboxkrnl> <canary-DECLARE_XBOXKRNL_EXPORT> lists
12 xboxkrnl ords ours registers that canary doesn't have a shim
for. Of those, the following actually fire in the current 50M run
and would also drift Phase A alignment if their callers reached them:
IoDismountVolumeByFileHandle(ord 0x3C, called 1× tid=1 main — fixed in this session)StfsCreateDevice(ord 0x259, called 1× tid=2 — drives tid=7→tid=2 divergence at idx=15; out of scope per session-scope rule)
The other 10 (DbgPrint, RtlCaptureContext, RtlUnwind, sprintf, _vsnprintf, __C_specific_handler, XeKeysConsoleSignatureVerification, StfsControlDevice) are not yet called in the 50M run; they will become relevant only after the game progresses further.
Sister bug (different class): ord 0x82 is KeQueryInterruptTime
in canary but ours mis-labels it KeQueryIdealProcessor; ord 0x98 is
KeSetBackgroundProcessors in canary but ours mis-labels it
KeSetIdealProcessor. These are name-lookup bugs (Stage 2 cleanup
class) and are NOT addressed here; would require renaming the
function-pointer fn and either dropping the existing semantics or
moving them to a different ord.
Reading-error class #26 (additive)
Phase A emitter coverage asymmetry across the "table-entry-only"
vs "shim-implemented" axis. Canary's emitter fires only from
Trampoline (wired by DECLARE_XBOXKRNL_EXPORT). Ours's emitter
fires for every register_export regardless of canary equivalence.
When a guest import is in canary's table but has no DECLARE shim,
canary routes it through a no-op syscall thunk with no Phase A
emission — ours, by registering a stub_success, injects 3 spurious
events per call.
Discipline addition: when registering a kernel export as
stub_success/stub_return_zero etc., grep canary for
DECLARE_XBOXKRNL_EXPORT(<name> first. If absent, use
register_unimplemented_export instead of register_export so the
Phase A emitter stays silent (matching canary). Reading-error #21
(C+1's gpr[3]-as-return-value for void exports) and #25 (C+5's
wrong-register read) were both kernel-export-emitter bugs; this is
the third in that family.