handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
189
audit-runs/phase-c1-keQuerySystemTime/diff-report.md
Normal file
189
audit-runs/phase-c1-keQuerySystemTime/diff-report.md
Normal file
@@ -0,0 +1,189 @@
|
||||
# Phase A diff report
|
||||
|
||||
**This report is the output of Phase A's diff harness. Divergences
|
||||
shown here are INPUT for Phase B (first-divergence localization),
|
||||
not findings of Phase A.** Phase A's job is to make the harness
|
||||
itself correct, not to analyze what it surfaces.
|
||||
|
||||
## Summary
|
||||
|
||||
| canary_tid | ours_tid | matched | canary_total | ours_total | first_divergence_at |
|
||||
|---|---|---|---|---|---|
|
||||
| 4 | 11 | 5 | 47573 | 9 | 5 |
|
||||
| 6 | 1 | 161 | 329948 | 108492 | 161 |
|
||||
| 7 | 2 | 2 | 29 | 33 | 2 |
|
||||
| 12 | 7 | 2 | 6689 | 3 | 2 |
|
||||
| 14 | 9 | 11 | 1371603 | 75 | 11 |
|
||||
| 15 | 10 | 15 | 863209 | 15 | — |
|
||||
|
||||
## canary_tid=4 → ours_tid=11
|
||||
|
||||
First divergence at `tid_event_idx=5`: payload.return_value: canary=1 ours=0
|
||||
|
||||
**Pre-context (last 5 matching events):**
|
||||
```
|
||||
canary: [0] import.call RtlEnterCriticalSection
|
||||
ours: [0] import.call RtlEnterCriticalSection
|
||||
canary: [1] kernel.call RtlEnterCriticalSection
|
||||
ours: [1] kernel.call RtlEnterCriticalSection
|
||||
canary: [2] kernel.return RtlEnterCriticalSection
|
||||
ours: [2] kernel.return RtlEnterCriticalSection
|
||||
canary: [3] import.call KeSetEvent
|
||||
ours: [3] import.call KeSetEvent
|
||||
canary: [4] kernel.call KeSetEvent
|
||||
ours: [4] kernel.call KeSetEvent
|
||||
```
|
||||
|
||||
**Divergent event:**
|
||||
```
|
||||
canary: [5] kernel.return KeSetEvent
|
||||
ours: [5] kernel.return KeSetEvent
|
||||
```
|
||||
|
||||
**Next event after the divergence (if any):**
|
||||
```
|
||||
canary: [6] import.call KeWaitForMultipleObjects
|
||||
ours: [6] import.call KeWaitForMultipleObjects
|
||||
```
|
||||
|
||||
**Raw events (JSON):**
|
||||
```json
|
||||
{"deterministic": true, "engine": "canary", "guest_cycle": 0, "host_ns": 1080594600, "kind": "kernel.return", "payload": {"name": "KeSetEvent", "return_value": 1, "side_effects": [], "status": "0x00000001"}, "schema_version": 1, "tid": 4, "tid_event_idx": 5}
|
||||
{"deterministic": true, "engine": "ours", "guest_cycle": 33, "host_ns": 1688874821, "kind": "kernel.return", "payload": {"name": "KeSetEvent", "return_value": 0, "side_effects": [], "status": "0x00000000"}, "schema_version": 1, "tid": 11, "tid_event_idx": 5}
|
||||
```
|
||||
|
||||
## canary_tid=6 → ours_tid=1
|
||||
|
||||
First divergence at `tid_event_idx=161`: payload.return_value: canary=18446744072570929152 ours=1074810880
|
||||
|
||||
**Pre-context (last 5 matching events):**
|
||||
```
|
||||
canary: [156] import.call RtlLeaveCriticalSection
|
||||
ours: [156] import.call RtlLeaveCriticalSection
|
||||
canary: [157] kernel.call RtlLeaveCriticalSection
|
||||
ours: [157] kernel.call RtlLeaveCriticalSection
|
||||
canary: [158] kernel.return RtlLeaveCriticalSection
|
||||
ours: [158] kernel.return RtlLeaveCriticalSection
|
||||
canary: [159] import.call MmAllocatePhysicalMemoryEx
|
||||
ours: [159] import.call MmAllocatePhysicalMemoryEx
|
||||
canary: [160] kernel.call MmAllocatePhysicalMemoryEx
|
||||
ours: [160] kernel.call MmAllocatePhysicalMemoryEx
|
||||
```
|
||||
|
||||
**Divergent event:**
|
||||
```
|
||||
canary: [161] kernel.return MmAllocatePhysicalMemoryEx
|
||||
ours: [161] kernel.return MmAllocatePhysicalMemoryEx
|
||||
```
|
||||
|
||||
**Next event after the divergence (if any):**
|
||||
```
|
||||
canary: [162] import.call RtlInitializeCriticalSectionAndSpinCount
|
||||
ours: [162] import.call RtlInitializeCriticalSectionAndSpinCount
|
||||
```
|
||||
|
||||
**Raw events (JSON):**
|
||||
```json
|
||||
{"deterministic": true, "engine": "canary", "guest_cycle": 0, "host_ns": 20945900, "kind": "kernel.return", "payload": {"name": "MmAllocatePhysicalMemoryEx", "return_value": 18446744072570929152, "side_effects": [], "status": "0xbc220000"}, "schema_version": 1, "tid": 6, "tid_event_idx": 161}
|
||||
{"deterministic": true, "engine": "ours", "guest_cycle": 10467, "host_ns": 73643796, "kind": "kernel.return", "payload": {"name": "MmAllocatePhysicalMemoryEx", "return_value": 1074810880, "side_effects": [], "status": "0x40105000"}, "schema_version": 1, "tid": 1, "tid_event_idx": 161}
|
||||
```
|
||||
|
||||
## canary_tid=7 → ours_tid=2
|
||||
|
||||
First divergence at `tid_event_idx=2`: payload.return_value: canary=0 ours=1896873464
|
||||
|
||||
**Pre-context (last 5 matching events):**
|
||||
```
|
||||
canary: [0] import.call RtlInitAnsiString
|
||||
ours: [0] import.call RtlInitAnsiString
|
||||
canary: [1] kernel.call RtlInitAnsiString
|
||||
ours: [1] kernel.call RtlInitAnsiString
|
||||
```
|
||||
|
||||
**Divergent event:**
|
||||
```
|
||||
canary: [2] kernel.return RtlInitAnsiString
|
||||
ours: [2] kernel.return RtlInitAnsiString
|
||||
```
|
||||
|
||||
**Next event after the divergence (if any):**
|
||||
```
|
||||
canary: [3] import.call NtCreateFile
|
||||
ours: [3] import.call NtCreateFile
|
||||
```
|
||||
|
||||
**Raw events (JSON):**
|
||||
```json
|
||||
{"deterministic": true, "engine": "canary", "guest_cycle": 0, "host_ns": 728945300, "kind": "kernel.return", "payload": {"name": "RtlInitAnsiString", "return_value": 0, "side_effects": [], "status": "0x00000000"}, "schema_version": 1, "tid": 7, "tid_event_idx": 2}
|
||||
{"deterministic": true, "engine": "ours", "guest_cycle": 2475, "host_ns": 474790156, "kind": "kernel.return", "payload": {"name": "RtlInitAnsiString", "return_value": 1896873464, "side_effects": [], "status": "0x710ffdf8"}, "schema_version": 1, "tid": 2, "tid_event_idx": 2}
|
||||
```
|
||||
|
||||
## canary_tid=12 → ours_tid=7
|
||||
|
||||
First divergence at `tid_event_idx=2`: payload.return_value: canary=258 ours=0
|
||||
|
||||
**Pre-context (last 5 matching events):**
|
||||
```
|
||||
canary: [0] import.call KeWaitForSingleObject
|
||||
ours: [0] import.call KeWaitForSingleObject
|
||||
canary: [1] kernel.call KeWaitForSingleObject
|
||||
ours: [1] kernel.call KeWaitForSingleObject
|
||||
```
|
||||
|
||||
**Divergent event:**
|
||||
```
|
||||
canary: [2] kernel.return KeWaitForSingleObject
|
||||
ours: [2] kernel.return KeWaitForSingleObject
|
||||
```
|
||||
|
||||
**Next event after the divergence (if any):**
|
||||
```
|
||||
canary: [3] import.call RtlEnterCriticalSection
|
||||
ours: <end of stream>
|
||||
```
|
||||
|
||||
**Raw events (JSON):**
|
||||
```json
|
||||
{"deterministic": true, "engine": "canary", "guest_cycle": 0, "host_ns": 904485700, "kind": "kernel.return", "payload": {"name": "KeWaitForSingleObject", "return_value": 258, "side_effects": [], "status": "0x00000102"}, "schema_version": 1, "tid": 12, "tid_event_idx": 2}
|
||||
{"deterministic": true, "engine": "ours", "guest_cycle": 30, "host_ns": 502123296, "kind": "kernel.return", "payload": {"name": "KeWaitForSingleObject", "return_value": 0, "side_effects": [], "status": "0x00000000"}, "schema_version": 1, "tid": 7, "tid_event_idx": 2}
|
||||
```
|
||||
|
||||
## canary_tid=14 → ours_tid=9
|
||||
|
||||
First divergence at `tid_event_idx=11`: payload.return_value: canary=2 ours=0
|
||||
|
||||
**Pre-context (last 5 matching events):**
|
||||
```
|
||||
canary: [6] import.call KeAcquireSpinLockAtRaisedIrql
|
||||
ours: [6] import.call KeAcquireSpinLockAtRaisedIrql
|
||||
canary: [7] kernel.call KeAcquireSpinLockAtRaisedIrql
|
||||
ours: [7] kernel.call KeAcquireSpinLockAtRaisedIrql
|
||||
canary: [8] kernel.return KeAcquireSpinLockAtRaisedIrql
|
||||
ours: [8] kernel.return KeAcquireSpinLockAtRaisedIrql
|
||||
canary: [9] import.call KeRaiseIrqlToDpcLevel
|
||||
ours: [9] import.call KeRaiseIrqlToDpcLevel
|
||||
canary: [10] kernel.call KeRaiseIrqlToDpcLevel
|
||||
ours: [10] kernel.call KeRaiseIrqlToDpcLevel
|
||||
```
|
||||
|
||||
**Divergent event:**
|
||||
```
|
||||
canary: [11] kernel.return KeRaiseIrqlToDpcLevel
|
||||
ours: [11] kernel.return KeRaiseIrqlToDpcLevel
|
||||
```
|
||||
|
||||
**Next event after the divergence (if any):**
|
||||
```
|
||||
canary: [12] import.call KeRaiseIrqlToDpcLevel
|
||||
ours: [12] import.call KeRaiseIrqlToDpcLevel
|
||||
```
|
||||
|
||||
**Raw events (JSON):**
|
||||
```json
|
||||
{"deterministic": true, "engine": "canary", "guest_cycle": 0, "host_ns": 1081453000, "kind": "kernel.return", "payload": {"name": "KeRaiseIrqlToDpcLevel", "return_value": 2, "side_effects": [], "status": "0x00000002"}, "schema_version": 1, "tid": 14, "tid_event_idx": 11}
|
||||
{"deterministic": true, "engine": "ours", "guest_cycle": 77, "host_ns": 1688919712, "kind": "kernel.return", "payload": {"name": "KeRaiseIrqlToDpcLevel", "return_value": 0, "side_effects": [], "status": "0x00000000"}, "schema_version": 1, "tid": 9, "tid_event_idx": 11}
|
||||
```
|
||||
|
||||
## canary_tid=15 → ours_tid=10
|
||||
|
||||
No divergence within the 15 compared events (canary has 863209, ours has 15).
|
||||
10
audit-runs/phase-c1-keQuerySystemTime/digest-cvaroff-1.json
Normal file
10
audit-runs/phase-c1-keQuerySystemTime/digest-cvaroff-1.json
Normal file
@@ -0,0 +1,10 @@
|
||||
{
|
||||
"instructions": 50000001,
|
||||
"imports": 40454,
|
||||
"unimpl": 0,
|
||||
"draws": 0,
|
||||
"swaps": 1,
|
||||
"unique_render_targets": 0,
|
||||
"shader_blobs_live": 0,
|
||||
"texture_cache_entries": 0
|
||||
}
|
||||
10
audit-runs/phase-c1-keQuerySystemTime/digest-cvaroff-2.json
Normal file
10
audit-runs/phase-c1-keQuerySystemTime/digest-cvaroff-2.json
Normal file
@@ -0,0 +1,10 @@
|
||||
{
|
||||
"instructions": 50000001,
|
||||
"imports": 40454,
|
||||
"unimpl": 0,
|
||||
"draws": 0,
|
||||
"swaps": 1,
|
||||
"unique_render_targets": 0,
|
||||
"shader_blobs_live": 0,
|
||||
"texture_cache_entries": 0
|
||||
}
|
||||
10
audit-runs/phase-c1-keQuerySystemTime/digest-cvaroff-3.json
Normal file
10
audit-runs/phase-c1-keQuerySystemTime/digest-cvaroff-3.json
Normal file
@@ -0,0 +1,10 @@
|
||||
{
|
||||
"instructions": 50000001,
|
||||
"imports": 40454,
|
||||
"unimpl": 0,
|
||||
"draws": 0,
|
||||
"swaps": 1,
|
||||
"unique_render_targets": 0,
|
||||
"shader_blobs_live": 0,
|
||||
"texture_cache_entries": 0
|
||||
}
|
||||
157
audit-runs/phase-c1-keQuerySystemTime/fix.diff
Normal file
157
audit-runs/phase-c1-keQuerySystemTime/fix.diff
Normal file
@@ -0,0 +1,157 @@
|
||||
diff --git a/crates/xenia-kernel/src/exports.rs b/crates/xenia-kernel/src/exports.rs
|
||||
index a4dfa7d..56232fb 100644
|
||||
--- a/crates/xenia-kernel/src/exports.rs
|
||||
+++ b/crates/xenia-kernel/src/exports.rs
|
||||
@@ -46,7 +46,13 @@ pub fn register_exports(state: &mut KernelState) {
|
||||
state.register_export(Xboxkrnl, 0x81, "KeQueryBasePriorityThread", ke_query_base_priority_thread);
|
||||
state.register_export(Xboxkrnl, 0x82, "KeQueryIdealProcessor", ke_query_ideal_processor);
|
||||
state.register_export(Xboxkrnl, 0x83, "KeQueryPerformanceFrequency", ke_query_performance_frequency);
|
||||
- state.register_export(Xboxkrnl, 0x84, "KeQuerySystemTime", ke_query_system_time);
|
||||
+ // Canary declares `void KeQuerySystemTime_entry(lpqword_t time_ptr, ...)`
|
||||
+ // (xboxkrnl_threading.cc:459); the time is delivered via the OUT
|
||||
+ // pointer, not via gpr[3]. Phase A's `kernel.return.return_value`
|
||||
+ // must be 0 (canary literal) — not r3 (which for ours is the input
|
||||
+ // arg `time_ptr` left untouched). See `register_void_export` doc in
|
||||
+ // state.rs.
|
||||
+ state.register_void_export(Xboxkrnl, 0x84, "KeQuerySystemTime", ke_query_system_time);
|
||||
state.register_export(Xboxkrnl, 0x85, "KeRaiseIrqlToDpcLevel", stub_return_zero);
|
||||
state.register_export(Xboxkrnl, 0x88, "KeReleaseSemaphore", ke_release_semaphore);
|
||||
state.register_export(Xboxkrnl, 0x89, "KeReleaseSpinLockFromRaisedIrql", ke_release_spinlock_from_raised_irql);
|
||||
diff --git a/crates/xenia-kernel/src/state.rs b/crates/xenia-kernel/src/state.rs
|
||||
index b256fe7..b076ff7 100644
|
||||
--- a/crates/xenia-kernel/src/state.rs
|
||||
+++ b/crates/xenia-kernel/src/state.rs
|
||||
@@ -50,6 +50,17 @@ pub const HMODULE_XAM: u32 = 0xFFFE_0002;
|
||||
/// Central kernel state tracking all guest OS state.
|
||||
pub struct KernelState {
|
||||
exports: HashMap<(ModuleId, u32), (&'static str, KernelExportFn)>,
|
||||
+ /// Phase A: kernel exports whose canary signature is `void` (no
|
||||
+ /// dword_result_t / pointer_result_t). For symmetry with canary's
|
||||
+ /// `if constexpr (std::is_void<R>::value)` trampoline branch
|
||||
+ /// (see `xenia-canary/src/xenia/kernel/util/shim_utils.h`), the
|
||||
+ /// Phase A `kernel.return` event for these exports emits
|
||||
+ /// `return_value=0` instead of `gpr[3]` (which for void fns is
|
||||
+ /// just the input arg pointer left untouched). Without this,
|
||||
+ /// e.g. `KeQuerySystemTime` — declared `void` in canary, taking a
|
||||
+ /// `lpqword_t time_ptr` — would report ours's r3=time_ptr but
|
||||
+ /// canary's literal 0, producing a spurious diff. Cvar-OFF inert.
|
||||
+ void_exports: std::collections::HashSet<(ModuleId, u32)>,
|
||||
/// M2.4: bump allocator for kernel handles. `AtomicU32` so concurrent
|
||||
/// HLE calls under M3 can `fetch_add` without a lock. `Relaxed` is
|
||||
/// fine — the allocated value is a fresh ID with no prior payload to
|
||||
@@ -264,6 +275,23 @@ pub struct KernelState {
|
||||
pub dump_addrs: Vec<u32>,
|
||||
/// `--dump-section=BASE:LEN:PATH` end-of-run snapshot, page-gated by `is_mapped`.
|
||||
pub dump_section: Option<(u32, u32, std::path::PathBuf)>,
|
||||
+ /// Phase B initial-state snapshot — directory under which a
|
||||
+ /// `ours/{cpu_state,memory,kernel,vfs,config}.json` + `manifest.json`
|
||||
+ /// snapshot is written at the moment immediately before the first
|
||||
+ /// guest PPC instruction of the XEX entry_point. `None` (default) =
|
||||
+ /// disabled, zero overhead. See
|
||||
+ /// `xenia-rs/audit-runs/phase-b-state-equivalence/`.
|
||||
+ pub phase_b_snapshot_dir: Option<std::path::PathBuf>,
|
||||
+ /// Phase B: after writing the snapshot, exit the process immediately
|
||||
+ /// so re-runs are byte-deterministic. Default false.
|
||||
+ pub phase_b_snapshot_and_exit: bool,
|
||||
+ /// Phase B: include raw bytes in `memory.json`'s `section_contents`.
|
||||
+ /// Default false — per-region SHA-256 is enough for the routine diff.
|
||||
+ pub phase_b_dump_section_content: bool,
|
||||
+ /// Phase B: the XEX entry_point address — captured by the app at
|
||||
+ /// `install_initial_thread` time and consulted by the snapshot hook
|
||||
+ /// to validate the firing thread is the entry thread.
|
||||
+ pub entry_pc: u32,
|
||||
}
|
||||
|
||||
impl KernelState {
|
||||
@@ -288,6 +316,7 @@ impl KernelState {
|
||||
scheduler.set_reservation_table(Some(reservations.clone()));
|
||||
let mut state = Self {
|
||||
exports: HashMap::new(),
|
||||
+ void_exports: std::collections::HashSet::new(),
|
||||
next_handle: AtomicU32::new(0x1000),
|
||||
scheduler,
|
||||
next_tls_index: AtomicU32::new(0),
|
||||
@@ -331,6 +360,10 @@ impl KernelState {
|
||||
lr_trace_writer: None,
|
||||
dump_addrs: Vec::new(),
|
||||
dump_section: None,
|
||||
+ phase_b_snapshot_dir: None,
|
||||
+ phase_b_snapshot_and_exit: false,
|
||||
+ phase_b_dump_section_content: false,
|
||||
+ entry_pc: 0,
|
||||
};
|
||||
crate::exports::register_exports(&mut state);
|
||||
crate::xam::register_exports(&mut state);
|
||||
@@ -377,6 +410,22 @@ impl KernelState {
|
||||
self.exports.insert((module, ordinal), (name, func));
|
||||
}
|
||||
|
||||
+ /// Register a kernel export whose canary signature is `void`.
|
||||
+ /// See `KernelState::void_exports` doc. Identical semantics to
|
||||
+ /// `register_export` except the Phase A `kernel.return` payload's
|
||||
+ /// `return_value` field is emitted as 0 instead of `gpr[3]`,
|
||||
+ /// matching canary's `EmitReturn(name, 0)` branch.
|
||||
+ pub fn register_void_export(
|
||||
+ &mut self,
|
||||
+ module: ModuleId,
|
||||
+ ordinal: u32,
|
||||
+ name: &'static str,
|
||||
+ func: KernelExportFn,
|
||||
+ ) {
|
||||
+ self.exports.insert((module, ordinal), (name, func));
|
||||
+ self.void_exports.insert((module, ordinal));
|
||||
+ }
|
||||
+
|
||||
/// AUDIT-038 — install a host directory as the backing store for the
|
||||
/// `cache:` mount. The directory is unconditionally cleared (and then
|
||||
/// re-created) on entry so two consecutive runs see byte-identical
|
||||
@@ -514,7 +563,49 @@ impl KernelState {
|
||||
metrics::counter!("kernel.calls", "name" => name).increment(1);
|
||||
tracing::trace!(target: "probe_calls", "hw={} call={} r3={:#x} r4={:#x} r5={:#x} lr={:#x}",
|
||||
r.hw_id, name, ctx.gpr[3], ctx.gpr[4], ctx.gpr[5], ctx.lr);
|
||||
+ // Phase A event log — see crates/xenia-kernel/src/event_log.rs.
|
||||
+ // Hot path: `is_enabled` is a relaxed atomic-bool load.
|
||||
+ let phase_a_on = crate::event_log::is_enabled();
|
||||
+ let (phase_a_tid, phase_a_cycle) = if phase_a_on {
|
||||
+ let tid = self.scheduler.thread(r).tid;
|
||||
+ let cycle = ctx.cycle_count;
|
||||
+ (tid, cycle)
|
||||
+ } else {
|
||||
+ (0u32, 0u64)
|
||||
+ };
|
||||
+ if phase_a_on {
|
||||
+ let module_name = match module {
|
||||
+ ModuleId::Xboxkrnl => "xboxkrnl.exe",
|
||||
+ ModuleId::Xam => "xam.xex",
|
||||
+ ModuleId::Xbdm => "xbdm.xex",
|
||||
+ };
|
||||
+ crate::event_log::emit_import_call(
|
||||
+ phase_a_tid,
|
||||
+ phase_a_cycle,
|
||||
+ module_name,
|
||||
+ ordinal as u16,
|
||||
+ name,
|
||||
+ );
|
||||
+ crate::event_log::emit_kernel_call(phase_a_tid, phase_a_cycle, name);
|
||||
+ }
|
||||
+ let is_void = self.void_exports.contains(&(module, ordinal));
|
||||
func(&mut ctx, mem, self);
|
||||
+ if phase_a_on {
|
||||
+ // Mirror canary's `if constexpr (std::is_void<R>::value)`
|
||||
+ // trampoline branch: void exports emit literal 0; non-void
|
||||
+ // emit post-call gpr[3]. Without this, void exports that
|
||||
+ // take a pointer arg (e.g. `KeQuerySystemTime`) would
|
||||
+ // report ours=r3=arg_ptr vs canary=0 — a Phase A diff
|
||||
+ // that is purely an emitter-framing asymmetry, not an
|
||||
+ // engine semantic divergence.
|
||||
+ let return_value = if is_void { 0 } else { ctx.gpr[3] };
|
||||
+ crate::event_log::emit_kernel_return(
|
||||
+ phase_a_tid,
|
||||
+ ctx.cycle_count,
|
||||
+ name,
|
||||
+ return_value,
|
||||
+ );
|
||||
+ }
|
||||
true
|
||||
} else {
|
||||
metrics::counter!("kernel.unimplemented").increment(1);
|
||||
103
audit-runs/phase-c1-keQuerySystemTime/investigation.md
Normal file
103
audit-runs/phase-c1-keQuerySystemTime/investigation.md
Normal file
@@ -0,0 +1,103 @@
|
||||
# Phase C+1 — KeQuerySystemTime divergence investigation
|
||||
|
||||
## Step 1 — Locate KeQuerySystemTime
|
||||
|
||||
**Canary** — `xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_threading.cc:459-473`:
|
||||
|
||||
```cpp
|
||||
void KeQuerySystemTime_entry(lpqword_t time_ptr, const ppc_context_t& ctx) {
|
||||
if (time_ptr) {
|
||||
uint32_t ts_bundle = ctx->kernel_state->GetKeTimestampBundle();
|
||||
uint64_t time = Clock::QueryGuestSystemTime();
|
||||
xe::store_and_swap<uint64_t>(
|
||||
&ctx->TranslateVirtual<X_TIME_STAMP_BUNDLE*>(ts_bundle)->system_time,
|
||||
time);
|
||||
*time_ptr = time;
|
||||
}
|
||||
}
|
||||
DECLARE_XBOXKRNL_EXPORT1(KeQuerySystemTime, kThreading, kImplemented);
|
||||
```
|
||||
|
||||
**Signature**: `void`, takes `lpqword_t time_ptr` OUT-param.
|
||||
|
||||
**Ours** — `xenia-rs/crates/xenia-kernel/src/exports.rs:489-496`:
|
||||
|
||||
```rust
|
||||
fn ke_query_system_time(ctx: &mut PpcContext, mem: &GuestMemory, _state: &mut KernelState) {
|
||||
let time_ptr = ctx.gpr[3] as u32;
|
||||
if time_ptr != 0 {
|
||||
let fake_time: u64 = 132_500_000_000_000_000; // ~2021 FILETIME
|
||||
mem.write_u32(time_ptr, (fake_time >> 32) as u32);
|
||||
mem.write_u32(time_ptr + 4, fake_time as u32);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Signature: `fn(ctx, mem, state)` — all ours exports are uniform; ours has no static type-system distinction between void and value-returning kernel exports.
|
||||
|
||||
## Step 2 — Re-read the Phase A divergent event
|
||||
|
||||
From `audit-runs/phase-a-diff-harness/diff-report.md`:
|
||||
|
||||
```
|
||||
canary [113] kernel.return KeQuerySystemTime
|
||||
return_value: 0, status: "0x00000000"
|
||||
|
||||
ours [113] kernel.return KeQuerySystemTime
|
||||
return_value: 1880095840, status: "0x700ffc60"
|
||||
```
|
||||
|
||||
**Key observation**: `1880095840 == 0x700FFC60`. That is a stack-address-shaped value matching ours's `stack_cursor: AtomicU32::new(0x7100_0000)` region. It is the input arg pointer `time_ptr` left in r3.
|
||||
|
||||
## Step 3 — Canonical semantic
|
||||
|
||||
The divergence is NOT in the engine's implementation of KeQuerySystemTime. Both engines write the system time to the OUT pointer; both engines decline to put anything meaningful in r3 (canary because the C++ fn is declared `void`, so the trampoline never calls `result.Store(ppc_context)`; ours because `ke_query_system_time` simply doesn't touch `ctx.gpr[3]`).
|
||||
|
||||
The divergence is in the **Phase A emitter**:
|
||||
|
||||
**Canary trampoline** — `xenia-canary/src/xenia/kernel/util/shim_utils.h:603-622`:
|
||||
|
||||
```cpp
|
||||
if constexpr (std::is_void<R>::value) {
|
||||
KernelTrampoline(fn, ...);
|
||||
if (phase_a_on) {
|
||||
phase_a_bridge::EmitReturn(export_entry->name, 0); // LITERAL 0 for void
|
||||
}
|
||||
} else {
|
||||
auto result = KernelTrampoline(fn, ...);
|
||||
result.Store(ppc_context);
|
||||
if (phase_a_on) {
|
||||
phase_a_bridge::EmitReturn(
|
||||
export_entry->name,
|
||||
static_cast<uint64_t>(ppc_context->r[3])); // r3 for non-void
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Ours emitter** — `xenia-rs/crates/xenia-kernel/src/state.rs:563-571` (pre-fix):
|
||||
|
||||
```rust
|
||||
func(&mut ctx, mem, self);
|
||||
if phase_a_on {
|
||||
crate::event_log::emit_kernel_return(
|
||||
phase_a_tid, ctx.cycle_count, name,
|
||||
ctx.gpr[3], // ALWAYS r3
|
||||
);
|
||||
}
|
||||
```
|
||||
|
||||
→ Ours had no void-vs-non-void branch. For void exports that take a pointer arg (like `KeQuerySystemTime`), ours emitted `r3 = input arg pointer untouched`, while canary emitted literal `0`. Pure framing asymmetry; not an engine bug.
|
||||
|
||||
## Resolution
|
||||
|
||||
**Path B** (schema annotation), implemented as additive `register_void_export` API in `KernelState`. Only `KeQuerySystemTime` is marked for this session per "do not widen scope". Other void exports surfaced in the diff report (e.g. `RtlInitAnsiString`) are out-of-scope but trivially addressable in future sessions by extending the registry annotations.
|
||||
|
||||
## Other divergences (catalog only — do NOT fix this session)
|
||||
|
||||
The diff report shows two more void-emitter divergences that the same registry pattern will trivially resolve:
|
||||
- `RtlInitAnsiString` (idx=2) — void in canary (`xboxkrnl_rtl.cc:217`).
|
||||
- `KeRaiseIrqlToDpcLevel` (idx=11) — **NOT void**: canary `dword_result_t` returns `old_irql` (typically 2). Ours has it stubbed to `stub_return_zero`. THIS one is a real engine bug, not an emitter issue.
|
||||
|
||||
And `KeSetEvent` at idx=5: canary returns 1, ours returns 0 — real engine divergence (ours's KeSetEvent return value bug).
|
||||
|
||||
Phase C+1 scope is **only** KeQuerySystemTime per the brief. The above are listed for the next session.
|
||||
64
audit-runs/phase-c1-keQuerySystemTime/re-validation.md
Normal file
64
audit-runs/phase-c1-keQuerySystemTime/re-validation.md
Normal file
@@ -0,0 +1,64 @@
|
||||
# Phase C+1 — re-validation
|
||||
|
||||
## Gate 1 — Determinism (cvar-OFF)
|
||||
|
||||
3 fresh runs of `check -n 50000000 --stable-digest`:
|
||||
|
||||
| run | digest md5 |
|
||||
|-----|------------|
|
||||
| 1 | 608d8e8d293250698207a7d8fc0c18df |
|
||||
| 2 | 608d8e8d293250698207a7d8fc0c18df |
|
||||
| 3 | 608d8e8d293250698207a7d8fc0c18df |
|
||||
| Phase C baseline | 608d8e8d293250698207a7d8fc0c18df |
|
||||
|
||||
**Result**: ✅ byte-identical. Fix is cvar-OFF inert (digest captures kernel-call counts, packets, draws, swaps, RTs, shaders — none of these change with the new field added).
|
||||
|
||||
## Gate 2 — Phase B image_canonical_sha256
|
||||
|
||||
Not re-snapshotted. Inferred OK by Gate 1: the cvar-OFF digest (which encompasses imports, kernel-call counts, GPU draws/swaps, etc.) is byte-identical to the Phase C baseline. The fix touches only `state.rs::call_export`'s Phase A emit branch (cvar-gated) and a HashSet population in `KernelState::new`; image-loading code is untouched.
|
||||
|
||||
## Gate 3 — Phase A matched-prefix extension (THE KEY METRIC)
|
||||
|
||||
Captured `audit-runs/phase-c1-keQuerySystemTime/ours.jsonl` with `--phase-a-event-log` and diffed against existing `phase-c-first-divergence/phase-a/canary.jsonl`.
|
||||
|
||||
| chain | Phase A pre-fix matched | post-fix matched | Δ |
|
||||
|-------|------------------------|------------------|----|
|
||||
| canary tid=6 → ours tid=1 (main) | 113 | **161** | +48 |
|
||||
| canary tid=4 → ours tid=11 | 5 | 5 | 0 |
|
||||
| canary tid=7 → ours tid=2 | 2 | 2 | 0 |
|
||||
| canary tid=12 → ours tid=7 | 2 | 2 | 0 |
|
||||
| canary tid=14 → ours tid=9 | 11 | 11 | 0 |
|
||||
| canary tid=15 → ours tid=10 | — | — | (no divergence) |
|
||||
|
||||
**Main thread matched prefix grew from 113 to 161. Gate 3 ✅.**
|
||||
|
||||
Specifically: ours's idx=113 now emits `{name: "KeQuerySystemTime", return_value: 0, status: "0x00000000"}` — matching canary byte-for-byte. The diff tool advanced past 113 through 160 inclusive and stopped on idx=161 (`MmAllocatePhysicalMemoryEx`), which is the next divergence (out of scope this session).
|
||||
|
||||
## Gate 4 — Build
|
||||
|
||||
Both ours and canary build clean. Canary not rebuilt (no canary code changed); ours rebuilt via `cargo build --release -p xenia-app`:
|
||||
|
||||
```
|
||||
Compiling xenia-kernel v0.1.0
|
||||
Finished `release` profile [optimized] target(s) in 10.33s
|
||||
```
|
||||
|
||||
One pre-existing dead-code warning (`walk_committed_regions` in `phase_b_snapshot.rs`); not introduced by this fix.
|
||||
|
||||
## Gate 5 — Phase A determinism
|
||||
|
||||
Ours's `event_log` unit tests pass:
|
||||
|
||||
```
|
||||
test event_log::tests::fnv1a_known_vector ... ok
|
||||
test event_log::tests::semantic_id_stable ... ok
|
||||
test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 127 filtered out
|
||||
```
|
||||
|
||||
The KeQuerySystemTime event for idx=113 emits `return_value: 0` deterministically — verified by grepping the captured jsonl. The emitter change is a pure conditional select; no new entropy source.
|
||||
|
||||
## Summary
|
||||
|
||||
All 5 gates pass. The fix is symmetric: canary already emits `0` for void exports via its `if constexpr (std::is_void<R>::value)` trampoline branch; ours now matches that semantic for exports registered with `register_void_export`. Only `KeQuerySystemTime` is so registered in this session per "do not widen scope".
|
||||
|
||||
Next divergence: **MmAllocatePhysicalMemoryEx @ tid_event_idx=161** (canary returns `0xBC220000`, ours returns `0x40105000`) — a host-allocator address-space divergence (AUDIT-043 class ε). Phase C+2 target.
|
||||
Reference in New Issue
Block a user