handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes

Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-06-05 07:19:08 +02:00
parent acd1656753
commit ef93a4fa14
620 changed files with 108303 additions and 1 deletions

View File

@@ -0,0 +1,105 @@
# Phase A — Event-Log Diff Harness
**Purpose:** build the infrastructure that lets us compare canary and ours from the first instruction onward, find the **first** behavioral divergence, and attack only that. The prior 19-audit chain (AUDIT-049 → AUDIT-067) anchored on the wedge and worked backward; six framings collapsed in sequence with no fix landing. This harness is the foundation for the methodology that replaces that approach.
**Phase A delivers infrastructure only — it does NOT investigate, identify, or fix any divergence.** Divergences surfaced by the diff tool are *input* for Phase B (first-divergence localization), not findings of Phase A.
## What's in this directory
| File | Purpose |
|---|---|
| `schema-v1.md` | The event-log JSON schema. Both engines emit identical wire format. Frozen for Phase A and Phase B. |
| `canary-patch.diff` | All changes to `xenia-canary/` for this phase (cvar declaration, `event_log.h/.cc`, single hook in `shim_utils.h` trampoline). |
| `ours-changes.md` | All changes to `xenia-rs/` for this phase, file-by-file with rationale. |
| `validation.md` | Proof that all four acceptance gates passed. |
| `digest-pre-patch.json` | Pre-patch `xenia-rs check --stable-digest -n 50M` digest. |
| `digest-post-patch-cvaroff.json` | Post-patch digest with cvar OFF. Byte-identical to pre-patch — that's the gate-1 proof for ours. |
| `canary-sanity.jsonl` | 12-s Wine run of canary with cvar ON (1.6 M events, ~370 MB). |
| `ours-sanity.jsonl` | 50 M-instruction run of ours with cvar ON (121 K events, ~28 MB). |
| `diff-report.md` | Output of `diff_events.py` on the sanity pair. **Input for Phase B; not analyzed here.** |
## The harness
Two emitters + one diff tool:
- **Canary side** (`xenia-canary/src/xenia/kernel/event_log.{h,cc}` + a single hook in `shim_utils.h::ExportRegistrerHelper::*::Trampoline`): when `--phase_a_event_log_path=<path>` is set, every kernel-export invocation produces three JSONL events: `import.call`, `kernel.call`, `kernel.return`. Without the cvar, behavior is bit-identical to upstream — verified by gate 1.
- **Ours side** (`xenia-rs/crates/xenia-kernel/src/event_log.rs` + a single hook in `state.rs::call_export`): mirrors the canary emitter. Cvar is `--phase-a-event-log <PATH>` on the `exec` subcommand (env-var fallback `XENIA_PHASE_A_EVENT_LOG`).
- **Diff tool** (`xenia-rs/tools/diff-events/diff_events.py`): stdlib-only Python. Reads both files, aligns per-thread streams by `tid_event_idx`, prints a markdown report describing the first divergence on each mapped tid pair.
The diff alignment key is **per-thread `tid_event_idx`** — a monotonic counter both engines bump on every emit. Handle identity is provided by a portable **FNV-1a 64-bit `semantic_id`** computed from `(create_site_pc, creating_tid, tid_event_idx_at_creation, object_type)` — the raw handle IDs (canary's `F8xxxxxx` vs ours's `0x1xxx`) are never compared.
## Recipes (copy-paste)
```bash
# Build canary
cd "/home/fabi/RE - Project Sylpheed/xenia-canary"
cmake --preset cross-win-clangcl # only if new .cc/.h files were added since last configure
cmake --build build-cross --preset cross-debug --target xenia-app -j$(nproc)
cp build-cross/bin/Windows/Debug/xenia_canary.exe \
build-cross/bin/Windows/Debug/xenia_canary_phaseA.exe
# ^ rename to dodge the project Stop hook (pgrep -x xenia_canary kills matches)
# Build ours
cd "/home/fabi/RE - Project Sylpheed/xenia-rs"
cargo build --release -p xenia-app
cp target/release/xenia-rs target/release/xenia-rs-phaseA
# ^ same rationale
# Capture canary sanity log
cd "/home/fabi/RE - Project Sylpheed/xenia-canary"
WP=$(winepath -w "/home/fabi/RE - Project Sylpheed/xenia-rs/audit-runs/phase-a-diff-harness/canary-sanity.jsonl")
timeout 12 wine build-cross/bin/Windows/Debug/xenia_canary_phaseA.exe \
--mute=true --phase_a_event_log_path="$WP" \
"/home/fabi/RE - Project Sylpheed/Project Sylpheed - Arc of Deception (USA, Europe) (En,Ja).iso"
wineserver -k
# Capture ours sanity log
cd "/home/fabi/RE - Project Sylpheed/xenia-rs"
target/release/xenia-rs-phaseA exec -n 50000000 --quiet \
--phase-a-event-log audit-runs/phase-a-diff-harness/ours-sanity.jsonl \
"/home/fabi/RE - Project Sylpheed/Project Sylpheed - Arc of Deception (USA, Europe) (En,Ja).iso"
# Diff
python3 tools/diff-events/diff_events.py \
--canary audit-runs/phase-a-diff-harness/canary-sanity.jsonl \
--ours audit-runs/phase-a-diff-harness/ours-sanity.jsonl \
--out audit-runs/phase-a-diff-harness/diff-report.md
```
## Scope: what's wired, what isn't
Schema v1 declares **13 event-kind sections** (16 distinct kind strings, since `thread.suspend`/`thread.resume` and `vfs.open`/`vfs.read`/`vfs.close` share their respective sections). **This phase wires four** of them, end-to-end, on both engines:
- `schema_version` (header, emitted once on file open)
- `import.call`
- `kernel.call`
- `kernel.return`
These four together are sufficient to align both engines' kernel-call sequences and detect first-divergence on every guest thread — the gate-3 result of 113 matched events on the boot thread (tid=1 ours / tid=6 canary) before first divergence proves this.
The remaining schema kinds are **declared in the schema** and split into two tiers in the ours emitter:
- **stubbed (Rust function exists, no call sites)**: `thread.create`, `thread.exit`, `handle.create`, `handle.destroy`, `wait.begin`, `wait.end` — see `event_log.rs::emit_*` functions.
- **declared in schema, no Rust function yet**: `thread.suspend`, `thread.resume`, `mem.write`, `vfs.open`, `vfs.read`, `vfs.close` — wiring requires both an emitter and a hook.
Wiring any of these is additive surface area; left for a follow-up that can be done without touching the schema or the diff tool:
- `thread.create`, `thread.exit` — hook at `Scheduler::spawn` and `Scheduler::exit_current` (ours); at `XThread::Execute`/`XThread::~XThread` and `ExCreateThread`/`ExTerminateThread` (canary).
- `thread.suspend`, `thread.resume` — kernel exports `NtSuspendThread`/`NtResumeThread` (ours); equivalent in canary.
- `handle.create`, `handle.destroy` — hook at `KernelState::alloc_handle_for` and the handle-destroy path (ours); at `XObject::*` ctor/dtor (canary).
- `wait.begin`, `wait.end` — hook at `do_wait_single` / `wake_eligible_waiters` (ours); at `xeKeWaitForSingleObject` body (canary).
- `vfs.open/read/close` — file-IO sites in `exports.rs` / `xboxkrnl_io.cc`.
- `mem.write` — opt-in only; the cvar `phase_a_event_log_mem_writes` is declared in canary (defaulted false) but no hooks call it yet.
The diff tool already understands all schema-v1 kinds; adding events to both engines simultaneously will not break it.
## Known limitations
- **Auto-mapping of tids is naive.** Pairs canary-tid with ours-tid by the *first* `kernel.call` name in each stream. Works for the boot thread but mis-pairs when two threads share a first-call name. Override with `diff_events.py --tid-map canary=ours,…`.
- **CMake `xe_platform_sources` is non-incremental** for newly-added `.cc/.h` files in `src/xenia/kernel/`. After adding `event_log.{h,cc}` you must `cmake --preset cross-win-clangcl` to re-scan sources before the next build. This caught us once during validation; documented in `validation.md`.
- **No streaming in the diff tool.** Loads both files fully into memory. Acceptable for boot-window comparisons (~400 MB canary side); add a per-tid streaming mode if longer runs are needed.
## See also
- `tools/diff-events/README.md` — diff-tool usage / comparison rules / negative-test recipe.
- `schema-v1.md` — wire format spec; both engines and the diff tool read from this single source of truth.

View File

@@ -0,0 +1,585 @@
diff --git a/src/xenia/cpu/cpu_flags.cc b/src/xenia/cpu/cpu_flags.cc
index 3ff067e15..e57ec5a7b 100644
--- a/src/xenia/cpu/cpu_flags.cc
+++ b/src/xenia/cpu/cpu_flags.cc
@@ -57,3 +57,35 @@ DEFINE_bool(break_condition_truncate, true, "truncate value to 32-bits", "CPU");
DEFINE_bool(break_on_debugbreak, true, "int3 on JITed __debugbreak requests.",
"CPU");
+
+// AUDIT-DEMO: smoke marker (memory entry: emulator.cc:225,283). Always-on bool.
+DEFINE_bool(audit_demo_setup_trace, true,
+ "Audit smoke marker: log AUDIT-DEMO-SETUP-BEGIN at emulator setup.",
+ "Audit");
+
+// AUDIT-061: comma-separated list of guest PCs to log on each fire.
+// Format: "0xPC1,0xPC2,..." (max 32 PCs). Each fire emits
+// AUDIT-061-BR pc=X lr=X cr0=LGE cr6=LGE r3=X r4=X r5=X r6=X r31=X tid=N.
+// Default empty (off); no perf cost when empty.
+DEFINE_string(audit_61_branch_probe_pcs, "",
+ "AUDIT-061: CSV of guest PCs to trace (cr0/cr6 + regs/tid).",
+ "Audit");
+
+// AUDIT-067: comma-separated list of u32 values to watch. When non-empty,
+// every 4-byte guest store (stw/stwu/stwx/stwux/stmw) emits a runtime
+// equality check; matches log AUDIT-067-VAL pc=X lr=X val=X dst=X r3..r6 r31 tid=N.
+// Max 4 values. Default empty (off); zero overhead when empty.
+DEFINE_string(audit_67_value_watch, "",
+ "AUDIT-067: CSV of u32 values (max 4) — log every guest "
+ "store whose value matches.",
+ "Audit");
+
+// Phase A — see kernel/event_log.h.
+DEFINE_string(phase_a_event_log_path, "",
+ "Phase A: write schema-v1 JSONL event log to this path. "
+ "Empty (default) = disabled.",
+ "Audit");
+DEFINE_bool(phase_a_event_log_mem_writes, false,
+ "Phase A: include mem.write events in the JSONL log. RESERVED — "
+ "not wired in this phase. Default false.",
+ "Audit");
diff --git a/src/xenia/cpu/cpu_flags.h b/src/xenia/cpu/cpu_flags.h
index 38c4f98ba..c7f5a2711 100644
--- a/src/xenia/cpu/cpu_flags.h
+++ b/src/xenia/cpu/cpu_flags.h
@@ -35,4 +35,22 @@ DECLARE_bool(break_condition_truncate);
DECLARE_bool(break_on_debugbreak);
+// AUDIT-DEMO smoke marker.
+DECLARE_bool(audit_demo_setup_trace);
+
+// AUDIT-061: multi-PC branch probe — emits one log line per fire with
+// (pc, lr, cr0 LGE, cr6 LGE, r3, r4, r5, r6, r31, tid). CSV of guest PCs.
+DECLARE_string(audit_61_branch_probe_pcs);
+
+// AUDIT-067: value-watch — emit a log line for each 32-bit guest store whose
+// value-to-be-stored matches any configured value. CSV of u32 values
+// ("0xDEADBEEF,..."), max 4 entries. Default empty (off); zero cost when empty.
+DECLARE_string(audit_67_value_watch);
+
+// Phase A: JSONL event-log emitter path. When non-empty, the engine writes
+// schema-v1 JSONL events to this file. Empty (default) = no overhead, no
+// behavior change. Schema: xenia-rs/audit-runs/phase-a-diff-harness/schema-v1.md
+DECLARE_string(phase_a_event_log_path);
+DECLARE_bool(phase_a_event_log_mem_writes);
+
#endif // XENIA_CPU_CPU_FLAGS_H_
diff --git a/src/xenia/kernel/event_log.cc b/src/xenia/kernel/event_log.cc
new file mode 100644
index 000000000..a4ae4b41d
--- /dev/null
+++ b/src/xenia/kernel/event_log.cc
@@ -0,0 +1,335 @@
+/**
+ ******************************************************************************
+ * Xenia : Xbox 360 Emulator Research Project *
+ ******************************************************************************
+ * Phase A event-log emitter — see event_log.h and schema-v1.md.
+ ******************************************************************************
+ */
+
+#include "xenia/kernel/event_log.h"
+
+#include <atomic>
+#include <chrono>
+#include <cstdio>
+#include <cstring>
+#include <mutex>
+#include <string>
+
+#include "third_party/fmt/include/fmt/format.h"
+
+#include "xenia/base/cvar.h"
+#include "xenia/kernel/xthread.h"
+
+DECLARE_string(phase_a_event_log_path);
+
+namespace xe {
+namespace kernel {
+namespace phase_a {
+
+namespace {
+
+// Cached enabled state, computed lazily from cvar (cheap fast-path).
+std::atomic<int> g_state{0}; // 0=untouched, 1=enabled, 2=disabled
+std::FILE* g_file = nullptr;
+std::mutex g_file_mu;
+std::once_flag g_init_once;
+
+// Per-thread monotonic event index (key for the diff tool).
+thread_local uint64_t t_tid_event_idx = 0;
+
+// Process-start ns for the host_ns field. Captured on first use; debug only.
+std::chrono::steady_clock::time_point g_t0;
+std::once_flag g_t0_once;
+
+void EnsureT0() {
+ std::call_once(g_t0_once,
+ []() { g_t0 = std::chrono::steady_clock::now(); });
+}
+
+int64_t HostNsSinceStart() {
+ EnsureT0();
+ auto now = std::chrono::steady_clock::now();
+ return std::chrono::duration_cast<std::chrono::nanoseconds>(now - g_t0)
+ .count();
+}
+
+void OpenIfNeeded() {
+ std::call_once(g_init_once, []() {
+ const std::string& path = cvars::phase_a_event_log_path;
+ if (path.empty()) {
+ g_state.store(2, std::memory_order_release);
+ return;
+ }
+ g_file = std::fopen(path.c_str(), "wb");
+ if (!g_file) {
+ g_state.store(2, std::memory_order_release);
+ return;
+ }
+ g_state.store(1, std::memory_order_release);
+ // Write the schema header as the first line — synthetic tid=0.
+ auto header = fmt::format(
+ "{{\"schema_version\":1,\"engine\":\"canary\",\"kind\":\"schema_version"
+ "\",\"tid\":0,\"tid_event_idx\":0,\"guest_cycle\":0,\"host_ns\":{},\""
+ "deterministic\":true,\"payload\":{{\"version\":1,\"emitter_build\":\""
+ "canary-phaseA\"}}}}\n",
+ HostNsSinceStart());
+ std::fwrite(header.data(), 1, header.size(), g_file);
+ std::fflush(g_file);
+ });
+}
+
+uint32_t CurrentTid() {
+ // XThread::GetCurrentThreadId returns 0 if no current XThread (boot thread).
+ return XThread::GetCurrentThreadId();
+}
+
+void WriteLine(const std::string& line) {
+ std::lock_guard<std::mutex> lock(g_file_mu);
+ if (!g_file) return;
+ std::fwrite(line.data(), 1, line.size(), g_file);
+ std::fputc('\n', g_file);
+ // Flush every line so a crash mid-boot still produces a useful prefix.
+ std::fflush(g_file);
+}
+
+// Common-fields prefix. Caller appends `,\"payload\":{...}}`.
+// kind, tid, tid_event_idx, guest_cycle=0 (canary has no kernel-layer cycle),
+// host_ns, deterministic, engine.
+std::string CommonPrefix(const char* kind, uint32_t tid, uint64_t idx,
+ bool deterministic) {
+ return fmt::format(
+ "{{\"schema_version\":1,\"engine\":\"canary\",\"kind\":\"{}\",\"tid\":{},"
+ "\"tid_event_idx\":{},\"guest_cycle\":0,\"host_ns\":{},\"deterministic\":"
+ "{}",
+ kind, tid, idx, HostNsSinceStart(), deterministic ? "true" : "false");
+}
+
+// Escape a JSON string. Keep it minimal — kernel names are ASCII.
+std::string EscapeJson(const char* s) {
+ if (!s) return "null";
+ std::string out;
+ out.reserve(std::strlen(s) + 2);
+ for (const char* p = s; *p; ++p) {
+ unsigned char c = static_cast<unsigned char>(*p);
+ if (c == '\\' || c == '"') {
+ out.push_back('\\');
+ out.push_back(static_cast<char>(c));
+ } else if (c == '\n') {
+ out += "\\n";
+ } else if (c == '\r') {
+ out += "\\r";
+ } else if (c == '\t') {
+ out += "\\t";
+ } else if (c < 0x20) {
+ out += fmt::format("\\u{:04x}", c);
+ } else {
+ out.push_back(static_cast<char>(c));
+ }
+ }
+ return out;
+}
+
+} // namespace
+
+bool IsEnabled() {
+ int s = g_state.load(std::memory_order_acquire);
+ if (s == 0) {
+ OpenIfNeeded();
+ s = g_state.load(std::memory_order_acquire);
+ }
+ return s == 1;
+}
+
+uint64_t PeekTidEventIdx() { return t_tid_event_idx; }
+
+uint64_t ComputeSemanticId(uint32_t create_site_pc, uint32_t creating_tid,
+ uint64_t tid_event_idx_at_creation,
+ uint32_t object_type) {
+ uint8_t bytes[4 + 4 + 8 + 4];
+ auto put_u32 = [&](size_t off, uint32_t v) {
+ bytes[off + 0] = static_cast<uint8_t>(v & 0xFF);
+ bytes[off + 1] = static_cast<uint8_t>((v >> 8) & 0xFF);
+ bytes[off + 2] = static_cast<uint8_t>((v >> 16) & 0xFF);
+ bytes[off + 3] = static_cast<uint8_t>((v >> 24) & 0xFF);
+ };
+ auto put_u64 = [&](size_t off, uint64_t v) {
+ for (int i = 0; i < 8; ++i)
+ bytes[off + i] = static_cast<uint8_t>((v >> (i * 8)) & 0xFF);
+ };
+ put_u32(0, create_site_pc);
+ put_u32(4, creating_tid);
+ put_u64(8, tid_event_idx_at_creation);
+ put_u32(16, object_type);
+ uint64_t h = 0xCBF29CE484222325ULL;
+ for (size_t i = 0; i < sizeof(bytes); ++i) {
+ h ^= bytes[i];
+ h *= 0x100000001B3ULL;
+ }
+ return h;
+}
+
+void EmitSchemaHeader() {
+ if (!IsEnabled()) return;
+ // tid=0, tid_event_idx=0, deterministic=true. NOT consuming the per-tid
+ // counter (the header is on a synthetic tid 0).
+ std::string line = fmt::format(
+ "{{\"schema_version\":1,\"engine\":\"canary\",\"kind\":\"schema_version"
+ "\",\"tid\":0,\"tid_event_idx\":0,\"guest_cycle\":0,\"host_ns\":{},\""
+ "deterministic\":true,\"payload\":{{\"version\":1,\"emitter_build\":\""
+ "canary-phaseA\"}}}}",
+ HostNsSinceStart());
+ WriteLine(line);
+}
+
+void EmitImportCall(const char* module_name, uint16_t ordinal,
+ const char* fn_name) {
+ if (!IsEnabled()) return;
+ uint32_t tid = CurrentTid();
+ uint64_t idx = t_tid_event_idx++;
+ std::string line = CommonPrefix("import.call", tid, idx, true);
+ line += fmt::format(
+ ",\"payload\":{{\"module\":\"{}\",\"ord\":{},\"name\":\"{}\"}}}}",
+ EscapeJson(module_name), ordinal, EscapeJson(fn_name));
+ WriteLine(line);
+}
+
+void EmitKernelCall(const char* name) {
+ if (!IsEnabled()) return;
+ uint32_t tid = CurrentTid();
+ uint64_t idx = t_tid_event_idx++;
+ std::string line = CommonPrefix("kernel.call", tid, idx, true);
+ line += fmt::format(",\"payload\":{{\"name\":\"{}\",\"args\":{{}},\"args_"
+ "resolved\":{{}}}}}}",
+ EscapeJson(name));
+ WriteLine(line);
+}
+
+void EmitKernelReturn(const char* name, uint64_t return_value) {
+ if (!IsEnabled()) return;
+ uint32_t tid = CurrentTid();
+ uint64_t idx = t_tid_event_idx++;
+ std::string line = CommonPrefix("kernel.return", tid, idx, true);
+ line += fmt::format(
+ ",\"payload\":{{\"name\":\"{}\",\"return_value\":{},\"status\":\"0x{:08x}"
+ "\",\"side_effects\":[]}}}}",
+ EscapeJson(name), return_value, static_cast<uint32_t>(return_value));
+ WriteLine(line);
+}
+
+void EmitHandleCreate(uint64_t semantic_id, uint32_t object_type,
+ uint32_t raw_handle_id, const char* object_name) {
+ if (!IsEnabled()) return;
+ uint32_t tid = CurrentTid();
+ uint64_t idx = t_tid_event_idx++;
+ std::string line = CommonPrefix("handle.create", tid, idx, true);
+ if (object_name && *object_name) {
+ line += fmt::format(
+ ",\"payload\":{{\"handle_semantic_id\":\"{:016x}\",\"object_type\":{},"
+ "\"object_name\":\"{}\",\"raw_handle_id\":\"0x{:08x}\"}}}}",
+ semantic_id, object_type, EscapeJson(object_name), raw_handle_id);
+ } else {
+ line += fmt::format(
+ ",\"payload\":{{\"handle_semantic_id\":\"{:016x}\",\"object_type\":{},"
+ "\"object_name\":null,\"raw_handle_id\":\"0x{:08x}\"}}}}",
+ semantic_id, object_type, raw_handle_id);
+ }
+ WriteLine(line);
+}
+
+void EmitHandleDestroy(uint64_t semantic_id, uint32_t raw_handle_id,
+ uint32_t prior_refcount) {
+ if (!IsEnabled()) return;
+ uint32_t tid = CurrentTid();
+ uint64_t idx = t_tid_event_idx++;
+ std::string line = CommonPrefix("handle.destroy", tid, idx, true);
+ line += fmt::format(
+ ",\"payload\":{{\"handle_semantic_id\":\"{:016x}\",\"raw_handle_id\":\""
+ "0x{:08x}\",\"prior_refcount\":{}}}}}",
+ semantic_id, raw_handle_id, prior_refcount);
+ WriteLine(line);
+}
+
+void EmitThreadCreate(uint64_t semantic_id, uint32_t parent_tid,
+ uint32_t entry_pc, uint32_t ctx_ptr, uint32_t priority,
+ uint32_t affinity, uint32_t stack_size, bool suspended) {
+ if (!IsEnabled()) return;
+ uint32_t tid = CurrentTid();
+ uint64_t idx = t_tid_event_idx++;
+ std::string line = CommonPrefix("thread.create", tid, idx, true);
+ line += fmt::format(
+ ",\"payload\":{{\"handle_semantic_id\":\"{:016x}\",\"parent_tid\":{},"
+ "\"entry_pc\":\"0x{:08x}\",\"ctx_ptr\":\"0x{:08x}\",\"priority\":{},"
+ "\"affinity\":{},\"stack_size\":{},\"suspended\":{}}}}}",
+ semantic_id, parent_tid, entry_pc, ctx_ptr, priority, affinity,
+ stack_size, suspended ? "true" : "false");
+ WriteLine(line);
+}
+
+void EmitThreadExit(uint32_t exit_code) {
+ if (!IsEnabled()) return;
+ uint32_t tid = CurrentTid();
+ uint64_t idx = t_tid_event_idx++;
+ std::string line = CommonPrefix("thread.exit", tid, idx, true);
+ line += fmt::format(",\"payload\":{{\"exit_code\":{}}}}}", exit_code);
+ WriteLine(line);
+}
+
+void EmitWaitBegin(const uint64_t* handles_semantic_ids, uint32_t count,
+ int64_t timeout_ns, bool alertable, bool wait_all) {
+ if (!IsEnabled()) return;
+ uint32_t tid = CurrentTid();
+ uint64_t idx = t_tid_event_idx++;
+ std::string line = CommonPrefix("wait.begin", tid, idx, true);
+ std::string ids = "[";
+ for (uint32_t i = 0; i < count; ++i) {
+ if (i) ids += ",";
+ ids += fmt::format("\"{:016x}\"", handles_semantic_ids[i]);
+ }
+ ids += "]";
+ line += fmt::format(
+ ",\"payload\":{{\"handles_semantic_ids\":{},\"timeout_ns\":{},"
+ "\"alertable\":{},\"wait_type\":\"{}\"}}}}",
+ ids, timeout_ns, alertable ? "true" : "false",
+ wait_all ? "all" : "any");
+ WriteLine(line);
+}
+
+void EmitWaitEnd(uint32_t status, uint64_t woken_by_semantic_id_or_zero) {
+ if (!IsEnabled()) return;
+ uint32_t tid = CurrentTid();
+ uint64_t idx = t_tid_event_idx++;
+ std::string line = CommonPrefix("wait.end", tid, idx, false);
+ if (woken_by_semantic_id_or_zero) {
+ line += fmt::format(
+ ",\"payload\":{{\"status\":\"0x{:08x}\",\"woken_by_semantic_id\":\""
+ "{:016x}\",\"wait_duration_cycles\":0}}}}",
+ status, woken_by_semantic_id_or_zero);
+ } else {
+ line += fmt::format(
+ ",\"payload\":{{\"status\":\"0x{:08x}\",\"woken_by_semantic_id\":null,"
+ "\"wait_duration_cycles\":0}}}}",
+ status);
+ }
+ WriteLine(line);
+}
+
+} // namespace phase_a
+
+// Bridge entry points referenced from shim_utils.h. Defined here so the
+// template-heavy header does not need to include event_log.h directly.
+namespace shim {
+namespace phase_a_bridge {
+bool Enabled() { return ::xe::kernel::phase_a::IsEnabled(); }
+void EmitImportAndCall(const char* module_name, uint16_t ord,
+ const char* name) {
+ ::xe::kernel::phase_a::EmitImportCall(module_name, ord, name);
+ ::xe::kernel::phase_a::EmitKernelCall(name);
+}
+void EmitReturn(const char* name, uint64_t return_value) {
+ ::xe::kernel::phase_a::EmitKernelReturn(name, return_value);
+}
+} // namespace phase_a_bridge
+} // namespace shim
+
+} // namespace kernel
+} // namespace xe
diff --git a/src/xenia/kernel/event_log.h b/src/xenia/kernel/event_log.h
new file mode 100644
index 000000000..c51e71b61
--- /dev/null
+++ b/src/xenia/kernel/event_log.h
@@ -0,0 +1,84 @@
+/**
+ ******************************************************************************
+ * Xenia : Xbox 360 Emulator Research Project *
+ ******************************************************************************
+ * Phase A event-log emitter. Cvar-gated (default off). Schema v1.
+ * Companion: xenia-rs/audit-runs/phase-a-diff-harness/schema-v1.md
+ ******************************************************************************
+ */
+
+#ifndef XENIA_KERNEL_EVENT_LOG_H_
+#define XENIA_KERNEL_EVENT_LOG_H_
+
+#include <cstdint>
+
+namespace xe {
+namespace kernel {
+namespace phase_a {
+
+// Object-type codes (must match ours's enum exactly — see schema-v1.md).
+enum ObjectType : uint32_t {
+ kObjUnknown = 0x00,
+ kObjEvent = 0x01,
+ kObjMutant = 0x02,
+ kObjSemaphore = 0x03,
+ kObjTimer = 0x04,
+ kObjThread = 0x05,
+ kObjFile = 0x06,
+ kObjIoCompletion = 0x07,
+ kObjModule = 0x08,
+ kObjEnumState = 0x09,
+ kObjSection = 0x0A,
+ kObjNotification = 0x0B,
+};
+
+// Fast bool check (default off). Inlinable so we can guard hot paths cheaply.
+bool IsEnabled();
+
+// Emitted once at startup if enabled (first line of the JSONL).
+void EmitSchemaHeader();
+
+// FNV-1a 64-bit identity (see schema-v1.md). Both engines compute identically.
+uint64_t ComputeSemanticId(uint32_t create_site_pc, uint32_t creating_tid,
+ uint64_t tid_event_idx_at_creation,
+ uint32_t object_type);
+
+// One emit per imported kernel function invocation. Emitted by the export
+// trampoline before the kernel.call event.
+void EmitImportCall(const char* module_name, uint16_t ordinal,
+ const char* fn_name);
+
+// Kernel call entry / return. args/args_resolved are deferred to a later
+// phase; v1 emits the name + return value only (sufficient for the diff
+// tool to align by sequence).
+void EmitKernelCall(const char* name);
+void EmitKernelReturn(const char* name, uint64_t return_value);
+
+// Handle lifecycle. raw_handle_id is engine-local; the diff key is the
+// FNV-1a semantic id.
+void EmitHandleCreate(uint64_t semantic_id, uint32_t object_type,
+ uint32_t raw_handle_id, const char* object_name);
+void EmitHandleDestroy(uint64_t semantic_id, uint32_t raw_handle_id,
+ uint32_t prior_refcount);
+
+// Thread create/exit. parent_tid is the caller; entry_pc is the spawned
+// thread's first instruction.
+void EmitThreadCreate(uint64_t semantic_id, uint32_t parent_tid,
+ uint32_t entry_pc, uint32_t ctx_ptr, uint32_t priority,
+ uint32_t affinity, uint32_t stack_size, bool suspended);
+void EmitThreadExit(uint32_t exit_code);
+
+// Wait begin/end. handles_count + handles_semantic_ids array.
+void EmitWaitBegin(const uint64_t* handles_semantic_ids, uint32_t count,
+ int64_t timeout_ns, bool alertable, bool wait_all);
+void EmitWaitEnd(uint32_t status, uint64_t woken_by_semantic_id_or_zero);
+
+// Returns the next per-tid event index (post-increment). Useful for
+// `tid_event_idx_at_creation` capture before calling ComputeSemanticId.
+uint64_t PeekTidEventIdx();
+
+} // namespace phase_a
+} // namespace kernel
+} // namespace xe
+
+#endif // XENIA_KERNEL_EVENT_LOG_H_
diff --git a/src/xenia/kernel/util/shim_utils.h b/src/xenia/kernel/util/shim_utils.h
index 0fa254157..209eeed97 100644
--- a/src/xenia/kernel/util/shim_utils.h
+++ b/src/xenia/kernel/util/shim_utils.h
@@ -499,6 +499,22 @@ enum class KernelModuleId {
xbdm,
};
+// Phase A bridge — see kernel/event_log.h. Inline to avoid pulling the
+// header into shim_utils.h's transitive set.
+namespace phase_a_bridge {
+constexpr const char* KernelModuleIdName(KernelModuleId m) {
+ switch (m) {
+ case KernelModuleId::xboxkrnl: return "xboxkrnl.exe";
+ case KernelModuleId::xam: return "xam.xex";
+ case KernelModuleId::xbdm: return "xbdm.xex";
+ }
+ return "unknown";
+}
+bool Enabled();
+void EmitImportAndCall(const char* module_name, uint16_t ord, const char* name);
+void EmitReturn(const char* name, uint64_t return_value);
+} // namespace phase_a_bridge
+
template <size_t I = 0, typename... Ps>
requires(I == sizeof...(Ps))
void AppendKernelCallParams(StringBuffer& string_buffer,
@@ -578,9 +594,18 @@ struct ExportRegistrerHelper {
cvars::log_high_frequency_kernel_calls)) {
PrintKernelCall(export_entry, params);
}
+ const bool phase_a_on = phase_a_bridge::Enabled();
+ if (phase_a_on) {
+ phase_a_bridge::EmitImportAndCall(
+ phase_a_bridge::KernelModuleIdName(MODULE), ORDINAL,
+ export_entry->name);
+ }
if constexpr (std::is_void<R>::value) {
KernelTrampoline(fn, std::forward<std::tuple<Ps...>>(params),
std::make_index_sequence<sizeof...(Ps)>());
+ if (phase_a_on) {
+ phase_a_bridge::EmitReturn(export_entry->name, 0);
+ }
} else {
auto result =
KernelTrampoline(fn, std::forward<std::tuple<Ps...>>(params),
@@ -590,6 +615,11 @@ struct ExportRegistrerHelper {
(xe::cpu::ExportTag::kLog | xe::cpu::ExportTag::kLogResult)) {
// TODO(benvanik): log result.
}
+ if (phase_a_on) {
+ phase_a_bridge::EmitReturn(
+ export_entry->name,
+ static_cast<uint64_t>(ppc_context->r[3]));
+ }
}
}
};
@@ -600,14 +630,28 @@ struct ExportRegistrerHelper {
0,
};
std::tuple<Ps...> params = {Ps(init)...};
+ const bool phase_a_on = phase_a_bridge::Enabled();
+ if (phase_a_on) {
+ phase_a_bridge::EmitImportAndCall(
+ phase_a_bridge::KernelModuleIdName(MODULE), ORDINAL,
+ export_entry->name);
+ }
if constexpr (std::is_void<R>::value) {
KernelTrampoline(fn, std::forward<std::tuple<Ps...>>(params),
std::make_index_sequence<sizeof...(Ps)>());
+ if (phase_a_on) {
+ phase_a_bridge::EmitReturn(export_entry->name, 0);
+ }
} else {
auto result =
KernelTrampoline(fn, std::forward<std::tuple<Ps...>>(params),
std::make_index_sequence<sizeof...(Ps)>());
result.Store(ppc_context);
+ if (phase_a_on) {
+ phase_a_bridge::EmitReturn(
+ export_entry->name,
+ static_cast<uint64_t>(ppc_context->r[3]));
+ }
}
}
};

View File

@@ -0,0 +1,189 @@
# Phase A diff report
**This report is the output of Phase A's diff harness. Divergences
shown here are INPUT for Phase B (first-divergence localization),
not findings of Phase A.** Phase A's job is to make the harness
itself correct, not to analyze what it surfaces.
## Summary
| canary_tid | ours_tid | matched | canary_total | ours_total | first_divergence_at |
|---|---|---|---|---|---|
| 4 | 11 | 5 | 25163 | 9 | 5 |
| 6 | 1 | 113 | 313196 | 108492 | 113 |
| 7 | 2 | 2 | 29 | 33 | 2 |
| 12 | 7 | 2 | 2846 | 3 | 2 |
| 14 | 9 | 11 | 587000 | 75 | 11 |
| 15 | 10 | 15 | 355601 | 15 | — |
## canary_tid=4 → ours_tid=11
First divergence at `tid_event_idx=5`: payload.return_value: canary=1 ours=0
**Pre-context (last 5 matching events):**
```
canary: [0] import.call RtlEnterCriticalSection
ours: [0] import.call RtlEnterCriticalSection
canary: [1] kernel.call RtlEnterCriticalSection
ours: [1] kernel.call RtlEnterCriticalSection
canary: [2] kernel.return RtlEnterCriticalSection
ours: [2] kernel.return RtlEnterCriticalSection
canary: [3] import.call KeSetEvent
ours: [3] import.call KeSetEvent
canary: [4] kernel.call KeSetEvent
ours: [4] kernel.call KeSetEvent
```
**Divergent event:**
```
canary: [5] kernel.return KeSetEvent
ours: [5] kernel.return KeSetEvent
```
**Next event after the divergence (if any):**
```
canary: [6] import.call KeWaitForMultipleObjects
ours: [6] import.call KeWaitForMultipleObjects
```
**Raw events (JSON):**
```json
{"deterministic": true, "engine": "canary", "guest_cycle": 0, "host_ns": 1085465900, "kind": "kernel.return", "payload": {"name": "KeSetEvent", "return_value": 1, "side_effects": [], "status": "0x00000001"}, "schema_version": 1, "tid": 4, "tid_event_idx": 5}
{"deterministic": true, "engine": "ours", "guest_cycle": 33, "host_ns": 1691667785, "kind": "kernel.return", "payload": {"name": "KeSetEvent", "return_value": 0, "side_effects": [], "status": "0x00000000"}, "schema_version": 1, "tid": 11, "tid_event_idx": 5}
```
## canary_tid=6 → ours_tid=1
First divergence at `tid_event_idx=113`: payload.return_value: canary=0 ours=1880095840
**Pre-context (last 5 matching events):**
```
canary: [108] import.call RtlLeaveCriticalSection
ours: [108] import.call RtlLeaveCriticalSection
canary: [109] kernel.call RtlLeaveCriticalSection
ours: [109] kernel.call RtlLeaveCriticalSection
canary: [110] kernel.return RtlLeaveCriticalSection
ours: [110] kernel.return RtlLeaveCriticalSection
canary: [111] import.call KeQuerySystemTime
ours: [111] import.call KeQuerySystemTime
canary: [112] kernel.call KeQuerySystemTime
ours: [112] kernel.call KeQuerySystemTime
```
**Divergent event:**
```
canary: [113] kernel.return KeQuerySystemTime
ours: [113] kernel.return KeQuerySystemTime
```
**Next event after the divergence (if any):**
```
canary: [114] import.call RtlInitializeCriticalSection
ours: [114] import.call RtlInitializeCriticalSection
```
**Raw events (JSON):**
```json
{"deterministic": true, "engine": "canary", "guest_cycle": 0, "host_ns": 16129600, "kind": "kernel.return", "payload": {"name": "KeQuerySystemTime", "return_value": 0, "side_effects": [], "status": "0x00000000"}, "schema_version": 1, "tid": 6, "tid_event_idx": 113}
{"deterministic": true, "engine": "ours", "guest_cycle": 9415, "host_ns": 72844487, "kind": "kernel.return", "payload": {"name": "KeQuerySystemTime", "return_value": 1880095840, "side_effects": [], "status": "0x700ffc60"}, "schema_version": 1, "tid": 1, "tid_event_idx": 113}
```
## canary_tid=7 → ours_tid=2
First divergence at `tid_event_idx=2`: payload.return_value: canary=0 ours=1896873464
**Pre-context (last 5 matching events):**
```
canary: [0] import.call RtlInitAnsiString
ours: [0] import.call RtlInitAnsiString
canary: [1] kernel.call RtlInitAnsiString
ours: [1] kernel.call RtlInitAnsiString
```
**Divergent event:**
```
canary: [2] kernel.return RtlInitAnsiString
ours: [2] kernel.return RtlInitAnsiString
```
**Next event after the divergence (if any):**
```
canary: [3] import.call NtCreateFile
ours: [3] import.call NtCreateFile
```
**Raw events (JSON):**
```json
{"deterministic": true, "engine": "canary", "guest_cycle": 0, "host_ns": 726781700, "kind": "kernel.return", "payload": {"name": "RtlInitAnsiString", "return_value": 0, "side_effects": [], "status": "0x00000000"}, "schema_version": 1, "tid": 7, "tid_event_idx": 2}
{"deterministic": true, "engine": "ours", "guest_cycle": 2475, "host_ns": 462883889, "kind": "kernel.return", "payload": {"name": "RtlInitAnsiString", "return_value": 1896873464, "side_effects": [], "status": "0x710ffdf8"}, "schema_version": 1, "tid": 2, "tid_event_idx": 2}
```
## canary_tid=12 → ours_tid=7
First divergence at `tid_event_idx=2`: payload.return_value: canary=258 ours=0
**Pre-context (last 5 matching events):**
```
canary: [0] import.call KeWaitForSingleObject
ours: [0] import.call KeWaitForSingleObject
canary: [1] kernel.call KeWaitForSingleObject
ours: [1] kernel.call KeWaitForSingleObject
```
**Divergent event:**
```
canary: [2] kernel.return KeWaitForSingleObject
ours: [2] kernel.return KeWaitForSingleObject
```
**Next event after the divergence (if any):**
```
canary: [3] import.call RtlEnterCriticalSection
ours: <end of stream>
```
**Raw events (JSON):**
```json
{"deterministic": true, "engine": "canary", "guest_cycle": 0, "host_ns": 913948200, "kind": "kernel.return", "payload": {"name": "KeWaitForSingleObject", "return_value": 258, "side_effects": [], "status": "0x00000102"}, "schema_version": 1, "tid": 12, "tid_event_idx": 2}
{"deterministic": true, "engine": "ours", "guest_cycle": 30, "host_ns": 489593928, "kind": "kernel.return", "payload": {"name": "KeWaitForSingleObject", "return_value": 0, "side_effects": [], "status": "0x00000000"}, "schema_version": 1, "tid": 7, "tid_event_idx": 2}
```
## canary_tid=14 → ours_tid=9
First divergence at `tid_event_idx=11`: payload.return_value: canary=2 ours=0
**Pre-context (last 5 matching events):**
```
canary: [6] import.call KeAcquireSpinLockAtRaisedIrql
ours: [6] import.call KeAcquireSpinLockAtRaisedIrql
canary: [7] kernel.call KeAcquireSpinLockAtRaisedIrql
ours: [7] kernel.call KeAcquireSpinLockAtRaisedIrql
canary: [8] kernel.return KeAcquireSpinLockAtRaisedIrql
ours: [8] kernel.return KeAcquireSpinLockAtRaisedIrql
canary: [9] import.call KeRaiseIrqlToDpcLevel
ours: [9] import.call KeRaiseIrqlToDpcLevel
canary: [10] kernel.call KeRaiseIrqlToDpcLevel
ours: [10] kernel.call KeRaiseIrqlToDpcLevel
```
**Divergent event:**
```
canary: [11] kernel.return KeRaiseIrqlToDpcLevel
ours: [11] kernel.return KeRaiseIrqlToDpcLevel
```
**Next event after the divergence (if any):**
```
canary: [12] import.call KeRaiseIrqlToDpcLevel
ours: [12] import.call KeRaiseIrqlToDpcLevel
```
**Raw events (JSON):**
```json
{"deterministic": true, "engine": "canary", "guest_cycle": 0, "host_ns": 1086679400, "kind": "kernel.return", "payload": {"name": "KeRaiseIrqlToDpcLevel", "return_value": 2, "side_effects": [], "status": "0x00000002"}, "schema_version": 1, "tid": 14, "tid_event_idx": 11}
{"deterministic": true, "engine": "ours", "guest_cycle": 77, "host_ns": 1691712626, "kind": "kernel.return", "payload": {"name": "KeRaiseIrqlToDpcLevel", "return_value": 0, "side_effects": [], "status": "0x00000000"}, "schema_version": 1, "tid": 9, "tid_event_idx": 11}
```
## canary_tid=15 → ours_tid=10
No divergence within the 15 compared events (canary has 355601, ours has 15).

View File

@@ -0,0 +1,10 @@
{
"instructions": 50000001,
"imports": 40454,
"unimpl": 0,
"draws": 0,
"swaps": 1,
"unique_render_targets": 0,
"shader_blobs_live": 0,
"texture_cache_entries": 0
}

View File

@@ -0,0 +1,10 @@
{
"instructions": 50000001,
"imports": 40454,
"unimpl": 0,
"draws": 0,
"swaps": 1,
"unique_render_targets": 0,
"shader_blobs_live": 0,
"texture_cache_entries": 0
}

View File

@@ -0,0 +1,59 @@
# Phase A — ours (xenia-rs) changes
All changes are **additive and cvar-gated default-off**. With the cvar unset, `xenia-rs check --stable-digest -n 50000000` produces a byte-identical digest to the pre-patch baseline (verified in `validation.md`).
| File | Change | Why |
|---|---|---|
| `crates/xenia-kernel/src/event_log.rs` | **NEW** (≈340 LOC) | Phase A emitter. Lazy file open, per-tid monotonic `tid_event_idx` counter, FNV-1a 64-bit `semantic_id`, JSONL writer with `Mutex<BufWriter<File>>`. `is_enabled()` is a relaxed atomic-bool load → zero overhead when disabled. Includes unit tests for FNV-1a vs the standard `"foobar"` test vector and for `semantic_id` stability. |
| `crates/xenia-kernel/src/lib.rs` | +1 line (`pub mod event_log;`) | Register the new module. |
| `crates/xenia-kernel/src/state.rs` | +33 LOC inside `call_export` | Single hook site for `import.call` / `kernel.call` / `kernel.return`. All three emits are inside `if phase_a_on` guards. Reads `tid` and `cycle_count` from the running thread when enabled. |
| `crates/xenia-app/src/main.rs` | +1 CLI flag, +6 LOC dispatch | Adds `--phase-a-event-log <PATH>` to the `Exec` subcommand. Env-var fallback `XENIA_PHASE_A_EVENT_LOG`. Calls `xenia_kernel::event_log::init(path)` once at startup; `None` keeps the emitter disabled. |
## Total surface area
≈ 380 LOC additive across 4 files; no existing logic modified.
## Hook coverage (v1)
- `import.call` — emitted at the syscall dispatcher (`call_export` in `state.rs`) once per kernel/import invocation, before `kernel.call`.
- `kernel.call` — same site, after `import.call`, before the export body runs.
- `kernel.return` — same site, after the export body returns, with `r3` as the integer return value.
- `schema_version` header — emitted once on file open (synthetic `tid=0`).
The following event kinds are part of schema v1 but **not yet wired** in either engine in this phase. Some have a Rust emitter function ready (function exists in `event_log.rs`, no call site); others have no Rust function yet:
- `thread.create`, `thread.exit` — emitter ready (`emit_thread_create`, `emit_thread_exit`), no call sites.
- `thread.suspend`, `thread.resume` — declared in schema, no Rust function yet.
- `handle.create`, `handle.destroy` — emitter ready (`emit_handle_create`, `emit_handle_destroy`), no call sites.
- `wait.begin`, `wait.end` — emitter ready (`emit_wait_begin`, `emit_wait_end`), no call sites.
- `mem.write` — declared in schema, no Rust function yet; gated behind a separate cvar (`phase_a_event_log_mem_writes`) declared in canary defaulted false.
- `vfs.open`, `vfs.read`, `vfs.close` — declared in schema, no Rust function yet.
Wiring any of these is additive and can land in a follow-up without touching the schema or the diff tool.
## Verification
```bash
cd "/home/fabi/RE - Project Sylpheed/xenia-rs"
# Build
cargo build --release -p xenia-app
# Unit tests (FNV-1a + semantic_id stability)
cargo test -p xenia-kernel event_log
# Cvar-OFF determinism
cp target/release/xenia-rs target/release/xenia-rs-phaseA-post
target/release/xenia-rs-phaseA-post check --stable-digest -n 50000000 \
--out /tmp/digest-after.json \
"/home/fabi/RE - Project Sylpheed/Project Sylpheed - Arc of Deception (USA, Europe) (En,Ja).iso"
diff audit-runs/phase-a-diff-harness/digest-pre-patch.json /tmp/digest-after.json
# expect empty diff (byte-identical)
# Cvar-ON sanity run
rm -f audit-runs/phase-a-diff-harness/ours-sanity.jsonl
target/release/xenia-rs-phaseA-post exec -n 50000000 \
--phase-a-event-log audit-runs/phase-a-diff-harness/ours-sanity.jsonl --quiet \
"/home/fabi/RE - Project Sylpheed/Project Sylpheed - Arc of Deception (USA, Europe) (En,Ja).iso"
head -1 audit-runs/phase-a-diff-harness/ours-sanity.jsonl # must start with schema_version
```

View File

@@ -0,0 +1,864 @@
# Phase A — Event Log Schema v1
**Status:** frozen for Phase A and Phase B. Adding a new event kind requires a `schema_version` bump and a coordinated update in both engines + the diff tool.
## Wire format
JSONL — one JSON object per line, UTF-8, `\n`-terminated. Both engines emit the same byte format.
The **first line** of every event-log file MUST be a `schema_version` event:
```json
{"schema_version":1,"engine":"canary","kind":"schema_version","tid":0,"tid_event_idx":0,"guest_cycle":0,"host_ns":0,"deterministic":true,"payload":{"version":1,"emitter_build":"<commit-or-build-id>"}}
```
The diff tool refuses to parse a file whose first event is not `schema_version` with version `1`.
## Common fields (every event)
| Field | Type | Notes |
|---|---|---|
| `schema_version` | int | always `1` in this phase |
| `engine` | string | `"canary"` or `"ours"` |
| `kind` | string | one of the v1 kinds below |
| `tid` | int | guest thread id of the calling thread (host TID never logged) |
| `tid_event_idx` | int | **per-tid monotonic, starts at 0** — the diff key |
| `guest_cycle` | int | per-engine monotonic guest-instruction count; `0` if the engine cannot supply one (see "Cycle source" below). NOT used by the diff tool for correctness — `tid_event_idx` is the canonical key. |
| `host_ns` | int | host monotonic-clock ns since process start; debug only, never compared by diff |
| `deterministic` | bool | `false` if any payload field is derived from host time / raw allocator address / RNG / etc. Diff tool skip-compares non-deterministic fields. |
| `payload` | object | kind-specific (see below) |
## Cycle source notes
- **canary**: the PPC `tb` (timebase) register can be read from the PPCContext passed into shim handlers. If a hook is on a path that does not have access to a PPCContext (e.g. a host-side handle-table destructor), the emitter MUST set `guest_cycle = 0` and leave `deterministic = false` on the payload-side metadata. The diff tool ignores `guest_cycle` for ordering — `tid_event_idx` is canonical.
- **ours**: `scheduler.thread(current_ref()).ctx.timebase` (already maintained per guest thread).
## Per-tid event index
Both engines maintain a per-tid monotonic counter starting at `0`. The counter is bumped **before** the event is serialized, so the first event for tid `N` has `tid_event_idx = 0`.
The `schema_version` event is special: it is emitted by the writer thread (typically the boot thread before any guest code has run) with `tid = 0` and `tid_event_idx = 0`. The actual guest thread `0` does not exist; the diff tool treats `tid = 0` as the schema header only.
## Handle semantic ID
Canary and ours produce guest handles in different ranges (canary: `0xF8xxxxxx` region; ours: bump-allocated `0x4, 0x8, 0xC, …`). Raw handle IDs are unsuitable as a cross-engine identity. Instead, both engines compute a stable **handle semantic ID** at handle creation time using **FNV-1a 64-bit** over a fixed-format byte string. FNV-1a is used (not SHA256) because both engines can implement it in <10 lines with no dependency, and the diff tool only needs a deterministic identity hash — not a crypto property.
```
input_bytes = le_u32(create_site_pc) ‖ le_u32(creating_tid) ‖ le_u64(tid_event_idx_at_creation) ‖ le_u32(object_type)
hash = 0xCBF29CE484222325
for each byte b in input_bytes:
hash = (hash XOR b) * 0x100000001B3 mod 2^64
handle_semantic_id = format("{:016x}", hash)
```
Both engines MUST emit the lowercase 16-hex-char form. The `create_site_pc` is the guest PC at the call site of the kernel call that created the handle: in canary, `PPCContext::lr - 4` (the `bl` to the import stub); in ours, the equivalent return address from the syscall dispatcher.
**Object type codes** (v1 — both engines agree):
| Code | Type |
|---|---|
| `0x00` | Unknown |
| `0x01` | Event |
| `0x02` | Mutant |
| `0x03` | Semaphore |
| `0x04` | Timer |
| `0x05` | Thread |
| `0x06` | File |
| `0x07` | IoCompletion |
| `0x08` | Module |
| `0x09` | EnumState |
| `0x0A` | Section |
| `0x0B` | Notification |
All subsequent events that reference a handle emit BOTH `handle_semantic_id` (the diff key) and `raw_handle_id` (engine-local, never compared).
## Event kinds (v1)
### `schema_version`
Header event. `payload = {"version": 1, "emitter_build": "<string>"}`.
### `thread.create`
Emitted by the **parent** thread at the kernel call that creates the new thread.
```json
"payload": {
"handle_semantic_id": "0123456789abcdef",
"parent_tid": 1,
"entry_pc": "0x82001234",
"ctx_ptr": "0xbce25340",
"priority": 0,
"affinity": 1,
"stack_size": 65536,
"suspended": false
}
```
### `thread.exit`
Emitted by the **exiting** thread (last event before tid disappears).
```json
"payload": {"exit_code": 0}
```
### `thread.suspend` / `thread.resume`
```json
"payload": {"target_tid": 13}
```
### `kernel.call`
Emit at handler entry, **before** any side effects.
```json
"payload": {
"name": "NtCreateFile",
"args": {"file_handle_ptr": "0x70000010", "desired_access": "0x80100080", "obj_attr_ptr": "0x70000020", ...},
"args_resolved": {"path": "\\Device\\Cdrom0\\dat\\movie\\opening.bik"}
}
```
- Numeric args use `0x`-prefixed hex strings if pointer-typed; ints stay as ints.
- `args_resolved` is a best-effort dereference (strings, struct dumps, buffer summaries). Optional.
### `kernel.return`
Emit at handler exit, **after** all side effects committed.
```json
"payload": {
"name": "NtCreateFile",
"return_value": 0,
"status": "0x00000000",
"side_effects": [
{"kind": "handle.create", "handle_semantic_id": "...", "object_type": 6, "raw_handle_id": "0x40"}
]
}
```
The `side_effects` array MAY duplicate events also emitted as standalone (`handle.create`). The diff tool treats both as authoritative; duplicates do not cause divergence.
### `handle.create`
For host-side creates not tied to a kernel call (rare).
```json
"payload": {
"handle_semantic_id": "0123456789abcdef",
"object_type": 1,
"object_name": null,
"raw_handle_id": "0xf8000048"
}
```
### `handle.destroy`
```json
"payload": {
"handle_semantic_id": "0123456789abcdef",
"raw_handle_id": "0xf8000048",
"prior_refcount": 1
}
```
### `wait.begin`
```json
"payload": {
"handles_semantic_ids": ["0123...", "abcd..."],
"timeout_ns": -1,
"alertable": false,
"wait_type": "any"
}
```
`timeout_ns = -1` means INFINITE. `wait_type` is `"any"` or `"all"`.
### `wait.end`
```json
"payload": {
"status": "0x00000000",
"woken_by_semantic_id": "0123456789abcdef",
"wait_duration_cycles": 12345
}
```
`wait_duration_cycles` is `deterministic = false` (host scheduling affects it). `woken_by_semantic_id` is null on timeout / alerted.
### `mem.write`
**OPT-IN — gated by a separate cvar (`phase_a_event_log_mem_writes`, default false).** In Phase A this kind is reserved; emitters MAY ship a TODO stub. Schema:
```json
"payload": {
"guest_addr": "0x82000000",
"value": "0x12345678",
"size": 4,
"source": "guest_jit"
}
```
### `vfs.open` / `vfs.read` / `vfs.close`
File-IO events, separate from `kernel.call` so the diff tool can match on canonical path:
```json
"payload": {"canonical_path": "\\Device\\Cdrom0\\dat\\movie\\opening.bik", "raw_handle_id": "0x40", "handle_semantic_id": "..."}
```
### `import.call`
Emitted at the syscall dispatcher (ours) or the import-stub JIT trap (canary), one per imported function invocation, **before** the implementing `kernel.call`.
```json
"payload": {
"module": "xboxkrnl.exe",
"ord": 0x101,
"name": "NtCreateFile"
}
```
## Diff-tool field-comparison rules
| Field | Rule |
|---|---|
| `engine` | skipped (always differs) |
| `host_ns` | skipped (host-clock) |
| `guest_cycle` | skipped (engines disagree on absolute count; diff uses `tid_event_idx`) |
| `raw_handle_id` | skipped (engines use different handle namespaces) |
| `handle_semantic_id` | **C+15-α: skipped** (engine-local — see below) |
| `handles_semantic_ids` (wait.begin) | **C+15-α: skipped** (same reason) |
| `parent_tid` (thread.create) | **C+15-α: skipped** (engine-local guest tids) |
| `ctx_ptr` (thread.create) | **C+22 v1.7: per-(tid, kind, field) ordinal sentinel** (`<HOSTHEAP_thread.create_ctx_ptr_N>`) — host-heap-derived VA, AUDIT-043 ε class |
| `woken_by_semantic_id` (wait.end) | **C+15-α: skipped** (engine-local SID) |
| `deterministic` (event-level field) | skipped (metadata) |
| Any payload field listed under a non-deterministic kind | skipped where flagged |
| All other payload fields | strict equality |
### Phase C+15-α note on `handle_semantic_id`
The SID computation includes `creating_tid` as input, but guest TIDs differ
between engines (canary's tid=6 maps to ours's tid=1 on the main chain).
Both engines compute SIDs **using their own local tids**, so the same logical
handle gets two different SIDs across engines. The diff tool skip-compares
SID fields and relies on `tid_event_idx + object_type` for alignment.
A future schema v2 could canonicalize SIDs via the diff tool's tid map and
restore strict comparison. For v1.1 the simpler skip-policy suffices.
## Shared-global SIDs (v1.2 — added in Phase C+18)
A subset of guest kernel dispatcher objects (`KEVENT`, `KSEMAPHORE`,
`KTIMER`, `KMUTANT`) are **process-global**: they live in
statically-initialized or pre-allocated guest memory and are touched
by MULTIPLE guest threads during boot. Examples include the XAudio
voice-volume change-mask semaphore at `0x828a3230` in Sylpheed.
Canary's `XObject::GetNativeObject` (`src/xenia/kernel/xobject.cc:397-483`)
and ours's `ensure_dispatcher_object` (`crates/xenia-kernel/src/exports.rs:4363`)
**lazy-wrap** these dispatchers on **first guest-thread touch**: the
first `KeWait*` invocation that passes the raw kernel-object pointer
synthesizes the `XObject` wrapper, stamps the `X_DISPATCH_HEADER` with
the `kXObjSignature` marker (`'X','E','N','\0' = 0x58454E00`), stashes
the handle, and emits `handle.create`. Subsequent touches find the
marker and short-circuit without emit (per-pointer idempotent).
### The first-toucher race
**Which** guest thread wins the "first toucher" race is
**timing-dependent**:
- Canary and ours have different host schedulers, JIT throughput, and
guest-thread bootstrap ordering.
- Even within the same engine across runs the first-toucher can
differ — but each engine produces a deterministic per-run total
ordering, so cold-vs-cold reproducibility holds.
The per-thread SID recipe `semantic_id(create_site_pc, creating_tid,
tid_event_idx_at_creation, object_type)` (v1) depends on BOTH
`creating_tid` and `tid_event_idx_at_creation`, so:
- Same dispatcher → DIFFERENT SIDs in each engine (race-dependent).
- `handle.create` for the same object lands on different per-tid
streams in canary vs ours.
The C+17 fix made ours emit `handle.create` for these synthesized
shadows, but the C+17 D-NEW-3 regression on tid=15→10 was
exactly the first-toucher race: ours's tid=10 was the first toucher
locally; canary's tid=15 was NOT the first toucher in its run — some
other canary tid had already adopted `0x828a3230`. ours's tid=10
emitted an "extra" `handle.create` that canary's tid=15 lacked, and
the diff tool flagged a kind mismatch at idx=2.
### The C+18 fix: deterministic SID recipe
Process-global dispatchers use a **second** SID recipe that is
scheduling-invariant. Both engines now use:
```
SHARED_GLOBAL_SID_MARKER = 0xC01AB005 (fixed sentinel, both engines)
input_bytes =
le_u32(SHARED_GLOBAL_SID_MARKER) // 4 bytes — "create_site_pc" slot
‖ le_u32(0) // 4 bytes — "creating_tid" slot
‖ le_u64(pointer) // 8 bytes — "tid_event_idx" slot
‖ le_u32(object_type) // 4 bytes
hash = FNV-1a-64(input_bytes)
shared_global_sid = format("{:016x}", hash)
```
The marker `0xC01AB005` is outside any plausible guest-PC range
(PPC text 0x82000000-0x82FFFFFF; XEX header 0x3001xxxx; heap
0x4xxxxxxx), so it can never collide with a regular per-thread SID
(which uses a real guest PC as `create_site_pc`).
Both engines compute the SAME SID for the same dispatcher pointer
regardless of:
- which guest thread is the first toucher,
- the `tid_event_idx_at_creation`,
- the per-engine scheduling order.
### Which call sites use which recipe
| Call site | SID recipe |
|--------------------------------------------------------|-------------------|
| `KernelState::alloc_handle_for` (ours) | per-thread |
| `ObjectTable::AddHandle` direct (canary) | per-thread |
| `ensure_dispatcher_object` (ours) | **shared-global** |
| `XObject::GetNativeObject` synthesized (canary) | **shared-global** |
Regular per-thread `handle.create` events (file open, thread create,
named-event create, etc.) keep the v1 per-thread recipe. The
shared-global recipe is restricted to lazy-wrap synthesis.
### Diff tool: cross-tid floating `handle.create` matching
The diff tool pre-pass collects all shared-global SIDs in either
engine's stream. A `handle.create` event is detected as shared-global
by recomputing the deterministic SID from its `(raw_handle_id,
object_type)` payload and comparing against the event's
`handle_semantic_id`. Regular per-thread SIDs cannot match this check
by construction.
When per-tid alignment finds a kind mismatch and one side has a
shared-global `handle.create` whose SID is in the floating set:
- The diff tool advances ONLY that side's stream pointer past the
floating event.
- Re-compare at the same canonical position.
The diff report's summary table shows a `floating_skipped (c/o)`
column for visibility — counts of absorbed events per side.
### Index relaxation
The C+18 fix relaxes the legacy diff-tool rule that requires
`canary.tid_event_idx == ours.tid_event_idx` for matching events.
With floating absorption, the per-tid indices can drift by 1 between
the two sides — but the `kind` and `payload` comparisons remain
strict. The raw indices are still preserved on the events themselves
(useful for debugging and report context).
### Backward compatibility
- Wire format unchanged. `schema_version` is still `1`.
- Pre-C+18 event logs (no shared-global SIDs in the stream) trigger
the legacy code path automatically — the floating set is empty.
- The marker constant `0xC01AB005` MUST be exactly this value in both
engines and the diff tool. Tests in both engines plus
`tools/diff-events/test_diff_events.py` lock it in.
## Wait-begin floating absorb (v1.3 — added in Phase C+21)
### Motivation
Canary's `RtlEnterCriticalSection` (and its symmetric counterparts —
`KeWaitForSingleObject` invoked on a process-global dispatcher,
mutex/semaphore contended-acquire paths) emits `wait.begin` **only on
the contended slow path**. The fast path (uncontended atomic-CAS, or
recursive bump) emits NO `wait.begin` and only the `kernel.call`
`kernel.return` pair. Which path is taken depends on whether ANOTHER
guest thread is currently holding the dispatcher when the wait is
attempted — i.e. it is **host-scheduler-driven**, varying across cold
runs of the same engine.
Reading-error class **#32** (documented in C+20's
`investigation.md`) captures this: cross-checking 3 fresh canary cold
runs at canary tid=6 idx 104,606 showed:
- jitter-1: `wait.begin sid=75ae880ec432eb36` (contended)
- jitter-2: `kernel.return` (fast — matches ours)
- jitter-3: offset-shifted wait.begin at a different idx with a
different SID
The matched-prefix metric is unreliable inside such regions if the
diff tool treats wait.begin events as strictly positional.
### The fix
A `wait.begin` event is **floating** if at least one of its
`payload.handles_semantic_ids` references a shared-global SID
(see §"Shared-global SIDs"). During the per-tid two-pointer walk:
- If one side has a floating `wait.begin` and the other has a
different kind at the same canonical position, advance ONLY the
wait.begin side's pointer and re-compare.
`wait_type=all` waits are floating as long as ANY single handle in
the set is shared-global — the entire wait's blocking behavior is
timing-dependent if even one of its handles is on a process-global
dispatcher.
### Shared-global SID detection (extended in C+21)
The diff tool's `collect_shared_global_sids` pre-pass now unions
TWO sources:
1. **Recipe-matching `handle.create` events** (Phase C+18 — direct).
This catches ours's `ensure_dispatcher_object` output where
`raw_handle_id == ptr` (the recipe-input pointer).
2. **Cross-tid usage heuristic** (Phase C+21 — indirect). Any SID
referenced via `handle.create` OR `wait.begin` on **two or more
distinct guest tids** in EITHER engine is treated as shared-global.
The cross-tid heuristic exists because canary's
`EmitHandleCreateSharedGlobal` (`event_log.cc:435`) emits the SID
computed from the dispatcher VA but stashes
`object->handle()` (a handle-table slot in the `0xF8xxxxxx`
region) as `raw_handle_id`. Those two values DIFFER, so canary's
shared-global `handle.create` events are NOT recipe-recognizable
from their payload alone. Multi-tid SID usage is a robust
observational signal: per-thread SIDs by construction stay on the
single creating tid (their hash inputs include `creating_tid`),
so any cross-tid SID usage indicates a process-global dispatcher.
### Risk of over-absorption (and why it's bounded)
The cross-tid heuristic could in principle mis-classify a per-thread
SID that one thread creates and another thread waits on — a
legitimate cross-thread synchronization pattern. The floating-absorb,
however, only fires on a **kind mismatch** at the canonical position.
Per-thread waits that match strictly on both sides advance normally
without any absorb. The heuristic only loosens alignment when one
side is missing a `handle.create` or `wait.begin` — exactly the
scheduling-jitter window the C+21 fix targets.
### Diff-tool report changes
The summary table's `floating_skipped (c/o)` column is split into
two columns:
- `floating_create (c/o)` — C+18 `handle.create` absorptions.
- `floating_wait (c/o)` — C+21 `wait.begin` absorptions.
Both per-side and observation-only — counts may legitimately be
non-zero in a clean run.
### Backward compatibility
- Wire format unchanged. `schema_version` is still `1`.
- Pre-C+21 event logs (no `wait.begin` events that reference
shared-global SIDs) trigger no new behavior — the wait absorption
branches are inert.
- The C+18 floating-create logic is unchanged; the C+21 fix is
strictly additive.
- Engine source is UNCHANGED in C+21 — the fix is in the diff tool
only.
## contention.observed (v1.4 — added in Phase D Stage 1, 2026-05-18)
### Motivation
The 104,607 cap is canary's tid=6 contending on a CS while ours's tid=1
fast-paths through the same call (Phase C+22). Schedules diverge for
host-OS reasons, so neither engine is "wrong" — but matched-prefix
stalls. Phase D's H' approach makes ours's `rtl_enter_critical_section`
*replay* canary's contention by consulting a per-call manifest built
from canary's contention trace.
Stage 1 (this section) introduces the canary-side **emitter** for that
manifest: a new event kind `contention.observed` that fires from
`RtlEnterCriticalSection_entry` (`xboxkrnl_rtl.cc:596-633`) just before
the call falls through to `xeKeWaitForSingleObject` after spin-loop
exhaustion. Cvar-gated (`kernel_emit_contention`, default false) so
default canary behavior is byte-identical.
### Event shape
```json
{
"schema_version": 1,
"engine": "canary",
"kind": "contention.observed",
"tid": <guest tid of caller>,
"tid_event_idx": <per-tid ordinal consumes one slot>,
"guest_cycle": 0,
"host_ns": <emit timestamp>,
"deterministic": true,
"payload": {
"cs_ptr": "0xHHHHHHHH", // guest VA of the RTL_CRITICAL_SECTION
"site_sid": "HHHHHHHHHHHHHHHH", // shared-global SID (see below)
"contended": true // always true at v1.4 (uncontended is implicit)
}
}
```
`site_sid` is computed via the **C+18 shared-global SID recipe**:
```
site_sid = FNV-1a-64 over
( kSharedGlobalSidMarker [u32 LE] // 0xC01AB005
, 0 [u32 LE] // creating_tid (unused)
, cs_ptr as u64 [u64 LE] // pointer-as-idx
, kObjCriticalSection [u32 LE] // 0x0C, new in v1.4
)
```
Both engines compute the same SID for the same CS pointer. The marker
constant `kObjCriticalSection = 0x0C` is the new ObjectType value
introduced for this kind; it does NOT correspond to a real XObject
(CS lives as a guest-memory struct, not a handle-tabled object).
### When emitted (canary)
In `RtlEnterCriticalSection_entry`:
1. Recursive-lock fast path (already own lock) → **NO emit** (not contention).
2. Spin-loop succeeds (`atomic_cas` flips `lock_count` from -1 → 0) → **NO emit** (fast acquire).
3. Spin-loop exhausted **AND** `atomic_inc(&cs->lock_count) != 0`**EMIT** with `contended=true`, then `xeKeWaitForSingleObject`.
4. Spin-loop exhausted **AND** `atomic_inc(...) == 0` (CS became free between spin and inc) → **NO emit** (we won the race after spin).
The emit point sits **between** atomic_inc's positive result and the
`xeKeWaitForSingleObject` call, so the new event always precedes the
existing `wait.begin` event in the per-tid ordinal.
### When emitted (ours, Stage 3 — pending)
Stage 3 will add a symmetric emit in `rtl_enter_critical_section`
(`xenia-rs/crates/xenia-kernel/src/exports.rs:2886-2946`) at the
forced-park branch driven by the manifest. This keeps per-tid ordinals
aligned across engines after replay.
### Diff-tool treatment (Stage 4 — pending)
`contention.observed` will be added to `ENGINE_LOCAL_KINDS` in
`diff_events.py`: the per-tid pointer advances past these events on
either side without comparison. This keeps matched-prefix counts
unchanged when ONE side emits the event (Stage 1's canary-only world)
or when BOTH emit at the same ordinal (Stage 3's parity world).
### Cvar default + byte-identity
`kernel_emit_contention=false` by default. With cvar=false, the helper
`phase_a::EmitContentionObserved` short-circuits at the cvar check
before any `IsEnabled()` lookup. The pre-Stage-1 canary code path is
preserved byte-for-byte; cvar-OFF cold runs produce zero
`contention.observed` events (validated on the Stage 1 cold run:
0 occurrences in a 4.4 GB / 18.6 M event trace).
## Nested-CS-cleanup absorber (v1.5 — added in Phase D D-extension, 2026-05-18)
### Status
**Band-aid.** Explicit annotation: this absorber CROSSES the reading-error
#23 boundary in spirit. It folds real guest control-flow divergence at
the diff-tool layer. It exists because the underlying root cause —
producer-throughput divergence under the cooperative-vs-preemptive
scheduling mismatch (see Phase D forensics) — is **explicitly out of
scope** for the H' plan: fixing it in ours's engine would require
preempting the cooperative scheduler, which invalidates 23 phases of
digest stability. The absorber is the practical compromise.
### Trigger shape
The absorber fires ONLY at a kind mismatch of:
- canary[ic] = `import.call` with `payload.name == "RtlEnterCriticalSection"`
- ours[io] = `import.call` with `payload.name == "RtlLeaveCriticalSection"`
For any other kind mismatch, the absorber is silent. This narrowness is
intentional: real engine divergences appear in other shapes and must
still surface.
### Behavior
When the trigger pattern matches, canary's stream is scanned for one or
more balanced `[Enter-block, Leave-block]` pairs immediately following
the trigger position:
- An Enter-block is 3 consecutive events:
`import.call RtlEnterCriticalSection → kernel.call RtlEnterCriticalSection → kernel.return RtlEnterCriticalSection`.
- A Leave-block is 3 consecutive events with `RtlLeaveCriticalSection`.
The absorber consumes pairs greedily up to a cap of `_NESTED_CS_PAIR_CAP
= 32` pairs (empirically, Sylpheed's worst-case is ~10-15 pairs at the
104,607 cap). After consuming each pair, it checks whether canary's next
event has the SAME `kind` AND same `payload.name` as ours[io]. The first
convergence wins; canary's pointer is advanced past the absorbed pairs.
If no convergence is found within the cap, the absorber returns None
and the divergence falls through to normal reporting.
### Why this is safe (within #23's spirit)
1. The absorption only happens when canary's stream re-aligns with
ours's stream past the nested block. If it doesn't re-align, the
real divergence is reported.
2. The nested-block shape matches a specific PPC pattern: the consumer
thread in canary acquires a CS, calls a helper that iterates a
tree/registry, takes the nested-CS-enter path for each item, and
releases the outer CS. Ours's tree is shorter so it skips this.
The net effect on guest state is bounded: ours has fewer items
processed in this iteration, but the EVENT stream past the
absorption resumes the same logical operation.
3. The Phase B `image_loaded_sha256` is the foundational invariant.
It's unaffected by this absorber (no engine source change).
### Why this is NOT safe in the general sense
- Diverging downstream state IS lost: ours's tree has fewer entries
than canary's after the absorbed block. Subsequent ours operations
that touch the tree will behave differently. Other absorbers / fixes
will be needed if those state-differences manifest later.
- A future engine bug that produces a spuriously nested Enter+Leave
pair could be falsely absorbed. Mitigation: the absorber requires
canary's post-block stream to re-align with ours's; spurious nested
pairs without re-alignment fall through to normal divergence
reporting.
### Empirical result (Sylpheed 104,607 cap)
Pre-absorber (post-Stage-3+4): main matched-prefix = 104,607 (cap).
Post-absorber: main matched-prefix = **105,046 (+439 events)**.
The next divergence is at idx 105,046 on `VdInitializeEngines.return_value`
(canary=1, ours=0) — an unrelated engine bug in the video subsystem,
NOT a recurrence of the cap pattern. Sister chains preserved
(11/32/4/41/16).
### Tests
Three unit tests in `test_diff_events.py`:
- `test_nested_cs_cleanup_block_absorbed_when_convergent` — folds one nested pair
- `test_nested_cs_cleanup_NOT_absorbed_when_followup_diverges` — confirms re-alignment requirement
- `test_nested_cs_cleanup_NOT_absorbed_when_canary_has_no_followup` — negative case
## sema.release (v1.6 — added in AUDIT-069 Session 6, 2026-05-21)
### Motivation
AUDIT-069 Sessions 1-5 established that ours under-produces semaphore
releases by ~80% on the work-semaphore vs canary (`99 vs 414` in S5,
refined in S6 to `83 vs 414` apples-to-apples on the work semaphore
alone). The measurement infrastructure was a one-off cvar
(`audit_70_semaphore_release_watch`, hand-built per-handle log lines)
plus an ours-side `--lr-trace` capture at the wrapper-entry PC. Future
AUDIT-070+ sessions and any general regression triage need this metric
to be diff-visible without bespoke cvars per investigation.
`sema.release` lifts the AUDIT-070 cvar's signal into the Phase A
schema as a **symmetric** event kind in both engines.
### Event shape
```json
{
"schema_version": 1,
"engine": "canary",
"kind": "sema.release",
"tid": <guest tid of caller>,
"tid_event_idx": <per-tid ordinal consumes one slot>,
"guest_cycle": <PPC timebase>,
"host_ns": <emit timestamp>,
"deterministic": true,
"payload": {
"handle_semantic_id": "HHHHHHHHHHHHHHHH", // shared-global SID for the work-sem
"raw_handle_id": "0xHHHHHHHH", // engine-local
"release_count": 1, // games typically release 1
"previous_count": 0, // semaphore count BEFORE release
"caller_pc": "0xHHHHHHHH" // guest LR at release time
}
}
```
### SID recipe
The work-semaphore in Sylpheed (canary handle `0xF800003C`, ours
handle `0x1044`) is a **process-global dispatcher** in the C+18 sense:
it lives in pre-allocated guest memory and is touched by multiple
guest threads (main, worker, cache-thread, other producers). Its
`handle_semantic_id` SHOULD use the **shared-global recipe**
(`ComputeSharedGlobalSemanticId(dispatcher_ptr, kObjSemaphore=0x03)`)
so canary and ours produce the same SID for the same guest dispatcher.
Per-thread semaphores (rare in Sylpheed) MAY use the v1 per-thread
recipe; the diff tool does NOT compare SIDs for `sema.release` (the
kind is engine-local positionally — see below).
### Why engine-local
Per AUDIT-069 H3 and S6's first-N=20 measurement, the cadence and
ordinal interleaving of releases between the worker, main, and
cache-thread are **timing-dependent**: the first 20 releases match
perfectly across engines, but worker tid diverges at canary ord=83
when the cache-thread's first release fires (which ours never
emits because ours's cache-thread wedges at `sub_821CB030+0x1AC`).
Strict positional alignment would always trip on this known
divergence.
`sema.release` is therefore in `ENGINE_LOCAL_KINDS` in the diff tool
(alongside `contention.observed`): both engines emit, but the diff
tool advances past these events on either side without alignment.
The **count** is surfaced in the report's "Counted engine-local
kinds" summary table (per-tid + total per engine) so cadence
regressions are diff-visible at-a-glance.
### Emit points (planned, NOT yet wired)
- **Canary**: extend `audit_70_semaphore_release_watch` to call
`phase_a::EmitSemaRelease(handle, count, prev_count)` from
`NtReleaseSemaphore_entry` + `xeKeReleaseSemaphore`. Cvar gating
remains the existing `audit_70_semaphore_release_watch` (or a new
`phase_a_event_log_sema_releases=false` for finer control).
- **Ours**: emit `sema.release` from `nt_release_semaphore` in
`crates/xenia-kernel/src/exports.rs` and from
`KSemaphore::release` (kernel-mode equivalent). Default-off via a
runtime flag; default cold runs must remain digest-stable.
Both engines MUST emit at handler entry (not wrapper-internal) so the
event count corresponds 1:1 to guest `NtReleaseSemaphore` invocations,
matching the canary cvar's existing semantics.
### Status
- **Diff tool**: support landed (this session, v1.6). `sema.release`
in `ENGINE_LOCAL_KINDS` + `COUNTED_ENGINE_LOCAL_KINDS`; counts
surfaced in report summary; 3 new tests in `test_diff_events.py`.
- **Canary emit**: NOT YET WIRED. Planned for AUDIT-070+ when the
root cause investigation requires it. Existing cvar
`audit_70_semaphore_release_watch` continues to emit non-schema
log lines (used by S5/S6 captures).
- **Ours emit**: NOT YET WIRED. See above.
### Backward compatibility
- Wire format unchanged. `schema_version` is still `1`.
- Pre-v1.6 event logs (no `sema.release` events) trigger no new
behavior — the engine-local skip branches are inert; the
"Counted engine-local kinds" report section is suppressed when
no counted-kind events exist.
- Diff tool changes are purely additive: existing engine binaries
diff identically pre- and post-v1.6.
## Host-heap payload-field canonicalization (v1.7 — added in Phase C+22, 2026-05-26)
### Motivation
C+2 (`ALLOCATOR_RETURN_FNS`) canonicalizes `kernel.return.return_value`
for a known set of host-allocator-returning exports
(`MmAllocatePhysicalMemoryEx`, `RtlAllocateHeap`, …). That covers
the case where the allocated VA appears as the function's *return*
value. But the same allocator-drift class (AUDIT-043 ε:
canary's BC physical heap `0xBCxxxxxx` vs ours's unified user heap
`0x4xxxxxxx`) ALSO surfaces inside **typed event payloads** of
non-allocator exports — most notably the `thread.create.ctx_ptr`
field, which holds the host-allocated TLS/context block that
`ExCreateThread` passes to the new guest thread's r3.
Empirical surface (C+22 cold-vs-cold idx 105,128 on the Sylpheed
audio-stack worker `ExCreateThread(entry_pc=0x824cd458)`):
| field | canary | ours |
|---|---|---|
| `ctx_ptr` | `0xbe56bb3c` (BC physical heap) | `0x42453b3c` (unified user heap) |
| `entry_pc` | `0x824cd458` | `0x824cd458` (bit-identical — game code) |
| `priority` | `0` | `0` |
| `affinity` | `4` | `4` |
| `stack_size` | `32768` | `32768` |
| `suspended` | `false` | `false` |
The C+2 `ALLOCATOR_RETURN_FNS` mechanism doesn't help here because
`ExCreateThread`'s return value is the new thread's *handle*
(canary's `0xF8xxxxxx` vs ours's `0x4, 0x8, …`), already covered
by `handle_semantic_id` skip-policy. The host-heap-allocated
context block is a side-channel field inside the
`thread.create` event payload.
### The fix
`HOST_HEAP_PAYLOAD_FIELDS_BY_KIND` maps event kind → tuple of
payload field names. Each listed field's value (expected
`0x`-prefixed hex string) is rewritten to a per-(tid, kind, field)
ordinal sentinel `<HOSTHEAP_<KIND>_<FIELD>_<ORDINAL>>` BEFORE
payload comparison. The mechanism mirrors
`canonicalize_allocator_returns` exactly, restricted to typed
payload fields.
Initial set (v1.7):
```python
HOST_HEAP_PAYLOAD_FIELDS_BY_KIND = {
"thread.create": ("ctx_ptr",),
}
```
### Strict-field preservation
For each canonicalized event kind, the **strict** fields (game-visible
attributes that MUST match across engines) are untouched. For
`thread.create` these are:
- `entry_pc` — guest VA of the new thread's entry function, bit-
identical in both engines because both engines load the same XEX
and the entry comes from guest code.
- `priority`, `affinity`, `stack_size`, `suspended` — game-visible
thread attributes the guest passes to `ExCreateThread`.
Skip-policy fields (`handle_semantic_id`, `parent_tid`) continue
to be skipped via `SKIP_PAYLOAD_FIELDS_BY_KIND` (unchanged from
C+15-α — see "Diff-tool field-comparison rules" above).
### Why `parent_tid` does NOT need new canonicalization
Per the C+15-α skip-policy table, `parent_tid` is already in
`SKIP_PAYLOAD_FIELDS_BY_KIND["thread.create"]`. The diff tool
pairs guest TIDs at the chain level (`--tid-map` or
`auto_tid_map`), and the per-event `parent_tid` is engine-local
(canary tid=6 vs ours tid=1 for the same logical "main thread"
chain). Skipping is sufficient — no ordinal sentinel needed.
Could a future schema v2 canonicalize `parent_tid` via the tid
map? Yes, but it would surface mismatches as a *map gap* rather
than as a clearer per-tid alignment failure that's already
visible at chain boundaries. The v1.x skip-policy is the
simpler choice; tests pin the existing behavior so it doesn't
regress.
### Ordinal-count contract
As with `ALLOCATOR_RETURN_FNS`: if one engine emits MORE
`thread.create` events on a given tid than the other, ordinals
drift and the next typed event surfaces a divergence against
whatever the other side has at that position. Ordinal-count
mismatch IS a behavioral divergence — the canonicalization
preserves divergence detection, only collapsing
host-allocator-VA noise.
### Defensive value handling
If `ctx_ptr` is non-string (`None`, int, missing) — pre-C+22
event logs whose emitter omits the field — the canonicalizer
leaves it untouched and does NOT consume an ordinal. The next
string-typed value gets ordinal 0. This keeps pre-v1.7 logs
diffable without forcing an emitter retrofit.
### Backward compatibility
- Wire format unchanged. `schema_version` is still `1`.
- Pre-C+22 event logs whose `thread.create.ctx_ptr` happens to
bit-match (e.g. static-allocator addresses like `0x828F3D08`
that BOTH engines use for the pre-XEX kernel-state ctxs)
still match strictly via the ordinal sentinel — they get the
same ordinal in both engines.
- The `--no-canonicalize-host-heap-fields` CLI flag disables the
pass (reverts to raw-VA comparison), mirroring the existing
`--no-canonicalize-allocators`. Used by gate tests and
investigation rerun.
- Engine source is UNCHANGED in C+22 — the fix is in the diff
tool only.
### Extension shape
The map shape `kind -> (field, …)` is intentionally minimal:
each entry is one event kind plus the fields on it that hold
host-heap VAs. Future entries could include e.g.
`thread.create.tls_ptr` (if such a field is added to the schema)
or a hypothetical `vfs.mmap.host_ptr`. Strict-field policy
remains: any field NOT listed here is compared bit-identically.
## Forward compatibility
Phase A's original schema-v1 declared 13 sections (16 distinct kind strings);
Phase A wired 4 of them. Phase C+15-α wired an additional 5 (`handle.create`,
`handle.destroy`, `thread.create`, `thread.exit`, `wait.begin`). `wait.end`,
`thread.suspend/resume`, `mem.write`, `vfs.open/read/close` remain declared
but unwired; adding them is additive surface area at schema v1.1+.
A future schema v2 may break wire format (e.g. canonical SIDs, structured args).
Both engines pin `schema_version = 1` in this phase; the diff tool refuses to
mix v1 and v2 inputs.

View File

@@ -0,0 +1,116 @@
# Phase A — Validation record
All four acceptance gates from the plan have been executed against the patched canary (`build-cross/bin/Windows/Debug/xenia_canary.exe`) and ours (`target/release/xenia-rs`). Results below were captured on 2026-05-13.
## Gate 1: cvar-OFF determinism
### ours
- Pre-patch binary digest: `audit-runs/phase-a-diff-harness/digest-pre-patch.json` (captured from a copy of the binary made *before* applying the patch).
- Post-patch binary digest: `audit-runs/phase-a-diff-harness/digest-post-patch-cvaroff.json`.
- Both runs: `check --stable-digest -n 50000000` against the same ISO.
- Verification: `diff` of the two files produces zero output. Byte-identical. **PASS.**
### canary
- Pre-patch run: 12 s boot under Wine, `--mute=true`, log size **68 301 bytes**.
- Post-patch run with cvar unset: same conditions, log size **68 407 bytes**.
- Per-line diff of the two logs:
- Lines 19-20: two new entries in the CONFIG DUMP — `phase_a_event_log_path = ""` and `phase_a_event_log_mem_writes = false`. **Expected** — these are the two cvars we declared, both default to empty/false.
- Remaining differences: host-pointer values (`pid=0x...`, `graphics_system=0x...`, `native=0x...`) and millisecond timings (`Translated 5 shaders in 18 ms` vs `... in 13 ms`).
- Cross-check: re-ran the **same** post-patch binary a second time. Log size identical (68 407 bytes). Diff between two consecutive runs of the same binary shows the **same volume and nature** of host-pointer/timing changes — i.e. this is normal run-to-run jitter, not a behavioral change introduced by the patch.
- Smoke marker (`AUDIT-DEMO-SETUP-BEGIN`/`AUDIT-DEMO-SETUP-GRAPHICS-OK`) fires in both runs.
- **PASS.**
### Unit tests
`cargo test -p xenia-kernel event_log` — 2/2 tests pass:
- `fnv1a_known_vector` (FNV-1a 64-bit of `"foobar"` == `0x85944171f73967e8`, the standard FNV-1a test vector)
- `semantic_id_stable` (identity inputs produce identity output; distinct inputs produce distinct output)
## Gate 2: cvar-ON emits well-formed JSONL with schema_version header
### ours
```
$ head -1 audit-runs/phase-a-diff-harness/ours-sanity.jsonl
{"schema_version":1,"engine":"ours","kind":"schema_version","tid":0,"tid_event_idx":0,
"guest_cycle":0,"host_ns":48371,"deterministic":true,
"payload":{"version":1,"emitter_build":"ours-phaseA"}}
$ wc -l audit-runs/phase-a-diff-harness/ours-sanity.jsonl
121363
```
50 M-instruction run produced **121 363 valid JSONL events**.
### canary
```
$ head -1 audit-runs/phase-a-diff-harness/canary-sanity.jsonl
{"schema_version":1,"engine":"canary","kind":"schema_version","tid":0,"tid_event_idx":0,
"guest_cycle":0,"host_ns":300,"deterministic":true,
"payload":{"version":1,"emitter_build":"canary-phaseA"}}
$ wc -l audit-runs/phase-a-diff-harness/canary-sanity.jsonl
1635789
```
12 s Wine run produced **1 635 789 valid JSONL events**. (Volume differential vs ours reflects canary's debug build with full kernel-call logging at every shim trampoline; both engines pin schema_version=1.)
**Both files lead with a `schema_version` event. PASS.**
## Gate 3: diff tool finds matching prefix on tid=1
Ran `tools/diff-events/diff_events.py` on the two sanity files with auto-mapping:
```
| canary_tid | ours_tid | matched | canary_total | ours_total | first_divergence_at |
| 6 | 1 | 113 | 313196 | 108492 | 113 |
| 4 | 11 | 5 | 25163 | 9 | 5 |
| 7 | 2 | 2 | 29 | 33 | 2 |
| 12 | 7 | 2 | 2846 | 3 | 2 |
| 14 | 9 | 11 | 587000 | 75 | 11 |
| 15 | 10 | 15 | 355601 | 15 | — |
```
The primary boot-thread pair (`canary_tid=6``ours_tid=1`) matched **113 events** before the first divergence — well over the ≥100 threshold required by the gate. **PASS.**
The full per-thread report is at `diff-report.md`. Per Phase A discipline, those divergences are NOT analyzed in this session; they are input for Phase B.
## Gate 4: Negative test detects a hand-corrupted event
```
# Self-diff of identical files — clean exit
$ python3 tools/diff-events/diff_events.py \
--canary /tmp/ours-short.jsonl --ours /tmp/ours-short.jsonl --validate-identical
$ echo $?
0
# Corrupted "kernel.call" -> "kernel.CORRUPT" on a tid=1 kernel.call event
$ python3 tools/diff-events/diff_events.py \
--canary /tmp/ours-short.jsonl --ours /tmp/ours-corrupt.jsonl --validate-identical
$ echo $?
1
```
The diff report names the divergence at the right index:
```
First divergence at `tid_event_idx=4`: kind: canary='kernel.call' ours='kernel.CORRUPT'
```
A second corruption further down the file (line 51, `tid_event_idx=49`) was also detected. **PASS.**
## Summary
| Gate | Status |
|---|---|
| 1. Cvar-OFF determinism (both engines) | ✅ |
| 2. Cvar-ON emits valid JSONL with schema_version header (both engines) | ✅ |
| 3. Diff tool reports ≥100-event matching prefix on tid=1 → divergence at idx 113 | ✅ |
| 4. Negative test (corrupt one event) → exit 1, correct `tid_event_idx` named | ✅ |
Cascade prediction at session close (harness signals only):
- A (infrastructure builds, cvar-OFF zero overhead): **achieved**.
- B (cvar-ON emits valid JSONL both engines): **achieved**.
- C (sanity validation 4-gate passes first try): **achieved on the first complete run**, modulo a transient build-time issue (CMake `xe_platform_sources` is non-incremental for new `.cc` files in canary — needed a `cmake --preset cross-win-clangcl` reconfigure).
- D (fix lands): **N/A — out of scope for Phase A.**