handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes

Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-06-05 07:19:08 +02:00
parent acd1656753
commit ef93a4fa14
620 changed files with 108303 additions and 1 deletions

View File

@@ -0,0 +1,304 @@
diff --git a/src/xenia/cpu/cpu_flags.cc b/src/xenia/cpu/cpu_flags.cc
index 3ff067e15..e6f412f91 100644
--- a/src/xenia/cpu/cpu_flags.cc
+++ b/src/xenia/cpu/cpu_flags.cc
@@ -57,3 +57,110 @@ DEFINE_bool(break_condition_truncate, true, "truncate value to 32-bits", "CPU");
DEFINE_bool(break_on_debugbreak, true, "int3 on JITed __debugbreak requests.",
"CPU");
+
+// AUDIT-DEMO: smoke marker (memory entry: emulator.cc:225,283). Always-on bool.
+DEFINE_bool(audit_demo_setup_trace, true,
+ "Audit smoke marker: log AUDIT-DEMO-SETUP-BEGIN at emulator setup.",
+ "Audit");
+
+// AUDIT-061: comma-separated list of guest PCs to log on each fire.
+// Format: "0xPC1,0xPC2,..." (max 32 PCs). Each fire emits
+// AUDIT-061-BR pc=X lr=X cr0=LGE cr6=LGE r3=X r4=X r5=X r6=X r31=X tid=N.
+// Default empty (off); no perf cost when empty.
+DEFINE_string(audit_61_branch_probe_pcs, "",
+ "AUDIT-061: CSV of guest PCs to trace (cr0/cr6 + regs/tid).",
+ "Audit");
+
+// AUDIT-067: comma-separated list of u32 values to watch. When non-empty,
+// every 4-byte guest store (stw/stwu/stwx/stwux/stmw) emits a runtime
+// equality check; matches log AUDIT-067-VAL pc=X lr=X val=X dst=X r3..r6 r31 tid=N.
+// Max 4 values. Default empty (off); zero overhead when empty.
+DEFINE_string(audit_67_value_watch, "",
+ "AUDIT-067: CSV of u32 values (max 4) — log every guest "
+ "store whose value matches.",
+ "Audit");
+
+// AUDIT-068: host-side memory-write watch. See cpu_flags.h header for format.
+// Mirrors AUDIT-067 but covers host-side writes (xe::store_and_swap<T>,
+// Memory::Zero/Fill/Copy). Empty default = zero cost.
+DEFINE_string(audit_68_host_mem_watch_values, "",
+ "AUDIT-068: CSV of u32 values (max 8) — log every host-side "
+ "guest-memory write whose value matches.",
+ "Audit");
+DEFINE_string(audit_68_host_mem_watch_addrs, "",
+ "AUDIT-068: CSV of guest VAs or VA ranges 'START-END' (max 8) "
+ "— log every host-side guest-memory write whose guest VA falls "
+ "within the configured set.",
+ "Audit");
+
+// AUDIT-068 Session 3: read-mode probe. See cpu_flags.h for format.
+DEFINE_string(audit_68_host_mem_read_probe, "",
+ "AUDIT-068 Session 3: CSV of 'VA:SIZE:PERIOD_NS' tuples (max 8) "
+ "— a dedicated poll thread reads the value at each VA every "
+ "PERIOD_NS and emits AUDIT-068-READ-CHANGE on transition.",
+ "Audit");
+
+// AUDIT-069: see cpu_flags.h header. Empty default = zero cost.
+DEFINE_string(audit_69_event_signal_watch, "",
+ "AUDIT-069: CSV of guest event-handle IDs (max 4) — log each "
+ "XEvent::Set / Ke*Event / Nt*Event fire whose target matches.",
+ "Audit");
+DEFINE_string(audit_69_event_signal_native_ptr, "",
+ "AUDIT-069: CSV of guest event native VAs (X_KEVENT*) (max 4) "
+ "— log each set fire whose native pointer matches.",
+ "Audit");
+DEFINE_bool(audit_69_log_all_sets, false,
+ "AUDIT-069: when true, log EVERY XEvent::Set/Pulse fire (used "
+ "for one-run wait→signal correlation across handle drift). "
+ "Default false; use only with --mute=true.",
+ "Audit");
+
+// AUDIT-070 (S5 of AUDIT-069 family): semaphore-release watch. See header.
+DEFINE_string(audit_70_semaphore_release_watch, "",
+ "AUDIT-070: CSV of guest semaphore handle IDs (max 4) — log "
+ "each NtReleaseSemaphore / xeKeReleaseSemaphore fire whose "
+ "target matches.",
+ "Audit");
+DEFINE_bool(audit_70_log_all_releases, false,
+ "AUDIT-070: when true, log EVERY NtReleaseSemaphore / "
+ "xeKeReleaseSemaphore fire (used to identify the work-semaphore "
+ "handle on first run). Default false; use only with --mute=true.",
+ "Audit");
+
+// Phase A — see kernel/event_log.h.
+DEFINE_string(phase_a_event_log_path, "",
+ "Phase A: write schema-v1 JSONL event log to this path. "
+ "Empty (default) = disabled.",
+ "Audit");
+DEFINE_bool(phase_a_event_log_mem_writes, false,
+ "Phase A: include mem.write events in the JSONL log. RESERVED — "
+ "not wired in this phase. Default false.",
+ "Audit");
+
+// Phase D Stage 1 — see kernel/event_log.h `EmitContentionObserved`.
+DEFINE_bool(kernel_emit_contention, false,
+ "Phase D Stage 1: emit `contention.observed` events when "
+ "RtlEnterCriticalSection's spin loop is exhausted and the call "
+ "falls through to xeKeWaitForSingleObject. Default false (zero "
+ "cost when disabled). Requires --phase_a_event_log_path to be "
+ "set as well.",
+ "Audit");
+
+// Phase B — see kernel/phase_b_snapshot.h.
+DEFINE_string(phase_b_snapshot_dir, "",
+ "Phase B: write 5-file structured state snapshot to "
+ "<dir>/canary/ at the moment immediately before the first "
+ "guest PPC instruction of entry_point. Empty (default) = "
+ "disabled, zero overhead.",
+ "Audit");
+DEFINE_bool(phase_b_snapshot_and_exit, false,
+ "Phase B: after writing the snapshot, exit the process "
+ "immediately (std::_Exit(0)) so re-runs are byte-deterministic.",
+ "Audit");
+DEFINE_bool(phase_b_dump_section_content, false,
+ "Phase B: in memory.json, populate section_contents[].content_b64 "
+ "with raw bytes of every committed XEX-image region. Default "
+ "false — per-region SHA-256 is enough for the routine diff; "
+ "this is the escape hatch for the STOP-and-report condition "
+ "(image_loaded_sha256 mismatch).",
+ "Audit");diff --git a/src/xenia/cpu/cpu_flags.h b/src/xenia/cpu/cpu_flags.h
index 38c4f98ba..95fe8cb22 100644
--- a/src/xenia/cpu/cpu_flags.h
+++ b/src/xenia/cpu/cpu_flags.h
@@ -35,4 +35,76 @@ DECLARE_bool(break_condition_truncate);
DECLARE_bool(break_on_debugbreak);
+// AUDIT-DEMO smoke marker.
+DECLARE_bool(audit_demo_setup_trace);
+
+// AUDIT-061: multi-PC branch probe — emits one log line per fire with
+// (pc, lr, cr0 LGE, cr6 LGE, r3, r4, r5, r6, r31, tid). CSV of guest PCs.
+DECLARE_string(audit_61_branch_probe_pcs);
+
+// AUDIT-067: value-watch — emit a log line for each 32-bit guest store whose
+// value-to-be-stored matches any configured value. CSV of u32 values
+// ("0xDEADBEEF,..."), max 4 entries. Default empty (off); zero cost when empty.
+DECLARE_string(audit_67_value_watch);
+
+// AUDIT-068: host-side memory-write watch — emit a log line for each host-side
+// write to guest memory whose VALUE matches any configured u32 value, or whose
+// guest VA falls within any configured ADDR or ADDR-range. Mirrors AUDIT-067
+// but covers the host-side write paths (xe::store_and_swap<T>, Memory::Zero/
+// Fill/Copy) that AUDIT-067's JIT store-opcode hooks cannot see.
+//
+// VALUES: CSV of u32 values, max 8 entries; e.g. "0x8200A208,0x8200A928".
+// ADDRS: CSV of guest VAs or VA ranges, max 8 entries; range form is
+// "0xSTART-0xEND" (inclusive). e.g. "0x42500000-0x42600000,0xBCE25340".
+// Default empty (off); zero cost on the hot path when both are empty.
+DECLARE_string(audit_68_host_mem_watch_values);
+DECLARE_string(audit_68_host_mem_watch_addrs);
+
+// AUDIT-068 Session 3: read-mode probe. CSV of "VA:SIZE:PERIOD_NS" tuples
+// (max 8). A dedicated low-priority thread polls each VA every PERIOD_NS and
+// emits AUDIT-068-READ-CHANGE when the value transitions. SIZE in {1,2,4,8}.
+// Example: "0xBCE25340:4:1000000" = poll u32 at 0xBCE25340 every 1 ms.
+// Default empty (off); the poll thread is not spawned when empty.
+DECLARE_string(audit_68_host_mem_read_probe);
+
+// AUDIT-069: event-signal watch. CSV of guest handle IDs (e.g. "0xF8000098")
+// to log on every XEvent::Set / KeSetEvent / NtSetEvent / KePulseEvent /
+// NtPulseEvent fire whose target matches. Max 4 entries. Default empty (off);
+// zero cost on the hot path when empty.
+DECLARE_string(audit_69_event_signal_watch);
+// AUDIT-069: event-signal watch by native guest VA (X_KEVENT*). CSV of guest
+// VAs (max 4). Default empty (off). Use when the handle id varies across
+// boots but the native dispatcher pointer is stable.
+DECLARE_string(audit_69_event_signal_native_ptr);
+// AUDIT-069: when true, log EVERY XEvent::Set / XEvent::Pulse fire (subject
+// to the slowpath gate). Use only with --mute=true and short windows — high
+// volume. Default false (off).
+DECLARE_bool(audit_69_log_all_sets);
+
+// AUDIT-070 (S5 of AUDIT-069 family): semaphore-release watch. CSV of guest
+// handle IDs (e.g. "0xF8000098") to log on every NtReleaseSemaphore /
+// xeKeReleaseSemaphore fire whose target matches. Max 4 entries. Default
+// empty (off); zero cost on the hot path when empty.
+DECLARE_string(audit_70_semaphore_release_watch);
+// AUDIT-070: when true, log EVERY NtReleaseSemaphore / xeKeReleaseSemaphore
+// fire. Use only with --mute=true and short windows — used to identify the
+// canary work-semaphore handle on first run. Default false (off).
+DECLARE_bool(audit_70_log_all_releases);
+
+// Phase A: JSONL event-log emitter path. When non-empty, the engine writes
+// schema-v1 JSONL events to this file. Empty (default) = no overhead, no
+// behavior change. Schema: xenia-rs/audit-runs/phase-a-diff-harness/schema-v1.md
+DECLARE_string(phase_a_event_log_path);
+DECLARE_bool(phase_a_event_log_mem_writes);
+
+// Phase B: initial-state snapshot. When the dir cvar is non-empty, the
+// engine writes a five-file structured state snapshot (cpu_state.json,
+// memory.json, kernel.json, vfs.json, config.json, plus manifest.json) to
+// `<dir>/canary/` at the moment immediately before the first guest PPC
+// instruction of the XEX entry_point executes. See
+// `xenia-rs/audit-runs/phase-b-state-equivalence/`.
+DECLARE_string(phase_b_snapshot_dir);
+DECLARE_bool(phase_b_snapshot_and_exit);
+DECLARE_bool(phase_b_dump_section_content);
+
#endif // XENIA_CPU_CPU_FLAGS_H_diff --git a/src/xenia/kernel/xboxkrnl/xboxkrnl_threading.cc b/src/xenia/kernel/xboxkrnl/xboxkrnl_threading.cc
index ced21a600..e1c74d7ec 100644
--- a/src/xenia/kernel/xboxkrnl/xboxkrnl_threading.cc
+++ b/src/xenia/kernel/xboxkrnl/xboxkrnl_threading.cc
@@ -12,6 +12,8 @@
#include "xenia/base/clock.h"
#include "xenia/base/platform.h"
#include "xenia/cpu/processor.h"
+#include "xenia/kernel/audit_70_semaphore_release_watch.h"
+#include "xenia/kernel/event_log.h"
#include "xenia/kernel/util/shim_utils.h"
#include "xenia/kernel/xboxkrnl/xboxkrnl_private.h"
#include "xenia/kernel/xsemaphore.h"
@@ -147,6 +149,25 @@ uint32_t ExCreateThread(xe::be<uint32_t>* handle_ptr, uint32_t stack_size,
if (thread_id_ptr) {
*thread_id_ptr = thread->thread_id();
}
+ // Phase C+15-α: schema-v1 `thread.create` event. Symmetric with
+ // ours's `ex_create_thread`. Emitted by the **parent** thread.
+ // handle.create for the thread handle itself was already emitted
+ // via ObjectTable::AddHandle inside XThread::Create. Here we
+ // surface the spawn-specific metadata.
+ if (phase_a::IsEnabled()) {
+ uint64_t sid = phase_a::LookupHandleSemanticId(thread->handle());
+ XThread* parent = XThread::TryGetCurrentThread();
+ uint32_t parent_tid = 0;
+ if (parent) {
+ parent_tid = static_cast<uint32_t>(
+ parent->guest_object<X_KTHREAD>()->thread_id);
+ }
+ uint32_t affinity = (creation_flags >> 24) & 0xFF;
+ bool suspended = (creation_flags & 0x1) != 0;
+ phase_a::EmitThreadCreate(sid, parent_tid, start_address, start_context,
+ /* priority */ 0, affinity, actual_stack_size,
+ suspended);
+ }
}
return result;
}
@@ -165,6 +186,9 @@ DECLARE_XBOXKRNL_EXPORT1(ExCreateThread, kThreading, kImplemented);
uint32_t ExTerminateThread(uint32_t exit_code) {
XThread* thread = XThread::GetCurrentThread();
+ // Phase C+15-α: schema-v1 `thread.exit` is emitted inside
+ // `XThread::Exit` (covers both explicit ExTerminateThread and
+ // implicit thread-entry returns).
// NOTE: this kills us right now. We won't return from it.
return thread->Exit(exit_code);
@@ -718,6 +742,9 @@ uint32_t xeKeReleaseSemaphore(X_KSEMAPHORE* semaphore_ptr, uint32_t increment,
int32_t previous_count = 0;
[[maybe_unused]] bool success =
sem->ReleaseSemaphore(adjustment, &previous_count);
+ // AUDIT-070: log Ke-form release fires whose target handle matches.
+ audit_70::check_release(sem->handle(), "xeKeReleaseSemaphore",
+ static_cast<int32_t>(adjustment), previous_count);
return static_cast<uint32_t>(previous_count);
}
@@ -786,6 +813,13 @@ dword_result_t NtReleaseSemaphore_entry(dword_t sem_handle,
uint32_t(release_count), previous_count);
result = X_STATUS_SEMAPHORE_LIMIT_EXCEEDED;
}
+ // AUDIT-070: log Nt-form release fires whose target handle matches.
+ // Logged regardless of success/limit-exceeded — distinguished by
+ // result/previous_count in subsequent analysis.
+ audit_70::check_release(static_cast<uint32_t>(sem_handle),
+ "NtReleaseSemaphore",
+ static_cast<int32_t>(release_count),
+ previous_count);
} else {
result = X_STATUS_INVALID_HANDLE;
}
@@ -954,6 +988,19 @@ uint32_t xeKeWaitForSingleObject(void* object_ptr, uint32_t wait_reason,
return X_STATUS_ABANDONED_WAIT_0;
}
+ // Phase C+15-α: schema-v1 `wait.begin` event. Symmetric with ours's
+ // `ke_wait_for_single_object`. Resolve the SID via the object's
+ // first registered handle.
+ if (phase_a::IsEnabled()) {
+ uint64_t sid = 0;
+ if (!object->handles().empty()) {
+ sid = phase_a::LookupHandleSemanticId(object->handles()[0]);
+ }
+ int64_t timeout_ns = timeout_ptr ? (static_cast<int64_t>(*timeout_ptr) * 100) : -1;
+ phase_a::EmitWaitBegin(&sid, 1, timeout_ns, alertable != 0,
+ /* wait_all */ false);
+ }
+
X_STATUS result =
object->Wait(wait_reason, processor_mode, alertable, timeout_ptr);
if (alertable) {
@@ -980,6 +1027,16 @@ uint32_t NtWaitForSingleObjectEx(uint32_t object_handle, uint32_t wait_mode,
uint32_t alertable, uint64_t* timeout_ptr) {
X_STATUS result = X_STATUS_SUCCESS;
+ // Phase C+15-α: schema-v1 `wait.begin` event. Symmetric with ours's
+ // `nt_wait_for_single_object_ex`. Resolve SID directly from the
+ // handle.
+ if (phase_a::IsEnabled()) {
+ uint64_t sid = phase_a::LookupHandleSemanticId(object_handle);
+ int64_t timeout_ns = timeout_ptr ? (static_cast<int64_t>(*timeout_ptr) * 100) : -1;
+ phase_a::EmitWaitBegin(&sid, 1, timeout_ns, alertable != 0,
+ /* wait_all */ false);
+ }
+
auto object =
kernel_state()->object_table()->LookupObject<XObject>(object_handle);
if (object) {

View File

@@ -0,0 +1,206 @@
diff --git a/src/xenia/cpu/cpu_flags.cc b/src/xenia/cpu/cpu_flags.cc
index 3ff067e15..e024bfb26 100644
--- a/src/xenia/cpu/cpu_flags.cc
+++ b/src/xenia/cpu/cpu_flags.cc
@@ -57,3 +57,98 @@ DEFINE_bool(break_condition_truncate, true, "truncate value to 32-bits", "CPU");
DEFINE_bool(break_on_debugbreak, true, "int3 on JITed __debugbreak requests.",
"CPU");
+
+// AUDIT-DEMO: smoke marker (memory entry: emulator.cc:225,283). Always-on bool.
+DEFINE_bool(audit_demo_setup_trace, true,
+ "Audit smoke marker: log AUDIT-DEMO-SETUP-BEGIN at emulator setup.",
+ "Audit");
+
+// AUDIT-061: comma-separated list of guest PCs to log on each fire.
+// Format: "0xPC1,0xPC2,..." (max 32 PCs). Each fire emits
+// AUDIT-061-BR pc=X lr=X cr0=LGE cr6=LGE r3=X r4=X r5=X r6=X r31=X tid=N.
+// Default empty (off); no perf cost when empty.
+DEFINE_string(audit_61_branch_probe_pcs, "",
+ "AUDIT-061: CSV of guest PCs to trace (cr0/cr6 + regs/tid).",
+ "Audit");
+
+// AUDIT-067: comma-separated list of u32 values to watch. When non-empty,
+// every 4-byte guest store (stw/stwu/stwx/stwux/stmw) emits a runtime
+// equality check; matches log AUDIT-067-VAL pc=X lr=X val=X dst=X r3..r6 r31 tid=N.
+// Max 4 values. Default empty (off); zero overhead when empty.
+DEFINE_string(audit_67_value_watch, "",
+ "AUDIT-067: CSV of u32 values (max 4) — log every guest "
+ "store whose value matches.",
+ "Audit");
+
+// AUDIT-068: host-side memory-write watch. See cpu_flags.h header for format.
+// Mirrors AUDIT-067 but covers host-side writes (xe::store_and_swap<T>,
+// Memory::Zero/Fill/Copy). Empty default = zero cost.
+DEFINE_string(audit_68_host_mem_watch_values, "",
+ "AUDIT-068: CSV of u32 values (max 8) — log every host-side "
+ "guest-memory write whose value matches.",
+ "Audit");
+DEFINE_string(audit_68_host_mem_watch_addrs, "",
+ "AUDIT-068: CSV of guest VAs or VA ranges 'START-END' (max 8) "
+ "— log every host-side guest-memory write whose guest VA falls "
+ "within the configured set.",
+ "Audit");
+
+// AUDIT-068 Session 3: read-mode probe. See cpu_flags.h for format.
+DEFINE_string(audit_68_host_mem_read_probe, "",
+ "AUDIT-068 Session 3: CSV of 'VA:SIZE:PERIOD_NS' tuples (max 8) "
+ "— a dedicated poll thread reads the value at each VA every "
+ "PERIOD_NS and emits AUDIT-068-READ-CHANGE on transition.",
+ "Audit");
+
+// AUDIT-069: see cpu_flags.h header. Empty default = zero cost.
+DEFINE_string(audit_69_event_signal_watch, "",
+ "AUDIT-069: CSV of guest event-handle IDs (max 4) — log each "
+ "XEvent::Set / Ke*Event / Nt*Event fire whose target matches.",
+ "Audit");
+DEFINE_string(audit_69_event_signal_native_ptr, "",
+ "AUDIT-069: CSV of guest event native VAs (X_KEVENT*) (max 4) "
+ "— log each set fire whose native pointer matches.",
+ "Audit");
+DEFINE_bool(audit_69_log_all_sets, false,
+ "AUDIT-069: when true, log EVERY XEvent::Set/Pulse fire (used "
+ "for one-run wait→signal correlation across handle drift). "
+ "Default false; use only with --mute=true.",
+ "Audit");
+
+// Phase A — see kernel/event_log.h.
+DEFINE_string(phase_a_event_log_path, "",
+ "Phase A: write schema-v1 JSONL event log to this path. "
+ "Empty (default) = disabled.",
+ "Audit");
+DEFINE_bool(phase_a_event_log_mem_writes, false,
+ "Phase A: include mem.write events in the JSONL log. RESERVED — "
+ "not wired in this phase. Default false.",
+ "Audit");
+
+// Phase D Stage 1 — see kernel/event_log.h `EmitContentionObserved`.
+DEFINE_bool(kernel_emit_contention, false,
+ "Phase D Stage 1: emit `contention.observed` events when "
+ "RtlEnterCriticalSection's spin loop is exhausted and the call "
+ "falls through to xeKeWaitForSingleObject. Default false (zero "
+ "cost when disabled). Requires --phase_a_event_log_path to be "
+ "set as well.",
+ "Audit");
+
+// Phase B — see kernel/phase_b_snapshot.h.
+DEFINE_string(phase_b_snapshot_dir, "",
+ "Phase B: write 5-file structured state snapshot to "
+ "<dir>/canary/ at the moment immediately before the first "
+ "guest PPC instruction of entry_point. Empty (default) = "
+ "disabled, zero overhead.",
+ "Audit");
+DEFINE_bool(phase_b_snapshot_and_exit, false,
+ "Phase B: after writing the snapshot, exit the process "
+ "immediately (std::_Exit(0)) so re-runs are byte-deterministic.",
+ "Audit");
+DEFINE_bool(phase_b_dump_section_content, false,
+ "Phase B: in memory.json, populate section_contents[].content_b64 "
+ "with raw bytes of every committed XEX-image region. Default "
+ "false — per-region SHA-256 is enough for the routine diff; "
+ "this is the escape hatch for the STOP-and-report condition "
+ "(image_loaded_sha256 mismatch).",
+ "Audit");
diff --git a/src/xenia/cpu/cpu_flags.h b/src/xenia/cpu/cpu_flags.h
index 38c4f98ba..cf5719b8b 100644
--- a/src/xenia/cpu/cpu_flags.h
+++ b/src/xenia/cpu/cpu_flags.h
@@ -35,4 +35,66 @@ DECLARE_bool(break_condition_truncate);
DECLARE_bool(break_on_debugbreak);
+// AUDIT-DEMO smoke marker.
+DECLARE_bool(audit_demo_setup_trace);
+
+// AUDIT-061: multi-PC branch probe — emits one log line per fire with
+// (pc, lr, cr0 LGE, cr6 LGE, r3, r4, r5, r6, r31, tid). CSV of guest PCs.
+DECLARE_string(audit_61_branch_probe_pcs);
+
+// AUDIT-067: value-watch — emit a log line for each 32-bit guest store whose
+// value-to-be-stored matches any configured value. CSV of u32 values
+// ("0xDEADBEEF,..."), max 4 entries. Default empty (off); zero cost when empty.
+DECLARE_string(audit_67_value_watch);
+
+// AUDIT-068: host-side memory-write watch — emit a log line for each host-side
+// write to guest memory whose VALUE matches any configured u32 value, or whose
+// guest VA falls within any configured ADDR or ADDR-range. Mirrors AUDIT-067
+// but covers the host-side write paths (xe::store_and_swap<T>, Memory::Zero/
+// Fill/Copy) that AUDIT-067's JIT store-opcode hooks cannot see.
+//
+// VALUES: CSV of u32 values, max 8 entries; e.g. "0x8200A208,0x8200A928".
+// ADDRS: CSV of guest VAs or VA ranges, max 8 entries; range form is
+// "0xSTART-0xEND" (inclusive). e.g. "0x42500000-0x42600000,0xBCE25340".
+// Default empty (off); zero cost on the hot path when both are empty.
+DECLARE_string(audit_68_host_mem_watch_values);
+DECLARE_string(audit_68_host_mem_watch_addrs);
+
+// AUDIT-068 Session 3: read-mode probe. CSV of "VA:SIZE:PERIOD_NS" tuples
+// (max 8). A dedicated low-priority thread polls each VA every PERIOD_NS and
+// emits AUDIT-068-READ-CHANGE when the value transitions. SIZE in {1,2,4,8}.
+// Example: "0xBCE25340:4:1000000" = poll u32 at 0xBCE25340 every 1 ms.
+// Default empty (off); the poll thread is not spawned when empty.
+DECLARE_string(audit_68_host_mem_read_probe);
+
+// AUDIT-069: event-signal watch. CSV of guest handle IDs (e.g. "0xF8000098")
+// to log on every XEvent::Set / KeSetEvent / NtSetEvent / KePulseEvent /
+// NtPulseEvent fire whose target matches. Max 4 entries. Default empty (off);
+// zero cost on the hot path when empty.
+DECLARE_string(audit_69_event_signal_watch);
+// AUDIT-069: event-signal watch by native guest VA (X_KEVENT*). CSV of guest
+// VAs (max 4). Default empty (off). Use when the handle id varies across
+// boots but the native dispatcher pointer is stable.
+DECLARE_string(audit_69_event_signal_native_ptr);
+// AUDIT-069: when true, log EVERY XEvent::Set / XEvent::Pulse fire (subject
+// to the slowpath gate). Use only with --mute=true and short windows — high
+// volume. Default false (off).
+DECLARE_bool(audit_69_log_all_sets);
+
+// Phase A: JSONL event-log emitter path. When non-empty, the engine writes
+// schema-v1 JSONL events to this file. Empty (default) = no overhead, no
+// behavior change. Schema: xenia-rs/audit-runs/phase-a-diff-harness/schema-v1.md
+DECLARE_string(phase_a_event_log_path);
+DECLARE_bool(phase_a_event_log_mem_writes);
+
+// Phase B: initial-state snapshot. When the dir cvar is non-empty, the
+// engine writes a five-file structured state snapshot (cpu_state.json,
+// memory.json, kernel.json, vfs.json, config.json, plus manifest.json) to
+// `<dir>/canary/` at the moment immediately before the first guest PPC
+// instruction of the XEX entry_point executes. See
+// `xenia-rs/audit-runs/phase-b-state-equivalence/`.
+DECLARE_string(phase_b_snapshot_dir);
+DECLARE_bool(phase_b_snapshot_and_exit);
+DECLARE_bool(phase_b_dump_section_content);
+
#endif // XENIA_CPU_CPU_FLAGS_H_
diff --git a/src/xenia/kernel/xevent.cc b/src/xenia/kernel/xevent.cc
index b583bf732..f8bf47952 100644
--- a/src/xenia/kernel/xevent.cc
+++ b/src/xenia/kernel/xevent.cc
@@ -11,6 +11,7 @@
#include "xenia/base/byte_stream.h"
#include "xenia/base/logging.h"
+#include "xenia/kernel/audit_69_event_signal_watch.h"
namespace xe {
namespace kernel {
@@ -58,12 +59,19 @@ void XEvent::InitializeNative(void* native_ptr, X_DISPATCH_HEADER* header) {
}
int32_t XEvent::Set(uint32_t priority_increment, bool wait) {
+ // AUDIT-069: log event-signal fires whose target matches the configured
+ // handle ID or native VA. Hot path is a single relaxed atomic load when
+ // the cvars are empty (default).
+ audit_69::check_event_set(this->handle(), this->guest_object(),
+ "XEvent::Set");
set_priority_increment(priority_increment);
event_->Set();
return 1;
}
int32_t XEvent::Pulse(uint32_t priority_increment, bool wait) {
+ audit_69::check_event_set(this->handle(), this->guest_object(),
+ "XEvent::Pulse");
set_priority_increment(priority_increment);
event_->Pulse();
return 1;

View File

@@ -0,0 +1,143 @@
# AUDIT-069 Session 3 — handle-sequence diff (ours tid=5 vs canary tid=10)
Two engines run γ-signaler family on identical thread (entry=0x82450A28, ctx=0x828F3B68).
ours labels this thread tid=5; canary labels it tid=10 (cross-engine tid mismatch, AUDIT-068 reading-error #28).
## Fire-count summary
| caller LR | symbol | wrapper PC | ours fires | canary fires | ratio |
|---|---|---|---|---|---|
| 0x8245DA44 | γ-D-A (sub_8245D9D8) | 0x824AA2F0 (NtSetEvent) | 5 | 23 | 22% |
| 0x8245DB08 | γ-D-B (sub_8245DA78) | 0x824AA2F0 (NtSetEvent) | 1 | 8 | 12% |
| 0x8245DC5C | γ-DB40 (sub_8245DB40) | 0x824AAF50 (Ke wrapper) | 75 | 461 | 16% |
| **TOTAL tid=5/tid=10 signaler work** | | | **81** | **492** | **16%** |
**Headline divergence**: ours completes ~16% of canary's producer-loop iterations.
Not (only) "wrong handles" — ours produces FAR fewer signals.
## Per-LR position-aligned sequence (handle = r3)
Note: ours uses normal slot-id namespace (0x10xx). canary uses pseudo-handle namespace (F8000xxx).
Handles cannot be compared by raw ID. Compare by position-in-per-LR-sequence and by call-args (size r5).
### γ-DB40 dispatch (lr=0x8245DC5C) — Ke wrapper @ 0x824AAF50
Args: r3=handle, r4=buf_ptr, r5=size, r6=0
| pos | ours r3 | ours r5(size) | ours r4(buf) | canary r3 | canary r5(size) | canary r4(buf) |
|---:|---|---|---|---|---|---|
| 0 | 0x00001040 | 0x00000800 | 0x41a01cd0 | 0xf8000030 | 0x00000800 | 0xbdb18cd0 |
| 1 | 0x0000105c | 0x00000800 | 0x41a01cd0 | 0xf8000034 | 0x00000800 | 0xbdb19cd0 |
| 2 | 0x00001098 | 0x00019000 | 0x42c12090 | 0xf8000044 | 0x00000800 | 0xbdb19cd0 |
| 3 | 0x000010ac | 0x00000800 | 0x41a01cd0 | 0xf8000044 | 0x00019000 | 0xbed2a090 |
| 4 | 0x000010d0 | 0x0001c000 | 0x431520d0 | 0xf8000078 | 0x0001c000 | 0xbf26a0d0 |
| 5 | 0x000010e0 | 0x00020000 | 0x4c946800 | 0xf8000078 | 0x00000800 | 0xbdb19cd0 |
| 6 | 0x000010e0 | 0x00020000 | 0x4c966800 | 0xf8000078 | 0x00020000 | 0xb2cb0800 |
| 7 | 0x000010e0 | 0x00020000 | 0x4c986800 | 0xf8000078 | 0x00020000 | 0xb2cd0800 |
| 8 | 0x000010e0 | 0x00020000 | 0x4c9a6800 | 0xf8000078 | 0x00020000 | 0xb2cf0800 |
| 9 | 0x000010e0 | 0x00020000 | 0x4c9c6800 | 0xf8000078 | 0x00020000 | 0xb2d10800 |
| 10 | 0x000010e0 | 0x00020000 | 0x4c9e6800 | 0xf8000078 | 0x00020000 | 0xb2d30800 |
| 11 | 0x000010e0 | 0x00020000 | 0x4ca06800 | 0xf8000078 | 0x00020000 | 0xb2d50800 |
| 12 | 0x000010e0 | 0x00020000 | 0x4ca26800 | 0xf8000078 | 0x00020000 | 0xb2d70800 |
| 13 | 0x000010e0 | 0x00020000 | 0x4ca46800 | 0xf8000078 | 0x00020000 | 0xb2d90800 |
| 14 | 0x000010e0 | 0x00020000 | 0x4ca66800 | 0xf8000078 | 0x00020000 | 0xb2db0800 |
| 15 | 0x000010e0 | 0x00020000 | 0x4ca86800 | 0xf8000078 | 0x00020000 | 0xb2dd0800 |
| 16 | 0x000010e0 | 0x00020000 | 0x4caa6800 | 0xf8000078 | 0x00020000 | 0xb2df0800 |
| 17 | 0x000010e0 | 0x00020000 | 0x4cac6800 | 0xf8000078 | 0x00020000 | 0xb2e10800 |
| 18 | 0x000010e0 | 0x00020000 | 0x4cae6800 | 0xf8000078 | 0x00020000 | 0xb2e30800 |
| 19 | 0x000010e0 | 0x00020000 | 0x4cb06800 | 0xf8000078 | 0x00020000 | 0xb2e50800 |
... (ours total 75, canary total 461)
### γ-D-A dispatch (lr=0x8245DA44) — NtSetEvent wrapper @ 0x824AA2F0
Args: r3=handle, r4=2(SignalKind=Set), r5=handle (dup), r6=ctx
| pos | ours r3 | ours r4 | canary r3 | canary r4 |
|---:|---|---|---|---|
| 0 | 0x00001054 | 0x00000002 | 0xf8000044 | 0x00000002 |
| 1 | 0x00001064 | 0x00000002 | 0xf8000048 | 0x00000002 |
| 2 | 0x000010a0 | 0x00000002 | 0xf8000074 | 0x00000002 |
| 3 | 0x000010b4 | 0x00000002 | 0xf8000080 | 0x00000002 |
| 4 | 0x000010ec | 0x00000002 | 0xf8000098 | 0x00000002 |
| 5 | --- | --- | 0xf80000a8 | 0x00000002 |
| 6 | --- | --- | 0xf80000b8 | 0x00000002 |
| 7 | --- | --- | 0xf80000c4 | 0x00000002 |
| 8 | --- | --- | 0xf80000d4 | 0x00000002 |
| 9 | --- | --- | 0xf80000e0 | 0x00000002 |
| 10 | --- | --- | 0xf80000e8 | 0x00000002 |
| 11 | --- | --- | 0xf80000f0 | 0x00000002 |
| 12 | --- | --- | 0xf80000f8 | 0x00000002 |
| 13 | --- | --- | 0xf80000fc | 0x00000002 |
| 14 | --- | --- | 0xf80000c4 | 0x00000002 |
| 15 | --- | --- | 0xf800009c | 0x00000002 |
| 16 | --- | --- | 0xf80000d4 | 0x00000002 |
| 17 | --- | --- | 0xf80000d4 | 0x00000002 |
| 18 | --- | --- | 0xf80000d4 | 0x00000002 |
| 19 | --- | --- | 0xf80000d0 | 0x00000002 |
| 20 | --- | --- | 0xf80000d0 | 0x00000002 |
| 21 | --- | --- | 0xf80000d0 | 0x00000002 |
| 22 | --- | --- | 0xf8000124 | 0x00000002 |
... (ours total 5, canary total 23)
### γ-D-B dispatch (lr=0x8245DB08) — NtSetEvent wrapper @ 0x824AA2F0
| pos | ours r3 | ours r4 | canary r3 | canary r4 |
|---:|---|---|---|---|
| 0 | 0x000010d8 | 0x7116fc40 | 0xf8000044 | 0x7033fc10 |
| 1 | --- | --- | 0xf8000080 | 0x7033fc10 |
| 2 | --- | --- | 0xf80000c0 | 0x7033fc10 |
| 3 | --- | --- | 0xf80000d0 | 0x7033fc10 |
| 4 | --- | --- | 0xf80000b4 | 0x7033fc10 |
| 5 | --- | --- | 0xf80000d4 | 0x7033fc10 |
| 6 | --- | --- | 0xf80000d0 | 0x7033fc10 |
| 7 | --- | --- | 0xf80000c8 | 0x7033fc10 |
## First-mismatch identification
Per-LR position 0:
- γ-DB40 pos[0]: ours r3=0x1040 r5=0x800 r4=0x41a01cd0 | canary r3=0xF8000030 r5=0x800 r4=0xBDB18CD0
- **r5 (size) MATCHES** = 0x800.
- r4 (buf pointer) DIFFERS in absolute address (0x41a01cd0 vs 0xBDB18CD0) — different memory layouts, expected.
- r3 different namespace — to be expected (pseudo-handle vs slot id).
- γ-D-A pos[0]: ours r3=0x1054 r4=0x2 | canary r3=0xF8000044 r4=0x2
- r4 (signal-kind=Set) MATCHES.
- Args structurally match.
- γ-D-B pos[0]: ours r3=0x10D8 r4=0x7116FC40 r5=0x2 | canary r3=0xF8000044 r4=0x7033FC10 r5=0x2
- r5 (signal-kind) MATCHES.
- r4 (ctx pointer) DIFFERS in absolute address — different stack layout.
Position-0 invocations are STRUCTURALLY consistent. The divergence in per-fire COUNT (5 vs 23, 1 vs 8, 75 vs 461) means ours's producer LOOP runs ~5× fewer iterations before exiting.
## Wedge handle status in ours
**AUDIT-062 archive** (~9 days old) recorded ours wedge handles `0x12AC` and `0x12B8` (kind=Event/Auto)
with `<NO_SIGNALS_DESPITE_WAITS>` annotation.
In THIS run's ours lr-trace: handle 0x12AC count = **0**, handle 0x12B8 count = **0**.
Max handle seen in lr-trace: 0x121C (cache file handle).
The wedge handles `0x12AC`/`0x12B8` were NOT created in this 5B-instruction run — boot terminates early.
## Boot-termination evidence
- ours exec completed 1.5B instr / 47s wallclock, OR 5B instr / 159s wallclock — same handle universe.
- `--halt-on-deadlock` did NOT trigger.
- import_calls = 39,290 identical on both runs.
- tid=5 producer fires 81 events then goes quiet; consumer threads remain blocked on existing handles indefinitely.
- Wedge `0x12AC`/`0x12B8` from AUDIT-062 archive likely formed in deeper-boot trajectory (NtCreateEvent calls after a graphics-frame-tick or similar event that doesn't fire here).
## Classification: missing-signal vs race
**ours produces 81 signals where canary produces 492 from the SAME caller chain on the SAME guest thread.**
This is a **producer-loop-underrun** classification:
- The signaler thread (tid=5) runs the EXACT SAME guest-code path (PCs match, LRs match).
- Position-0 args match structurally.
- But the loop ITERATES far fewer times before going idle.
The "wrong handles" framing from AUDIT-062 is partial: the bigger problem is that **the loop exits early** — most of the work that canary completes never gets touched by ours.
Mechanism: sub_82450A68 dispatch loop reads work from a guest-memory work queue. Each iteration enqueues a new task once the previous fires. If the producer FEEDING that queue under-fires, the dispatch loop's read-head reaches the tail early and the loop exits (or blocks on a dispatcher event with no pending work).

View File

@@ -0,0 +1,209 @@
# AUDIT-069 Session 4 — divergence analysis
Date: 2026-05-20
xenia-rs HEAD: `e6d43a23ac393004d2e5adf2f0395fd0b5e6448b` (UNCHANGED)
## Headline (HIGH confidence — direct per-iteration measurement)
The S3 framing of "producer-loop underrun" was directionally right but
mis-located the divergence. The loop in `sub_82450A68` **does not take
an early-exit branch in either engine** — neither ours nor canary ever
reaches `0x82450B50` (the exit path). Both stay in the loop indefinitely.
The divergence is **WHAT the NtWaitForMultipleObjectsEx call returns at
each iteration**:
- **Ours: r3 = 1 (WAIT_OBJECT_0+1, semaphore signaled) EVERY iteration.**
- **Canary: r3 = 0x102 (WAIT_TIMEOUT) mostly, r3 = 1 occasionally.**
This refines the producer-loop classification: it is NOT loop-underrun
(both engines's loops run continuously). It is a **semaphore-state
divergence** — ours's work semaphore is over-released or never properly
drained; canary's drains correctly and the wait times out per 16ms tick.
## Loop structure (sub_82450A68 disasm at s4/sub_82450A68-disasm.txt)
```
0x82450A28: sub_82450A28 = thread entry (KeSetThreadPriority(-2, 3); bl sub_82450A68)
0x82450A68: prolog (mflr, alloc 128B frame, r31=ctx_arg)
0x82450A78-94: stack handle array [r1+80]=[r31+88]=handle[0]=STOP_EVENT (=0x104C in ours),
[r1+84]=[r31+92]=handle[1]=WORK_SEMAPHORE (=0x1050 in ours).
0x82450A98: bl 0x824AB240 ; NtWaitForMultipleObjectsEx wrapper, 16ms timeout
0x82450A9C-A0: cmplwi/beq cr6, r3, 0 → 0x82450B50 [EXIT-WAIT1: r3==0 → exit (stop signaled)]
0x82450AA4-A8: li r29,0; li r28,4 [FIRST-ITER body entry]
0x82450AAC: lwz r11, 212(r31) [BACK-EDGE TARGET; reads "fast-path flag"]
0x82450AB0-BC: cntlzw / extrwi / cmplwi / bne cr6, 0xAC8 [BR-A: flag@212!=0 → search path]
0x82450AC0-C4: li r4,5; b 0xB2C [BR-B: flag@212==0 → direct dispatch w/ r4=5]
0x82450AC8-CC: mr r30,r29; addi r11,r31,112 [search-path setup]
0x82450AD0-E0: lwz r10,0(r11); cntlzw; extrwi; cmplwi; beq cr6, 0xAF8 [BR-C: candidate found]
0x82450AE4-F0: addi r30,1; addi r11,20; cmplwi cr6, r30, 5; blt cr6, 0xAD0 [BR-D: search continue]
0x82450AF4: b 0xB34 [BR-E: search exhausted → skip dispatch, re-wait]
0x82450AF8: lwz r11, 224(r31) [budget check]
0x82450AFC-00: cmplwi cr6, r11, 0; beq cr6, 0xB28 [BR-F: budget@224==0 → skip refresh]
0x82450B04-0C: lwz r11, 220(r31); cmpw cr6, r11, r30; bge cr6, 0xB28 [BR-G: budget cmp]
0x82450B10: bl 0x824AA830 [KeQueryPerformanceCounter; sub_824AA830]
0x82450B14-1C: lwz r11,224(r31); cmplw cr6,r3,r11; blt cr6, 0xB34 [BR-H: budget exceeded → re-wait]
0x82450B20-24: stw r28, 220(r31); stw r29, 224(r31)
0x82450B28: mr r4, r30
0x82450B2C-30: mr r3, r31; bl 0x82450B68 [DISPATCH: calls γ-signaler family]
0x82450B34-44: li r6,16; li r5,0; addi r4,r1,80; li r3,2; bl 0x824AB240 [RE-WAIT]
0x82450B48-4C: cmplwi cr6, r3, 0; bne cr6, 0x82450AAC [BACK-EDGE: r3!=0 → loop]
0x82450B50-58: li r3,0; addi r1,r1,128; b 0x825F0FD8 [EXIT path]
```
## Handle slots (ours, mem-watch confirmed)
```
[r31+88] = [0x828F3BC0] written at PC 0x8244FFB0 from NtCreateEvent → ours handle 0x104C
[r31+92] = [0x828F3BC4] written at PC 0x8244FFCC from NtCreateSemaphore → ours handle 0x1050
```
Created in `sub_8244FF50` (the spawn helper) BEFORE ExCreateThread:
- handle[0] = NtCreateEvent(EventType=NotificationEvent, InitialState=0)
- handle[1] = NtCreateSemaphore(InitialCount=0, MaximumCount=0x7FFFFFFF)
This is a **stop-event + work-semaphore** pattern, NOT two events.
NtWaitForMultipleObjectsEx with WaitAny:
- r3 = WAIT_OBJECT_0 = 0 → handle[0] (stop event) signaled → EXIT
- r3 = WAIT_OBJECT_0+1 = 1 → handle[1] (semaphore) acquired (decremented) → DO WORK
- r3 = WAIT_TIMEOUT = 0x102 → 16ms elapsed with no signal → continue (poll)
## Per-PC iteration counts (HIGH confidence, direct branch-probe)
| PC | path | ours fires | canary fires | ratio |
|---|---|---:|---:|---:|
| 0x82450AA4 | FIRST-ITER entry | 1 | 1 | 1× |
| 0x82450AAC | BACK-EDGE target | 91 | 4 | (canary crashed early) |
| 0x82450AC0 | BR-B: flag@212==0 direct-dispatch r4=5 | 2 | 0 | — |
| 0x82450AC8 | BR-A: flag@212!=0 search path | 90 | 4 | — |
| 0x82450AE4 | inner-search continue | 72 | 17 | — |
| 0x82450AF4 | BR-E: search exhausted | 8 | 3 | — |
| 0x82450AF8 | BR-C: candidate found | 82 | 1 | — |
| 0x82450B04 | BR-F: budget skip | 81 | 0 | — |
| 0x82450B10 | budget refresh (KeQuery) | 8 | 0 | — |
| 0x82450B28 | dispatch entry (r4=r30) | 74 | 1 | — |
| 0x82450B34 | re-wait entry | 92 | 4 | — |
| **0x82450B50** | **EXIT path** | **0** | **0** | **never exits** |
Canary's run was cut short at ~5 iterations by a vkd3d-proton fault on
exit. The relevant signal is in the **r3 distribution at the back-edge**,
not the absolute counts.
## r3 distribution at the back-edge (HIGH confidence)
### Ours (91 captures at PC=0x82450AAC, lr=0x82450B48)
```
r3=0x00000001 × 91/91 (100%)
r3=0x00000102 × 0/91 (0%)
```
### Canary (4 captures at PC=0x82450AAC, lr=0x82450B48)
```
r3=0x00000001 × 1/4 (25%)
r3=0x00000102 × 3/4 (75%)
```
Pattern visible in canary trace: first re-wait returns 0x1 (work
available immediately), subsequent re-waits return 0x102 (timeout).
## The divergent guest-memory location
The "divergent load" the user's framing predicted (a guest load reading
some flag whose value differs ours-vs-canary) is **the wait return
value, computed inside the kernel** — not a guest-memory load. The
return r3 comes from `NtWaitForMultipleObjectsEx` (a kernel import).
The kernel-side state that differs is the **WORK SEMAPHORE COUNT**:
- Ours: count > 0 at every wait → wait succeeds (decrement, r3=1)
- Canary: count = 0 at every wait (mostly) → wait times out (r3=0x102)
The semaphore count is influenced by:
- `NtReleaseSemaphore(handle[1], 1)` calls (increments count by 1)
- `NtWaitForMultipleObjectsEx` success on handle[1] (decrements by 1)
So either:
- (a) ours's NtReleaseSemaphore is called more aggressively than canary's
- (b) ours's NtWaitForMultipleObjectsEx doesn't decrement on success (kernel bug)
- (c) ours's NtCreateSemaphore creates with InitialCount > 0 (creation bug)
- (d) ours's NtReleaseSemaphore over-releases (kind-extra count)
## NtReleaseSemaphore callers (15 unique fns from sylpheed.db xrefs)
```
sub_822c6748, sub_822c6808, sub_822c8b50 (×6 inline call sites),
sub_822f2328,
sub_823dd770, sub_823dd838, sub_823de4b8 (×3),
sub_823df320,
sub_82450218 ← in dispatch-loop module (callers: sub_82452DC0 ×2)
sub_824503A0 ← in dispatch-loop module (callers: sub_82452690, sub_8245E1D8)
sub_82450B68 ← THE DISPATCH FUNCTION ITSELF (×2 internal release sites at 0xCDC, 0xD28)
sub_824569C0 (j-call), sub_82457FE0, sub_82458468, sub_824591C0,
sub_8245AAF0, sub_8245ABD8, sub_8245AD00
```
The most-suspicious sites for this audit are the three in the
dispatch-loop module: `sub_82450218`, `sub_824503A0`, and the
self-release in `sub_82450B68`.
## Most-recent kernel calls before the divergent load (ours tid=5)
The "divergent load" is the kernel-side return of `NtWaitForMultipleObjectsEx`.
No guest-memory load is the proximate cause. Most-recent kernel calls
before each wait on ours tid=5 (from S3's ours-lr-trace data):
- `sub_824AB158``NtReleaseSemaphore` (via wrapper)
- `sub_824AA2F0``NtSetEvent`
- `sub_824AAF50``KeSetEvent`-style with ptr+size args
- `sub_824AA830``KeQueryPerformanceCounter`-like
- `sub_824AB240``NtWaitForMultipleObjectsEx` itself
## Hypothesis (MEDIUM-HIGH confidence)
The semaphore is being **over-released** in ours. Specifically, one of
the producer-side enqueue paths (sub_82452DC0, sub_82452690, sub_8245E1D8,
or any of the 22 other release-call sites) is firing release more often
than the dispatch loop is consuming work — OR — ours's wait kernel
handler in `xenia-kernel/src/exports.rs` is not atomically decrementing
the semaphore count on WAIT_OBJECT_0+N.
Ranked S5 leads:
1. **Audit ours's `NtWaitForMultipleObjectsEx` handler implementation**:
does it decrement the semaphore on success? (Likely yes — would
regress many things otherwise. Test with a small probe.)
2. **Probe `NtReleaseSemaphore` call rate on handle 0x1050** in ours.
Compare to canary on equivalent handle (some F8000xxx in canary).
Hypothesis: ours releases more often per dispatch.
3. **Cross-check the canary equivalent handle**: canary uses
`XSemaphore::native_object()` pseudo-handle for handle[1]. Use
`audit_69_event_signal_watch` extension (or grep S1's
`signal-probe-correlated.log` for KeReleaseSemaphore + the relevant
ptr) to identify canary's semaphore handle ID, then run the same probe.
## Classification
NOT a loop-exit-branch divergence (neither engine exits).
NOT a missing-thread / missing-spawn divergence (S2 closed that).
NOT a wrong-handle-selection divergence (S3 confirmed args match).
It IS a **semaphore-state divergence**: ours's NtWaitForMultipleObjects
keeps returning WAIT_OBJECT_0+1 (semaphore signaled) where canary's
returns WAIT_TIMEOUT. The semaphore count is non-zero at wait-entry in
ours; zero in canary.
## Confidence flags
| finding | confidence | reasoning |
|---|---|---|
| both loops never exit (B50 never fires) | HIGH | direct measurement |
| ours r3=1 always at back-edge | HIGH | 91/91 captures direct measurement |
| canary r3=0x102 mostly at back-edge | HIGH | 3/4 captures direct measurement |
| handle[1] is NtCreateSemaphore w/ InitialCount=0, Max=0x7FFFFFFF | HIGH | mem-watch + disasm confirmed |
| handle[0] is NtCreateEvent | HIGH | disasm confirmed at 0x824A9F18 |
| ours handle slot values 0x104C, 0x1050 | HIGH | mem-watch confirmed |
| no exit-branch divergence in matching iter | HIGH | exit branch never taken in either |
| semaphore-state divergence root cause | MEDIUM-HIGH | r3 differs → wait kernel return differs → semaphore state must differ; haven't directly proved which (over-release vs no-decrement vs wrong-init) |
| S5 path-1 (NtWaitForMultiple decrement bug) | MEDIUM | most likely culprit given kernel-side state divergence pattern, but other hypotheses still open |

View File

@@ -0,0 +1,80 @@
0x82450a28: mflr r12
0x82450a2c: stw r12, -8(r1)
0x82450a30: std r31, -16(r1)
0x82450a34: stwu r1, -96(r1)
0x82450a38: mr r31, r3
0x82450a3c: li r4, 3
0x82450a40: li r3, -2
0x82450a44: bl 0x824AA658
0x82450a48: mr r3, r31
0x82450a4c: bl 0x82450A68
0x82450a50: addi r1, r1, 96
0x82450a54: lwz r12, -8(r1)
0x82450a58: mtlr r12
0x82450a5c: ld r31, -16(r1)
0x82450a60: blr
0x82450a64: .long 0x00000000
0x82450a68: mflr r12
0x82450a6c: bl 0x825F0F88
0x82450a70: stwu r1, -128(r1)
0x82450a74: mr r31, r3
0x82450a78: li r6, 16
0x82450a7c: li r5, 0
0x82450a80: addi r4, r1, 80
0x82450a84: li r3, 2
0x82450a88: lwz r11, 88(r31)
0x82450a8c: stw r11, 80(r1)
0x82450a90: lwz r11, 92(r31)
0x82450a94: stw r11, 84(r1)
0x82450a98: bl 0x824AB240
0x82450a9c: cmplwi cr6, r3, 0x0
0x82450aa0: beq cr6, 0x82450B50
0x82450aa4: li r29, 0
0x82450aa8: li r28, 4
0x82450aac: lwz r11, 212(r31)
0x82450ab0: cntlzw r11, r11
0x82450ab4: extrwi r11, r11, 1, 26
0x82450ab8: cmplwi cr6, r11, 0x0
0x82450abc: bne cr6, 0x82450AC8
0x82450ac0: li r4, 5
0x82450ac4: b 0x82450B2C
0x82450ac8: mr r30, r29
0x82450acc: addi r11, r31, 112
0x82450ad0: lwz r10, 0(r11)
0x82450ad4: cntlzw r10, r10
0x82450ad8: extrwi r10, r10, 1, 26
0x82450adc: cmplwi cr6, r10, 0x0
0x82450ae0: beq cr6, 0x82450AF8
0x82450ae4: addi r30, r30, 1
0x82450ae8: addi r11, r11, 20
0x82450aec: cmplwi cr6, r30, 0x5
0x82450af0: blt cr6, 0x82450AD0
0x82450af4: b 0x82450B34
0x82450af8: lwz r11, 224(r31)
0x82450afc: cmplwi cr6, r11, 0x0
0x82450b00: beq cr6, 0x82450B28
0x82450b04: lwz r11, 220(r31)
0x82450b08: cmpw cr6, r11, r30
0x82450b0c: bge cr6, 0x82450B28
0x82450b10: bl 0x824AA830
0x82450b14: lwz r11, 224(r31)
0x82450b18: cmplw cr6, r3, r11
0x82450b1c: blt cr6, 0x82450B34
0x82450b20: stw r28, 220(r31)
0x82450b24: stw r29, 224(r31)
0x82450b28: mr r4, r30
0x82450b2c: mr r3, r31
0x82450b30: bl 0x82450B68
0x82450b34: li r6, 16
0x82450b38: li r5, 0
0x82450b3c: addi r4, r1, 80
0x82450b40: li r3, 2
0x82450b44: bl 0x824AB240
0x82450b48: cmplwi cr6, r3, 0x0
0x82450b4c: bne cr6, 0x82450AAC
0x82450b50: li r3, 0
0x82450b54: addi r1, r1, 128
0x82450b58: b 0x825F0FD8
0x82450b5c: .long 0x00000000
0x82450b60: lwz r18, 9792(r31)
0x82450b64: lwz r16, 13880(r14)

View File

@@ -0,0 +1,202 @@
Disassembly from requested address 0x82450b68 (200 instructions):
0x82450b68: mflr r12
0x82450b6c: bl 0x825F0F74
0x82450b70: subi r31, r1, 176
0x82450b74: stwu r1, -176(r1)
0x82450b78: mr r29, r4
0x82450b7c: mr r27, r3
0x82450b80: cmpwi cr6, r29, 5
0x82450b84: bne cr6, 0x82450B94
0x82450b88: addi r28, r27, 196
0x82450b8c: addi r26, r27, 28
0x82450b90: b 0x82450BAC
0x82450b94: slwi r11, r29, 2
0x82450b98: mr r26, r27
0x82450b9c: add r11, r29, r11
0x82450ba0: slwi r11, r11, 2
0x82450ba4: add r11, r11, r27
0x82450ba8: addi r28, r11, 96
0x82450bac: addi r23, r27, 56
0x82450bb0: mr r3, r23
0x82450bb4: stw r23, 84(r31)
0x82450bb8: bl 0x8284DCFC
0x82450bbc: mr r3, r26
0x82450bc0: bl 0x8284DCFC
0x82450bc4: lwz r7, 16(r28)
0x82450bc8: cntlzw r11, r7
0x82450bcc: extrwi r11, r11, 1, 26
0x82450bd0: cmplwi cr6, r11, 0x0
0x82450bd4: beq cr6, 0x82450BEC
0x82450bd8: mr r3, r26
0x82450bdc: bl 0x8284DD0C
0x82450be0: mr r3, r23
0x82450be4: bl 0x8284DD0C
0x82450be8: b 0x82450EE8
0x82450bec: lwz r11, 12(r28)
0x82450bf0: lwz r9, 8(r28)
0x82450bf4: srwi r10, r11, 2
0x82450bf8: clrlwi r8, r11, 30
0x82450bfc: cmplw cr6, r9, r10
0x82450c00: bgt cr6, 0x82450C08
0x82450c04: sub r10, r10, r9
0x82450c08: lwz r9, 4(r28)
0x82450c0c: slwi r10, r10, 2
0x82450c10: slwi r8, r8, 2
0x82450c14: lwz r6, 8(r28)
0x82450c18: addi r11, r11, 1
0x82450c1c: slwi r6, r6, 2
0x82450c20: li r24, 0
0x82450c24: lwzx r10, r10, r9
0x82450c28: cmplw cr6, r6, r11
0x82450c2c: lwzx r30, r10, r8
0x82450c30: stw r11, 12(r28)
0x82450c34: stw r30, 80(r31)
0x82450c38: bgt cr6, 0x82450C40
0x82450c3c: stw r24, 12(r28)
0x82450c40: subic. r11, r7, 1
0x82450c44: stw r11, 16(r28)
0x82450c48: bne 0x82450C50
0x82450c4c: stw r24, 12(r28)
0x82450c50: addi r25, r27, 28
0x82450c54: mr r3, r25
0x82450c58: bl 0x8284DCFC
0x82450c5c: mr r3, r25
0x82450c60: stw r30, 216(r27)
0x82450c64: bl 0x8284DD0C
0x82450c68: mr r3, r26
0x82450c6c: bl 0x8284DD0C
0x82450c70: lwz r11, 28(r30)
0x82450c74: clrlwi r11, r11, 31
0x82450c78: cmplwi cr6, r11, 0x0
0x82450c7c: bne cr6, 0x82450D30
0x82450c80: lwz r11, 8(r30)
0x82450c84: cmplwi cr6, r11, 0x1
0x82450c88: blt cr6, 0x82450CE4
0x82450c8c: bne cr6, 0x82450D3C
0x82450c90: lwz r11, 28(r30)
0x82450c94: rlwinm r11, r11, 0, 29, 29
0x82450c98: cmplwi cr6, r11, 0x0
0x82450c9c: beq cr6, 0x82450CB0
0x82450ca0: mr r4, r30
0x82450ca4: mr r3, r27
0x82450ca8: bl 0x824510E0
0x82450cac: b 0x82450CBC
0x82450cb0: mr r4, r30
0x82450cb4: mr r3, r27
0x82450cb8: bl 0x824517B0
0x82450cbc: stw r29, 220(r27)
0x82450cc0: bl 0x824AA830
0x82450cc4: mr r11, r3
0x82450cc8: lwz r3, 92(r27)
0x82450ccc: li r5, 0
0x82450cd0: addi r11, r11, 66
0x82450cd4: li r4, 1
0x82450cd8: stw r11, 224(r27)
0x82450cdc: bl 0x824AB158
0x82450ce0: b 0x82450D3C
0x82450ce4: lwz r11, 28(r30)
0x82450ce8: mr r4, r30
0x82450cec: mr r3, r27
0x82450cf0: rlwinm r11, r11, 0, 29, 29
0x82450cf4: cmplwi cr6, r11, 0x0
0x82450cf8: beq cr6, 0x82450D04
0x82450cfc: bl 0x82450F68
0x82450d00: b 0x82450D08
0x82450d04: bl 0x82451238
0x82450d08: stw r29, 220(r27)
0x82450d0c: bl 0x824AA830
0x82450d10: mr r11, r3
0x82450d14: lwz r3, 92(r27)
0x82450d18: li r5, 0
0x82450d1c: addi r11, r11, 66
0x82450d20: li r4, 1
0x82450d24: stw r11, 224(r27)
0x82450d28: bl 0x824AB158
0x82450d2c: b 0x82450D3C
0x82450d30: lwz r11, 28(r30)
0x82450d34: ori r11, r11, 0x2
0x82450d38: stw r11, 28(r30)
0x82450d3c: lwz r11, 8(r30)
0x82450d40: mr r29, r24
0x82450d44: cmpwi cr6, r11, 2
0x82450d48: blt cr6, 0x82450E08
0x82450d4c: cmpwi cr6, r11, 3
0x82450d50: ble cr6, 0x82450DA0
0x82450d54: cmpwi cr6, r11, 4
0x82450d58: bne cr6, 0x82450E08
0x82450d5c: lwz r11, 28(r30)
0x82450d60: rlwinm r11, r11, 0, 29, 29
0x82450d64: cmplwi cr6, r11, 0x0
0x82450d68: bne cr6, 0x82450D98
0x82450d6c: lwz r29, 36(r30)
0x82450d70: mr r3, r29
0x82450d74: lwz r11, 0(r29)
0x82450d78: lwz r11, 4(r11)
0x82450d7c: mtctr r11
0x82450d80: bctrl
0x82450d84: clrlwi r11, r3, 24
0x82450d88: cmplwi cr6, r11, 0x0
0x82450d8c: beq cr6, 0x82450D98
0x82450d90: mr r3, r29
0x82450d94: bl 0x8244FB38
0x82450d98: li r29, 1
0x82450d9c: b 0x82450E28
0x82450da0: addi r3, r30, 40
0x82450da4: bl 0x82451DB8
0x82450da8: lwz r11, 32(r30)
0x82450dac: cmplwi cr6, r11, 0x0
0x82450db0: beq cr6, 0x82450DCC
0x82450db4: rlwinm r11, r11, 0, 0, 31
0x82450db8: lwz r10, 4(r30)
0x82450dbc: lwz r11, 4(r11)
0x82450dc0: cmplw cr6, r10, r11
0x82450dc4: li r11, 1
0x82450dc8: beq cr6, 0x82450DD0
0x82450dcc: mr r11, r24
0x82450dd0: clrlwi r11, r11, 24
0x82450dd4: cmplwi cr6, r11, 0x0
0x82450dd8: beq cr6, 0x82450E00
0x82450ddc: lwz r4, 8(r30)
0x82450de0: lwz r5, 0(r30)
0x82450de4: lwz r3, 32(r30)
0x82450de8: cmpwi cr6, r4, 1
0x82450dec: ble cr6, 0x82450DFC
0x82450df0: bl 0x8245D9D8
0x82450df4: li r29, 1
0x82450df8: b 0x82450E28
0x82450dfc: stw r4, 8(r3)
0x82450e00: li r29, 1
0x82450e04: b 0x82450E28
0x82450e08: mr r3, r26
0x82450e0c: stw r26, 88(r31)
0x82450e10: bl 0x8284DCFC
0x82450e14: addi r4, r31, 80
0x82450e18: mr r3, r28
0x82450e1c: bl 0x823232C0
0x82450e20: mr r3, r26
0x82450e24: bl 0x8284DD0C
0x82450e28: clrlwi r11, r29, 24
0x82450e2c: cmplwi cr6, r11, 0x0
0x82450e30: beq cr6, 0x82450ECC
0x82450e34: lwz r11, 28(r30)
0x82450e38: rlwinm r11, r11, 0, 30, 30
0x82450e3c: cmplwi cr6, r11, 0x0
0x82450e40: beq cr6, 0x82450E68
0x82450e44: mr r3, r26
0x82450e48: stw r26, 88(r31)
0x82450e4c: bl 0x8284DCFC
0x82450e50: addi r4, r31, 80
0x82450e54: mr r3, r28
0x82450e58: bl 0x823232C0
0x82450e5c: mr r3, r26
0x82450e60: bl 0x8284DD0C
0x82450e64: b 0x82450ECC
0x82450e68: lwz r11, 40(r30)
0x82450e6c: cmplwi cr6, r11, 0x0
0x82450e70: beq cr6, 0x82450EA4
0x82450e74: rlwinm r3, r11, 0, 0, 31
0x82450e78: bl 0x82458A70
0x82450e7c: lwz r29, 40(r30)
0x82450e80: lwz r3, 0(r29)
0x82450e84: bl 0x824583E8

View File

@@ -0,0 +1,192 @@
# AUDIT-069 Session 2 — writer report v2
Date: 2026-05-20
xenia-rs HEAD: `e6d43a23ac393004d2e5adf2f0395fd0b5e6448b` (UNCHANGED from S1)
`git diff HEAD | sha256sum`: `ed30fd526643918f67311caff0a10d1346d73fd0c0323e02477883cf5ff20357` (UNCHANGED from S1 end)
No canary instrumentation added this session.
## Headline
**S1's framing is FALSIFIED.** ours does NOT lack a "canary-tid=10
equivalent" thread. The spawn chain executes identically:
main (ours tid=1) → sub_8244FEA8 → sub_8244FF50
→ ExCreateThread(entry=0x82450A28, ctx=0x828F3B68)
→ ours tid=5 starts
→ sub_82450A28 (1×) → sub_82450A68 (1×)
γ-signaler family (sub_8245D9D8 6×, sub_8245DA78 1×, sub_8245DB40 75×)
This is bit-equivalent to canary's chain, modulo the tid label
(canary calls it tid=10, ours calls it tid=5 — same entry, same ctx,
same dispatch loop, same γ-signaler family fires from inside it).
The signaler spawn-chain is NOT the bug. S1's "the bug is at the
thread-spawn layer" hypothesis is wrong.
## Spawn chain (DB-derived, READ-ONLY DuckDB)
| Fn | callers in DB | role |
|---|---|---|
| 0x82450A28 | 1 ref-edge from 0x8244FFF8 (sub_8244FF50+0xA8) | thread entry (data ptr only) |
| 0x8244FF50 | 1 call-edge from 0x8244FEE8 (sub_8244FEA8+0x40) | ExCreateThread caller |
| 0x8244FEA8 | 11 call-edges (8 unique callers across sub_821A5150, sub_821CB968, sub_821CC2E8, sub_821D2850, sub_82237EC8, sub_8225EE20, sub_822E0350, sub_824528A8, sub_82452DC0 (2×), sub_8245E528) | spawn helper |
## Per-PC fire counts (ours-cold, 1.5B instr, fresh today)
| PC | symbol | fires | tid |
|---|---|---|---|
| 0x8244FEA8 | sub_8244FEA8 (spawn helper) | 7 | 1 |
| 0x8244FF50 | sub_8244FF50 (ExCreateThread caller) | 1 | 1 |
| 0x82450A28 | sub_82450A28 (thread entry) | 1 | 5 |
| 0x82450A68 | sub_82450A68 (worker dispatch loop) | 1 | 5 |
| 0x8245D9D8 | γ-signaler D | 6 | 5 |
| 0x8245DA78 | γ-signaler D-B | 1 | 5 |
| 0x8245DB40 | γ-signaler D-NEW | 75 | 5 |
Spawn event log confirms `ExCreateThread: tid=5 handle=0x1050 entry=0x82450a28 start_ctx=0x828f3b68`.
Total `kernel.calls{name=ExCreateThread} = 10`.
## Comparison with canary (S1 data — fresh today, not stale)
| metric | canary | ours |
|---|---|---|
| thread with entry=0x82450A28 | tid=10 | tid=5 |
| start_ctx | 0x828F3B68 | 0x828F3B68 |
| γ-D family signaler firings | all on tid=10 | all on tid=5 |
| NtSetEvent fires from γ-D (via wrapper 0x824AA2F0) | confirmed | confirmed |
The spawn chain and γ-signaler invocation match. The only divergence at the
signaler call site is **which handle gets signaled**, not whether the
signaler runs.
## Divergence point (parent fires, child also fires)
NONE — every node in the spawn chain fires in ours. The S1-prescribed
"first ancestor that fires while child does not" never materialises because
the entire chain is reached identically.
The actual divergence is downstream of the spawn-chain — at the
**handle-selection** step inside the γ-signaler family, per AUDIT-062's
prior finding ("ours's γ-signalers signal WRONG handles — neighbors of the
wedge handle, not the wedge itself").
## Gate condition
There is no gate that ours fails. The control flow reaches the γ-signaler
and invokes the NtSetEvent wrapper (`sub_824AA2F0`) with bit-identical
control flow. The argument to NtSetEvent (the handle) is the
divergent term.
In the AUDIT-062 archive ours-ntset.jsonl, the γ-D signaler on ours tid=5
calls NtSetEvent on handles `0x103C`, `0x1068`, `0x106C`, `0x1094`, ...
These are guest-side handle slots that the *waiter* is NOT waiting on.
Per S1, canary's wedge waiter (tid=17, tid=26) waits on `F80000A4` and
`F8000110`. Note that canary's handles are *pseudo-handles* (high-bit
encoded), while ours's slot allocator hands out normal `0x10xx` IDs —
a known cross-engine handle convention mismatch already documented
in AUDIT-019/043/062.
The semantic question is therefore: **what does the producer compute as
the "next handle to signal", and is the computation reading
a different value of the bookkeeping struct in ours vs canary?**
This is the question AUDIT-062 hit and parked; it must be re-opened
now that S1 has clarified the producer thread is reached identically.
## ours-side analog status
The relevant kernel handlers are:
- `NtSetEvent` — ours `xenia-kernel/src/exports.rs` is per-AUDIT-062 archive
bit-equivalent to canary in semantics (signals the event, schedules wakeup).
Returns SUCCESS in both.
- `ExCreateThread` — ours bit-equivalent (S2 spawn matches canary trajectory
ctx + entry + suspended flag).
- `xeKeWaitForSingleObject` (wedge wait at 0x821CB1DC) — ours behaviour
matches per AUDIT-049/065 prior work; the WAIT itself is fine, what
remains broken is the signaler picking the right handle on tid=5.
Net: NO kernel handler bug. The divergence is **guest-state computed
inside the γ-signaler family at sub_8245D9D8 / sub_8245DA78 /
sub_8245DB40** — i.e. data that lives in the queue/list dispatched
by sub_82450A68.
## Reading-error #28 reclassification
S1 inadvertently committed the same class of error documented as #28 in
prior audit memory: "treating per-engine tid label numerically across
engines without a tid-mapping translation." S1 used canary's "tid=10"
verbatim and AUDIT-062's "tid=10: 0 fires" verbatim, concluding "ours's
thread set lacks the canary-tid=10 equivalent." In reality the same
guest thread exists on both, with renumbered host-side tid labels.
The correct cross-engine identity is `(entry_pc, start_ctx)`, not the
tid integer. S2 re-validates by `entry=0x82450a28 ∧ ctx=0x828f3b68`,
which uniquely identifies the spawn on both engines.
Do NOT register a new reading-error #; this is the existing #28 surface.
## Session 3 recommendation (refined)
Drop the spawn-chain investigation entirely. The producer thread runs.
**Path A (RECOMMENDED, ~80 LOC ours-only)**: build a probe of the
**handle-passed-to-NtSetEvent** on tid=5 (ours) inside the γ-signaler
PCs, paired with the symmetric `audit_69_event_signal_watch` capture
from S1 in canary. Compare the *sequence of handle IDs* per signaler
invocation. The first mismatch identifies the guest-state divergence
that drives wrong-handle selection.
Plumbing path: extend `--lr-trace` in ours (`crates/xenia-app/src/main.rs:233-243`)
to also capture `r3` snapshot at multiple PCs, matching canary's
audit_69 wrapper-entry capture. Already exists (M12 lr_trace lists
pc/tid/hw/cycle/r3/r4/r5/r6/lr). Probe ours `0x824AA2F0` and `0x824AAF50`
entry PCs.
**Path B (~50 LOC diff-tool)**: extend the diff-events JSONL absorber to
treat the canary→ours handle-ID mapping as a runtime-discovered alias
when the underlying dispatcher pointer matches. Doesn't fix the bug,
absorbs the symptom.
**Path C (root-cause, larger)**: walk sub_82450A68 dispatch loop body
disassembly + AUDIT-062 archive to identify which guest-memory struct
holds the queue of "handles to signal." The wrong handles on ours mean
this struct gets populated wrong somewhere upstream of tid=5's dispatch
loop — likely from sub_8244FEA8's 7 fires (which call sites enqueue
work, and what data is enqueued).
LOC budget for S3: Path A ~80, Path B ~50, Path C unknown (~200+).
## Cascade A/B/C/D
- **A** (DB-derived spawn chain): PASS (11 callers, 1 unique call edge to FF50).
- **B** (per-fn fire counts ours+canary): PASS (ours fresh, canary from S1 fresh).
- **C** (divergence-point identification): N/A — no divergence in spawn chain;
S1 framing falsified. Re-direction recommended.
- **D** (kernel-handler bit-equivalence check): PASS (NtSetEvent / ExCreateThread
per AUDIT-062 archive; no new kernel bug detected).
Net: 3/4 PASS, 1/4 N/A (because the postulated divergence wasn't there).
## Discipline
- xenia-rs HEAD UNCHANGED (sha256 of `git diff HEAD` matches S1 end).
- No canary instrumentation added this session — S1's data is fresh.
- ours-rs ran with `--ctor-probe` (read-only, lockstep-digest-unaffected
flag already in main.rs:194).
- No source modifications to ours.
- ours-rs cache (none on this host); no canary run, no canary cache to wipe.
## Artifacts
```
audit-runs/audit-069-wait-signal-producer/
session-2-spawn-walk.log (combined probe + DB queries + fires table)
writer-report-v2.md (this file)
s2/ours-probe.stdout (780 lines, 91 CTOR-PROBE records)
s2/ours-probe.stderr (241 lines, all spawn events + summary)
```
No `fix-canary-v2.diff` (no canary instrumentation added).

View File

@@ -0,0 +1,229 @@
# AUDIT-069 Session 3 — writer report v3
Date: 2026-05-20
xenia-rs HEAD: `e6d43a23ac393004d2e5adf2f0395fd0b5e6448b` (UNCHANGED from S1/S2)
`git diff HEAD | sha256sum`: `ed30fd526643918f67311caff0a10d1346d73fd0c0323e02477883cf5ff20357`
(UNCHANGED at start AND end of S3)
No canary instrumentation added this session.
No ours source modifications. `--lr-trace` is a runtime flag (main.rs:233-243).
## Headline (HIGH confidence, direct measurement)
ours's tid=5 (= canary tid=10 by entry/ctx identity) fires the γ-signaler
family from the SAME guest LRs as canary — but **only 81 times where
canary fires 492 times (16%)**. This is NOT a "wrong-handle" bug — it is
a **producer-loop underrun**. The dispatch loop in `sub_82450A68` exits
early or starves; consumer threads then block on events that ours never
gets to signal.
S2's "the producer fires identically, just selects wrong handles" framing
is REFINED, not falsified: the producer reaches the wrappers via the
EXACT same call sites but completes ~5× fewer iterations.
## Method
Read-only `--lr-trace=0x824AA2F0,0x824AAF50` on cold ours boot, 1.5B
instructions / 47 s wallclock (and re-validated at 5B / 159s — same 81
fires, same handle universe, same import_calls=39290 → no new work after
the producer's initial burst). JSONL output to s3/ours-lr-trace.jsonl.
Cross-engine paired against S1's `signal-probe-correlated.log` (canary
data, fresh 2026-05-20).
## Per-LR fire counts
| caller LR | symbol | wrapper PC | canary tid=10 | ours tid=5 | ratio |
|---|---|---|---:|---:|---:|
| 0x8245DA44 | γ-D-A (sub_8245D9D8) | 0x824AA2F0 | 23 | 5 | 22% |
| 0x8245DB08 | γ-D-B (sub_8245DA78) | 0x824AA2F0 | 8 | 1 | 12% |
| 0x8245DC5C | γ-DB40 (sub_8245DB40 NEW) | 0x824AAF50 | 461 | 75 | 16% |
| **TOTAL** | | | **492** | **81** | **16%** |
ours runs the same producer code, but the loop terminates early. S2's per-PC
fire-count table also shows ours = 6/1/75 for the three γ-fns — this S3
data agrees with S2 for the wrapper-entry side too.
## Handle namespaces are incomparable by raw ID
- canary uses `XEvent::native_object()` pseudo-handles `F8000xxx` (high bit
set, encodes a synthetic ID assigned by `XObject::GetNativeObject`).
- ours uses normal slot IDs `0x10xx` from the handle-slot allocator.
Comparison must be by (a) **position in the per-LR sequence** and (b)
**call args** (size r5, signal-kind r4).
## Position-0 args MATCH (HIGH confidence, direct measurement)
| LR | r5 (size / kind) | matches? |
|---|---|---|
| 0x8245DC5C | ours=0x800 / canary=0x800 | YES |
| 0x8245DA44 | ours=2 (Set) / canary=2 | YES |
| 0x8245DB08 | ours=2 / canary=2 | YES |
r4 (buffer/ctx pointers) DIFFER in absolute address (different memory
layouts) but TYPE-shaped identically. The first invocation of each
signaler is structurally identical. The divergence is in COUNT of
subsequent loop iterations, not in handle-selection of position-0.
See `s3/handle-sequence-diff.md` for full position-aligned table.
## γ-DB40 signal-target distribution (the 461-vs-75 case)
| canary handle | count | ours handle | count |
|---|---:|---|---:|
| F80000C8 | 229 | 0x000010E0 | 69 |
| F80000DC | 79 | 0x00001040 | 1 |
| F8000078 | 71 | 0x0000105C | 1 |
| F80000BC | 39 | 0x00001098 | 1 |
| F800012C | 28 | 0x000010AC | 1 |
| F80000B4 | 7 | 0x000010D0 | 1 |
| F8000044 | 4 | 0x0000121C | 1 |
Shape: both have one dominant handle that absorbs ~half the signals
(canary 229/461=50%, ours 69/75=92%) and a long tail. ours's tail is
truncated — only 7 distinct handles in γ-DB40 vs canary's 10+.
This is consistent with **the producer enqueues the same kinds of work
items but the upstream feeder under-fires**, so the dominant work-item
(handle `0x10E0``F80000C8` by position) gets some iterations,
the next-most-common items get truncated to 1×, and the long tail
(canary's `F80000DC` 79× / `F8000078` 71×) is mostly missing.
## Wedge handle status (HIGH confidence)
AUDIT-062 archive recorded ours wedge handles `0x12AC` and `0x12B8` with
`<NO_SIGNALS_DESPITE_WAITS>` annotation in a deeper-boot run.
In S3's lr-trace: **handle 0x12AC count = 0, handle 0x12B8 count = 0**.
**No handle ≥ 0x121C appears in tid=5's signal trace at all.**
Max handle observed in this run: 0x121C (cache:/aab216c3 NtCreateFile).
The wedge handles are NEVER allocated in this 5B-instruction run, because
boot terminates **before** the trajectory that would create them. The
producer fires 81 times, then tid=5 goes quiet; the import_call counter
freezes at 39,290; `--halt-on-deadlock` does NOT trigger (consumers wait
on existing events that were never the wedge in this run).
**This is a stronger statement than "the wedge handle is never signaled":
the wedge handle is never even CREATED, because the boot never reaches
the point of creating it.** ours's boot trajectory is truncated by the
producer underrun upstream.
## Classification: producer-loop underrun (HIGH confidence)
NOT a race (timing-dependent), NOT a wrong-handle bug (the args at
matching positions are structurally identical), NOT a missing-kernel-
handler bug (the signals that DO fire pass through bit-equivalent
wrappers).
It is **producer-loop underrun**: sub_82450A68's dispatch loop iterates
fewer times. Either:
1. The work queue (read from guest memory by sub_82450A68) is populated
with fewer items by some upstream feeder.
2. The dispatch loop's exit condition trips early.
3. The thread blocks on a dispatcher event that never gets re-signaled.
Mechanism candidates (S4 to discriminate):
- **upstream feeder**: callers of sub_8244FEA8 (11 sites in DB) — one
enqueues less work in ours. Most likely the audio cluster
(sub_8225EE20) or sub_82452DC0 (2 calls) given they relate to APUBUG-
PRODUCER-001 territory.
- **dispatch loop exit**: the loop reads a flag from the dispatcher
struct at `0x828F3B68 + offset`; a state divergence there exits early.
- **inner KeWait at 0x824AB240** (mentioned in S1 spawn-chain notes):
if this wait times out / fails differently in ours, the loop exits.
## Reading-error registry
NO new reading-error class needed. This session confirms one existing
class:
- **#28 cross-engine tid label mismatch** — used correctly here
(compared by entry/ctx, not by tid integer).
- **AUDIT-062 "wrong handles" framing** is a SYMPTOM of the producer
underrun (fewer signals → some handles signaled, others starved),
not a separate bug.
## Cascade
- **A** (capture ours per-PC signaler firings): PASS (137 records, 81 on tid=5).
- **B** (parallel canary sequence from S1): PASS (492 records on tid=10).
- **C** (first-mismatch identification): PASS — divergence is in iteration
count, not in handle-at-position-0. Position-0 args match structurally.
- **D** (race-vs-missing-signal classification): PASS — neither pure race
nor pure missing-signal. It is **producer-loop underrun** (boot doesn't
reach the wedge-handle-creating subsystem).
Net 4/4 PASS.
## S4 recommendation (refined)
**Drop the "wrong-handles-from-γ-signaler" framing.** Focus upstream on
WHY tid=5's dispatch loop runs ~5× fewer iterations.
### Path A (RECOMMENDED, ~30 LOC ours-only diagnostic, no source mod)
Use `--lr-trace=0x82450A68` (the dispatch-loop body PC) plus the existing
`--branch-probe` to see WHERE in the loop body ours exits. If the loop has
a backward branch at offset X and ours's last fire is at offset Y < X, the
loop is exiting early. Pair with the inner `bl 0x824AB240` (KeWaitForMultipleObjects)
to see if the loop blocks on a wait that returns differently than canary.
### Path B (~80 LOC ours-only) — feeder-side capture
`--lr-trace=0x8244FEA8` on cold ours AND canary. The spawn-helper fires 11
times statically in DB-derived list of callers; runtime fires 7× in S2's
ours run. Pair r3/r4 (the spawned thread's start_ctx args) with canary's
equivalent. ours may be missing one or more enqueues — the missing
enqueue is the upstream root cause.
### Path C (~250 LOC, larger) — work-queue struct disassembly
Disassemble sub_82450A68 body, identify the work-queue struct it reads
from (likely at `[r29 + N]` where r29 = start_ctx 0x828F3B68 or a derived
pointer). Watch the struct with `--mem-watch` to identify the populator
(which fn writes the queue items). Trace that populator upstream.
LOC budget for S4: Path A ~30, Path B ~80, Path C ~250.
**Path A first** — gives the precise exit-condition (loop-body branch vs
inner-wait timeout) at zero LOC cost.
## Discipline
- xenia-rs HEAD UNCHANGED (sha256 of `git diff HEAD` matches S1/S2 end).
- No source modifications.
- `--lr-trace` is read-only, lockstep-digest-unaffected (per state.rs:1463-1500).
- No canary run this session (S1's data is fresh).
- No canary cache to wipe (no canary run).
- ours runs cold (no cache pre-population).
## Artifacts
```
audit-runs/audit-069-wait-signal-producer/s3/
ours-lr-trace.jsonl (137 records, both PCs, all tids)
ours-lr-trace.stderr (run log + counters)
ours-lr-trace.stdout (empty under --quiet)
ours-lr-trace-824AA2F0.log (60 records, NtSetEvent wrapper)
ours-lr-trace-824AAF50.log (77 records, Ke wrapper)
ours-lr-trace-extended.{jsonl,stderr,stdout} (5B-instr re-validation: same 81 fires)
handle-sequence-diff.md (parallel comparison + first-mismatch table)
writer-report-v3.md (this file)
```
No fresh canary run was needed — S1's `signal-probe-correlated.log`
(154,187 lines) carries all canary signal-probe data.
## Summary of S1 → S2 → S3 progression
- **S1**: identified canary's tid=10 as the signaler; claimed ours lacks
this thread (FALSIFIED by S2).
- **S2**: spawn-chain runs identically on ours tid=5; refined to "wrong-
handle selection" downstream (REFINED by S3).
- **S3**: ours runs identical PC/LR chain but with ~5× fewer iterations.
Loop underrun classification. Wedge handle never even gets created in
ours's truncated boot trajectory.
The bug is **upstream of the γ-signaler**: in WHAT the dispatch loop
reads from the work queue, or in the loop's exit condition.

View File

@@ -0,0 +1,357 @@
# AUDIT-069 Session 4 — writer report v4
Date: 2026-05-20
xenia-rs HEAD: `e6d43a23ac393004d2e5adf2f0395fd0b5e6448b` (UNCHANGED from S1/S2/S3)
`git diff HEAD | sha256sum`: `ed30fd526643918f67311caff0a10d1346d73fd0c0323e02477883cf5ff20357`
(UNCHANGED at start AND end of S4)
No ours source modifications. No canary instrumentation added.
Canary `audit_61_branch_probe_pcs` cvar used (pre-existing from S1).
Canary cache restored from `/tmp/canary-cache-bak-audit-068`.
## Headline (HIGH confidence — direct per-iteration measurement)
S3's "producer-loop underrun" framing pointed in the right direction
but mis-located the divergence. **Neither engine ever takes the
exit-branch in `sub_82450A68` (PC=0x82450B50, the LR=epilog path)**.
Both engines's dispatch threads stay in the loop indefinitely (no
deadlock; just waiting).
The actual divergence is in the **return value of the
`NtWaitForMultipleObjectsEx` call at PC=0x82450B44**:
- **Ours: r3 = 0x00000001 in 91/91 captures (100%)** — semaphore acquired.
- **Canary: r3 = 0x00000102 in 3/4 captures (75%)** — WAIT_TIMEOUT.
The two handles being waited on are:
- **handle[0] = NtCreateEvent** at `[r31+88]` — the STOP event (signal → exit).
- **handle[1] = NtCreateSemaphore(InitialCount=0, MaximumCount=0x7FFFFFFF)**
at `[r31+92]` — the WORK semaphore (signal → process work).
Both created by `sub_8244FF50` (spawn helper) BEFORE `ExCreateThread`.
mem-watch confirms handle slots in ours: `0x104C` (event) / `0x1050`
(semaphore) at run-1; absolute IDs drift across runs but the slot
layout is invariant.
This is **NOT an exit-branch divergence, NOT loop-underrun in the
literal sense — it is a SEMAPHORE-STATE divergence**. In ours the
work-semaphore count is non-zero at every wait entry (so the wait
always returns immediately with success); in canary the count is zero
at most wait entries (so the wait times out per the 16ms relative
timeout).
## Method (READ-ONLY, no source mod)
1. Disassembled `sub_82450A68` body (80 instructions) via
`xenia-rs disasm --at 0x82450A68 -n 200`. Saved to
`s4/sub_82450A68-disasm.txt`.
2. Identified loop topology: prolog → wait-#1 → body (with inner search
over 5-slot table at [r31+112..212]) → dispatch (bl 0x82450B68 →
γ-signaler family) → re-wait → back-edge OR exit.
3. Ran ours-cold with `--branch-probe=` on 14 BB-entry PCs covering all
loop-body paths. Captured 696 records over ~80s wallclock /
91 loop iterations.
4. Ran canary-cold (cache wiped → restored from
`/tmp/canary-cache-bak-audit-068`) with same `audit_61_branch_probe_pcs`
cvar set. Canary process faulted in vkd3d-proton at ~10s wallclock;
captured 35 records / 4 loop iterations. Sufficient to surface the
r3 distribution.
5. Used `--mem-watch=0x828F3BC0,0x828F3BC4` to identify which ours
handle IDs live in slots `[r31+88]` and `[r31+92]`. Then
disassembled `sub_8244FF50` to confirm event-vs-semaphore types via
the import jumps (`NtCreateEvent` at 0x824A9F18, `NtCreateSemaphore`
at 0x824AB0C0).
6. Cross-checked ours's kernel handlers (`nt_wait_for_multiple_objects_ex`,
`do_wait_multiple`, `handle_consume`, `nt_release_semaphore`,
`try_release_semaphore`, `wake_eligible_waiters`) — code looks
correct in isolation; the divergence is NOT in these handlers
directly.
## Per-PC iteration counts
| PC | path | ours fires | canary fires | note |
|---|---|---:|---:|---|
| 0x82450AA4 | first-iter entry | 1 | 1 | both entered once |
| 0x82450AAC | back-edge target | 91 | 4 | canary crashed early |
| 0x82450AC0 | flag@212==0 → r4=5 | 2 | 0 | rare path |
| 0x82450AC8 | flag@212!=0 → search | 90 | 4 | dominant |
| 0x82450AE4 | inner-search continue | 72 | 17 | |
| 0x82450AF4 | search-exhausted | 8 | 3 | no candidate found |
| 0x82450AF8 | candidate-found | 82 | 1 | |
| 0x82450B04 | budget skip | 81 | 0 | |
| 0x82450B10 | budget refresh | 8 | 0 | |
| 0x82450B28 | dispatch entry | 74 | 1 | bl 0x82450B68 |
| 0x82450B34 | re-wait entry | 92 | 4 | |
| **0x82450B50** | **EXIT (epilog)** | **0** | **0** | **never reached** |
## r3 at back-edge (the divergence signal)
| | ours | canary |
|---|---|---|
| r3=0x1 | 91/91 (100%) | 1/4 (25%) |
| r3=0x102 (TIMEOUT) | 0/91 (0%) | 3/4 (75%) |
| r3=0x0 (handle[0] signaled) | 0/91 | 0/4 |
| r3=other | 0/91 | 0/4 |
This is the **per-iteration measurement** the user's framing predicted.
The matching iterations show different r3 values at the SAME PC. The
"load feeding the predicate" is, however, NOT a guest-memory load — it
is the kernel-side return of `NtWaitForMultipleObjectsEx`. The
divergent KERNEL STATE is the work-semaphore count.
## Wait wrapper chain (disasm-derived)
```
sub_824AB240:
li r7, 0 ; alertable = 0
b 0x824AB190 ; tail-jump
sub_824AB190(r3=numObj, r4=&handles, r5=WaitMode, r6=Timeout(=16 ms), r7=Alertable):
...
bl 0x824ACA88 ; converts r4=16 ms → LARGE_INTEGER -160000 (relative 100-ns ticks)
...
bl 0x8284E08C ; NtWaitForMultipleObjectsEx (ord 254, import @ VA 0x8284E08C)
; returns NTSTATUS in r3:
; 0 = WAIT_OBJECT_0 = handle[0] (stop event) signaled
; 1 = WAIT_OBJECT_0+1 = handle[1] (work semaphore) acquired (atomically decrements count by 1)
; 0x102 = WAIT_TIMEOUT = 16 ms elapsed with no signal
```
`sub_82450A68` branches on this:
- `cmplwi cr6, r3, 0; beq cr6, 0xB50` → r3 == 0 → EXIT (stop event signaled)
- `cmplwi cr6, r3, 0; bne cr6, 0xAAC` → r3 != 0 (including 0x102) → CONTINUE
- r3 == 1 → at least one work-item is available → run the inner table search
- r3 == 0x102 → just a 16ms timer wake; inner search will likely find no candidate
and the loop just re-waits
In canary's brief 4-iteration captured window, only iteration-0 had real
work (`r3=1`); iterations 1-3 were timer-wakes (`r3=0x102`). In ours's
91-iteration window, all back-edges saw `r3=1`: someone has released
the semaphore at least once between each consume.
## Handle slot identification (HIGH confidence)
Via `--mem-watch=0x828F3BC0,0x828F3BC4`:
```
MEM-WATCH addr=0x828f3bc0 old=0x00000000 new=0x0000104c
store_addr=0x828f3bc0 store_len=4 tid=1 pc=0x8244ffb0 lr=0x8244ffb0
MEM-WATCH addr=0x828f3bc4 old=0x00000000 new=0x00001050
store_addr=0x828f3bc4 store_len=4 tid=1 pc=0x8244ffcc lr=0x8244ffcc
```
Static disasm of writer PCs:
```
0x8244FFAC: bl 0x824A9F18 ; NtCreateEvent wrapper
0x8244FFB0: stw r3, 88(r30) ; handle[0] = event = ours 0x104C
0x8244FFC8: bl 0x824AB0C0 ; NtCreateSemaphore wrapper (r4=0=Initial, r5=0x7FFFFFFF=Max)
0x8244FFCC: stw r3, 92(r30) ; handle[1] = semaphore = ours 0x1050
```
The semaphore is created with **InitialCount=0**. So if no one ever
calls `NtReleaseSemaphore` on it, the wait will only ever return
`STATUS_TIMEOUT`. Canary's behavior (mostly 0x102, occasionally 0x1)
matches this: producers release the semaphore ~1× per ~16ms.
Ours's behavior (always 0x1) means **producers release the semaphore
FASTER THAN the consumer drains it**.
## NtReleaseSemaphore call graph (xrefs to wrapper sub_824AB158)
Wrapper sub_824AB158 calls NtReleaseSemaphore (ord 243, import @
VA 0x8284E07C). Called from 22 sites across 18 functions:
```
0x822c6770 fn=0x822c6748
0x822c6848 fn=0x822c6808
0x822c95c4 .. 0x822c9718 fn=0x822c8b50 (×6 inline call sites)
0x822f23e8 fn=0x822f2328
0x823dd7f8 fn=0x823dd770
0x823dda3c fn=0x823dd838
0x823df008..1b4 fn=0x823de4b8 (×3)
0x823df604 fn=0x823df320
0x82450310 fn=0x82450218 ← dispatcher-module enqueuer (callers: sub_82452DC0 ×2)
0x824504c4 fn=0x824503A0 ← dispatcher-module enqueuer (callers: sub_82452690, sub_8245E1D8)
0x82450cdc fn=0x82450b68 ← THE DISPATCH FUNCTION itself (self-release)
0x82450d28 fn=0x82450b68 ← THE DISPATCH FUNCTION itself (self-release)
0x82456b48 fn=0x824569c0 (jump form)
0x82458020 fn=0x82457fe0
0x824584c8 fn=0x82458468
0x82459424 fn=0x824591c0
0x8245ab6c fn=0x8245aaf0
0x8245ac6c fn=0x8245abd8
0x8245ade0 fn=0x8245ad00
```
**Critical observation**: the dispatch function `sub_82450B68`
contains TWO release sites (at offsets 0xCDC, 0xD28). Each successful
dispatch run can release the semaphore again. If both branches release
+1 token, and the wait consumes only -1 per iteration, the count would
drift up. This is consistent with the "ours over-released" hypothesis.
Some sub_82450B68 branches release the semaphore via `lwz r3, 92(r27)`
which is `handle[1]` of the dispatcher itself. So the dispatch function
re-fills its own pipe.
## Hypothesis (MEDIUM-HIGH confidence)
The semaphore is being over-released in ours due to a divergent
**dispatch-loop control flow inside `sub_82450B68`** that
differentially decides whether to fire the self-release. Either:
(a) ours takes a sub_82450B68 branch that releases when canary's doesn't
(this is the dual of S3's question: which sub-branches differ?), OR
(b) ours's parse_timeout scales the 16 ms relative timeout by /100
(exports.rs:4495 — `magnitude.max(1) / 100`), turning a 16 ms wall-clock
timeout into 1,600 emulator-ticks. This may differentially interact
with how often the semaphore gets a release between wait entries.
The exit-branch-at-matching-iteration framing from the user's task spec
does NOT apply here: there IS no exit-branch divergence (both never
exit). The divergence is in the wait return value, which has no
proximate guest-memory load. The "load feeding the predicate" is a
kernel-state read (the semaphore count) performed inside the kernel
import handler itself.
## Most-recent kernel calls (tid=5 in ours, from S3 lr-trace
data + S4 cross-check)
Most-recent kernel calls before each wait at PC=0x82450B44 (re-wait
site), on ours tid=5:
- `NtReleaseSemaphore(handle=0x1050, count=1)` via wrapper
sub_824AB158, lr=0x82450CDC OR lr=0x82450D28 (both inside sub_82450B68
dispatch body) — self-release in the dispatch tail.
- `KeSetEvent(handle=0x10xx)` via wrapper sub_824AA2F0 OR sub_824AAF50 —
γ-signaler family fires (the audit's original signaler PCs from S1/S3).
- `KeQueryPerformanceCounter`-like via sub_824AA830 — used in budget
refresh path.
In **canary**, the equivalent sequence per S1's signal-probe-correlated.log
(180s window) is similar (γ-signalers fire 492× on tid=10), but the
SELF-RELEASE rate matters more — that determines whether the consumer
keeps seeing a non-zero semaphore.
## S5 recommendation (refined)
The right next step is **NOT** to walk further upstream in the
γ-signaler chain (S3's lead). It is to **measure the per-branch flow
inside `sub_82450B68` itself** — find which of its many branches
release the semaphore and how that branch is selected.
### Path A (RECOMMENDED, ~0 LOC, read-only)
`--branch-probe` covering `sub_82450B68` body (PCs 0x82450B68 ..
0x82451238, the dispatch body). Want to capture:
1. Frequency at the two release sites `0x82450CDC` and `0x82450D28`
(per-call cumulative count on tid=5).
2. Frequency at the OTHER exit sites in sub_82450B68 (e.g. the early
return at `0x82450EE8` which does NOT release).
If ours's release-rate at CDC/D28 is significantly higher than canary's,
that confirms (a). If similar, then (b) becomes the next theory.
### Path B (~80 LOC ours-side probe, no source mod)
Use `--branch-probe` on PCs inside `xenia_kernel::exports::parse_timeout`
to confirm the magnitude/100 scaling actually causes the divergence.
Actually this requires source instrumentation since parse_timeout is
Rust, not guest code. Mid-priority.
### Path C (~30 LOC canary diagnostic)
Add canary cvar `audit_69_semaphore_count_probe = VA` that emits the
post-Set count for the semaphore at native VA matching ours's
[r31+92]'s underlying X_KSEMAPHORE. Compare per-iteration count
progression canary-vs-ours.
LOC budget for S5: Path A = 0, Path B = ~80, Path C = ~30.
**Path A first** — narrows the divergence to specific sub_82450B68
sub-branch behavior at zero LOC cost.
## Cascade
- **A** (disasm sub_82450A68): PASS (HIGH) — 80-instruction body,
3 BB-paths, 12 BB-entries identified.
- **B** (ours per-iteration loop-branch trace): PASS (HIGH) —
91 back-edge captures, all r3=0x1.
- **C** (canary same trace): PARTIAL (MEDIUM) — canary crashed at
4 iterations in vkd3d-proton on exit; 4 captures sufficient to surface
r3=0x102 dominance, but not a long-window comparison.
- **D** (identify divergent load): PARTIAL (MEDIUM) — no guest-memory
load is the proximate cause; the divergence is in the kernel-side
semaphore-count state. The "load" is conceptually inside
`do_wait_multiple`'s read of `KernelObject::Semaphore.count`.
Net 2/4 PASS-HIGH, 2/4 PARTIAL-MEDIUM. Methodology learned: when both
engines stay in a loop, "which branch did ours take differently" is the
WRONG question — ask "what's different at the SAME branch."
## Confidence flags (summary)
| finding | confidence |
|---|---|
| Both engines never take exit-branch (B50) | HIGH |
| ours back-edge r3=1 always (91/91) | HIGH |
| canary back-edge r3=0x102 mostly (3/4) | HIGH |
| handle[1] is NtCreateSemaphore w/ InitialCount=0 | HIGH |
| handle[0] is NtCreateEvent | HIGH |
| Divergence is kernel-side semaphore-count state | MEDIUM-HIGH |
| sub_82450B68 self-release over-fires in ours | MEDIUM |
| parse_timeout /100 scaling is contributing | LOW-MEDIUM |
## Discipline
- xenia-rs HEAD `e6d43a23ac393004d2e5adf2f0395fd0b5e6448b` UNCHANGED
(sha256 of `git diff HEAD` matches S1/S2/S3 end at session start AND end).
- READ-ONLY ours. No source mod. `--branch-probe` / `--lr-trace` /
`--mem-watch` / `--trace-handles-focus` are runtime read-only flags
documented as "lockstep digest unaffected" (state.rs comments).
- Canary `audit_61_branch_probe_pcs` cvar enabled with our PC set; set
back to "" at session end. Verified.
- Canary `mute = true` set during run, restored to `false` at session end.
- Canary cache wiped before cold canary run, restored from
`/tmp/canary-cache-bak-audit-068` at session end.
## Artifacts
```
audit-runs/audit-069-wait-signal-producer/s4/
sub_82450A68-disasm.txt (80 ins disasm: sub_82450A28 entry + body)
ours-loop-branch-trace.stdout (696 BRANCH-PROBE records, ours-cold)
ours-loop-branch-trace.stderr (empty under --quiet)
canary-loop-branch-trace.stdout (1074 lines, 35 AUDIT-061-BR records)
canary-loop-branch-trace.stderr (89 lines, wine/vkd3d setup + final fault)
ours-mem-watch.stderr (2 MEM-WATCH records identifying handle slots)
ours-mem-watch.stdout (empty)
ours-signaler.jsonl (95 lr-trace records on wrapper PCs)
ours-handles.{stdout,stderr} (probe for handle dump; --halt-on-deadlock didn't trigger)
ours-trace-handles-summary.log (21 lines: focus startup + 8 ExCreateThread spawns)
divergence-analysis.md (per-iter table, hypothesis, S5 leads)
writer-report-v4.md (this file)
```
No canary instrumentation diff this session. No `fix-canary-s4.diff`.
## Summary of S1 → S2 → S3 → S4 arc
- **S1** (2026-05-20 AM): identified canary tid=10 as the signaler;
claimed ours lacks this thread (FALSIFIED by S2).
- **S2** (2026-05-20 noon): spawn-chain runs identically on ours tid=5;
refined to "wrong-handle selection" downstream (REFINED by S3).
- **S3** (2026-05-20 PM): ours runs identical PC/LR chain but with
~5× fewer iterations. Producer-loop underrun classification.
Wedge handle never even created in ours's truncated boot.
- **S4** (2026-05-20 evening): per-iteration branch-probe shows
**NEITHER engine ever exits the loop**. Divergence is in
`NtWaitForMultipleObjectsEx` return: ours r3=1 always (semaphore
acquired), canary r3=0x102 mostly (timeout). Root cause is
**semaphore-count state divergence** — ours's work-semaphore is
over-released relative to consume rate, OR ours's timeout never
fires before signal. Hypothesis: divergence inside `sub_82450B68`
dispatch body's self-release logic.
The S5 question is no longer "which earlier kernel call differs" —
it is "which sub-branch of `sub_82450B68` releases the semaphore in
ours that canary's doesn't release in." Read-only branch-probe on
sub_82450B68 body PCs.

View File

@@ -0,0 +1,122 @@
# AUDIT-069 Session 5 — writer report (RECOVERED from captured data; agent timed out before authoring)
Date: 2026-05-20.
Status: The dispatched agent (`a9380b477f5cb4b3f`) ran ~50 min and timed out via API stream-idle error. The instrumentation, builds, and capture runs completed. The agent did NOT author the final analysis. This report is composed by the parent agent from the captured artifact files (canary-release-trace.log, ours-release-trace.jsonl, fix-canary-s5.diff).
xenia-rs HEAD `e6d43a23ac393004d2e5adf2f0395fd0b5e6448b` UNCHANGED. `sha256(git diff HEAD)` = `ed30fd526643918f67311caff0a10d1346d73fd0c0323e02477883cf5ff20357` UNCHANGED (matches S1-S4 end).
## Canary handle identification
Canary's work-semaphore: handle `0xF800003C` (single semaphore released across 414 events). Wrapper inside canary captures every release through `lr=0x824AB168` (the post-call PC inside `sub_824AB158`). To get the GUEST-side caller LR, S5 would need to probe at the wrapper-entry PC and capture the caller's LR; this was not done in this session.
## Per-tid release counts
### Canary (`canary-release-trace.log`, 414 events)
| tid | count | role |
|---:|---:|---|
| 10 | 382 | worker (self-release inside dispatch fn) |
| 18 | 14 | producer |
| 17 | 9 | producer |
| 6 | 7 | main thread |
| 16 | 1 | producer |
| 26 | 1 | producer |
### Ours (`ours-release-trace.jsonl`, 99 events)
| tid | count | role |
|---:|---:|---|
| 5 | 90 | worker (= canary tid=10 by entry/ctx identity) |
| 1 | 8 | main thread (= canary tid=6) |
| 13 | 1 | producer (the wedged thread) |
## Per-LR release counts (ours only — canary lr field captured wrapper-internal addr, not useful)
| ours lr | count | likely site |
|---|---:|---|
| 0x82450ce0 | 68 | inside sub_82450B68 dispatch fn (the dominant self-release) |
| 0x82450d2c | 7 | second self-release in same fn |
| 0x82450314 | 7 | sub_824502E0+0x34 (producer A) |
| 0x8245ab70 | 7 | sub_8245ab40+0x30 (producer B) |
| 0x824584cc | 4 | sub_82458480 area (producer C) |
| 0x82458024 | 4 | sub_82458000 area (producer D) |
| 0x824504c8 | 1 | sub_82450450+0x78 (producer E) |
| 0x822f23ec | 1 | sub_822F23B0 area (main-thread producer F) |
## Hypothesis verdict
- **H1 (ours over-releases the work-semaphore)**: **FALSIFIED.** Ours releases 99 total vs canary 414 (24% of canary's rate). The worker self-release shows 90 in ours vs 382 in canary (24%). Ours does NOT over-release.
- **H2 (canary processes a batch per iteration)**: **PARTIALLY SUPPORTED but insufficient.** Per-iteration rates (combining S4's iteration data):
- Canary: 4 iterations in 10s with 382 worker releases ≈ ~95 releases per iteration (HIGH variance, n=4 is too small)
- Ours: 91 iterations in ~60s with 90 worker releases ≈ 1 release per iteration
The per-iteration ratio is suggestive but the canary sample size remains too thin for a HIGH-confidence claim.
- **H3 (new): SYSTEMIC under-production of work in ours.** Producer-tid releases:
- Canary: 32 events across 5 producer tids (16, 17, 18, 26 + main 6)
- Ours: 9 events across 2 producer tids (1, 13)
Ours has fewer producer threads contributing AND fewer events per producer. The bug isn't localized to a single fn or handle — it's distributed across the production-side of the work-queue. Ratio ~28%, consistent with the worker self-release ratio.
## Reconciliation with S3
S3 measured γ-signals: ours 81 / canary 492 (16%). S5 measures semaphore releases: ours 99 / canary 414 (24%). Same shape of disparity, slightly different ratio because the two events are at different points in the dispatch path. Both consistent with H3.
## Confidence labels
- Per-tid release counts (ours): HIGH (n=99 measured directly).
- Per-tid release counts (canary): HIGH for the count itself (n=414 measured), MEDIUM for "which canary tid is the worker" (relies on S2's entry/ctx-identity mapping).
- H1 falsification: HIGH.
- H2 partial support: LOW (canary iteration data still n=4).
- H3 (systemic under-production): MEDIUM-HIGH (consistent across two independent measurements — γ-signals from S3, releases from S5).
## Methodology pattern note
S1→S5 has been a sequence of progressively refined framings, each falsifying the prior:
- S1: "spawn-layer bug" — falsified by S2.
- S2: "wrong-handle queue" (per archive) — falsified by S3.
- S3: "producer-loop underrun" — refined by S4 (it's not underrun, it's overrun per S4's branch-probe).
- S4: "ours self-releases too much" → H1 — FALSIFIED by S5.
- S5: H3 — "systemic under-production" — at least testable across multiple measurements, NOT yet a fix point.
S5's H3 is not a localized bug. It says "ours's entire work-queue-producer ecosystem under-fires by ~24-28%". That's a symptom-description, not a root cause. The next session needs to identify WHICH producer fn fails to fire as often, and WHY.
## S6 recommendation
Given S5's H3, the next session should **identify the specific producer-tid divergence**, not continue investigating the dispatch fn. Compare:
- Canary tid=18 (14 releases) vs ours's analog tid — does ours have an analog? Per-tid count divergence at the producer level.
- Canary tid=17 (9 releases) — note: per S1, canary tid=17 is the thread that completes 16+ `sub_821CB030` calls (the wedge wait site). It contributes 9 work-semaphore releases as a producer. Ours's analog is tid=13 (the wedged thread, releases 1).
**The wedge IS the producer divergence**: ours's tid=13 is wedged in `sub_821CB030+0x1AC` and can only release the semaphore 1× before blocking. Canary's tid=17 completes its loop and releases ~9×. So the system has been circular all along:
- Worker (tid=5/10) needs work-items enqueued by producers.
- One major producer is tid=13/17 (the cache thread).
- tid=13 wedges in ours at sub_821CB030 because the worker doesn't process enough items to wake it.
- Worker doesn't process enough items because tid=13 doesn't produce enough.
This is **self-consistent with the AUDIT-049 framing**: the wedge is a producer-consumer ladder where one side can't progress without the other, and they share the work-semaphore at handle 0x1050.
The TRUE first divergence point is upstream of all this: **whatever bootstraps the system so that tid=17 (canary's cache thread) completes its initial work cycle.** Canary's first releases at host_ns=6600 and 9503200 (tid=6 main) happen before tid=10 starts. Ours's tid=1 main also fires releases. The QUESTION: does ours's tid=1 release the right semaphore at the right host_ns?
## S6 path
Capture the **first N=20 release events on each engine, time-ordered**. Compare wallclock + tid + LR. Find the first event canary fires that ours does NOT fire (or vice versa). That's the bootstrap divergence.
LOC: 0 ours, 0 canary (data already captured). Just analysis of the existing logs.
## Cascade outcome
- A (canary cvar implemented + captured): PASS HIGH
- B (ours captured): PASS HIGH (existing --lr-trace)
- C (cadence comparison): PASS MEDIUM (H1 falsified high-confidence; H2 partial-low; H3 medium-high)
- D (root cause identified): N/A — narrowed but not pinpointed.
3 PASS / 1 N/A.
## Discipline
- xenia-rs HEAD UNCHANGED.
- Canary instrumentation 2 new files cvar-gated default-off (audit_70_semaphore_release_watch.h + .cc).
- Canary cache will need restore from `/tmp/canary-cache-bak-audit-068` (agent timed out before doing so — manual cleanup needed).
- `--mute=true` honored on canary runs.

View File

@@ -0,0 +1,271 @@
# AUDIT-069 Session 1 — wait-signal producer identification
Date: 2026-05-20
Status: **LANDED — signaler tid + caller fns identified; AUDIT-066 circular framing FALSIFIED**
## Headline
The wait at `sub_821CB030+0x1AC` (PC `0x821CB1DC`) — the canonical
AUDIT-049/065 wedge wait — fires in canary on two tids (worker tid=17 and
cache-loader tid=26). Both wedges are signaled by **tid=10**, a worker
thread spawned EARLY (via `sub_8244FF50``ExCreateThread(entry=sub_82450A28)`),
NOT by any of the four workers spawned by `sub_825070F0`. This refutes
AUDIT-066's circular framing ("γ-signaler running inside the 4 workers
spawned by sub_825070F0"): the actual signaler reaches the production
phase WITHOUT depending on sub_825070F0 firing.
## Step 1 — wait site capture (canary)
Probe: `--audit_61_branch_probe_pcs=0x821CB1DC --mute=true`, 180s cold.
| tid | r3 (handle) | r4 (timeout) | r5 (wait_mode) | r6 (ctx) | r31 (stack) | lr |
|----:|------------:|-------------:|---------------:|---------:|------------:|---:|
| 17 | `F80000A4` | `FFFFFFFF` | `0` (auto) | `BC65CEC0` | `7064FA70` | `0x821CB1D0` |
| 26 | `F8000110` | `FFFFFFFF` | `0` (auto) | `BC667F80` | `708FF990` | `0x821CB1D0` |
Two distinct fires (one per logical caller). Both have r4=INFINITE timeout
matching dossier. The lr=`0x821CB1D0` is `sub_821CB030+0x1A0` = the
instruction AFTER the bl-wait — consistent with branch-probe firing at the
basic-block-entry following the wait-call's return.
Handle drift across cold runs is real: Step 1 vs Step 3 vs Step 4 trajectories
produced wait handles `{F80000A0,F8000108}` / `{F80000A0,F8000108}` /
`{F80000A4,F8000110}`. Per-run handles are still deterministic; the absolute
ID is not.
**Important framing correction**: The brief expected "~16 fires" per
AUDIT-065. This was already partly retracted by AUDIT-066 (which observed
that thid=17 "terminates via `ExTerminateThread(0)` WITHOUT ever calling
Wait inside its cache loop"). Step 1 confirms AUDIT-066's correction:
the wait at `+0x1AC` fires ~2× per boot (one for the work-queue load
that ANON_Class_713383D7 work goes through; one for the cache-loader
sister-flow). Not 16. The wait is the WORK-QUEUE wait, not a per-cache-file
IO wait.
Confidence: HIGH (probe fired, r3/r4/r5 match expected wait-call ABI,
two distinct logical fires reproducible across cold runs).
## Step 2 — instrumentation (canary, ~280 LOC additive)
New `audit_69_*` cvars + slowpath module:
- **cpu_flags.{h,cc}** (+23/+48 LOC, of which ~30 LOC are mine vs cumulative):
- `--audit_69_event_signal_watch` (CSV of guest handle IDs, max 4)
- `--audit_69_event_signal_native_ptr` (CSV of guest VAs, max 4)
- `--audit_69_log_all_sets` (bool — log EVERY XEvent::Set/Pulse fire)
- **xenia-kernel/audit_69_event_signal_watch.h** (51 LOC) — fwd decls,
hot-path inline wrapper (single relaxed atomic load + branch).
- **xenia-kernel/audit_69_event_signal_watch.cc** (193 LOC) — lazy parse +
UINT32_MAX sentinel + `XThread::TryGetCurrentThread()` for lr/tid capture.
Mirrors AUDIT-068's static-init gate pattern.
- **xenia-kernel/xevent.cc** (+9 LOC) — hook at `XEvent::Set` and
`XEvent::Pulse` (the deepest convergence of Ke/Nt set + pulse paths).
Reading-error registration: `XThread::GetCurrentThread()` asserts on host
threads; first iteration used it and crashed. Fixed by switching to
`TryGetCurrentThread()`. (Same lesson as AUDIT-067's bool-vs-pointer
asymmetry but in a different fn.)
Cumulative cross-run canary additions retained in tree (AUDIT-061/067/068/069).
## Step 3 — correlated capture
Run: cold, 180s, `--mute=true --audit_61_branch_probe_pcs=0x821CB1DC,0x824AA2F0,0x824AAF50 --audit_69_log_all_sets=true`.
Volume: 122,165 log lines (Step 3) / 155,627 lines (Step 4 with wrapper probes).
Wait fires (Step 4): 2 (tid=17, tid=26, as in Step 1 but with handle drift to F80000A4/F8000110).
Signals on wedge handles (Step 4):
| wedge handle (waited on) | wait tid | signal fires | signal lr | signaling fn | signal tid |
|---|---|---|---|---|---|
| `0xF80000A4` | 17 | **1** | `0x824AA304` | `sub_824AA2F0` (NtSetEvent wrapper) | **10** |
| `0xF8000110` | 26 | **100** | `0x824AAFC8` | `sub_824AAF50` (a generic event-set-with-arg wrapper) | **10** |
The 100 fires on F8000110 are repeats — auto-reset events fire on first
signal; the rest are no-ops. Volume reflects how often the work-queue
processes items targeting this synchronizer.
## Step 4 — signaler-fn resolution (sylpheed.db cross-check)
Wrapper-entry probe data for these two NtSet wrappers, filtered to tid=10:
| wrapper | lr-of-caller | caller fn | tid=10 fire count |
|---|---|---|---|
| `sub_824AA2F0` (NtSetEvent wrapper) | `0x8245DA44` | **`sub_8245D9D8`** (γ-signaler D-A per AUDIT-062) | 23 |
| `sub_824AA2F0` (NtSetEvent wrapper) | `0x8245DB08` | **`sub_8245DA78`** (γ-signaler D-B per AUDIT-062) | 8 |
| `sub_824AAF50` (Ke-style wrapper) | `0x8245DC5C` | **`sub_8245DB40`** (NEW — not previously named) | 461 |
`sub_824AAF50` disasm needs follow-up but lr=0x824AAFC8 = `sub_824AAF50+0x78`
position is consistent with a `bl xeKeSetEvent` followed by status check
in an N-arg helper. The wrapper takes `(handle, ptr, size)` and the
internally-signaled event has a different handle from the input.
Containing-fn cross-check (`sylpheed.db`):
- `sub_8245D9D8` and `sub_8245DA78` are in the worker cluster
(0x82450000-0x8245C000). Per AUDIT-062: both are γ-signaler-D family,
hot from worker-side, missed by AUDIT-059/060 enumeration.
- `sub_8245DB40` is in the same cluster; callers are `sub_824528A8+0x54`
and `sub_8245EE50+0x20` (both worker-cluster internal).
- All three are reached from tid=10's body fn `sub_82450A68`, the
trampoline body for the entry `sub_82450A28` (which `ExCreateThread`
registers via `sub_8244FF50`).
**tid=10 caller chain (canary)**:
```
sub_8244FEA8 (caller of sub_8244FF50; itself called from 11 sites)
→ sub_8244FF50 (spawner — calls ExCreateThread w/ entry=sub_82450A28)
→ sub_82450A28 (thread-entry trampoline:
KeSetThreadPriority(-2, 3); bl sub_82450A68)
→ sub_82450A68 (worker dispatch loop)
→ ... γ-signalers D / DA78 / DB40
```
`sub_82450A28` is referenced as a data pointer at `0x8244FFF8` (inside
`sub_8244FF50`). No call edges to it — it's purely a thread-entry data
constant passed to ExCreateThread.
## Step 5 — ours cross-reference
All identified signaler fns (`sub_8245D9D8`, `sub_8245DA78`, `sub_8245DB40`,
`sub_824AA2F0`, `sub_824AAF50`, `sub_82450A28`, `sub_8244FF50`) are GAME
(XEX) code — not kernel-imports. In ours these execute under the JIT, with
no host-side analog to compare. The relevant question is whether the
trajectory in ours REACHES these PCs.
Direct evidence from prior runs:
**AUDIT-062 ours `--lr-trace=0x824aa2f0`** trace (`ours-ntset.jsonl`, 136
fires across cold boot up to deadlock):
- tid=6: 82 NtSet fires
- tid=1: 28 fires
- tid=5: 22 fires
- tid=8: 2 fires
- tid=13: 2 fires
- **tid=10: 0 fires**
ours NEVER spawns the canary-equivalent of tid=10 (the
`sub_8244FF50/sub_82450A28/sub_82450A68` worker). This is consistent with
AUDIT-057's "thread-gap" finding: ours has fewer threads than canary.
Within ours, the γ-signalers DO fire — but on tid=5 (calling sub_824AA2F0
from lr=`0x8245DA44` = `sub_8245D9D8+0x6C`) per AUDIT-062's
`ours-ntset.jsonl:line 1`. AUDIT-062 already established these signal
WRONG handles in ours (neighbors of `0x12AC` are signaled; the wedge
handle itself is not).
**Conclusion**: ours's signaler PCs exist and run, but on the wrong tids
(no tid=10 equivalent), and target the wrong handles. The PRODUCER →
SIGNALER chain in ours is structurally broken at the **thread-spawn**
layer, not the kernel-import layer.
Confidence (Step 5): MEDIUM-HIGH for the chain identification (data is
internally consistent and matches AUDIT-062's prior independent capture).
LOW on the ours-side resolution mechanism (this audit did not re-run
ours; cross-ref is read-only against prior dumps which may be stale
relative to current ours HEAD `e6d43a23…`).
## AUDIT-066 framing refutation
AUDIT-066 stated:
> the producer-side signal for THAT event comes from a γ-signaler running
> inside the 4 workers spawned by sub_825070F0 — per AUDIT-063's
> static-reachability survey of NtSet wrapper callers.
This is **falsified** by AUDIT-069 Step 3+4 evidence:
1. The signaler runs on **tid=10**, spawned by `sub_8244FF50` via
`ExCreateThread(entry=sub_82450A28)`. This is NOT one of sub_825070F0's
4 workers.
2. sub_8244FF50's caller chain does NOT require ANON_Class_713383D7's
vtable to be installed; it does NOT require sub_825070F0 to fire.
3. The circular-bootstrap concern AUDIT-066 raised ("workers can't signal
until they spawn; they can't spawn until the wedge clears") was
structurally correct framing IF the signaler were inside the
sub_825070F0 4-worker family. Since the actual signaler is tid=10
(independently spawned), the circle is **broken** — the signaler IS
reachable without the wedge clearing.
Reading-error class **#37**: static-reachability surveys (AUDIT-063 walked
12 hops from sub_82452DC0 to NtSet wrapper callers) are scoped to a
particular caller chain; they miss alternative producer paths reached via
unrelated thread-spawn sites. Always probe at the runtime SIGNAL site to
confirm which exact caller fired, not just which static path could fire.
## Cascade outcome
- **A** (capture wait site PC + r3=handle in canary): **PASS**. PC
`0x821CB1DC`, r3 captures the handle on first fire reproducibly.
- **B** (capture signal fires on the wait targets): **PASS**. 1 fire on
F80000A4 (wedge handle 1), 100 fires on F8000110 (wedge handle 2).
- **C** (resolve signaling fn + immediate caller fn): **PASS**.
`sub_824AA2F0``sub_8245D9D8` / `sub_8245DA78` (γ-signaler D family);
`sub_824AAF50``sub_8245DB40` (new). All on tid=10.
- **D** (ours-side cross-ref): **PARTIAL**. tid=10 IS missing in ours
per existing AUDIT-062 data; γ-signalers DO fire but on wrong tids.
Did not re-run ours in this session (per task discipline; cross-ref
read-only against prior dumps).
Net 3/4 PASS, 1/4 PARTIAL.
## Discipline
- xenia-rs HEAD `e6d43a23ac393004d2e5adf2f0395fd0b5e6448b` UNCHANGED.
`git diff HEAD | sha256sum` at session start =
`ed30fd526643918f67311caff0a10d1346d73fd0c0323e02477883cf5ff20357`
and at session end IDENTICAL.
- Canary patch is purely additive, cvar-gated default-off, UINT32_MAX
sentinel + std::once parse pattern (per AUDIT-068 discipline).
- Every canary run used `--mute=true`.
- Cache wiped before each cold run (4 cold runs total: Step 1 90s,
Step 1 180s rerun, Step 3 with handle watch, Step 3 with log_all_sets,
Step 4 with wrapper probes). Each cache moved to `/tmp/_audit_069_step*`
before next cold run.
- Cache restoration from `/tmp/canary-cache-bak-audit-068` deferred to
session end (done after this report).
## Artifacts
```
xenia-rs/audit-runs/audit-069-wait-signal-producer/
step1-wait-probe.log (90s baseline; 2 wait fires)
step1-wait-probe.stdout
step1-wait-probe-180s.log (180s rerun; 2 wait fires)
step1-wait-probe-180s.stdout
step3-signal-probe.log (180s; first signal-watch test;
handles drifted, partial correlation)
step3-signal-probe.stdout
step3-correlated.log (180s; log_all_sets; 120k signal fires)
step3-correlated.stdout
step4-wrapper-callers.log (180s; log_all_sets + wrapper entries;
155k events; correlated lr-to-caller)
step4-wrapper-callers.stdout
fix-canary.diff (cumulative canary diff vs 6de80dffe)
writer-report.md (this file)
```
## Session 2 recommendation
Two paths, both <100 LOC ours-side:
**Path 1 (ours read-only probe + targeted root-cause)**: re-run ours with
`--ctor-probe=0x82450A28` (the canary-tid=10 entry) — confirm it never
fires. Then `--ctor-probe=0x8244FF50` (the spawner). If sub_8244FF50 also
never fires, walk up its 11 callers in sylpheed.db — likely one of them
gates on a flag/event that's not set in ours's early-boot trajectory.
**Path 2 (canary additional capture)**: probe canary's tid=10 spawn
sequence in detail. Add `audit_69_thread_spawn_watch` cvar that logs
every ExCreateThread call with (entry_pc, ctx, suspend_flag, caller_lr).
~40 LOC. Compare to ours's spawn list — find which call goes
unfired in ours.
Both paths are cheaper than continuing on the wedge directly. Path 1 is
preferred: it stays on the ours side which is the failing engine.
Predicted Session 2 cascade:
- A (find sub_82450A28's first-non-fire ancestor in ours): 75-85%
- B (identify the missing precondition for that ancestor): 50-60%
- C (fix LOC in ours ≤ 50): 30-40%
- D (draws>0): 15-25% (single wedge unlock)