Files
xenia-rs/audit-runs/phase-c7-keSetEvent/investigation.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

4.5 KiB

Phase C+7 KeSetEvent — Phase 1 investigation

Headline (surprise)

The task description (and the C+6½ memory note) framed this as "canary returns prior signaled state; ours returns STATUS_SUCCESS". That framing is incorrect for canary. Canary's XEvent::Set unconditionally return 1; — see xenia-canary/src/xenia/kernel/xevent.cc:60-64:

int32_t XEvent::Set(uint32_t priority_increment, bool wait) {
  set_priority_increment(priority_increment);
  event_->Set();
  return 1;
}

Canary's xeKeSetEvent returns the result of ev->Set(...) — i.e. constant 1. The Phase A trampoline (shim_utils.h:620) emits ppc_context->r[3] post-Store, so the kernel.return event carries return_value = 1 for every successful KeSetEvent call. That matches the canary side of every observed divergence (idx 5 on tid=4, idx 26 on tid=7, …).

Ours's ke_set_event (exports.rs:4136-4152) ALREADY computes prior signaled state and returns it via ctx.gpr[3] = previous as u64. On a freshly-initialized event whose signaled = false, this writes 0 into r3 — which is the observed ours-side value.

True semantic gap: canary's KeSetEvent return == constant 1 on success; ours's KeSetEvent return == prior signaled state (0 or 1). Canary is technically wrong by NT semantics, but it is the reference oracle, so ours must match.

NtSetEvent has a DIFFERENT shape — canary returns NTSTATUS (always STATUS_SUCCESS on success) AND writes was_signalled (return of ev->Set() = constant 1) to the optional out-pointer (xboxkrnl_threading.cc:610-628). Ours's nt_set_event (exports.rs:4168-4185) already returns STATUS_SUCCESS and writes previous as u32 to the out-pointer. For canary parity, ours must write 1 (not prior) to the out-pointer.

Source evidence

Canary

  • xenia-canary/src/xenia/kernel/xevent.cc:60-64XEvent::Set returns constant 1.
  • xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_threading.cc:537-551xeKeSetEvent returns ev->Set(increment, !!wait) = pass-through of the constant.
  • xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_threading.cc:610-633xeNtSetEvent/NtSetEvent_entry returns X_STATUS_SUCCESS, writes was_signalled (= ev->Set return = 1) to previous_state_ptr if non-null.
  • xenia-canary/src/xenia/kernel/util/shim_utils.h:615-624 — Phase A trampoline emits kernel.return with return_value = r[3] after result.Store(ppc_context).

Ours

  • xenia-rs/crates/xenia-kernel/src/exports.rs:4136-4152ke_set_event writes previous as u64 (prior signaled state) to ctx.gpr[3].
  • xenia-rs/crates/xenia-kernel/src/exports.rs:4168-4185nt_set_event writes previous as u32 (prior signaled state) to the optional out pointer; returns STATUS_SUCCESS in r3.

Event records (from audit-runs/phase-c6half-xam-audit/diff-report.md)

tid=4 → tid=11 idx=5 (chain 5/47573/9)

canary: kernel.return KeSetEvent return_value=1
ours:   kernel.return KeSetEvent return_value=0

tid=7 → tid=2 idx=26 (chain 26/29/30 — close to full alignment)

canary: kernel.return KeSetEvent return_value=1
ours:   kernel.return KeSetEvent return_value=0

tid=12 → tid=7 idx=2 is a DIFFERENT bug

KeWaitForSingleObject return_value=258 vs 0 — that's a wait-timeout return (X_STATUS_TIMEOUT = 0x102), not the KeSetEvent bug. Out of scope for this session.

Fix shape

Two body changes, both in crates/xenia-kernel/src/exports.rs:

  1. ke_set_event: compute previous for internal bookkeeping (audit_signal, wake_eligible_waiters), but write constant 1 (or specifically: 1 if the event was found, 0 if handle invalid) to ctx.gpr[3]. Match canary's hardcoded XEvent::Set return.

  2. nt_set_event: continue writing STATUS_SUCCESS to r3. Change the out-pointer write from previous as u32 to 1u32 when the handle is valid — match canary's was_signalled = ev->Set() = constant 1.

Tests will pin both behaviors. Existing tests assert that the shadow signaled flag flips to true — those still pass. New/updated tests verify return values.

Note on NT semantics

Real NT KeSetEvent does return prior signaled state. Canary's choice to hardcode 1 is technically a bug — but documented and stable behavior the game has been built/run against. Once we land Phase C+7 the kernel-export emitter parity is restored; if real NT semantics ever become required (e.g. real-hw oracle is added), this becomes a revisit point. For now: parity > correctness.