handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes

Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-06-05 07:19:08 +02:00
parent acd1656753
commit ef93a4fa14
620 changed files with 108303 additions and 1 deletions

View File

@@ -0,0 +1,116 @@
# Phase C+7 KeSetEvent — Phase 1 investigation
## Headline (surprise)
The task description (and the C+6½ memory note) framed this as
"canary returns prior signaled state; ours returns STATUS_SUCCESS".
**That framing is incorrect** for canary. Canary's `XEvent::Set`
unconditionally `return 1;` — see
`xenia-canary/src/xenia/kernel/xevent.cc:60-64`:
```cpp
int32_t XEvent::Set(uint32_t priority_increment, bool wait) {
set_priority_increment(priority_increment);
event_->Set();
return 1;
}
```
Canary's `xeKeSetEvent` returns the result of `ev->Set(...)` — i.e.
constant **1**. The Phase A trampoline (`shim_utils.h:620`) emits
`ppc_context->r[3]` post-Store, so the kernel.return event carries
`return_value = 1` for every successful KeSetEvent call. That matches
the canary side of every observed divergence (idx 5 on tid=4, idx 26
on tid=7, …).
Ours's `ke_set_event` (`exports.rs:4136-4152`) ALREADY computes prior
signaled state and returns it via `ctx.gpr[3] = previous as u64`. On
a freshly-initialized event whose `signaled = false`, this writes 0
into r3 — which is the observed ours-side value.
**True semantic gap**: canary's KeSetEvent return == constant 1 on
success; ours's KeSetEvent return == prior signaled state (0 or 1).
Canary is technically wrong by NT semantics, but it is the reference
oracle, so ours must match.
NtSetEvent has a DIFFERENT shape — canary returns NTSTATUS (always
STATUS_SUCCESS on success) AND writes `was_signalled` (return of
`ev->Set()` = constant 1) to the optional out-pointer
(`xboxkrnl_threading.cc:610-628`). Ours's `nt_set_event`
(`exports.rs:4168-4185`) already returns STATUS_SUCCESS and writes
`previous as u32` to the out-pointer. For canary parity, ours must
write **1** (not prior) to the out-pointer.
## Source evidence
### Canary
* `xenia-canary/src/xenia/kernel/xevent.cc:60-64``XEvent::Set`
returns constant `1`.
* `xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_threading.cc:537-551`
`xeKeSetEvent` returns `ev->Set(increment, !!wait)` =
pass-through of the constant.
* `xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_threading.cc:610-633`
`xeNtSetEvent`/`NtSetEvent_entry` returns `X_STATUS_SUCCESS`,
writes `was_signalled` (= `ev->Set` return = 1) to
`previous_state_ptr` if non-null.
* `xenia-canary/src/xenia/kernel/util/shim_utils.h:615-624` — Phase A
trampoline emits `kernel.return` with `return_value = r[3]` after
`result.Store(ppc_context)`.
### Ours
* `xenia-rs/crates/xenia-kernel/src/exports.rs:4136-4152``ke_set_event`
writes `previous as u64` (prior signaled state) to `ctx.gpr[3]`.
* `xenia-rs/crates/xenia-kernel/src/exports.rs:4168-4185``nt_set_event`
writes `previous as u32` (prior signaled state) to the optional out
pointer; returns `STATUS_SUCCESS` in r3.
## Event records (from `audit-runs/phase-c6half-xam-audit/diff-report.md`)
### tid=4 → tid=11 idx=5 (chain 5/47573/9)
```
canary: kernel.return KeSetEvent return_value=1
ours: kernel.return KeSetEvent return_value=0
```
### tid=7 → tid=2 idx=26 (chain 26/29/30 — close to full alignment)
```
canary: kernel.return KeSetEvent return_value=1
ours: kernel.return KeSetEvent return_value=0
```
### tid=12 → tid=7 idx=2 is a DIFFERENT bug
`KeWaitForSingleObject return_value=258 vs 0` — that's a wait-timeout
return (`X_STATUS_TIMEOUT = 0x102`), not the KeSetEvent bug. Out of
scope for this session.
## Fix shape
Two body changes, both in `crates/xenia-kernel/src/exports.rs`:
1. `ke_set_event`: compute `previous` for internal bookkeeping
(audit_signal, wake_eligible_waiters), but write **constant 1**
(or specifically: 1 if the event was found, 0 if handle invalid)
to `ctx.gpr[3]`. Match canary's hardcoded `XEvent::Set` return.
2. `nt_set_event`: continue writing STATUS_SUCCESS to r3. Change the
out-pointer write from `previous as u32` to **`1u32`** when the
handle is valid — match canary's `was_signalled = ev->Set()` =
constant 1.
Tests will pin both behaviors. Existing tests assert that the shadow
signaled flag flips to true — those still pass. New/updated tests
verify return values.
## Note on NT semantics
Real NT `KeSetEvent` does return prior signaled state. Canary's
choice to hardcode 1 is technically a bug — but documented and stable
behavior the game has been built/run against. Once we land Phase C+7
the kernel-export emitter parity is restored; if real NT semantics
ever become required (e.g. real-hw oracle is added), this becomes a
revisit point. For now: parity > correctness.