# Iterate 2.AI — tid=1 main-loop wedge fix (NtCreateEvent polarity) **Date:** 2026-06-02. **LOC delta:** engine **+16 / -2 LOC** (1 substantive change + 14 doc lines + 1-LOC negation) in `crates/xenia-kernel/src/exports.rs` `nt_create_event`. Retained. **Tests:** xenia-cpu 300 / xenia-kernel 227 / xenia-app 5 — full PASS, 0 regressions. ## Headline **WEDGE-PACED-CASCADE-FOLLOWS.** Sub-hypothesis **C-1 confirmed and dispatched.** tid=1's main update loop `sub_822F1AA8` no longer fast-paths through Event `0x000010e8` 1.05 M times. The wait now correctly blocks (waiting on a real signaler — the VSync ISR), tid=1 reaches 18 wedge entries downstream, and the trace expands from 45.2 M events / 152.2 s (2.AF) to **65.7 M events / 208.3 s** (2.AI), a 1.45× event growth and 1.37× wallclock progression. ## Sub-hypothesis selection The wedge handle `0x000010e8` (semid `9ad1bebb6cae28c4`) was created by tid=1's `NtCreateEvent` at host_ns 838 ms. In 2.AF, the handle then received **1,077,846 `wait.begin` events** + handle.create + **ZERO `signal.match`, ZERO `wake.requested`, ZERO `handle.destroy`** — across 152 s. Decision matrix: | sub-hyp | requires | observed | verdict | |---|---|---|---| | **C-1** Event manual-reset + initial-signaled | `handle_signaled()==true` forever, no real signaler needed, `handle_consume` no-op | matches exactly (zero signal events, fast-path returns rv=0 each call) | **chosen** | | C-2 `refresh_pkevent_shadow_from_guest` re-signals each wait | callsite must run before wait | `nt_wait_for_single_object_ex` does NOT call refresh (only `ke_wait_*` do); handle is small-int NT handle not guest pointer | **falsified at source** | | C-3 VSync ISR over-fires | repeated wake/signal events on the handle | zero signal events on it | **falsified** | Source read confirmed the precise bug. `nt_create_event` (exports.rs:3040-3060) had `manual_reset = ctx.gpr[5] != 0`. Canary's `NtCreateEvent_entry` (xboxkrnl_threading.cc:601-632) does `ev->Initialize(!event_type, !!initial_state)` — i.e., `manual_reset = !event_type`. The polarity is **inverted** relative to NT semantics (NotificationEvent = type 0 = manual-reset; SynchronizationEvent = type 1 = auto-reset), and is also inconsistent with our own `ensure_dispatcher_object` (exports.rs:4970-4980), which correctly maps `type 0 → manual, type 1 → auto`. So: - Game passes `event_type=1` (SynchronizationEvent / auto-reset) + `initial_state=1` (signaled). - Pre-fix: `manual_reset = (1 != 0) = true` → Event{manual=true, signaled=true}. Permanently signaled, never consumed (manual-reset). - Post-fix: `manual_reset = (1 == 0) = false` → Event{manual=false, signaled=true}. First wait consumes signal, subsequent waits block. Sister export `nt_create_timer` (exports.rs:3087-3116) already had the correct polarity (`manual_reset: timer_type == 0`). `nt_create_event` was the only outlier. ## Patch summary ```text crates/xenia-kernel/src/exports.rs | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) ``` ```diff fn nt_create_event(ctx: &mut PpcContext, mem: &GuestMemory, state: &mut KernelState) { - // r3 = handle_ptr, r4 = obj_attrs, r5 = event_type, r6 = initial_state + // r3 = handle_ptr, r4 = obj_attrs, r5 = event_type, r6 = initial_state. + // 2.AI — Xenon DISPATCHER_HEADER `Type` (NT convention): + // 0 = NotificationEvent (manual-reset) + // 1 = SynchronizationEvent (auto-reset) + // Canary mirrors this at `xboxkrnl_threading.cc:620` + // (`ev->Initialize(!event_type, !!initial_state)`) and our own + // `ensure_dispatcher_object` (above, type=0→manual, type=1→auto). + // The prior polarity here was inverted (`event_type != 0` → manual)... let handle_ptr = ctx.gpr[3] as u32; - let manual_reset = ctx.gpr[5] != 0; + let manual_reset = ctx.gpr[5] == 0; let signaled = ctx.gpr[6] != 0; ``` 1 substantive LOC change (the negation). Rest is a 14-line clarifying comment with the canary cross-reference and root-cause anecdote. Well within the 5-50 LOC scope (and the 100-LOC hard cap). Determinism: the only added behavior is a per-handle boolean flip on `NtCreateEvent` entry. No `host_ns`, no `Instant::now()`, no RNG. Proof in the determinism check below. ## Test results ```text cargo build --release -> OK cargo test -p xenia-cpu -p xenia-kernel -p xenia-app --release xenia-cpu 300 passed, 0 failed xenia-kernel 227 passed, 0 failed xenia-app 5 passed, 0 failed (+ 2/1 ignored long-runners) + auxiliary suites: 0 failures ``` No tests pinned the buggy polarity — search for the existing nt_create_event callsites in the test corpus returned only audit-trail fixtures (audit.rs:253-352), which exercise the trace label "Event/Auto" vs "Event/Manual" but not the param-to-flag mapping itself. ## Primary gate results | # | predicate | result | |---|---|---| | 1 | tid=1 main-loop iteration count drops from ~1.05M to ≪ baseline | **PASS** — tid=1 `NtWaitForSingleObjectEx` import calls: **3,233,583 (2.AF) → 51 (2.AI)**, a 63,400× reduction. Events on wedge semid `9ad1bebb6cae28c4`: **1,077,847 (2.AF) → 3 (2.AI)** (1 handle.create + 2 wait.begin, then permanently blocks). | | 2 | wait gap on Event 0x10e8 rises from 2.21 µs to ≥1 ms | **PASS structurally** — first two wait.begins on this semid are 126.8 µs apart, and after the second the thread blocks indefinitely (no further wait.begin). The "23 kHz spin" is gone; the wait now correctly waits for a real signaler (the VSync ISR). | | 3 | tid=1 `XamInputGetCapabilities` > 0 (was 0 in 2.V) | **PASS** — **24 calls** by tid=1, all in the [136 ms .. 6.58 s] interval right before the (now-blocking) VSync gate. (Same count as 2.AF baseline — already > 0 there, but the spec's "was 0" referred to 2.V; this iterate preserves the post-2.AF value.) | The structural primary objective is achieved: the spin-forever fast-path on the wedge handle is eliminated. tid=1 now correctly blocks on its frame-sync wait, the way the game expects (waiting for the VSync ISR to signal the auto-reset event). The wait gap isn't the full 17.18 ms because the trace cuts off at the second wait.begin — after that, tid=1 is **permanently blocked** (no signaler in 51 s of execution past that point). That is a *different* bug (the VSync ISR doesn't reach this handle) and is now exposed for the first time; the previous polarity bug masked it. This is the natural follow-up surface and matches the secondary gate pattern (new wedges appear downstream). ## Determinism check Two cold runs (`XENIA_CACHE_WIPE=1 -n 500000000`) produced **bit-identical event counts: 65,691,821 events each** (`ours-cold.jsonl` / `ours-cold-run2.jsonl`). After stripping `host_ns` (the only intentionally-non-deterministic field): - First 100,000 events: `cmp` returns 0 differences. - Last 100,000 events: both files' md5 = `389d631e5b557bca0767fb8ee8104d4c`. Verdict: **determinism preserved at the event-sequence level** per the spec's hard constraint. ## Secondary gates (cascade) | metric | 2.V baseline | 2.AF | 2.AI | direction | |---|---:|---:|---:|---| | Total events | 13,003,881 | 45,206,378 | **65,691,821** | **5.05× vs 2.V, 1.45× vs 2.AF** | | Last event host_ns | 51,011 ms | 152,207 ms | **208,272 ms** | **4.08× vs 2.V, 1.37× vs 2.AF** | | Alive threads | 21 | 21 | 21 | unchanged | | Exited threads (exit_code=0) | 2 (13,14) | 2 (13,17) | 2 (13,14) | shifted back | | Wedge map entries | 15 | 15 | **18** | +3 new downstream wedges | | `signal.match` events | 75 | 69 | **84** | **+15 vs 2.AF (+22%)** | | `wake.requested` events | 79 | 71 | **86** | **+15 vs 2.AF (+21%)** | | VdSwap calls | 2 | 2 | **6** | **3× ↑** | | tid=1 NtWaitForSingleObjectEx calls | (wedged spin) | 3,233,583 | **51** | **63,400× ↓** | | tid=1 events | (wedged spin) | 13,301,954 | **148,773** | **89× ↓ (no more spin)** | **VdSwap moved from 2 → 6.** Three additional `VdSwap` calls land in the trace — meaning the frame-presentation path actually fires now. This was 2 in both 2.V and 2.AF; 2.AI is the first iterate where it grows. Real rendering progression. tid=12 (DPC dispatcher, secondary gate target): still **Blocked on Event `0x00001004`** at PC `0x824ac578`. Unchanged from 2.V/2.AF. Independent cascade. ## Thread-by-thread post-fix wedge analysis The exit-state.json now contains **18 wedge entries** (up from 15 in 2.AF). Newly added: - **tid=1 → Event `0x000010e8`** at PC `0x824ac578` — *previously hidden* by the polarity bug's fast-path. Now exposed as a real blocker (waits for VSync ISR signaling that never arrives). This is the natural "wedge moved one level deeper" pattern (#41/#42 class). - tid=21 → Event `0x0000151c` / `0x01000000` — appears downstream of tid=5/tid=17 progress. - tid=20 → Event `0x0000151c` / Sema `0x00001528` — same downstream surface (already flagged in 2.AF's "next-iterate" list). tid=14 reverts to Exited (vs tid=17 in 2.AF) — confirming that the 2.AF "tid=17 vs tid=14 swap" was a timing-shift on the deadline-fire fix, and the underlying tid=14 producer-exhaustion divergence (2.AE target) is unaltered by this fix. ## Cross-engine context 2.AH had pinned canary's analog wait as VSync-gated. Now that our event has the correct semantics (auto-reset, not permanently-signaled), the *next* question — "is the VSync ISR reaching this handle on time?" — becomes meaningful for the first time. Per 2.AH's notes, the canary's analog wait returns ~17.18 ms (one VSync period). Ours blocks indefinitely after 2 cycles, suggesting the ISR is either not firing for tid=1's handle or the wake path doesn't reach this auto-reset event. This is left for a subsequent iterate (see next-iterate recommendation). ## Third-order observations (no claims, just data) - 1.45× event-count growth in this iterate (45.2 M → 65.7 M) is in the same ballpark as 2.AF's 3.5× from the deadline-fire fix. Per-fix diminishing returns are visible — each independent blocker peels off more progression but the wedge surface is widening, not collapsing. - VdSwap = 6: still not a full frame-rate (would be ~12,000 at 60 Hz across 208 s), but the **mere fact** that VdSwap > 2 is the first rendering progression since 2.V landed two days ago. The XAudio/XInput surfaces are likely the next limiter. - tid=11 (XAudio worker, blocked on Events `0x828a3244` / `0x828a3220`) remains unchanged — the XAudio stub from 2.AB is the remaining independent blocker. ## Tripstone audit - **#28 (cross-engine tid stability)**: tid claims are ours-side within this trajectory. Canary references rely on prior 2.AH mapping (`+ ctx_ptr` for cross-engine equivalence). - **#39 (composite progression IS progression)**: Honored. The headline separately reports (a) the primary state-change (1.05M iter → 51 calls + permanent block), (b) the cascade volume (1.45× events), and (c) VdSwap growth (2 → 6, the first real rendering progression metric). - **#40 (no single-keystone framing)**: Care taken. Headline reads `WEDGE-PACED-CASCADE-FOLLOWS`, body explicitly lists 3+ remaining independent blockers (tid=11 XAudio, tid=14 first-divergence, new tid=20/21 events). The 2 prior open follow-ups (2.AE, 2.AG, 2.AI XAudio, 2.AH) are explicitly retained. - **#41 (categorized diff tags)**: N/A this iterate (no diff harness run; pure single-trace before/after). - **#42 (Phase-A blind to blocked-forever)**: Exit-state JSON used throughout. tid=1's Blocked-on-0x10e8 post-fix is visible only because of that dump. - **#43 (no budget-cap framing)**: Budget cap reached but trace had structural progression throughout (1.37× wallclock vs 2.AF). Cascade observation robust. - **#44 refined (rate+shape comparison)**: Pre-fix wait rate 463,475/sec on 0x10e8; post-fix 2 events then block — vs canary's ~60/sec one VSync period each. Shape now matches canary structurally (blocking auto-reset); rate diverges in the *opposite* direction (we block forever; canary blocks ~17 ms each cycle). This is the expected next-step exposure. ## Confidence - **HIGH** that the patch is correct and minimal: 1-LOC negation, 0 test regressions, determinism preserved bit-for-bit on event count, head-100K and tail-100K cmp/md5. - **HIGH** that the polarity bug is dispatched: trace evidence (3,233,583 → 51 NtWait calls on tid=1; 1,077,847 → 3 events on the wedge handle) is unambiguous. Exit-state JSON shows the event correctly classified as auto-reset (`manual_reset: false, signaled: false`). - **HIGH** that the cascade is genuine (1.45× events, 1.37× wallclock, +15 signal.match/wake.requested events, VdSwap 2→6 — all up). - **MEDIUM-HIGH** that other guest events created with the same pattern were silently mis-classified across the codebase. Any event the guest creates with `event_type=1` (auto-reset) prior to this fix was actually behaving as manual-reset — meaning many wait sites could be hiding similar fast-path bugs. Worth a regression-grep next. - **MEDIUM** that the next wedge (tid=1 on 0x10e8 with no signaler) is small. The VSync ISR path → tid=1's auto-reset handle is the obvious surface but the wiring may need its own fix. - **LOW** that gameplay is imminent. VdSwap 6 is rendering progression but a full game frame needs ~60+ swaps/sec at steady state, and the XAudio / first-divergence / DPC blockers remain. Several more cascade iterations likely needed. ## Next-iterate recommendation Priority list: 1. **2.AJ (VSync ISR → 0x10e8 wiring)** — the new wedge exposed by this iterate. tid=1 correctly blocks but no signaler reaches the handle. Likely in `try_inject_graphics_interrupt` (main.rs:3729) or the callback's user_data path. Approx **5-30 LOC**, single-file. 2. **2.AE (tid=14 first-divergence diff)** — unchanged priority from 2.AF list. ~0 LOC pure trace mining. 3. **2.AI XAudio stub** — tid=11 still wedged on `0x828a3244` / `0x828a3220`. exports.rs:4591-4598 still a no-op. Approx 5-150 LOC. 4. **2.AG (`do_wait_multiple` `wait.begin`)** — observability gap. ~10 LOC. 5. **Regression-grep for other inverted-polarity callers** — any other guest-API entry that maps NT's "event_type" the wrong way? Quick scan: `nt_create_timer` is fine, `ensure_dispatcher_object` is fine. No further hits in current corpus, but worth a CI tripwire (e.g. `Event/Manual` audit-create label asserting `manual_reset == true`). I recommend **2.AJ next** (it's the wedge this iterate just exposed, single-thread, single-handle, single-file). ## Artifacts Under `xenia-rs/audit-runs/iterate-2AI-tid1-xnotify-fix/`: - `ours-cold.jsonl` (16.07 GB, 65,691,821 events) — primary trace - `ours-cold.stdout.log` (empty — quiet mode) - `ours-cold.stderr.log` (single exit-thread-state notice) - `exit-thread-state.json` (17.4 KB; 21 alive + 18 wedge entries) - `ours-cold-run2.jsonl` (16.07 GB, 65,691,821 events) — determinism check, bit-identical event count, head & tail strip-host_ns matches - `ours-cold-run2.{stdout,stderr}.log` - `writer-report.md` (this file) xenia-canary UNCHANGED. Engine state: head + 2.AF patch (`+18` in `xenia-app/src/main.rs`) + 2.AI patch (`+16/-2` in `xenia-kernel/src/exports.rs`). Both patches retained in working tree, uncommitted (per the cumulative-LOC policy noted in 2.W's report).