Files
xenia-rs/audit-runs/iterate-2T-wake-requested/writer-report.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

16 KiB
Raw Blame History

Iterate 2.T — wake.requested instrumentation in wake_eligible_waiters (writer report)

Date: 2026-05-28. LOC delta: engine ~95 (event_log.rs +55, exports.rs ~40 across helpers + 2 in-loop snapshot/emit blocks), tooling +1 (diff_events.py ENGINE_LOCAL_KINDS extension). All retained, cvar-gated default-off via existing event_log::is_enabled(). Tests: 227 kernel + 19 path + 149 cpu + 300 main + ~30 smaller = full suite PASS, 0 regressions. Cascade: N/A — observability class, no semantics changed.

Headline

C-2B-SCHEDULER-PICK-BUG-IDENTIFIED. The kernel-side wake primitive (wake_eligible_waiters) correctly transitions tid=6 from Blocked → Ready EVERY TIME a signal targets it. 5 wake.requested events fire for target_tid=6 on handle 0x000010b4 over [485, 764]ms, all with transitioned=true, new_state="Ready", target_cpu=5. After the 5th wake at 764.29 ms, tid=6 is Ready on CPU5 but never executes again — last tid=6 event remains 663.60 ms — even though the trace continues until 766.86 ms (tids 4, 5, 13 keep emitting events). The exit-thread-state confirms tid=6 finishes in Ready on hw_id=5, priority=0 — sharing CPU5 with tid=10 at priority=15 (and tid=12 at priority=0). The pick_runnable rule in xenia-cpu/src/scheduler.rs:211-218 is strict-priority within a slot: max_by_key(|(i, t)| (t.priority, -(*i as i64))). tid=10's priority=15 deterministically beats tid=6's priority=0. C-2a (kernel wake-call bug) is FALSIFIED; C-2b (scheduler-pick-skipping Ready tid on CPU) is CONFIRMED, with the specific mechanism being strict-priority starvation by a same-CPU peer (tid=10, priority=15).

Mode

Pure observability emitter. No semantic change to wake_eligible_waiters or any caller. Cvar-gated via existing event_log::is_enabled() and implicitly skipped when zero waiters are eligible (the wake loop's existing early-returns happen before the prior-state snapshot). ENGINE_LOCAL in the diff tool — does not affect matched-prefix.

Invocation (identical to 2.Q):

XENIA_CACHE_WIPE=1 timeout 600 ./target/release/xenia-rs exec \
  -n 50000000 --quiet \
  --phase-a-event-log audit-runs/iterate-2T-wake-requested/ours-cold.jsonl \
  "../Project Sylpheed - Arc of Deception (USA, Europe) (En,Ja).iso"

Exit code 0. Output: ours-cold.jsonl (28.7 MB, 121,641 events — 121,569 baseline + 36 signal.match + 36 wake.requested), exit-thread-state.json (9651 bytes, bit-identical to 2.M/2.N/2.Q/2.S), ours-cold.stderr.log (single 2.M emission-notice line), ours-cold.stdout.log (empty — quiet mode).

Patch summary

file purpose LOC added
crates/xenia-kernel/src/event_log.rs new emit_wake_requested(signaling_tid, cycle, target_tid, target_handle, wait_kind, transitioned, new_state, target_cpu) — emits schema-v1 wake.requested event mirroring the signal.match shape from 2.Q ~55
crates/xenia-kernel/src/exports.rs wake_classify_state(HwState) -> (wait_kind, state_name) + wake_signaling_ctx(state) -> (tid, cycle) + emit_wake_requested_for(state, signaler, cycle, target, handle, prior_kind, prior_name) shim (~40 LOC), plus 2 in-loop snapshot blocks inside wake_eligible_waiters (~14 LOC: manual-fan-out path + auto-winner path) capturing prior state BEFORE set_wake_status_for_waitany + wake_ref and emitting AFTER ~40
crates/xenia-kernel/src/exports.rs let (signaling_tid, signaling_cycle) = wake_signaling_ctx(state); once at entry to wake_eligible_waiters (captures the signal-call caller's identity for the entire fan-out attribution) 1
tools/diff-events/diff_events.py ENGINE_LOCAL_KINDS += {"wake.requested"} so the new kind consumes a per-tid idx slot on the emitter side without alignment cost (analog to 2.Q's signal.match addition) 1

Total engine ~95 LOC, tooling +1. Within the 80-150 target / 200 hard cap.

Schema-v1 event emitted:

{"schema_version":1,"engine":"ours","kind":"wake.requested",
 "tid":<signaling tid>,"tid_event_idx":<idx>,"guest_cycle":<cycle>,
 "host_ns":<ns>,"deterministic":true,
 "payload":{"target_tid":<woken tid>,"target_handle":"0x<8hex>",
            "wait_kind":"WaitAny"|"WaitAll"|"WaitSingle"|"Other",
            "transitioned":true|false,
            "new_state":"Ready"|"StillBlocked"|"AlreadyReady"|"Exited"|"ServicingIrq"|"Idle"|"Other",
            "target_cpu":<hw_id>|null}}

Emit policy: exactly ONE event per waiter the kernel touches (the wake loop's early-returns naturally skip zero-eligible-waiter cases per spec). Snapshot of prior state captured immediately BEFORE set_wake_status_for_waitany + wake_ref; post-state read AFTER. transitioned is the strict prior=Blocked && post=Ready predicate.

Test results

cargo build --release  -> OK (1 pre-existing dead_code warning unrelated)
cargo test --release   -> all suites PASS:
  xenia-kernel    227 passed, 0 failed
  xenia-cpu       149 passed, 0 failed
  xenia-app       300 passed, 0 failed
  xenia-path       19 passed, 0 failed
  + 30+ smaller suites, 0 failures total

Baseline event count UNCHANGED at 121,569 (vs 2.J/2.K/2.Q/2.S); signal.match count UNCHANGED at 36 (vs 2.Q/2.S); 36 NEW wake.requested events emitted. Total 121,641 = 121,569 + 36 + 36.

wake.requested summary (all 36 events)

field distribution
transitioned true: 36 — every wake successfully transitioned its target from Blocked → Ready
new_state "Ready": 36 — no StillBlocked / AlreadyReady / etc.
wait_kind "WaitAny": 36 — all single-WaitAny parks (consistent with NtWaitForSingleObjectEx routing through WaitAny{handles:[h]} per do_wait_single)
signaler tids tid=5:19, tid=1:9, tid=6:3, tid=2:1, tid=11:1, tid=9:1, tid=8:1, tid=13:1
target tids tid=5:11, tid=1:9, tid=4:7, tid=6:5, tid=8:2, tid=9:1, tid=10:1

Every signal that targeted a parked waiter produced a successful Blocked→Ready transition. The kernel wake primitive plumbing is sound for this trajectory.

tid=6 deep-dive — the C-2 case

Five wake.requested events for target_tid=6, paired 1:1 with the 5 signal.match events on handle 0x000010b4 (same timestamps to ms precision):

ns (ms) signaler handle transitioned new_state target_cpu tid=6 next event
485.03 tid=5 0x10b4 true Ready 5 (tid=6 runs through ~520 ms)
485.14 tid=5 0x10b4 true Ready 5 (back-to-back wake; coalesces)
510.64 tid=5 0x10b4 true Ready 5 (tid=6 runs through ~660 ms)
659.00 tid=5 0x10b4 true Ready 5 tid=6 last event @ 663.60 ms
764.29 tid=5 0x10b4 true Ready 5 NEVER — tid=6 emits no further events; trace continues to 766.86 ms

The first 4 wakes are followed by tid=6 execution. The 5th wake (764.29 ms) transitions tid=6 Blocked→Ready on CPU5 — exactly as expected — but tid=6 never gets picked by the scheduler before the 50M-instruction budget cuts off ~2.6 ms later. Other tids on the trace continue emitting events past 764.29 ms:

tid last event ns (ms) vs tid=6 quiescence
1 760.12 predecessor
4 766.84 +103 ms after tid=6 last event; +2.55 ms after tid=6's final wake
5 766.86 +103 ms after; the signaler still alive
13 763.04 +99 ms after

So the trace is NOT truncated; tid=6 specifically is starved post-wake.

CPU5 contention — the C-2b mechanism

Exit-thread-state CPU5 (hw_id=5) Ready set:

tid state hw_id affinity priority pc
6 Ready 5 0x20 0 0x824ab214
10 Ready 5 0x20 15 0x824d1404
12 Ready 5 0x20 0 0x824aa6a4
3 Blocked 5 0x20 0

Per xenia-cpu/src/scheduler.rs:211-218:

pub fn pick_runnable(&self) -> Option<usize> {
    self.runqueue.iter().enumerate()
        .filter(|(_, t)| matches!(t.state, HwState::Ready | HwState::ServicingIrq(_)))
        .max_by_key(|(i, t)| (t.priority, -(*i as i64)))
        .map(|(i, _)| i)
}

Strict-priority within a slot. tid=10 (priority=15) deterministically beats tid=6 (priority=0) whenever tid=10 is also Ready. tid=10's last trace event is 727.92 ms, but tid=10 is still Ready in the exit state — meaning tid=10 is executing past 727.92 ms in a path that doesn't cross kernel boundaries (no further kernel.call, import.call, etc.), almost certainly a CPU-bound loop. That loop starves tid=6 on CPU5 indefinitely.

This is the C-2b mechanism with concrete identification:

  • NOT a wake-call bug (wake.requested fires 5× with transitioned=true, new_state=Ready on tid=6) — falsifies C-2a.
  • IS a scheduler-pick-skip but specifically a strict-priority starvation: when a same-slot peer is at priority ≥1, lower-priority Ready threads on that slot never run.

Disambiguation result vs goal-spec

outcome gate predicate result conclusion
C-2A-KERNEL-WAKE-BUG wake.requested for tid=6 absent OR transitioned=false NO (5 events, all transitioned) FALSIFIED
C-2B-SCHEDULER-PICK-BUG wake.requested fires for tid=6 with transitioned=true, new_state=Ready AND tid=6 still doesn't execute YES (5 events, last at 764.29 ms; tid=6 last event 663.60 ms; never runs after the 5th wake) CONFIRMED
NEITHER-CLEAN-NEW-HYPOTHESIS neither sub-hypothesis matches NO NOT TRIGGERED
RUN-FAILED non-zero exit / crash / hang / no wake.requested events NO (EXIT=0, 36 events) NOT TRIGGERED

Result: C-2b CONFIRMED with concrete mechanism — strict-priority starvation on CPU5 by tid=10 (priority=15) blocking tid=6 (priority=0).

Tripstone audit

  • #28 (cross-engine tid stability): No cross-engine tid claims. All reported tids are ours-side scheduler tids; stable within this trajectory (verified per-tid counts intact vs 2.Q/2.S baseline).
  • #39 (composite progression IS progression): NO progression claim. VdSwap=1, draws=0, render_targets=0 — bit-identical to 2.J/2.K/2.Q/2.N/2.S. The C-2b confirmation is a root-cause identification, not a progression event.
  • #40 (single-keystone framing): Care taken. The headline names strict-priority starvation as the SPECIFIC mechanism behind C-2b (not just "the scheduler is buggy"). The data isolates the priority asymmetry (15 vs 0) and same-slot pinning (0x20=CPU5 only). Open follow-ups are not collapsed: why is tid=10 priority=15 (canary parity? game intent?), and why does the strict-priority scheduler not implement starvation-avoidance (intentional simplification or oversight)?
  • #41 (categorized diff tags): wake.requested is ENGINE_LOCAL in the diff harness — doesn't affect matched-prefix.
  • #42 (Phase-A blind to blocked-forever waits): Used the always-on exit-thread-state.json to confirm tid=6's final Ready state post-trace-end (Phase-A alone would have shown only the 5 wakes without the final unscheduled outcome).
  • #43 (no budget-cap framing): 2.S already established the 50M budget reproduces the phenomenon; this iterate uses 50M and confirms the C-2b mechanism is structural and reproducible in the smaller budget. The 2.6 ms gap between the 5th wake (764.29) and tid=5/4's final events (~766.86) is sufficient to show the scheduler had ample opportunity to pick tid=6 and didn't.

Comparison: 2.K → 2.Q → 2.S → 2.T

gate 2.K (500M, baseline) 2.Q (50M + signal.match) 2.S (500M + signal.match) 2.T (50M + signal.match + wake.requested)
total events 121,569 121,605 121,605 121,641
baseline events 121,569 121,569 121,569 121,569
signal.match events n/a 36 36 36
wake.requested events n/a n/a n/a 36
wake.requested for tid=6 (transitioned=true) n/a n/a n/a 5
exit-state size 9651 9651 9651 9651
tid=6 final state Ready Ready Ready Ready
Termination budget hit (50M) budget hit (500M) budget hit (50M)

Baseline + signal.match counts bit-identical to 2.Q/2.S. New 36 wake.requested events 1:1 with the 36 signal.match events (every successful signal that found a waiter resulted in a wake call that transitioned the waiter — sanity check of the kernel wake plumbing across all signaler tids and target handles).

Confidence

  • HIGH that the patch is correct and observability-only: 0 test regressions; wake_eligible_waiters body unchanged except for the snapshot reads and post-call emits.
  • HIGH that wake.requested events fire as designed: 36 events emitted, all with transitioned=true && new_state=Ready (which matches the expected branch in the wake primitive — every waker pulled off a non-empty waiter queue should transition Blocked→Ready).
  • HIGH that tid=6 is woken 5× by tid=5 on handle 0x10b4 with successful transition — grepped exhaustively by exact handle string
    • target_tid filter.
  • HIGH that tid=6 emits no events after 663.60 ms despite the 5th wake at 764.29 ms (per-tid scan).
  • HIGH that exit-state thread geometry matches 2.M/2.N/2.Q/2.S bit-identically (9651 bytes; tid=6 final Ready/hw_id=5/affinity=0x20).
  • HIGH that C-2a (kernel wake-call bug) is FALSIFIED: the wake-call IS issued AND IS transitioning tid=6 Blocked→Ready correctly.
  • HIGH that C-2b (scheduler-pick-skip) is CONFIRMED with strict- priority starvation as the specific mechanism, given CPU5's exit- state Ready set has tid=10 at priority=15 (vs tid=6 at priority=0) pinned to affinity 0x20.
  • MEDIUM-HIGH that tid=10 is in a CPU-bound non-kernel-crossing loop after 727.92 ms (inferred from "still Ready at exit" + "no trace events after 727.92 ms"). Could also be a tight kernel.call loop where the trace happens to not have a corresponding emit — but on CPU5 with priority advantage, tid=10 keeps the slot.

Next-iterate recommendation

Three complementary directions, priority order:

(1) 2.U — investigate why tid=10 is priority=15 / what tid=10 is doing post-727.92ms (~0 LOC observability OR ~50 LOC PC-trace + KeSetBasePriorityThread hook). Pull from existing audit DB (sylpheed.db) the basic block containing PC 0x824d1404 (tid=10's final PC) — find the loop body. Compare with canary's analog tid to see whether canary's same-PC thread also runs at priority=15. If canary runs the same code at a different priority OR has a yield/sleep in the loop, that's the divergence point. If canary is identical, then the problem is ours's scheduler lacks fair preemption that canary's host-OS-thread scheduler provides naturally (canary multiplexes guest threads onto OS threads, getting OS-level fairness for free; ours's cooperative-pick-runnable model needs explicit anti-starvation).

(2) 2.V — add starvation-avoidance to pick_runnable (~30-80 LOC engine, semantic change). E.g., dynamic priority aging or a quantum-expiration override that occasionally picks the highest-priority not-recently-run thread. Risk: changes scheduling semantics — matched-prefix and Phase-A trace counts will shift; needs careful parity check vs canary. NOT recommended as next without first establishing canary baseline (2.U).

(3) 2.W — canary-side wake.requested mirror (~30-60 LOC C++) for cross-engine parity diff. Lower urgency than 2.U because the ours-side data alone is sufficient to identify the scheduler-pick mechanism; canary parity becomes relevant only if 2.U finds tid=10 is intentionally high-priority and canary handles it via OS-level fairness.

Recommended: 2.U first (~30 min DB lookup + cross-engine check), then 2.V or 2.W depending on what 2.U reveals.

Artifacts

Under xenia-rs/audit-runs/iterate-2T-wake-requested/:

  • ours-cold.jsonl (28.7 MB, 121,641 events = 121,569 baseline + 36 signal.match + 36 wake.requested)
  • ours-cold.stdout.log (empty — quiet mode)
  • ours-cold.stderr.log (single 2.M emission-notice line)
  • exit-thread-state.json (9651 bytes; bit-identical to 2.M/2.N/2.Q/2.S — 13 threads + 10 wedge entries)
  • writer-report.md (this file)

Engine HEAD e6d43a23ac393004d2e5adf2f0395fd0b5e6448b + uncommitted 2.Q signal.match patch + this iterate's 2.T wake.requested patch. xenia-canary UNCHANGED.