Priority aging in xenia-cpu/scheduler.rs:pick_runnable
(effective_priority = base + age_bonus(now_round - last_run_round),
capped at +31, AGING_ROUNDS_PER_BONUS=1). Strict-priority was parking
priority=0 threads behind CPU-bound priority=15 audio mixer
(sub_824D1328 guest spinwait at PC=0x824d1404 on CPU5). Aging
eventually picks the starved thread, breaking the producer-consumer
cycle that caused 5-tid wedge at PC=0x824ac578 since AUDIT-049 (10 May).
Cascade observed: tid=13 clean exit; events 121K -> 13M (107x); last
host_ns 767ms -> 51,011ms (66x); 8 new threads spawn; VdSwap 1 -> 2.
Complete two-day iterate sequence (2026-05-27 -> 2026-05-28):
- 2.F: VdSwap drain timeout 900ms -> 1ms (xenia-gpu/handle.rs); 876x
perf win on VdSwap kernel callback
- 2.H: vA0000000 physical heap bucket added (state.rs, exports.rs);
ctx_ptrs now in 0xA0000000-0xBFFFFFFF range matching canary
- 2.L: Phase-A diff harness categorized [return_value mismatch],
[status mismatch], [args_resolved.path mismatch] tags
(tools/diff-events/diff_events.py); closes reading-error #41
(silent test-harness state leak invalidating trace diffs)
- 2.M: always-on exit-thread-state.json sibling to Phase-A JSONL
(event_log.rs + xenia-app/main.rs); closes reading-error #42
(Phase-A blind to blocked-forever waits)
- 2.Q: signal.match kernel instrumentation in NtSetEvent /
NtReleaseSemaphore / KeSetEvent / KeReleaseSemaphore
(exports.rs); emits target_handle + waiter_count + waiter_tids
- 2.T: wake.requested kernel instrumentation in wake_eligible_waiters
(exports.rs); emits target_tid + transition + new_state
- 2.V: scheduler priority aging (xenia-cpu/scheduler.rs) [keystone]
Plus accumulated WIP from earlier May (contention_manifest,
phase_b_snapshot, xam/xaudio enhancements, analysis db, xex loader,
xenia-app main loop, etc.). Audit-runs/ artifacts remain untracked
per project convention.
Tests: 300 xenia-cpu / 227 xenia-kernel / 5 xenia-app / 19 xenia-path
/ 30+ smaller suites -- all PASS, 0 regressions. Determinism preserved
(2x cold runs bit-identical at 13,003,881 events post-2.V).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
diff_events.py — Phase A event-log diff tool
A stdlib-only Python tool that diffs two schema-v1 JSONL event logs (one per engine) and reports the first behavioral divergence per guest thread. Built for the Phase A diff harness — see audit-runs/phase-a-diff-harness/README.md and schema-v1.md.
What it does
- Reads two JSONL files. Validates each begins with a
schema_version=1header event. - Builds per-thread streams keyed by
tid_event_idx(the schema's per-tid monotonic counter). - Maps canary-tid ↔ ours-tid (auto-pairs by first
kernel.callname in each stream, or manual via--tid-map). - Walks each mapped pair in parallel, comparing events with rules from the schema (raw_handle_id skipped, host_ns skipped, wait_duration_cycles skipped, etc.).
- On first divergence: prints 5-event pre-context + the divergent event + the next event from each. Stops that thread's walk.
- Writes a markdown report.
Usage
# Default — auto-map tids, write markdown to stdout
python3 diff_events.py --canary canary.jsonl --ours ours.jsonl
# Write report to a file
python3 diff_events.py --canary c.jsonl --ours o.jsonl --out report.md
# Manual tid map
python3 diff_events.py --canary c.jsonl --ours o.jsonl --tid-map 6=1,7=2
# Negative-test mode — exit non-zero on ANY divergence (gate-4)
python3 diff_events.py --canary c.jsonl --ours o.jsonl --validate-identical
How it compares
These fields are skipped when comparing payloads:
- Top-level:
engine,host_ns,guest_cycle,deterministic. handle.create/handle.destroy:raw_handle_id,handle_semantic_id(engine-local).wait.begin:handles_semantic_ids(engine-local SIDs).wait.end:wait_duration_cycles(depends on host scheduling),woken_by_semantic_id.
The tid_event_idx field is the alignment key. Two events at the same tid_event_idx on a mapped pair of tids are expected to be the same logical event. The kind must match; the payload must match field-by-field (except skipped fields).
Phase C+18 — Cross-tid floating handle.create (shared-global dispatchers)
Process-global kernel dispatcher objects (KEVENT/KSEMAPHORE etc. that game code creates with KeInitializeEvent or static-allocs and shares across multiple guest threads) are lazy-wrapped on first guest-thread touch by canary's XObject::GetNativeObject and ours's ensure_dispatcher_object. Whichever thread happens to touch the dispatcher first synthesizes the wrapper and emits the handle.create event. Which thread wins is timing-dependent — canary and ours may disagree.
The SID for these synthesized handles is computed via a scheduling-invariant recipe keyed on (pointer, object_type) only (see schema-v1.md §"Shared-global SIDs"). The same dispatcher therefore yields the same SID in both engines regardless of the first-toucher thread.
The diff tool detects shared-global handle.create events by recomputing the deterministic SID from the event's (raw_handle_id, object_type) payload and matching against the emitted handle_semantic_id. When per-tid alignment finds one side has an "extra" handle.create event whose SID is in the global set, the tool advances only that side's stream pointer past the floating event and re-compares — preserving strict alignment for everything else.
The summary table shows per-pair floating_skipped (c/o) counts so you can see how many events were absorbed by this mechanism.
Known limitations (v1)
- Auto tid-map is naive: pairs canary-tid with ours-tid by the first
kernel.callname on each thread. Works for boot when the same initial call happens on each engine's primary thread; can mis-pair if two threads start with the same first-call name or if a thread spawns earlier on one engine. Use--tid-mapto override. - No streaming: loads both files fully into memory. Acceptable for boot-window runs; the canary log is ~370 MB for a 12 s run.
- First-divergence only: per-thread walk stops at first divergence. Subsequent divergences on the same thread are not reported (a sliding-window mode could be added later if needed).
- Schema v1 only: refuses to parse v2 inputs (forward-incompat is intentional).
Files
diff_events.py— single-file CLI, stdlib only (json, argparse, pathlib).README.md— this file.
Test it
# Self-diff (compare a file against itself) should report 0 divergences.
python3 diff_events.py --canary x.jsonl --ours x.jsonl --validate-identical
echo "exit=$?" # expect 0
# Negative test: corrupt one event and confirm the tool reports it.
sed '50s/"kernel.call"/"kernel.CORRUPT"/' x.jsonl > /tmp/x-corrupt.jsonl
python3 diff_events.py --canary x.jsonl --ours /tmp/x-corrupt.jsonl --validate-identical
echo "exit=$?" # expect 1