ITERATE-2.V: scheduler priority aging closes 18-day AUDIT-049 wedge

Priority aging in xenia-cpu/scheduler.rs:pick_runnable
(effective_priority = base + age_bonus(now_round - last_run_round),
capped at +31, AGING_ROUNDS_PER_BONUS=1). Strict-priority was parking
priority=0 threads behind CPU-bound priority=15 audio mixer
(sub_824D1328 guest spinwait at PC=0x824d1404 on CPU5). Aging
eventually picks the starved thread, breaking the producer-consumer
cycle that caused 5-tid wedge at PC=0x824ac578 since AUDIT-049 (10 May).

Cascade observed: tid=13 clean exit; events 121K -> 13M (107x); last
host_ns 767ms -> 51,011ms (66x); 8 new threads spawn; VdSwap 1 -> 2.

Complete two-day iterate sequence (2026-05-27 -> 2026-05-28):
- 2.F: VdSwap drain timeout 900ms -> 1ms (xenia-gpu/handle.rs); 876x
       perf win on VdSwap kernel callback
- 2.H: vA0000000 physical heap bucket added (state.rs, exports.rs);
       ctx_ptrs now in 0xA0000000-0xBFFFFFFF range matching canary
- 2.L: Phase-A diff harness categorized [return_value mismatch],
       [status mismatch], [args_resolved.path mismatch] tags
       (tools/diff-events/diff_events.py); closes reading-error #41
       (silent test-harness state leak invalidating trace diffs)
- 2.M: always-on exit-thread-state.json sibling to Phase-A JSONL
       (event_log.rs + xenia-app/main.rs); closes reading-error #42
       (Phase-A blind to blocked-forever waits)
- 2.Q: signal.match kernel instrumentation in NtSetEvent /
       NtReleaseSemaphore / KeSetEvent / KeReleaseSemaphore
       (exports.rs); emits target_handle + waiter_count + waiter_tids
- 2.T: wake.requested kernel instrumentation in wake_eligible_waiters
       (exports.rs); emits target_tid + transition + new_state
- 2.V: scheduler priority aging (xenia-cpu/scheduler.rs) [keystone]

Plus accumulated WIP from earlier May (contention_manifest,
phase_b_snapshot, xam/xaudio enhancements, analysis db, xex loader,
xenia-app main loop, etc.). Audit-runs/ artifacts remain untracked
per project convention.

Tests: 300 xenia-cpu / 227 xenia-kernel / 5 xenia-app / 19 xenia-path
/ 30+ smaller suites -- all PASS, 0 regressions. Determinism preserved
(2x cold runs bit-identical at 13,003,881 events post-2.V).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-05-29 07:27:26 +02:00
parent e6d43a23ac
commit ad45873a1b
50 changed files with 14389 additions and 506 deletions

View File

@@ -0,0 +1,50 @@
---
address: 0x82458B90
classification: normal_callee
confidence: high
last_audit: 060
aliases:
- "canary γ-wedge signaler A"
---
# sub_82458B90 — canary γ-wedge signaler A (NtSetEvent caller from tid=6 thread_proc body)
## Synopsis
A function that wraps `bl 0x824AA2F0` (NtSetEvent wrapper) at an internal PC near `+0x180` (canary LR `0x82458D14`). In canary, this is one of two NtSetEvent caller-LRs that signal the AUDIT-059 file-IO completion wedge dup handle (per `(tid, r31)` cross-run invariant). Reached only from [sub_82457EF0](sub_82457EF0.md)+0x24, which is itself the **tid=6 thread_proc entry**. The "1 static caller, 0 callers above" chain in `xrefs` is structurally correct for a fn invoked from a thread loop's body.
## Evidence
- AUDIT-059 Probe C canary: at LR `0x82458D14` (=`sub_82458B90+0x184` or similar post-`bl 0x824AA2F0` internal PC), signals the wedge dup handle (matched cross-run via `r31` stack invariant — thread `F8000054` / frame `0x7036FDC0`).
- AUDIT-060 Probe O ours: fires **1× in ours** (`--ctor-probe`), called from `sub_82457EF0+0x24` (PC `0x82457f18`).
- Static caller chain in DB: `sub_82458B90 ← sub_82457EF0` (1 caller); `sub_82457EF0` itself has 0 static callers — it is the tid=6 thread_proc entry.
## Activation
Direct `bl` from `sub_82457EF0+0x24` (single static caller). [sub_82457EF0](sub_82457EF0.md) is a `thread_proc`, so the activation chain is:
1. Some boot-site calls `ExCreateThread(entry=sub_82457EF0)` — installing tid=6's thread_proc.
2. Thread tid=6 starts; PPC entry-LR sentinel `0xbcbcbcbc` indicates "first instruction of thread_proc".
3. `sub_82457EF0` body calls this fn via `bl` at `+0x24`.
## Static graph
- Static callers (`bl`): 1 site = `sub_82457EF0+0x24` (PC `0x82457f18`).
- Callees: `bl 0x824AA2F0` (NtSetEvent wrapper) internal.
## Audit log
- **AUDIT-060 (2026-05-12)** — confirmed alive in ours (1 fire on tid=6). AUDIT-059's "fires 1× off-wedge" wording was technically correct but misleading; the function IS active, just signaling a different KEVENT instance per call. [confirmed alive]
- **AUDIT-059 (2026-05-11)** — identified as canary NtSetEvent signaler A for the wedge dup handle via cross-run `r31` invariant. Static reachability claim ("only-caller has 0 callers — fnptr-array only") flagged as suspect; AUDIT-060 confirms the chain is correct but the conclusion ("unreachable") was wrong. [confirmed for canary signaler role]
## Open questions
- What r3 (handle) does `sub_82458B90` pass to `bl 0x824AA2F0` in ours's 1 fire vs canary's signaling fires? Probe entry of `sub_824AA2F0` filtered by caller=`sub_82458B90`.
- Is `sub_82457EF0`'s thread body a "wait on queue, dequeue work, signal completion" loop? If yes, what queue? And is the queue empty in ours but populated in canary?
## Cross-references
- Caller (thread_proc): [sub_82457EF0](sub_82457EF0.md).
- NtSetEvent wrapper: `sub_824AA2F0` (not yet dossierd).
- Sibling canary signaler: [sub_8245EC10](sub_8245EC10.md).
- Audits: 059, 060.
- Artifacts: `audit-runs/audit-059-gamma-wedge/canary-setwrapper.log`, `audit-runs/audit-060-fnptr-array-bootstrap/`.