ITERATE-2.V: scheduler priority aging closes 18-day AUDIT-049 wedge
Priority aging in xenia-cpu/scheduler.rs:pick_runnable
(effective_priority = base + age_bonus(now_round - last_run_round),
capped at +31, AGING_ROUNDS_PER_BONUS=1). Strict-priority was parking
priority=0 threads behind CPU-bound priority=15 audio mixer
(sub_824D1328 guest spinwait at PC=0x824d1404 on CPU5). Aging
eventually picks the starved thread, breaking the producer-consumer
cycle that caused 5-tid wedge at PC=0x824ac578 since AUDIT-049 (10 May).
Cascade observed: tid=13 clean exit; events 121K -> 13M (107x); last
host_ns 767ms -> 51,011ms (66x); 8 new threads spawn; VdSwap 1 -> 2.
Complete two-day iterate sequence (2026-05-27 -> 2026-05-28):
- 2.F: VdSwap drain timeout 900ms -> 1ms (xenia-gpu/handle.rs); 876x
perf win on VdSwap kernel callback
- 2.H: vA0000000 physical heap bucket added (state.rs, exports.rs);
ctx_ptrs now in 0xA0000000-0xBFFFFFFF range matching canary
- 2.L: Phase-A diff harness categorized [return_value mismatch],
[status mismatch], [args_resolved.path mismatch] tags
(tools/diff-events/diff_events.py); closes reading-error #41
(silent test-harness state leak invalidating trace diffs)
- 2.M: always-on exit-thread-state.json sibling to Phase-A JSONL
(event_log.rs + xenia-app/main.rs); closes reading-error #42
(Phase-A blind to blocked-forever waits)
- 2.Q: signal.match kernel instrumentation in NtSetEvent /
NtReleaseSemaphore / KeSetEvent / KeReleaseSemaphore
(exports.rs); emits target_handle + waiter_count + waiter_tids
- 2.T: wake.requested kernel instrumentation in wake_eligible_waiters
(exports.rs); emits target_tid + transition + new_state
- 2.V: scheduler priority aging (xenia-cpu/scheduler.rs) [keystone]
Plus accumulated WIP from earlier May (contention_manifest,
phase_b_snapshot, xam/xaudio enhancements, analysis db, xex loader,
xenia-app main loop, etc.). Audit-runs/ artifacts remain untracked
per project convention.
Tests: 300 xenia-cpu / 227 xenia-kernel / 5 xenia-app / 19 xenia-path
/ 30+ smaller suites -- all PASS, 0 regressions. Determinism preserved
(2x cold runs bit-identical at 13,003,881 events post-2.V).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
50
docs/functions/sub_824F7800.md
Normal file
50
docs/functions/sub_824F7800.md
Normal file
@@ -0,0 +1,50 @@
|
||||
---
|
||||
address: 0x824F7800
|
||||
classification: normal_callee
|
||||
confidence: high
|
||||
last_audit: 064
|
||||
aliases:
|
||||
- "AUDIT-058 caller-ladder fn #2 (bctrl-dispatch site for sub_825070F0)"
|
||||
---
|
||||
|
||||
# sub_824F7800 — dispatch caller for ANON_Class_713383D7 vtable slot 1
|
||||
|
||||
## Synopsis
|
||||
|
||||
Normal callee that performs the `bctrl` invoking [sub_825070F0](sub_825070F0.md) (slot 1 of the `ANON_Class_713383D7` vtable at `0x8200A208`). Bottom of a 4-fn linear call chain (`sub_824F8398 → sub_824F7CD0 → sub_824F7800 → [bctrl] → sub_825070F0`) that runs once per game-loop activation pass. AUDIT-064 verified canary fires this fn 1× at ~60s wallclock; ours fires it 0× because the entire chain sits downstream of tid=13's audit-049 wedge.
|
||||
|
||||
## Evidence
|
||||
|
||||
- Disasm prolog at `0x824F7800`: `mflr r12; bl 0x825F0F60 (frame helper); stwu r1, -336(r1); mr r22, r3; ...` — standard normal-callee prolog. NOT MSVC EH-handler shape (no `subi r31, r12, N`).
|
||||
- Function size: 1232 bytes / 308 insns. `has_eh=False`, `frame_size=336`.
|
||||
- Static caller xref: 1 — `bl` from PC `0x824F8314` inside [sub_824F7CD0](sub_824F7CD0.md). No other refs (only `.pdata` entry at file offset `0x1347B0` — standard unwind metadata).
|
||||
- AUDIT-064 canary 60s probe (`--audit_61_branch_probe_pcs=0x824F7800,...`): fires 1× with `lr=0x824F8318 r3=BE568F00 r4=701CF5B0 r5=BCA44D40 r6=BCA44DE0` on tid=6. Reproduced bit-identical at 120s and 180s wallclock.
|
||||
- AUDIT-064 ours `--ctor-probe=0x824F7800` -n 500M: **0 fires**.
|
||||
- The `bctrl` at PC `0x824F7B20` (= `sub_824F7800+0x320`, slot 1 of `0x8200A208` vtable) is where [sub_825070F0](sub_825070F0.md) is dispatched from.
|
||||
|
||||
## Activation
|
||||
|
||||
Direct `bl` from `sub_824F7CD0+0x644` (PC `0x824F8314`). Both engines see the same single static caller.
|
||||
|
||||
## Static graph
|
||||
|
||||
- Static callers (from `xrefs.source_func`):
|
||||
- PC `0x824F8314` inside `sub_824F7CD0` (the only caller).
|
||||
- Callees include the `bctrl` at PC `0x824F7B20` that dispatches to `sub_825070F0` via vtable slot 1 of `ANON_Class_713383D7` (vtable `0x8200A208`).
|
||||
|
||||
## Audit log
|
||||
|
||||
- **AUDIT-064 (2026-05-12)** — disasm confirms normal-callee prolog (refutes "another EH handler" hypothesis). Canary probe fires 1× / ours 0×. Static-DB caller is the runtime caller (no surprise bctrl divergence here). The chain runs downstream of [sub_822F1AA8](sub_822F1AA8.md)'s vtable[0] dispatch through sub_82173990 — which waits on tid=13 — so ours never reaches it because tid=13 is blocked on the AUDIT-049 wedge. [confirmed]
|
||||
- **AUDIT-058 (2026-05-10)** — flagged as part of the static caller ladder for sub_825070F0. [confirmed at this level; ladder framing partially preserved — see sub_821B6DF4 for the EH-thunk caveat one step further up]
|
||||
|
||||
## Open questions
|
||||
|
||||
- Why does the bctrl at `0x824F7B20` always dispatch to `sub_825070F0` (slot 1 of vtable `0x8200A208`) at this point? Investigate where the `r3` instance pointer comes from — likely a class member loaded via the slot-1 ctor path of `ANON_Class_713383D7`.
|
||||
- The 4-fn linear chain (`sub_824F8398 → sub_824F7CD0 → sub_824F7800 → bctrl`) is rigid and runs end-to-end without branching in canary. Confirm no early-exit branches inside the chain in ours (irrelevant if we resolve the audit-049 wedge first).
|
||||
|
||||
## Cross-references
|
||||
|
||||
- Callees: `sub_825070F0` via slot 1 of vtable `0x8200A208` at `bctrl` PC `0x824F7B20`.
|
||||
- Callers: `sub_824F7CD0+0x644`.
|
||||
- Audits: 058, 064.
|
||||
- Artifacts: `audit-runs/audit-064-activation-ladder/canary-{60,120,180}s.log`, `audit-runs/audit-064-activation-ladder/ours-500M.stdout`.
|
||||
Reference in New Issue
Block a user