Files
xenia-rs/docs/functions/sub_822F1AA8.md
MechaCat02 ad45873a1b ITERATE-2.V: scheduler priority aging closes 18-day AUDIT-049 wedge
Priority aging in xenia-cpu/scheduler.rs:pick_runnable
(effective_priority = base + age_bonus(now_round - last_run_round),
capped at +31, AGING_ROUNDS_PER_BONUS=1). Strict-priority was parking
priority=0 threads behind CPU-bound priority=15 audio mixer
(sub_824D1328 guest spinwait at PC=0x824d1404 on CPU5). Aging
eventually picks the starved thread, breaking the producer-consumer
cycle that caused 5-tid wedge at PC=0x824ac578 since AUDIT-049 (10 May).

Cascade observed: tid=13 clean exit; events 121K -> 13M (107x); last
host_ns 767ms -> 51,011ms (66x); 8 new threads spawn; VdSwap 1 -> 2.

Complete two-day iterate sequence (2026-05-27 -> 2026-05-28):
- 2.F: VdSwap drain timeout 900ms -> 1ms (xenia-gpu/handle.rs); 876x
       perf win on VdSwap kernel callback
- 2.H: vA0000000 physical heap bucket added (state.rs, exports.rs);
       ctx_ptrs now in 0xA0000000-0xBFFFFFFF range matching canary
- 2.L: Phase-A diff harness categorized [return_value mismatch],
       [status mismatch], [args_resolved.path mismatch] tags
       (tools/diff-events/diff_events.py); closes reading-error #41
       (silent test-harness state leak invalidating trace diffs)
- 2.M: always-on exit-thread-state.json sibling to Phase-A JSONL
       (event_log.rs + xenia-app/main.rs); closes reading-error #42
       (Phase-A blind to blocked-forever waits)
- 2.Q: signal.match kernel instrumentation in NtSetEvent /
       NtReleaseSemaphore / KeSetEvent / KeReleaseSemaphore
       (exports.rs); emits target_handle + waiter_count + waiter_tids
- 2.T: wake.requested kernel instrumentation in wake_eligible_waiters
       (exports.rs); emits target_tid + transition + new_state
- 2.V: scheduler priority aging (xenia-cpu/scheduler.rs) [keystone]

Plus accumulated WIP from earlier May (contention_manifest,
phase_b_snapshot, xam/xaudio enhancements, analysis db, xex loader,
xenia-app main loop, etc.). Audit-runs/ artifacts remain untracked
per project convention.

Tests: 300 xenia-cpu / 227 xenia-kernel / 5 xenia-app / 19 xenia-path
/ 30+ smaller suites -- all PASS, 0 regressions. Determinism preserved
(2x cold runs bit-identical at 13,003,881 events post-2.V).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 07:27:26 +02:00

56 lines
6.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
address: 0x822F1AA8
classification: normal_callee
confidence: high
last_audit: 065
aliases:
- "tid=1 post-init dispatch loop (calls sub_82173990 via vtable[0])"
---
# sub_822F1AA8 — tid=1 post-init game-loop dispatcher
## Synopsis
Normal-callee invoked by tid=1's `entry_point → sub_8216EA68 → sub_822F1AA8`. Contains the per-frame game-loop pump for the post-init subsystem-dispatch tree (audit-064 chain to [sub_825070F0](sub_825070F0.md)). Runs an outer loop: (a) `KeWaitForSingleObject` infinite at PC `0x822F1DFC`, (b) dispatches vtable[0] of object at `*(0x828E1F08)` at PC `0x822F1B4C bctrl` — which is `sub_82175330` → tail-jump → `sub_82173990`. Canary executes the body 4040× in 60s (per-frame). Ours executes the function entry 1× then **blocks immediately inside sub_82173990 (the vtable[0] callee) at sub_82173990+0x2D0** — KeWaitForSingleObject INFINITE on handle `0x12A4` (= tid=13's thread handle = AUDIT-049 wedge).
## Evidence
- Disasm prolog: `mflr r12; bl 0x825F0F60; stfd f30, -136(r1); stfd f31, -128(r1); subi r31, r1, 256; stwu r1, -256(r1); mr r30, r3; ...` — normal-callee.
- Function size: 996 bytes / 249 insns. `has_eh=True`, `frame_size=0` per DB (dynamic 256-byte frame).
- Static caller: `sub_8216EA68+0x3AC` via `bl` (the post-init dispatcher).
- AUDIT-064 ours `--ctor-probe=0x822F1AA8` -n 500M fires 1× at tid=1, cycle=6,171,801, lr=0x8216ee14, r3=0x40d09a40. Back-chain: tid=1 thread_proc → entry_point → sub_8216EA68+0x3AC → sub_822F1AA8.
- AUDIT-064 ours fine probe at BB-entries `0x822F1ACC/0x822F1AEC/0x822F1B20/0x822F1B30/0x822F1B38` all fire 1× — execution does pass through the function body to PC `0x822F1B38`.
- AUDIT-064 ours `--ctor-probe=0x822F1B50` fires **0×**. The bctrl at PC `0x822F1B4C` DOES execute (sub_82175330 fires 1× per separate probe), but never returns — because sub_82175330 tail-jumps to sub_82173990 which blocks at +0x2D0.
- AUDIT-064 canary 60s probe (`--audit_61_branch_probe_pcs`): all probes in body fire — `0x822F1B5C/0x822F1B78/0x822F1BB8` fire 1×, `0x822F1BCC` (outer-loop body) fires 4040×, `0x822F1D58` (the inner bctrl → sub_821741C8) fires 4030×, `0x822F1DFC` (outer KeWait) fires 4040×.
- Global `0x828E1F08` is the slot holding the object pointer that the vtable[0] bctrl dispatches off. Its writers are `sub_822F14D8+0xF0` (PC `0x822F15A4`, observed in ours) and `sub_822F1638+0x84` (PC `0x822F16BC`). At cycle ~6,171,800 in ours, `[0x828E1F08]` is set to `0x40111890`; `[0x40111890+0]` evolves through multiple vtable values (`0x820AD894`, `0x820A183C`, ...) before the bctrl fires; final value at bctrl is `0x820A183C` (slot 0 = `sub_82175330`).
## Activation
Direct `bl` from `sub_8216EA68+0x3AC` on tid=1. One-shot at boot per game session — but the function itself contains an outer loop that iterates `KeWaitForSingleObject` waits until termination.
## Static graph
- Static callers: `sub_8216EA68+0x3AC` via `bl` (sole).
- Direct callees: `sub_822F13B0`, `sub_824AA2F0` (NtSetEvent wrapper), `sub_82172370`, `sub_824AA3E0`, `sub_824C1910` (leaf), `sub_824AA8B0`, `sub_82456B58`, `sub_824AA330` (KeWaitForSingleObject wrapper), `sub_824574C0`, `sub_82457038`, `sub_8284E45C` (kernel import thunk).
- Indirect: `bctrl` at PC `0x822F1B4C` (vtable[0] of `*(0x828E1F08)`) and `bctrl` at PC `0x822F1D58` (vtable[2] of same).
- Reads `0x828E1F08` at PCs `0x822F1B3C, 0x822F1BE8, 0x822F1D40, 0x822F1E44, ...` (11 reads).
## Audit log
- **AUDIT-065 (2026-05-12)** — refined the dispatch-target understanding. The vtable[0] callee at PC `0x822F1B4C bctrl` is `sub_82175330` (2-insn tail-jump to `sub_82173990`). `sub_82173990` is a **synchronous task-spawn-and-join helper** — not an outer game loop. Canary fires this function **exactly 1× per boot** (not 4040× as the synopsis previously suggested) — the 4040× metric in audit-064 referred to PCs *downstream of* sub_82173990's return into sub_822F1AA8's outer loop (PCs `0x822F1BCC`/`0x822F1D58`/`0x822F1DFC`). Per AUDIT-065 sub_82173990 dossier, the wait at sub_82173990+0x2D0 IS a thread-join, and the body of sub_82173990 itself is clean — only the worker spawned via `sub_821746B0` (which becomes ours's tid=13) is wedged inside `sub_821CB030`. [confirmed]
- **AUDIT-064 (2026-05-12)** — identified as the immediate dispatch chain origin for the 4-fn ladder leading to sub_825070F0. Disasm + ours fine-grained BB probes localize the wedge: tid=1 enters function (1×), passes through PCs 0x822F1ACC/0x822F1AEC/0x822F1B20/0x822F1B30/0x822F1B38, executes bctrl at PC `0x822F1B4C` → sub_82175330 → sub_82173990 → KeWaitForSingleObject(handle=0x12A4 = tid=13 thread handle) → STALL. Canary instead returns from that wait and enters the outer game-loop body (`0x822F1BCC` fires 4040× in 60s). **First divergence between canary and ours is at sub_82173990's wait inside vtable[0] of `*(0x828E1F08)` — same AUDIT-049 wedge.** [confirmed]
## Open questions
- The vptr at `[0x40111890+0]` mutates multiple times before the bctrl fires (writes from sub_82152XXX, sub_8244e850, sub_8244e8bc, sub_82155b4c, sub_82460c40, sub_822F2758, sub_8216F110, ...). Is the final value `0x820A183C` (which has slot 0 = sub_82175330) the same as canary's final value? Run the same `--mem-watch` against canary to verify.
- Why does canary's tid=13 finish (allowing tid=1's join wait on handle 0x12A4 to complete) while ours's tid=13 stalls? That's the AUDIT-049 root question — separately tracked.
## Cross-references
- Direct callers: `sub_8216EA68+0x3AC`.
- Callees of interest: `sub_82173990` (via vtable[0] thunk `sub_82175330`) — where tid=1's stall occurs.
- Downstream (when activated): `sub_82173990``sub_821741C8``sub_82172BA0``sub_821B55D8``sub_824F8398``sub_824F7CD0``sub_824F7800``sub_825070F0`.
- Object dispatch: `*(0x828E1F08) = 0x40111890`, vptr `[0x40111890+0] = 0x820A183C` (vtable), slot 0 = `sub_82175330`.
- Audits: 049 (the underlying wedge), 064.
- Artifacts: `audit-runs/audit-064-activation-ladder/ours-fine-822F1AA8.stdout`, `ours-bb-822F1AA8.stdout`, `ours-vtable820a183c.stdout`, `ours-vptr-time.log`, `canary-inside-822F1AA8.log`.