ITERATE-2.V: scheduler priority aging closes 18-day AUDIT-049 wedge
Priority aging in xenia-cpu/scheduler.rs:pick_runnable
(effective_priority = base + age_bonus(now_round - last_run_round),
capped at +31, AGING_ROUNDS_PER_BONUS=1). Strict-priority was parking
priority=0 threads behind CPU-bound priority=15 audio mixer
(sub_824D1328 guest spinwait at PC=0x824d1404 on CPU5). Aging
eventually picks the starved thread, breaking the producer-consumer
cycle that caused 5-tid wedge at PC=0x824ac578 since AUDIT-049 (10 May).
Cascade observed: tid=13 clean exit; events 121K -> 13M (107x); last
host_ns 767ms -> 51,011ms (66x); 8 new threads spawn; VdSwap 1 -> 2.
Complete two-day iterate sequence (2026-05-27 -> 2026-05-28):
- 2.F: VdSwap drain timeout 900ms -> 1ms (xenia-gpu/handle.rs); 876x
perf win on VdSwap kernel callback
- 2.H: vA0000000 physical heap bucket added (state.rs, exports.rs);
ctx_ptrs now in 0xA0000000-0xBFFFFFFF range matching canary
- 2.L: Phase-A diff harness categorized [return_value mismatch],
[status mismatch], [args_resolved.path mismatch] tags
(tools/diff-events/diff_events.py); closes reading-error #41
(silent test-harness state leak invalidating trace diffs)
- 2.M: always-on exit-thread-state.json sibling to Phase-A JSONL
(event_log.rs + xenia-app/main.rs); closes reading-error #42
(Phase-A blind to blocked-forever waits)
- 2.Q: signal.match kernel instrumentation in NtSetEvent /
NtReleaseSemaphore / KeSetEvent / KeReleaseSemaphore
(exports.rs); emits target_handle + waiter_count + waiter_tids
- 2.T: wake.requested kernel instrumentation in wake_eligible_waiters
(exports.rs); emits target_tid + transition + new_state
- 2.V: scheduler priority aging (xenia-cpu/scheduler.rs) [keystone]
Plus accumulated WIP from earlier May (contention_manifest,
phase_b_snapshot, xam/xaudio enhancements, analysis db, xex loader,
xenia-app main loop, etc.). Audit-runs/ artifacts remain untracked
per project convention.
Tests: 300 xenia-cpu / 227 xenia-kernel / 5 xenia-app / 19 xenia-path
/ 30+ smaller suites -- all PASS, 0 regressions. Determinism preserved
(2x cold runs bit-identical at 13,003,881 events post-2.V).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
76
docs/functions/sub_821C4EB0.md
Normal file
76
docs/functions/sub_821C4EB0.md
Normal file
@@ -0,0 +1,76 @@
|
||||
---
|
||||
address: 0x821C4EB0
|
||||
classification: vtable_method
|
||||
confidence: high
|
||||
last_audit: 061
|
||||
aliases:
|
||||
- "silph::GamePart_Title::UImpl member fn"
|
||||
- "AUDIT-056 early-exit (falsified by 061)"
|
||||
---
|
||||
|
||||
# sub_821C4EB0 — silph::GamePart_Title::UImpl member fn (AUDIT-061: NOT a branch-divergence gate)
|
||||
|
||||
## Synopsis
|
||||
|
||||
Member function on class `silph::GamePart_Title::UImpl` (vtable `0x820a3e00`). **AUDIT-061 falsified the "conditional-branch divergence in `[+0x44, +0xE0]`" framing**: all 4 branches in that range are decided **bit-identically** in canary and ours. The actual divergence is the call `bl 0x821CC3F8` at PC `0x821C4F14`: in canary the call returns to `0x821C4F18` and the rest of sub_821C4EB0 executes through the 5 `bl 0x821CEDF8` sites at +0x198..+0x240; in ours the call enters the chain `sub_821CC3F8 → sub_821CBA08 → sub_821CB030` and never returns (tid=13 wedge inside sub_821CB030 = AUDIT-049 wedge handle, `NtCreateEvent` at +0x128 → INFINITE wait). AUDIT-056's "5× canary / 0× ours" callsite count is an indirect consequence of the upstream wedge, not a branch-decision asymmetry in this fn.
|
||||
|
||||
## Evidence
|
||||
|
||||
- AUDIT-049: appears in tid=13 thread-create chain — `sub_821748F0 → sub_821C4EB0 (UImpl@GamePart_Title@silph) → sub_821CC3F8 → sub_821CBA08 → sub_821CB030`.
|
||||
- AUDIT-056: caller-LR `0x821C4F2C / 0x821C5014 / 0x821C5048` are post-`bl` PCs inside this fn. Reported `sub_821CEDF8` 5× canary / 0× ours.
|
||||
- AUDIT-059: in the wedge's wait-thread frame-4 saved-r29 the vtable is `0x820a3e00 = .?AUImpl@GamePart_Title@silph@@`, confirming class membership.
|
||||
- AUDIT-061 (READ-ONLY canary multi-PC probe @ ~2:00 wallclock; ours `--branch-probe` @ -n 500M):
|
||||
- Both engines call sub_821C4EB0 exactly **1×** at this horizon. Same caller LR=0x82174A80 (canary tid=17, ours tid=13).
|
||||
- Canary probe fires 17× covering entry + post-bl block entries + all 4 cond-branches: B1 `beq cr6 NOT taken` (cr6=.G., r3=0xBC220008≠0), B2 `bne cr6 NOT taken` (cr6=..E, lbz @ 0x828F3284 = 0), B3 `beq cr6 TAKEN` (cr6=..E, lwz r3,92(r30) == 0), B4 `bgt cr6 TAKEN` (cr6=.G., [r27+4] > 4). Reaches 0x821C5048 (1st `bl 0x821CEDF8`) and 0x821C504C (returned).
|
||||
- Ours probe fires 4× covering entry + 3 post-bl: 0x821C4EB0 → 0x821C4EB8 → 0x821C4ED0 → 0x821C4EEC (r3=0x40105004 returned from `bl 0x82150EF8`; cr6=.G., **same direction as canary**). After 0x821C4EEC: **never reaches 0x821C4F18 or anywhere later in the function**.
|
||||
- Chain probe (separate run) confirms ours's tid=13 enters sub_821CC3F8 (cycle 2069) → sub_821CC3F8+0x38 post-alloc (2249) → sub_821CBA08 (2258) → sub_821CB030 (3242), then stalls. Canary's tid=17 returns out of all four and reaches 0x821CC454 (post-bl-sub_821CBA08) and 0x821C4F18 (post-bl-sub_821CC3F8) cleanly.
|
||||
- First divergent INSTRUCTION (not branch): `bl 0x821CC3F8` at PC `0x821C4F14`. First divergent state: ours's r3 at function entry to sub_821CC3F8 is `0x40105004` (40xxxxxx host-allocator region) vs canary's `0xBC220008` (BCxxxxxx region) — but this VA difference is the AUDIT-043 ε-class (allocator region drift) and is BENIGN here; sub_821CC3F8 dereferences r3 as a pool handle the same way in both engines and downstream allocation succeeds (sub_82150EF8 returns valid pointer in both).
|
||||
|
||||
## Activation
|
||||
|
||||
Vtable method. Reached via `bctrl` from class-owning code in the boot UI / GamePart_Title state machine. Indirect; the dispatch site PC and vtable slot index need DB cross-reference (see Open questions).
|
||||
|
||||
## Static graph
|
||||
|
||||
- Caller chain at the wedge site (AUDIT-049):
|
||||
- `sub_821C4EB0 ← sub_821748F0` (top-level)
|
||||
- flows down to `sub_821CC3F8 (GamePart_Title)` → `sub_821CBA08` → `sub_821CB030` (where wedge fires)
|
||||
- Callees in source order:
|
||||
- `0x821C4EB4 bl 0x825F0F7C` — save-GPRs prologue helper
|
||||
- `0x821C4ECC bl 0x8284DA7C` — XAM import `XNotifyPositionUI` (xam.xex ord 652); r3=0xA → returns 0 in both engines.
|
||||
- `0x821C4EE8 bl 0x82150EF8` — pool allocator (called with allocator table @ `[0x828E0000+11028]`, size=4); returns pointer in both engines (canary BC220008, ours 0x40105004).
|
||||
- `0x821C4F14 bl 0x821CC3F8` — **first divergent instruction (AUDIT-061)**: returns in canary, wedges in ours.
|
||||
- `0x821C4F2C bl 0x82187C30` — only reached in canary at this horizon.
|
||||
- `0x821C4F60 bl 0x82172370` — only reached in canary.
|
||||
- `0x821C4F74 bl 0x824AA3E0` — conditional on prior beq; canary takes the SKIP-bl path (B3 = taken).
|
||||
- `0x821C5048 / 0x821C5074 / 0x821C50A0 / 0x821C50C8 / 0x821C50F0 bl 0x821CEDF8` — 5 sites in the bgt-taken path; only reached in canary.
|
||||
- Conditional branches in `[+0x44, +0xE0]` (enumerated AUDIT-061):
|
||||
- B1 `0x821C4EF8 beq cr6, 0x821C4F20` — after `cmplwi cr6, r3, 0` (r3 = sub_82150EF8 return). Decided NOT taken in both.
|
||||
- B2 `0x821C4F3C bne cr6, 0x821C4F7C` — after `lbz r10, 12932(0x828F0000)+cmplwi r10, 0`. Decided NOT taken in canary; UNREACHED in ours.
|
||||
- B3 `0x821C4F70 beq cr6, 0x821C4F78` — after `lwz r3, 92(r30)`. Decided TAKEN in canary; UNREACHED in ours.
|
||||
- B4 `0x821C4F90 bgt cr6, 0x821C5000` — after `cmplwi cr6, r11, 3`, r11 = `[r27+4]−1`. Decided TAKEN in canary; UNREACHED in ours.
|
||||
|
||||
## Audit log
|
||||
|
||||
- **AUDIT-061 (2026-05-12)** — Multi-PC branch probe in both engines (new canary cvar `audit_61_branch_probe_pcs`, ours `--branch-probe`). All 4 conditional branches in `[+0x44, +0xE0]` decided **bit-identically** (B1 NOT-taken in both; B2/B3/B4 UNREACHED in ours because the function stalls earlier). First divergent BEHAVIOR is the call `bl 0x821CC3F8` at PC `0x821C4F14` — returns in canary, wedges in ours. The wedge is INSIDE `sub_821CB030` (chain `sub_821C4EB0 → sub_821CC3F8 → sub_821CBA08 → sub_821CB030`); tid=13 reaches sub_821CB030 at cycle 3242 and blocks indefinitely. Confirms AUDIT-049 wedge premise; matches AUDIT-059 γ-class missing-signaler. AUDIT-056's "5× sub_821CEDF8 canary / 0× ours" is an indirect consequence (those 5 sites are at +0x198..+0x240, downstream of the wedge). [confirmed — sub_821C4EB0 is NOT a branch-divergence gate]
|
||||
- **AUDIT-060 (2026-05-12)** — convergence confirmed this fn as the AUDIT-061 target after AUDIT-058/059's "missing activator" framing was refuted. [superseded by 061 — actual divergence is non-returning call, not a branch]
|
||||
- **AUDIT-056 (2026-05-10)** — identified as the primary divergence-introducer. Caller-LR is IDENTICAL canary/ours but body chooses a different path. [partially falsified by 061 — the "different path" framing was true at a high level, but it's because of a non-returning call, not a divergent conditional-branch decision in `[+0x44, +0xE0]`. The 5 sub_821CEDF8 callsites are downstream of the wedge.]
|
||||
- **AUDIT-049 (2026-05-10)** — placed on the tid=13 chain that ultimately creates wedge handle. [confirmed — AUDIT-061 directly observed tid=13 entering sub_821CB030 in ours]
|
||||
|
||||
## Open questions
|
||||
|
||||
- ~~Enumerate every conditional branch PC in `[0x821C4EF4, 0x821C4F90]`~~. **DONE in AUDIT-061**: B1/B2/B3/B4 enumerated; none divergent in decision.
|
||||
- ~~For each branch: capture cr0/cr6/cr-of-interest...~~. **DONE in AUDIT-061**.
|
||||
- ~~What input register controls the first divergent branch?~~ **Moot — no branch diverges in this fn.**
|
||||
- **NEW (AUDIT-062 target):** Where INSIDE sub_821CB030 does ours's tid=13 stall? AUDIT-049 hypothesized the wait at the event handle created at +0x128. Probe sub_821CB030's basic-block entries to find the highest-PC reached by tid=13 before stall; cross-reference with the NtCreateEvent / KeWaitForSingleObject sites.
|
||||
- Which vtable slot is `sub_821C4EB0` at in vtable `0x820a3e00`? (still open; cross-ref `xrefs` table for `target = 0x821C4EB0` with `kind = 'read'` or `'ref'` in `.data`/`.rdata`).
|
||||
|
||||
## Cross-references
|
||||
|
||||
- Vtable: `0x820a3e00 = .?AUImpl@GamePart_Title@silph@@` (class)
|
||||
- Sibling class vtable: `0x820a3dc8 = .?AVGamePart_Title@silph@@` (parent? aggregate?)
|
||||
- Callees: `sub_821CC3F8` (first-divergent-call AUDIT-061), `sub_821CEDF8` (5× sites at +0x198..+0x240, only reached in canary)
|
||||
- Callers: `sub_821748F0` (top of tid=13 chain; lr=0x82174A80 seen in both engines AUDIT-061)
|
||||
- Wedge chain: [sub_821CB030](sub_821CB030.md) is where ours's tid=13 stalls per AUDIT-061's chain probe.
|
||||
- Audits: 049, 056, 057, 058, 059, 060, **061**
|
||||
- Artifacts: `audit-runs/audit-056-producer-trace/`, `audit-runs/audit-059-gamma-wedge/`, `audit-runs/audit-061-sub821C4EB0-branch-diff/`
|
||||
Reference in New Issue
Block a user