Files
xenia-rs/docs/functions/sub_825070F0.md
MechaCat02 ad45873a1b ITERATE-2.V: scheduler priority aging closes 18-day AUDIT-049 wedge
Priority aging in xenia-cpu/scheduler.rs:pick_runnable
(effective_priority = base + age_bonus(now_round - last_run_round),
capped at +31, AGING_ROUNDS_PER_BONUS=1). Strict-priority was parking
priority=0 threads behind CPU-bound priority=15 audio mixer
(sub_824D1328 guest spinwait at PC=0x824d1404 on CPU5). Aging
eventually picks the starved thread, breaking the producer-consumer
cycle that caused 5-tid wedge at PC=0x824ac578 since AUDIT-049 (10 May).

Cascade observed: tid=13 clean exit; events 121K -> 13M (107x); last
host_ns 767ms -> 51,011ms (66x); 8 new threads spawn; VdSwap 1 -> 2.

Complete two-day iterate sequence (2026-05-27 -> 2026-05-28):
- 2.F: VdSwap drain timeout 900ms -> 1ms (xenia-gpu/handle.rs); 876x
       perf win on VdSwap kernel callback
- 2.H: vA0000000 physical heap bucket added (state.rs, exports.rs);
       ctx_ptrs now in 0xA0000000-0xBFFFFFFF range matching canary
- 2.L: Phase-A diff harness categorized [return_value mismatch],
       [status mismatch], [args_resolved.path mismatch] tags
       (tools/diff-events/diff_events.py); closes reading-error #41
       (silent test-harness state leak invalidating trace diffs)
- 2.M: always-on exit-thread-state.json sibling to Phase-A JSONL
       (event_log.rs + xenia-app/main.rs); closes reading-error #42
       (Phase-A blind to blocked-forever waits)
- 2.Q: signal.match kernel instrumentation in NtSetEvent /
       NtReleaseSemaphore / KeSetEvent / KeReleaseSemaphore
       (exports.rs); emits target_handle + waiter_count + waiter_tids
- 2.T: wake.requested kernel instrumentation in wake_eligible_waiters
       (exports.rs); emits target_tid + transition + new_state
- 2.V: scheduler priority aging (xenia-cpu/scheduler.rs) [keystone]

Plus accumulated WIP from earlier May (contention_manifest,
phase_b_snapshot, xam/xaudio enhancements, analysis db, xex loader,
xenia-app main loop, etc.). Audit-runs/ artifacts remain untracked
per project convention.

Tests: 300 xenia-cpu / 227 xenia-kernel / 5 xenia-app / 19 xenia-path
/ 30+ smaller suites -- all PASS, 0 regressions. Determinism preserved
(2x cold runs bit-identical at 13,003,881 events post-2.V).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 07:27:26 +02:00

11 KiB
Raw Blame History

address, classification, confidence, last_audit, aliases
address classification confidence last_audit aliases
0x825070F0 vtable_method high 067
ANON_Class_713383D7 vtable slot 1
AUDIT-057 top missing-thread spawner

sub_825070F0 — ANON_Class_713383D7 vtable slot 1 (worker spawner)

Synopsis

Slot 1 of class ANON_Class_713383D7 vtable (located at 0x8200A208 and clone at 0x8200A928). When invoked, initializes 4 worker threads with shared context r3=0xBCE25340 (canary). The thread entry points are 0x82506528 / 0x82506558 / 0x82506588 / 0x825065B8. In canary, this fn fires 1× at ~60s wallclock immediately after DiscImageDevice::ResolvePath(\\dat\\movie) (post-intro file open). In ours, it fires 0× at any horizon probed so far.

Evidence

  • AUDIT-058 Linux Debug canary: fires 1× at ~60s wallclock with pc=0x825070F0 lr=0x824F7B24 r3=BCE25340 r4=701CF3C0 r5=BCE25AC0.
  • AUDIT-060 Probe C-Win Windows Debug canary: same probe (--log_lr_on_pc=0x825070F0, 90s) → 1 fire, lr=0x824F7B24bit-identical to Linux Debug, validating the new Wine canary oracle.
  • LR 0x824F7B24 resolves to inside sub_824F7800+0x24 — the vtable bctrl dispatch site.
  • Class ANON_Class_713383D7 lives at vtables 0x8200A208 (and clone 0x8200A928); both are 7-method tables. Slot 1 is this fn. Zero recorded vptr_writes in DB — the ctor that writes this vtable is in an unreachability island OR is a computed-store-only ctor.
  • AUDIT-067 (2026-05-12) strengthens this: zero vptr_writes, zero xrefs, zero u32-byte occurrences of 0x8200A208/0x8200A928 in the .pe file, zero addis+addi/ori pairs materializing the value. Runtime mem-watch of all 16 guest store opcodes (stw/std/stwx/stwbrx/stwcx./stmw/stvx/stvewx/etc.) for 211 s wallclock in canary produces 0 hits for these values — though sub_825070F0 itself fires 1× at ~25 s wallclock with *r3 = 0x8200A208 implicit at the bctrl. The install is host-side, not guest-side.
  • AUDIT-057 named this as the top missing-thread spawner: 4 missing thread spawns in ours.

Activation

Vtable dispatch from sub_824F7800+0x24 bctrl (slot 1 of vtable 0x8200A208). AUDIT-064 fully classified the ladder: sub_824F7800, sub_824F7CD0, sub_824F8398, sub_821B55D8 are ALL normal_callee (NOT EH thunks). Only sub_821B6DF4 is the EH catch-handler; it's a secondary entry path, not the runtime activation route.

Full runtime activation chain (in canary; identified by AUDIT-064 via lr-resolution at each fire): tid=1 entry_point → sub_8216EA68 → sub_822F1AA8 (post-init dispatcher) → bctrl vtable[0] of *(0x828E1F08)sub_82175330 (2-insn thunk) → tail-jump → sub_82173990 → … → sub_821741C8sub_82172BA0 (array-walk dispatcher) → bctrl vtable[6]sub_821B55D8sub_824F8398sub_824F7CD0sub_824F7800bctrl vtable[1]sub_825070F0.

Wedge in ours (AUDIT-064): tid=1 successfully enters sub_822F1AA8, reaches the bctrl at 0x822F1B4C, dispatches to sub_82175330sub_82173990 → blocks at sub_82173990+0x2D0 on KeWaitForSingleObject INFINITE on handle 0x12A4 = tid=13's thread handle. Tid=13 itself is blocked on the AUDIT-049 wedge (event 0x12AC inside the audit-009 cluster). The 5-fn ladder downstream of sub_82172BA0 is NEVER reached because tid=1 hasn't returned from the thread-join wait.

Static graph

  • Static callers (direct bl): 0 — it's reached only via bctrl. The bctrl site is 0x824F7B20.
  • Callees: spawns 4 worker threads via ExCreateThread (or equivalent) with entries 0x82506528/58/88/B8.

Audit log

  • Phase Non-match Investigation (2026-05-19) — Phase A thread.create events directly corroborate the AUDIT-058 framing using runtime evidence (previously only static + ctor-probe). Canary cold trace canary-jitter-1.jsonl (4.4 GB, 18.7M events) contains EXACTLY 4 thread.create events at host_ns = 10.382912900 / 10.383282200 / 10.383647200 / 10.384161700 (spaced ~370500 ns apart on tid=6 = guest main) with entries 0x82506528 / 0x82506558 / 0x82506588 / 0x825065B8, shared ctx_ptr=0xBCE251C0, stack=65,536, suspended=true, affinity=0. These match the dossier's listed worker entries 1:1 and are bit-identical-in-structure to the AUDIT-058 fire (modulo ctx_ptr arena drift: AUDIT-058 cited 0xBCE25340, this jitter sample has 0xBCE251C0 — both inside the 0xBCE25xxx arena allocated by the same fn). FIFO-matched child tids: 0x82506528 → tid=28 (3.26M events, file IO + heavy RtlEnterCS), 0x82506558 → tid=27 (36k events), 0x82506588 → tid=29 (91k events), 0x825065B8 → never started in the 90s window. Same canary-vs-ours digest comparison shows ours-postfix.jsonl has 0 occurrences of 0xBCE251C0 and 0 thread.create events after spawn #10 (1.727 s). The full set of static-analysis-invisible properties (0 vptr_writes, 0 xrefs, 0 indirect_dispatch_candidates targeting vtables 0x8200A208 / 0x8200A928) was re-verified against current sylpheed.db — AUDIT-067's conclusion stands. New artifacts at audit-runs/phase-nonmatch-investigation/. Recommended next probe: AUDIT-068 host-side mem-watch was deferred — re-attempt now with Phase A event correlation (the 10.382 s spawn burst is the precise wall-clock window to hook). [confirmed runtime; framing intact]
  • AUDIT-067 (2026-05-12) — runtime mem-watch via new canary cvar audit_67_value_watch (~422 LOC additive instrumentation; default-empty / zero-cost; kept in canary tree per policy). Hooked all 16 store opcodes: stw, stwu, stwx, stwux, stwbrx, stwcx., stmw, std, stdu, stdux, stdx, stdbrx, stdcx., stvx/stvxl/128, stvewx/128. Each hook emits CompareEQ(val32, watch) → TrapTrue(_,250+idx); trap handler logs pc/lr/val/dst/regs/tid. Sanity test with watch=0x00000000103,321 hits in 30s (instrumentation verified). Main run watch=0x8200A208,0x8200A928 for 211 s wallclock: 0 hits despite AUDIT-061-BR pc=0x825070F0 firing 1× with r3=0xBCE25340 r4=0x701CF3C0 r5=0xBCE25AC0 (bit-identical to AUDIT-058/060). CONCLUSION: the vtable address 0x8200A208 is never stored to guest memory via any guest PowerPC store opcode in canary — the install is host-side (most likely a kernel-import direct memory write via xe::store_and_swap<uint32_t>(memory + addr, val), OR an XEX loader operation, OR a RtlCopyMemory-style host helper). Path A (static binary search) also yielded 0 matches: no vptr_writes, no xrefs, no addis+addi/ori pair (with or without mr-chain register propagation) materializing the value, no u32 occurrence anywhere in the .pe file. Reading-error #19: assumption that meaningful guest-memory writes go through guest PPC code is false — kernel imports and the image loader perform direct host writes that bypass the JIT. AUDIT-068 must hook at the Memory::write* / store_and_swap<*> level instead. [confirmed — negative result; structural finding]
  • AUDIT-064 (2026-05-12) — classified all 4 unclassified ladder fns: sub_824F7800, sub_824F7CD0, sub_824F8398, sub_821B55D8all 4 are normal_callee, NOT EH thunks (refutes the worst-case hypothesis from AUDIT-060 that the whole chain might be EH metadata). Probed canary at 60s/120s/180s — all 4 fire 1× each, bit-identical context. Walked upward: real runtime caller of sub_821B55D8 is sub_82172BA0+0x1E8 bctrl (PC 0x82172D88), NOT the static-DB-listed sub_821B6DF4 EH branch. Identified the full upstream activation chain: tid=1 entry_point → sub_8216EA68sub_822F1AA8 → vtable[0] of *(0x828E1F08) = sub_82175330 (2-insn thunk) → tail-jump to sub_82173990 → … (canary continues through sub_821741C8sub_82172BA0 → vtable[6]=sub_821B55D8sub_824F8398sub_824F7CD0sub_824F7800 → bctrl → sub_825070F0). First divergence in ours: tid=1 enters sub_82173990 via the vtable[0] dispatch but blocks at sub_82173990+0x2D0 bl 0x824AA330 (KeWaitForSingleObject INFINITE) on handle 0x12A4 = tid=13's thread handle. This is the same AUDIT-049 wedge: tid=13 itself is blocked on handle 0x12AC waiting for the audit-009-cluster signal. Activation of sub_825070F0 is gated on resolving the tid=13 wait, NOT on any divergence in the ladder fns themselves. [confirmed]
  • AUDIT-060 (2026-05-12) — verified canary fire reproduces under Windows Debug oracle. Caller chain caveat added: sub_821B6DF4 ladder-top is EH, not normal call edge. Other ladder fns need individual classification. [confirmed for canary fire; caveat on the upstream chain]
  • AUDIT-058 (2026-05-10) — captured canary fire context, walked static caller ladder, found all 6 ladder fns fire 0× in ours. Concluded "activation phase doesn't activate in ours". [STATUS: ladder framing partially falsified by AUDIT-060 — at least sub_821B6DF4 is EH; the real gate is the AUDIT-056 sub_821C4EB0 throughput gap, upstream.]
  • AUDIT-057 (2026-05-10) — flagged as top missing-thread spawner (4 of 13 missing thread spawns). [confirmed quantitatively]

Open questions

  • What spawns the 4 worker threads exactly? Disassemble body. The threads have entries 0x82506528/58/88/B8 — are these consecutive 0x30-byte stubs that all forward to a common worker fn?
  • What class instance triggers the slot-1 dispatch? Is it a silph::GamePart_Title instance? The wallclock context (post-\\dat\\movie ResolvePath) suggests so.
  • (AUDIT-067 result) What host-side mechanism installs 0x8200A208 at 0xBCE25340+0? Candidates: xboxkrnl_rtl* direct-write helpers (RtlCopyMemory/RtlFillMemory/RtlInitializeCriticalSection etc.), XEX loader image-rewrites, or kernel-import factory helpers. Next probe: AUDIT-068 host-side mem-watch — hook Memory::write* and/or xe::store_and_swap<*> in canary.

Cross-references

  • Vtable: 0x8200A208 (primary), 0x8200A928 (clone), class ANON_Class_713383D7, slot 1.
  • Dispatch site: sub_824F7800+0x20 bctrl (PC 0x824F7B20); post-bctrl PC 0x824F7B24.
  • Worker thread entries spawned: 0x82506528, 0x82506558, 0x82506588, 0x825065B8.
  • Real runtime activation chain (AUDIT-064): tid=1 entry_point → sub_8216EA68 → [sub_822F1AA8](sub_822F1AA8.md) → bctrl vtable[0]={sub_82175330 tail-jump→sub_82173990} → … → sub_821741C8 → [sub_82172BA0](sub_82172BA0.md) → bctrl vtable[6] → [sub_821B55D8](sub_821B55D8.md) → [sub_824F8398](sub_824F8398.md) → [sub_824F7CD0](sub_824F7CD0.md) → [sub_824F7800](sub_824F7800.md) → bctrl vtable[1] → sub_825070F0.
  • Wedge in ours: tid=1 blocks at sub_82173990+0x2D0 on KeWaitForSingleObject(handle=0x12A4 = tid=13's thread handle); tid=13 itself blocks at sub_821CB030+0x128-created event 0x12AC — AUDIT-049 wedge.
  • Old static-DB ladder (AUDIT-058, partly EH): sub_824F7800 ← sub_824F7CD0 ← sub_824F8398 ← sub_821B55D8 ← [sub_821B6DF4](sub_821B6DF4.md) (EH catch-handler — secondary EH-only entry path).
  • Audits: 057, 058, 060, 064, 067.
  • Artifacts: audit-runs/audit-058-sub825070F0-activation/, audit-runs/audit-060-fnptr-array-bootstrap/canary-sanity-825070F0.log, audit-runs/audit-064-activation-ladder/, audit-runs/audit-067-vptr-install-mem-watch/.