Files
xenia-rs/docs/functions/sub_825070F0.md
MechaCat02 ad45873a1b ITERATE-2.V: scheduler priority aging closes 18-day AUDIT-049 wedge
Priority aging in xenia-cpu/scheduler.rs:pick_runnable
(effective_priority = base + age_bonus(now_round - last_run_round),
capped at +31, AGING_ROUNDS_PER_BONUS=1). Strict-priority was parking
priority=0 threads behind CPU-bound priority=15 audio mixer
(sub_824D1328 guest spinwait at PC=0x824d1404 on CPU5). Aging
eventually picks the starved thread, breaking the producer-consumer
cycle that caused 5-tid wedge at PC=0x824ac578 since AUDIT-049 (10 May).

Cascade observed: tid=13 clean exit; events 121K -> 13M (107x); last
host_ns 767ms -> 51,011ms (66x); 8 new threads spawn; VdSwap 1 -> 2.

Complete two-day iterate sequence (2026-05-27 -> 2026-05-28):
- 2.F: VdSwap drain timeout 900ms -> 1ms (xenia-gpu/handle.rs); 876x
       perf win on VdSwap kernel callback
- 2.H: vA0000000 physical heap bucket added (state.rs, exports.rs);
       ctx_ptrs now in 0xA0000000-0xBFFFFFFF range matching canary
- 2.L: Phase-A diff harness categorized [return_value mismatch],
       [status mismatch], [args_resolved.path mismatch] tags
       (tools/diff-events/diff_events.py); closes reading-error #41
       (silent test-harness state leak invalidating trace diffs)
- 2.M: always-on exit-thread-state.json sibling to Phase-A JSONL
       (event_log.rs + xenia-app/main.rs); closes reading-error #42
       (Phase-A blind to blocked-forever waits)
- 2.Q: signal.match kernel instrumentation in NtSetEvent /
       NtReleaseSemaphore / KeSetEvent / KeReleaseSemaphore
       (exports.rs); emits target_handle + waiter_count + waiter_tids
- 2.T: wake.requested kernel instrumentation in wake_eligible_waiters
       (exports.rs); emits target_tid + transition + new_state
- 2.V: scheduler priority aging (xenia-cpu/scheduler.rs) [keystone]

Plus accumulated WIP from earlier May (contention_manifest,
phase_b_snapshot, xam/xaudio enhancements, analysis db, xex loader,
xenia-app main loop, etc.). Audit-runs/ artifacts remain untracked
per project convention.

Tests: 300 xenia-cpu / 227 xenia-kernel / 5 xenia-app / 19 xenia-path
/ 30+ smaller suites -- all PASS, 0 regressions. Determinism preserved
(2x cold runs bit-identical at 13,003,881 events post-2.V).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 07:27:26 +02:00

64 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
address: 0x825070F0
classification: vtable_method
confidence: high
last_audit: 067
aliases:
- "ANON_Class_713383D7 vtable slot 1"
- "AUDIT-057 top missing-thread spawner"
---
# sub_825070F0 — ANON_Class_713383D7 vtable slot 1 (worker spawner)
## Synopsis
Slot 1 of class `ANON_Class_713383D7` vtable (located at `0x8200A208` and clone at `0x8200A928`). When invoked, initializes 4 worker threads with shared context `r3=0xBCE25340` (canary). The thread entry points are `0x82506528 / 0x82506558 / 0x82506588 / 0x825065B8`. In canary, this fn fires 1× at ~60s wallclock immediately after `DiscImageDevice::ResolvePath(\\dat\\movie)` (post-intro file open). In ours, it fires 0× at any horizon probed so far.
## Evidence
- AUDIT-058 Linux Debug canary: fires 1× at ~60s wallclock with `pc=0x825070F0 lr=0x824F7B24 r3=BCE25340 r4=701CF3C0 r5=BCE25AC0`.
- AUDIT-060 Probe C-Win Windows Debug canary: same probe (`--log_lr_on_pc=0x825070F0`, 90s) → 1 fire, `lr=0x824F7B24`**bit-identical to Linux Debug**, validating the new Wine canary oracle.
- LR `0x824F7B24` resolves to inside `sub_824F7800+0x24` — the vtable `bctrl` dispatch site.
- Class `ANON_Class_713383D7` lives at vtables `0x8200A208` (and clone `0x8200A928`); both are 7-method tables. Slot 1 is this fn. **Zero recorded vptr_writes in DB** — the ctor that writes this vtable is in an unreachability island OR is a computed-store-only ctor.
- **AUDIT-067 (2026-05-12)** strengthens this: **zero `vptr_writes`, zero `xrefs`, zero u32-byte occurrences of `0x8200A208`/`0x8200A928` in the `.pe` file, zero `addis+addi/ori` pairs materializing the value**. Runtime mem-watch of all 16 guest store opcodes (`stw`/`std`/`stwx`/`stwbrx`/`stwcx.`/`stmw`/`stvx`/`stvewx`/etc.) for 211 s wallclock in canary produces **0 hits** for these values — though `sub_825070F0` itself fires 1× at ~25 s wallclock with `*r3 = 0x8200A208` implicit at the bctrl. The install is **host-side**, not guest-side.
- AUDIT-057 named this as the top missing-thread spawner: 4 missing thread spawns in ours.
## Activation
Vtable dispatch from `sub_824F7800+0x24 bctrl` (slot 1 of vtable `0x8200A208`). **AUDIT-064 fully classified the ladder**: `sub_824F7800`, `sub_824F7CD0`, `sub_824F8398`, `sub_821B55D8` are ALL `normal_callee` (NOT EH thunks). Only `sub_821B6DF4` is the EH catch-handler; it's a secondary entry path, not the runtime activation route.
**Full runtime activation chain (in canary; identified by AUDIT-064 via lr-resolution at each fire)**: tid=1 `entry_point → sub_8216EA68 → sub_822F1AA8` (post-init dispatcher) → `bctrl vtable[0] of *(0x828E1F08)``sub_82175330` (2-insn thunk) → tail-jump → `sub_82173990` → … → `sub_821741C8``sub_82172BA0` (array-walk dispatcher) → `bctrl vtable[6]``sub_821B55D8``sub_824F8398``sub_824F7CD0``sub_824F7800``bctrl vtable[1]``sub_825070F0`.
**Wedge in ours (AUDIT-064)**: tid=1 successfully enters `sub_822F1AA8`, reaches the bctrl at `0x822F1B4C`, dispatches to `sub_82175330``sub_82173990` → blocks at `sub_82173990+0x2D0` on `KeWaitForSingleObject` INFINITE on handle `0x12A4` = tid=13's thread handle. Tid=13 itself is blocked on the AUDIT-049 wedge (event 0x12AC inside the audit-009 cluster). The 5-fn ladder downstream of `sub_82172BA0` is NEVER reached because tid=1 hasn't returned from the thread-join wait.
## Static graph
- Static callers (direct `bl`): 0 — it's reached only via `bctrl`. The `bctrl` site is `0x824F7B20`.
- Callees: spawns 4 worker threads via `ExCreateThread` (or equivalent) with entries `0x82506528/58/88/B8`.
## Audit log
- **Phase Non-match Investigation (2026-05-19)** — Phase A `thread.create` events directly corroborate the AUDIT-058 framing using **runtime** evidence (previously only static + ctor-probe). Canary cold trace `canary-jitter-1.jsonl` (4.4 GB, 18.7M events) contains EXACTLY 4 `thread.create` events at `host_ns = 10.382912900 / 10.383282200 / 10.383647200 / 10.384161700` (spaced ~370500 ns apart on tid=6 = guest main) with entries `0x82506528 / 0x82506558 / 0x82506588 / 0x825065B8`, shared `ctx_ptr=0xBCE251C0`, stack=65,536, `suspended=true`, affinity=0. These match the dossier's listed worker entries 1:1 and are bit-identical-in-structure to the AUDIT-058 fire (modulo `ctx_ptr` arena drift: AUDIT-058 cited `0xBCE25340`, this jitter sample has `0xBCE251C0` — both inside the `0xBCE25xxx` arena allocated by the same fn). FIFO-matched child tids: `0x82506528 → tid=28` (3.26M events, file IO + heavy RtlEnterCS), `0x82506558 → tid=27` (36k events), `0x82506588 → tid=29` (91k events), `0x825065B8 → never started` in the 90s window. Same canary-vs-ours digest comparison shows ours-postfix.jsonl has **0 occurrences** of `0xBCE251C0` and **0 thread.create events** after spawn #10 (1.727 s). The full set of static-analysis-invisible properties (0 vptr_writes, 0 xrefs, 0 indirect_dispatch_candidates targeting vtables `0x8200A208` / `0x8200A928`) was re-verified against current sylpheed.db — AUDIT-067's conclusion stands. New artifacts at `audit-runs/phase-nonmatch-investigation/`. **Recommended next probe**: AUDIT-068 host-side mem-watch was deferred — re-attempt now with Phase A event correlation (the 10.382 s spawn burst is the precise wall-clock window to hook). [confirmed runtime; framing intact]
- **AUDIT-067 (2026-05-12)** — runtime mem-watch via new canary cvar `audit_67_value_watch` (~422 LOC additive instrumentation; default-empty / zero-cost; kept in canary tree per policy). Hooked **all 16 store opcodes**: `stw`, `stwu`, `stwx`, `stwux`, `stwbrx`, `stwcx.`, `stmw`, `std`, `stdu`, `stdux`, `stdx`, `stdbrx`, `stdcx.`, `stvx`/`stvxl`/`128`, `stvewx`/`128`. Each hook emits `CompareEQ(val32, watch) → TrapTrue(_,250+idx)`; trap handler logs `pc/lr/val/dst/regs/tid`. Sanity test with `watch=0x00000000`**103,321 hits in 30s** (instrumentation verified). Main run `watch=0x8200A208,0x8200A928` for **211 s wallclock**: **0 hits** despite `AUDIT-061-BR pc=0x825070F0` firing 1× with `r3=0xBCE25340 r4=0x701CF3C0 r5=0xBCE25AC0` (bit-identical to AUDIT-058/060). **CONCLUSION**: the vtable address `0x8200A208` is never stored to guest memory via any guest PowerPC store opcode in canary — the install is **host-side** (most likely a kernel-import direct memory write via `xe::store_and_swap<uint32_t>(memory + addr, val)`, OR an XEX loader operation, OR a `RtlCopyMemory`-style host helper). Path A (static binary search) also yielded 0 matches: no `vptr_writes`, no `xrefs`, no `addis`+`addi/ori` pair (with or without mr-chain register propagation) materializing the value, no u32 occurrence anywhere in the `.pe` file. **Reading-error #19**: assumption that meaningful guest-memory writes go through guest PPC code is false — kernel imports and the image loader perform direct host writes that bypass the JIT. AUDIT-068 must hook at the `Memory::write*` / `store_and_swap<*>` level instead. [confirmed — negative result; structural finding]
- **AUDIT-064 (2026-05-12)** — classified all 4 unclassified ladder fns: `sub_824F7800`, `sub_824F7CD0`, `sub_824F8398`, `sub_821B55D8`**all 4 are normal_callee**, NOT EH thunks (refutes the worst-case hypothesis from AUDIT-060 that the whole chain might be EH metadata). Probed canary at 60s/120s/180s — all 4 fire 1× each, bit-identical context. Walked upward: real runtime caller of `sub_821B55D8` is `sub_82172BA0+0x1E8 bctrl` (PC `0x82172D88`), NOT the static-DB-listed `sub_821B6DF4` EH branch. **Identified the full upstream activation chain**: tid=1 entry_point → `sub_8216EA68` → [`sub_822F1AA8`](sub_822F1AA8.md) → vtable[0] of `*(0x828E1F08)` = `sub_82175330` (2-insn thunk) → tail-jump to `sub_82173990` → … (canary continues through `sub_821741C8` → [`sub_82172BA0`](sub_82172BA0.md) → vtable[6]=[`sub_821B55D8`](sub_821B55D8.md) → [`sub_824F8398`](sub_824F8398.md) → [`sub_824F7CD0`](sub_824F7CD0.md) → [`sub_824F7800`](sub_824F7800.md) → bctrl → `sub_825070F0`). **First divergence in ours**: tid=1 enters `sub_82173990` via the vtable[0] dispatch but blocks at `sub_82173990+0x2D0 bl 0x824AA330` (KeWaitForSingleObject INFINITE) on handle `0x12A4` = tid=13's thread handle. This is the **same AUDIT-049 wedge**: tid=13 itself is blocked on handle `0x12AC` waiting for the audit-009-cluster signal. Activation of sub_825070F0 is gated on resolving the tid=13 wait, NOT on any divergence in the ladder fns themselves. [confirmed]
- **AUDIT-060 (2026-05-12)** — verified canary fire reproduces under Windows Debug oracle. Caller chain caveat added: `sub_821B6DF4` ladder-top is EH, not normal call edge. Other ladder fns need individual classification. [confirmed for canary fire; caveat on the upstream chain]
- **AUDIT-058 (2026-05-10)** — captured canary fire context, walked static caller ladder, found all 6 ladder fns fire 0× in ours. Concluded "activation phase doesn't activate in ours". [STATUS: ladder framing partially falsified by AUDIT-060 — at least `sub_821B6DF4` is EH; the *real* gate is the AUDIT-056 sub_821C4EB0 throughput gap, upstream.]
- **AUDIT-057 (2026-05-10)** — flagged as top missing-thread spawner (4 of 13 missing thread spawns). [confirmed quantitatively]
## Open questions
- What spawns the 4 worker threads exactly? Disassemble body. The threads have entries `0x82506528/58/88/B8` — are these consecutive 0x30-byte stubs that all forward to a common worker fn?
- What class instance triggers the slot-1 dispatch? Is it a `silph::GamePart_Title` instance? The wallclock context (post-`\\dat\\movie` ResolvePath) suggests so.
- **(AUDIT-067 result)** What host-side mechanism installs `0x8200A208` at `0xBCE25340+0`? Candidates: `xboxkrnl_rtl*` direct-write helpers (`RtlCopyMemory`/`RtlFillMemory`/`RtlInitializeCriticalSection` etc.), XEX loader image-rewrites, or kernel-import factory helpers. Next probe: AUDIT-068 host-side mem-watch — hook `Memory::write*` and/or `xe::store_and_swap<*>` in canary.
## Cross-references
- Vtable: `0x8200A208` (primary), `0x8200A928` (clone), class `ANON_Class_713383D7`, slot 1.
- Dispatch site: `sub_824F7800+0x20 bctrl` (PC `0x824F7B20`); post-bctrl PC `0x824F7B24`.
- Worker thread entries spawned: `0x82506528, 0x82506558, 0x82506588, 0x825065B8`.
- **Real runtime activation chain (AUDIT-064)**: `tid=1 entry_point → sub_8216EA68 → [sub_822F1AA8](sub_822F1AA8.md) → bctrl vtable[0]={sub_82175330 tail-jump→sub_82173990} → … → sub_821741C8 → [sub_82172BA0](sub_82172BA0.md) → bctrl vtable[6] → [sub_821B55D8](sub_821B55D8.md) → [sub_824F8398](sub_824F8398.md) → [sub_824F7CD0](sub_824F7CD0.md) → [sub_824F7800](sub_824F7800.md) → bctrl vtable[1] → sub_825070F0`.
- **Wedge in ours**: tid=1 blocks at `sub_82173990+0x2D0` on KeWaitForSingleObject(handle=0x12A4 = tid=13's thread handle); tid=13 itself blocks at `sub_821CB030+0x128`-created event 0x12AC — AUDIT-049 wedge.
- Old static-DB ladder (AUDIT-058, partly EH): `sub_824F7800 ← sub_824F7CD0 ← sub_824F8398 ← sub_821B55D8 ← [sub_821B6DF4](sub_821B6DF4.md) (EH catch-handler — secondary EH-only entry path)`.
- Audits: 057, 058, 060, 064, 067.
- Artifacts: `audit-runs/audit-058-sub825070F0-activation/`, `audit-runs/audit-060-fnptr-array-bootstrap/canary-sanity-825070F0.log`, `audit-runs/audit-064-activation-ladder/`, `audit-runs/audit-067-vptr-install-mem-watch/`.