docs(audit): KRNBUG-AUDIT-008 + KRNBUG-AUDIT-009 diagnostics — renderer cluster fully unreached
Captures two consecutive read-only diagnostic sessions: AUDIT-008 (2026-05-05): IO-003 model reset. The 0x100c / 0x1004 / 0x15e0 workers ARE spawned post-IO-003; the IO-003 prediction-scorecard's "UNCREATED" markers were misclassifications (handle audit already showed the workers parked on lifecycle events, just unlinked from dispatcher addresses). Hypothesized the gate among the 5 non-create-chain callers of sub_821800D8 whose parents live in 0x82287000-0x82292FFF. AUDIT-009 (2026-05-05): falsifies AUDIT-008's β-hypothesis. A 21-PC --branch-probe (6 parents + 5 shims + dispatcher + 9 audit-005 producer-callsites) shows 0/21 firings at -n 500M — the entire 0x82287000-0x82294000 cluster is unreached. Static analysis confirms the cluster's level-1 roots have zero non-call xrefs in sylpheed.db. The gate is structurally above the cluster (vtable / function-pointer that's never written). Stop condition 1 triggered; discipline gate fails on box 1 + box 3; no fix this session. Also updates audit-runs/audit-006/canary_export_queue.md to reflect the AUDIT-009 evidence: 3 canary-only exports remain REAL_BUT_UNREACHED (ExTerminateThread, KeReleaseSemaphore, XamUserReadProfileSettings) — none is the immediate gate. No code changes; --branch-probe machinery from AUDIT-007 sufficed. Trace artifacts left untracked under audit-runs/audit-008/ + audit-runs/audit-009/ (consistent with prior audit-runs/* convention). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -5265,3 +5265,217 @@ PROBE_LIST="0x824a9aa0,0x824a9128,0x824a9710,0x824a9778,0x824a9788,0x824a9790,0x
|
||||
### Next session candidates
|
||||
|
||||
The 0x100c worker still doesn't spawn. Three of audit-006's canary-only entries (`ExTerminateThread`, `KeReleaseSemaphore`, `XamUserReadProfileSettings`) remain canary-only — any of them may be the next downstream gate. Re-running `--branch-probe` against `sub_824A9710` would now show a new exit branch (the priv-11 site fires, so the failure mode has shifted).
|
||||
|
||||
## KRNBUG-AUDIT-008 — IO-003 model reset; next gate is β-class job-submitter unreached (DIAGNOSTIC 2026-05-05)
|
||||
|
||||
### Outcome
|
||||
|
||||
**Model reset on IO-003 cascade.** Branch-probe trace at the post-priv-11 cluster decisively shows the 0x100c worker IS spawned as `tid=3` with `ctx=0x828F3D08, entry=0x82181830`, parked on lifecycle event handle `0x1020` (signals=0). The IO-003 audit memory's "0x100c UNCREATED" claim was wrong; the handle audit already had `handle=0x00001020 waiters(tid)=[3]` but the trace dump didn't connect tid=3 to the 0x100c dispatcher. Same correction applies to the 0x1004 worker (tid=11).
|
||||
|
||||
The actual next gate is **β-class** (internal-sub unreached): the 5 non-create-chain callers of `sub_821800D8` (job-submitter shims with the pattern `bl outer_getter; lwz r3, 80(r3); bl sub_824AA1D8`) are never called. Their parent functions live in the **0x82287000-0x82292FFF module range** — likely renderer / scene-graph subsystem.
|
||||
|
||||
### Decisive runtime evidence
|
||||
|
||||
`audit-runs/audit-008/branch-probe.trace`:
|
||||
|
||||
```
|
||||
BRANCH-PROBE pc=0x824a9a14 tid=1 cycle=5378562 -- main: post-XamTaskSchedule
|
||||
BRANCH-PROBE pc=0x824a93c8 tid=2 cycle=0 r3=0x828a28f0 -- spawned thread enters callback (matches canary's 0x824A93C8/0x828A28F0)
|
||||
BRANCH-PROBE pc=0x824a9540 tid=2 cycle=4232 -- spawned thread post-StfsCreateDevice cmpi
|
||||
BRANCH-PROBE pc=0x824a9a44 tid=1 cycle=5378576 -- main: post-KeWaitForSingleObject(0x8287094C)
|
||||
BRANCH-PROBE pc=0x824a9a4c tid=1 cycle=5378579 -- main: post-KeResetEvent
|
||||
BRANCH-PROBE pc=0x824a9a98 tid=1 cycle=5378596 -- main: sub_824A9710 epilogue
|
||||
BRANCH-PROBE pc=0x824a9acc tid=1 cycle=5378609 -- main: sub_824A9AA0 return
|
||||
BRANCH-PROBE pc=0x8216eaa0 tid=1 cycle=5378617 -- main: bl sub_82181C28 callsite
|
||||
BRANCH-PROBE pc=0x82181c28 tid=1 cycle=5378618 -- main entered sub_82181C28
|
||||
BRANCH-PROBE pc=0x821800d8 tid=1 cycle=5378630 -- main entered sub_821800D8 (singleton getter for 0x100c)
|
||||
BRANCH-PROBE pc=0x82181750 tid=1 cycle=5378645 r3=0x828f3d08 -- main entered sub_82181750 ctor
|
||||
BRANCH-PROBE pc=0x821817c0 tid=1 cycle=5378712 r3=0x00001020 -- post-sub_824A9F18 (lifecycle event=0x1020)
|
||||
BRANCH-PROBE pc=0x82181830 tid=3 cycle=0 r3=0x828f3d08 lr=0xbcbcbcbc -- 0x100C WORKER SPAWNED
|
||||
BRANCH-PROBE pc=0x82181838 tid=3 cycle=1 -- past entry thunk
|
||||
BRANCH-PROBE pc=0x821817fc tid=1 cycle=5378786 r3=0x00001024 -- main: post-sub_82172370, thread handle=0x1024
|
||||
BRANCH-PROBE pc=0x82180120 tid=1 cycle=5378951 -- main: post-atexit
|
||||
BRANCH-PROBE pc=0x82181c58 tid=1 cycle=5378957 r3=0x828f3d08 -- main: bl sub_821800D8 returned
|
||||
```
|
||||
|
||||
### Mechanical chain (cross-checked vs disasm)
|
||||
|
||||
1. main (sub_8216EA68) returns from sub_824A9AA0 at cycle 5378609.
|
||||
2. main calls `sub_82181C28` at `0x8216eaa0` (cycle 5378617). `sub_82181C28` is a Meyers singleton getter that checks `[0x828F3D98]` flag.
|
||||
3. First call → flag is 0 → falls through to `bl sub_821800D8` at `0x82181c54`.
|
||||
4. `sub_821800D8` is the 0x100c singleton getter. Checks `[0x828F3D78]` flag bit 0. First call → bit 0 is 0 → falls through to `bl sub_82181750` at `0x82180110`.
|
||||
5. `sub_82181750` is the constructor. With `r3 = this = 0x828F3D08` (the dispatcher).
|
||||
6. Constructor calls `bl sub_824A9F18` (allocates a lifecycle event); returns r3=0x1020.
|
||||
7. Constructor calls `bl sub_82172370` at `0x821817f8` (the ExCreateThread wrapper) with r3=0x20000 (stack), r4=0x82181830 (entry), r5=0x828F3D08 (ctx).
|
||||
8. Worker thread spawns as tid=3 at PC=0x82181830 → through the entry thunk → 0x82181838 (worker body).
|
||||
9. Worker body reads `[0x828F3D08+76] = 0x1020` (lifecycle event handle), waits on it.
|
||||
10. **Wait never satisfied** — handle 0x1020 has `signals=0, waits=1, wakes=0` in the handle-audit dump.
|
||||
|
||||
### Where the gate actually is
|
||||
|
||||
`sub_821800D8` xrefs:
|
||||
|
||||
| caller PC | from func | role |
|
||||
|-----------|-----------|------|
|
||||
| 0x82181c54 | sub_82181C28 | **create chain** (ran successfully — see trace above) |
|
||||
| 0x821802d8 | sub_82180158 | job-submitter shim |
|
||||
| 0x821806e0 | sub_821805C8 | job-submitter shim |
|
||||
| 0x82180b28 | sub_82180A10 | job-submitter shim |
|
||||
| 0x82180ea0 | sub_82180D90 | job-submitter shim |
|
||||
| 0x82181254 | sub_821810E0 | job-submitter shim |
|
||||
|
||||
Each shim is a 5-instruction leaf (`bl getter; lwz r3, 80(r3); bl sub_824AA1D8`) — the canonical "get-then-enqueue" pattern. `sub_824AA1D8` is the universal dispatcher-submit primitive that signals the lifecycle event.
|
||||
|
||||
The 5 shims' parent functions are in the **0x82287000-0x82292FFF module range** (sub_82292838, sub_822878A8, sub_8228D760, sub_822900A8, sub_822919C8, sub_8228FDB8). This module is downstream of the cache-init code we've been working on and is almost certainly renderer/scene-graph related.
|
||||
|
||||
### Discipline gate
|
||||
|
||||
Per task brief (audit-008 session), gate fails on:
|
||||
- Box 1: gate is NOT a single stubbed import (β-class, not α-class).
|
||||
- Box 4: no sharp 4-dim cascade prediction can be written without first identifying which submitter should fire first.
|
||||
|
||||
**Hand back. No fix this session.**
|
||||
|
||||
### Follow-up probe set for next session
|
||||
|
||||
Probe parent functions of the 5 shims to find which path actually fires:
|
||||
|
||||
```
|
||||
PROBE_LIST="0x82292838,0x822878a8,0x8228d760,0x822900a8,0x822919c8,0x8228fdb8,
|
||||
0x82180158,0x821805c8,0x82180a10,0x82180d90,0x821810e0,
|
||||
0x824aa1d8"
|
||||
```
|
||||
|
||||
Whichever target fires identifies the producer path; whichever doesn't names the gate.
|
||||
|
||||
### Trace artifacts
|
||||
|
||||
- `audit-runs/audit-008/branch-probe.trace` — 17 BRANCH-PROBE lines (clean extract).
|
||||
- `audit-runs/audit-008/probe-100m.log` — full stdout.
|
||||
- `audit-runs/audit-008/probe-100m.err` — full stderr trace.
|
||||
|
||||
### Files modified
|
||||
|
||||
None. KRNBUG-AUDIT-007's `--branch-probe` machinery was sufficient. No git commit — no code changes.
|
||||
|
||||
## KRNBUG-AUDIT-009 — renderer cluster fully unreached; gate is structurally above 0x82287xxx-0x82294xxx (DIAGNOSTIC 2026-05-05)
|
||||
|
||||
### Outcome
|
||||
|
||||
**Stop condition 1 triggered.** Branch-probed all 21 PCs proposed by AUDIT-008 (12 renderer-cluster parents + shims + dispatcher) plus the AUDIT-005 9-PC producer-callsite set. **0/21 fired** at -n 500M. The 0x82287000-0x82294000 cluster is not entered at all. The gate is structurally above the cluster — outside its call boundary — so a deeper renderer-side probe would land on dead code. Per task brief stop condition 1, hand back with a higher-up probe set; no Phase 2 attempted.
|
||||
|
||||
### Decisive runtime evidence
|
||||
|
||||
`audit-runs/audit-009/probe-500m.err`:
|
||||
|
||||
- `branch probes armed: 21 (0x8216f9d4 ... 0x824aa1d8)`
|
||||
- `BRANCH-PROBE` line count in stderr: **0**.
|
||||
- `instructions=500000010 import_calls=5629676 unimplemented=0` — completed without halt.
|
||||
- Final state, main: `tid=1 hw=0 state=Ready pc=0x822f1c60 lr=0x822f1be0` — inside `sub_822F1AA8` (frame-poll loop, between two `XNotifyGetNext` calls at 0x822f1bdc / 0x822f1c14). LR-of-LR points back into the same function.
|
||||
- Counters: `XNotifyGetNext=1,489,741`, `NtWaitForSingleObjectEx=1,489,801`, `NtWaitForMultipleObjectsEx=865,493`, `RtlEnter/LeaveCriticalSection=889,109` each, `VdSwap=2`. Main is service-loop polling forever; no forward progress past frame-poll.
|
||||
- 18 worker threads spawned (parity with audit-008 baseline + 2 new entry trampolines for 0x822c6870 / 0x824563e0 / 0x823dde30 / 0x823ddb50 that weren't catalogued before): tid=3 (0x100c worker, ctx=0x828F3D08, parked on lifecycle event 0x1020), tid=11 (0x1004 worker, ctx=0x828F3EC0, parked on event 0x1004), tid=17 (0x15e0 worker, ctx=0x828F4070, parked on event 0x15F4 — confirms post-IO-003 spawn at the new tid).
|
||||
- canary-only kernel exports unchanged from audit-008: `{ExTerminateThread, KeReleaseSemaphore, XamUserReadProfileSettings}` (3 entries).
|
||||
- `signal_attempts=0` on parked handles 0x1004, 0x100c (= event 0x1020), 0x15e0 (= event 0x15F4), 0x10c4. Same parked state as audit-008.
|
||||
|
||||
### Mechanical interpretation
|
||||
|
||||
- **Box 1 of the 12 PCs (parents):** `sub_82292838, sub_822878A8, sub_8228D760, sub_822900A8, sub_822919C8, sub_8228FDB8` — never entered.
|
||||
- **Box 2 of the 12 PCs (shims):** `sub_82180158, sub_821805C8, sub_82180A10, sub_82180D90, sub_821810E0` — never entered. (These are leaf shims with the `bl outer_getter; lwz r3, OFFSET(r3); bl sub_824AA1D8` pattern.)
|
||||
- **Box 3 (universal dispatcher):** `sub_824AA1D8` — never entered. The dispatcher serves both 0x100c and 0x15e0 clusters; its non-entry confirms NEITHER cluster's job-submit path runs, not just the 0x100c side.
|
||||
- **AUDIT-005 9-PC producer callsites (5 × 0x100c shims + 4 × 0x15e0 shims):** never entered.
|
||||
|
||||
This eliminates the audit-008 working hypothesis that the gate sat among the 5 known callers of `sub_821800D8`. The gate is at least one level higher — above the cluster's external entry boundary.
|
||||
|
||||
### Cluster shape (from sylpheed.db xrefs)
|
||||
|
||||
The 0x82287000-0x82294000 cluster is **internally cohesive but externally unreachable via direct calls**. Its level-1 root functions (where call hierarchy starts within the cluster) have only self-call xrefs — i.e. the cluster is reached only via indirect calls (function pointers / vtables) from outside. The 6 candidate parents from audit-008 sit deep enough that ANY upstream gate looks the same from their level.
|
||||
|
||||
External entry points worth probing next:
|
||||
- `sub_82293448` (0x82293448) — level-1 root, only self-recursion xrefs.
|
||||
- `sub_822919C8` (0x822919C8) — level-1 root, only self-recursion xrefs.
|
||||
- `sub_82288028` (0x82288028) — 8 callers, all in-cluster, but a hub.
|
||||
- `sub_82292D80` (0x82292D80) — 1 caller, in-cluster (sub_82293448).
|
||||
- `sub_822851E0` (0x822851E0) — has 2 in-cluster callers (sub_82284BA0, sub_82290BC8); reached transitively from `sub_82289FD0`.
|
||||
- `sub_82286BC8` (0x82286BC8) — only sub_822851E0 calls it.
|
||||
|
||||
NEW thread entry trampolines spawned post-IO-003 (these didn't exist in audit-008's tid set; mid-run kernel-call telemetry shows ExCreateThread at these PCs):
|
||||
- 0x822c6870 (tid=14 + tid=15, parallel duplicates, ctx=0x828f3300)
|
||||
- 0x824563e0 (tid=16, ctx=0x828f3e70)
|
||||
- 0x823dde30 (tid=18, ctx=0x828f3c4c)
|
||||
- 0x823ddb50 (tid=19 + tid=20, parallel duplicates, ctx=0x828f3c88)
|
||||
|
||||
These are likely XAM/system-event dispatchers, not renderer producers, but their entries are unprobed — worth folding into the next probe set to confirm they are not the missing edge.
|
||||
|
||||
### Why main parks at sub_822F1AA8
|
||||
|
||||
main's call sequence (from xrefs of sub_8216EA68): the priv-11/cache-init cluster (`sub_824A9AA0`), the 0x100c create chain (`sub_82181C28`), `sub_82181298` (a 964-byte function — likely 0x1004 create chain), then a series of `sub_8216E858 / sub_82448470 / sub_8216F218 / sub_82448XXX` calls (probably config / xconfig / atexit), then finally:
|
||||
|
||||
```
|
||||
0x8216ecc4: sub_822F17F0 (684 bytes — pre-poll setup, calls sub_82611CD8/sub_825F1000/sub_825F14D0/sub_824C1A38/sub_824BD460×2/sub_824BD580×2/sub_824B3798/sub_824B40B0/sub_824C2BF8/sub_824CE348/sub_824C76D0/sub_824CE4D0)
|
||||
0x8216eccc: sub_822F1AA8 (frame-poll #1) ← we are here, looping forever
|
||||
0x8216ee10: sub_822F1AA8 (frame-poll #2) — never reached
|
||||
```
|
||||
|
||||
Two interpretations are plausible:
|
||||
1. **sub_822F1AA8 is a finite poll** that exits when XNotifyGetNext returns a particular notification (e.g. dashboard signin completion / profile load). Some XAM event main expects is never delivered.
|
||||
2. **sub_822F1AA8 is an event pump for the FIRST half of init**, calling work-items that should drive the renderer subsystem. If the work-items are dispatched here and the dispatch path goes via an indirect call into the 0x82287xxx cluster, then the missing edge is a function-pointer/vtable that's never populated.
|
||||
|
||||
Both interpretations are consistent with the 0/21 probe data. Probing the entry of sub_822F1AA8's CALLEE list (the calls inside the 1.49M-iteration loop) will discriminate.
|
||||
|
||||
### Discipline gate
|
||||
|
||||
| # | Condition | Pass? |
|
||||
|---|---|---|
|
||||
| 1 | Phase 1 named a single failing kernel/xam import (α) or a narrow internal-sub bug | **NO** — 0 PCs fired |
|
||||
| 2 | Canary impl small (<80 LOC) | N/A |
|
||||
| 3 | Sharp 4-dim cascade prediction | **NO** — no candidate fix |
|
||||
| 4 | No new ABI plumbing | N/A |
|
||||
| 5 | Fix doesn't touch renderer subsystem | N/A |
|
||||
|
||||
**Gate fails on box 1 + 3. STOP. Hand back per stop condition 1.** No code changes this session.
|
||||
|
||||
### Follow-up probe set for next session
|
||||
|
||||
```
|
||||
PROBE_LIST=
|
||||
# Renderer-cluster level-1 roots (never entered if gate is above):
|
||||
0x82293448,0x822919c8,0x82288028,0x82292d80,0x822851e0,0x82286bc8,
|
||||
# Newly spawned thread entry trampolines (unprobed, may be system-side):
|
||||
0x822c6870,0x824563e0,0x823dde30,0x823ddb50,
|
||||
# Main's frame-poll loop entry + its callee list (XNotifyGetNext consumer):
|
||||
0x822f1aa8,0x822f1be0,0x822f1c14,0x822f1d00,
|
||||
# Main's continuation (only fires if main exits frame-poll #1):
|
||||
0x822f1638,0x821506b8,0x8216f088,0x82150ef8,
|
||||
0x82173360,0x82173530,0x8216f170,0x824a9ad8
|
||||
```
|
||||
|
||||
Whichever entries fire bound the live path; whichever don't bound the gate.
|
||||
|
||||
If `sub_822F1AA8` fires once but never exits → main is stuck waiting for a XAM notification or critical-section signal. Look for which `XamNotifyCreateListener`-registered ID the loop expects.
|
||||
If `sub_822F1AA8` fires AND exits → main reaches `sub_822F1638` etc.; gate is further down.
|
||||
If the cluster level-1 roots fire → gate is INSIDE the cluster (renderer β-recursion), and the brief's "no renderer fixes" rule binds.
|
||||
|
||||
### Trace artifacts
|
||||
|
||||
- `audit-runs/audit-009/probe-500m.log` — final state + thread diag + handle audit + full counter table.
|
||||
- `audit-runs/audit-009/probe-500m.err` — full stderr trace (kernel call log, 187 KB).
|
||||
- `audit-runs/audit-009/branch-probe.trace` — empty (0 BRANCH-PROBE lines emitted).
|
||||
|
||||
Re-run command:
|
||||
|
||||
```
|
||||
cd xenia-rs
|
||||
PROBE="0x82292838,0x822878a8,0x8228d760,0x822900a8,0x822919c8,0x8228fdb8,\
|
||||
0x82180158,0x821805c8,0x82180a10,0x82180d90,0x821810e0,0x824aa1d8,\
|
||||
0x821802d8,0x821806e0,0x82180b28,0x82180ea0,0x82181254,\
|
||||
0x8216f9d4,0x8216fc08,0x821700b8,0x821700f4"
|
||||
./target/release/xenia-rs exec sylpheed.iso \
|
||||
--halt-on-deadlock --branch-probe="$PROBE" \
|
||||
--trace-handles-focus=0x1004,0x100c,0x15e0,0x1020,0x10c4 \
|
||||
-n 500000000 \
|
||||
> audit-runs/audit-009/probe-500m.log 2> audit-runs/audit-009/probe-500m.err
|
||||
```
|
||||
|
||||
### Files modified
|
||||
|
||||
None. KRNBUG-AUDIT-007's `--branch-probe` machinery was sufficient. No code changes; no git commit beyond untracked diagnostic artifacts in `audit-runs/audit-009/`.
|
||||
|
||||
Reference in New Issue
Block a user