Snapshot of every non-log artifact under audit-runs/ from audits 003 through 058: findings.md per audit, comparison CSVs, probe diffs, schema docs, register-dump txts, lr-trace JSONL streams, the saved canary patch diffs, etc. ~284 files / ~52 MB total. Excluded (per .gitignore): probe stdout/stderr/log streams (the raw firehose), guest-memory dumps under audit-026/027/029 (4.5 GB of .bin files; *.bin pattern added to .gitignore this commit). Also adds the orphan audit-058-sub825070F0-activation directory that a subagent accidentally created at project-root instead of under xenia-rs/audit-runs/; relocated to its proper home. Purpose: cross-machine continuity. With these summaries committed, a fresh clone gives the next session the full per-audit context (findings + tables + cascade predictions) without dependence on local-only working tree. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
209 lines
10 KiB
Markdown
209 lines
10 KiB
Markdown
# KRNBUG-AUDIT-010 — XNotify delivery diff (2026-05-05, READ-ONLY)
|
||
|
||
## Branch classification
|
||
**(α) — canary delivers a SPECIFIC set of notifications we don't, at a
|
||
specific point. Synthesis side is identifiable.**
|
||
|
||
## Verified ground truth
|
||
|
||
### Static — our impl
|
||
- `crates/xenia-kernel/src/xam.rs:358-361` — `xam_notify_create_listener`
|
||
is a stub that allocates a handle but **stores no listener object,
|
||
no queue, no mask**.
|
||
- `crates/xenia-kernel/src/xam.rs:363-366` — `xnotify_get_next` is a
|
||
stub that **always returns `r3 = 0`** (FALSE / no notifications).
|
||
- `crates/xenia-kernel/src/objects.rs:14-77` — `KernelObject` enum has
|
||
variants `Event/Semaphore/File/Thread/Timer/Mutex` only. **No
|
||
`NotifyListener` variant exists** in our object model.
|
||
- No code anywhere in `crates/xenia-kernel/src/` references
|
||
`BroadcastNotification`, `EnqueueNotification`, or any notification
|
||
queue.
|
||
|
||
### Static — canary
|
||
- `xenia-canary/src/xenia/kernel/xam/xam_notify.cc:22-46` —
|
||
`XamNotifyCreateListener_entry` constructs an `XNotifyListener`,
|
||
calls `Initialize(mask, is_system, max_version)`, returns its
|
||
handle.
|
||
- `xenia-canary/src/xenia/kernel/xam/xam_notify.cc:57-95` —
|
||
`XNotifyGetNext_entry` looks up the listener by handle, calls
|
||
`DequeueNotification(&id, ¶m)`, returns 1 if dequeued.
|
||
- `xenia-canary/src/xenia/kernel/xnotifylistener.cc:25-51` —
|
||
`Initialize` creates a manual-reset event, registers the listener
|
||
with `KernelState::RegisterNotifyListener(this)`. `EnqueueNotification`
|
||
appends to the queue and signals the wait_handle (mask + version
|
||
filtered).
|
||
- `xenia-canary/src/xenia/kernel/kernel_state.cc:1013-1033` —
|
||
**`RegisterNotifyListener` enqueues 4 startup notifications on the
|
||
first listener whose mask covers `kXNotifySystem` / `kXNotifyLive`**:
|
||
- `kXNotificationSystemUI = 0x00000009`, data = `IsUIActive()`
|
||
- `kXNotificationSystemSignInChanged = 0x0000000A`, data = `1`
|
||
- `kXNotificationLiveConnectionChanged = 0x02000001`, data = `0x001510F1`
|
||
(`X_ONLINE_S_LOGON_DISCONNECTED`)
|
||
- `kXNotificationLiveLinkStateChanged = 0x02000003`, data = `0`
|
||
|
||
### Runtime — canary
|
||
From `/home/fabi/xenia_canary_windows/xenia.log`:
|
||
- Line 1395: `XamNotifyCreateListener(0x000000000000002F, 0x00000000)`
|
||
— mask `0x2F = kXNotifySystem|kXNotifyLive|kXNotifyFriends|kXNotifyCustom|kXNotifyXmp`,
|
||
max_version=0. Mask covers both `kXNotifySystem` and `kXNotifyLive`,
|
||
so all 4 startup notifications are queued by `RegisterNotifyListener`.
|
||
- Line 2787: `XamUserReadProfileSettings(0, 0, 0, 0, 8, 701CEC80=10040015, ...)`
|
||
— fires AFTER the listener is created and notifications are
|
||
delivered. Strong suggestion this is the SignInChanged dispatch
|
||
result (sign-in handler reads the now-signed-in user's profile).
|
||
- Line 1426: `VdInitializeEngines(...)` then continued progression
|
||
through the renderer cluster + `MmAllocatePhysicalMemoryEx` for
|
||
ring buffer + `VdEnableRingBufferRPtrWriteBack` etc. — full boot
|
||
trajectory.
|
||
|
||
### Runtime — ours (audit-009 / -n 500M)
|
||
From `audit-runs/audit-009/probe-500m.err`:
|
||
- `kernel.calls{name=XamNotifyCreateListener} = 1` — the stub fires.
|
||
- `kernel.calls{name=XNotifyGetNext} = 1,489,741` — main poll loop
|
||
hammers it ~1.5M times in 500M instructions.
|
||
- `kernel.calls{name=XNotifyPositionUI} = 1` — fires once.
|
||
- 0/21 renderer-cluster + producer-callsite probe PCs fire.
|
||
- **canary-only exports unchanged: `ExTerminateThread`,
|
||
`KeReleaseSemaphore`, `XamUserReadProfileSettings`** — `XamUserReadProfileSettings`
|
||
is the prediction target after fix.
|
||
|
||
## Consumer side — Sylpheed's notification dispatch path
|
||
|
||
Static analysis of `sub_822F1AA8` (the main frame-poll loop), via
|
||
`sylpheed.db`:
|
||
|
||
```
|
||
0x822f1bcc addi r6, r31, 84 ; r6 = ¶m
|
||
0x822f1bd0 lwz r3, 132(r30) ; r3 = block.listener_handle ([block+132])
|
||
0x822f1bd4 addi r5, r31, 88 ; r5 = &id
|
||
0x822f1bd8 addi r4, r0, 0 ; r4 = match_id = 0 (any)
|
||
0x822f1bdc bl 0x8284E45C ; XNotifyGetNext(handle, 0, &id, ¶m)
|
||
0x822f1be0 cmpi cr6, 0, r3, 0
|
||
0x822f1be4 bc ..., 0x822F1C20 ; if r3==0, jump past the dispatch block
|
||
0x822f1be8 lwz r3, 7944(r25) ; r25=0x828E0000 → r3 = mem[0x828E1F08] = outer
|
||
0x822f1bec lwz r5, 84(r31) ; r5 = id
|
||
0x822f1bf0 lwz r4, 88(r31) ; r4 = param/data
|
||
0x822f1bf4 lwz r11, 0(r3) ; r11 = outer.vtable
|
||
0x822f1bf8 lwz r11, 4(r11) ; r11 = vtable[1] = OnNotify(this, data, id)
|
||
0x822f1bfc mtspr CTR, r11
|
||
0x822f1c00 bcctrl 20, lt ; call OnNotify
|
||
0x822f1c04 ... ; loop back to 0x822F1BE8 (drain queue)
|
||
```
|
||
|
||
### Construction chain
|
||
- `sub_8216EA68` (main) at 0x8216ECAC: `bl sub_822F2758(r3=&outer_on_stack)`.
|
||
- `sub_822F2758`:
|
||
- 0x822f2788: `stw r11, 0(r30)` with `r11 = 0x820AD894` →
|
||
**outer.vtable = 0x820AD894**.
|
||
- 0x822f2790: `bl sub_82150EF8` allocates 288-byte block.
|
||
- if non-null, 0x822f27a4: `bl sub_822F14D8(block, outer)`.
|
||
- 0x822f27b8: `stw block, 4(outer)` → outer[4] = block.
|
||
- `sub_822F14D8`:
|
||
- 0x822f15a0: `bl sub_826124A0` (= tail-jump to `XamNotifyCreateListener`
|
||
with `r3 = 0x2F = mask`, `r4 = 0` set inside the trampoline).
|
||
- 0x822f15a8: `stw listener_handle, 132(block)`.
|
||
- 0x822f15c4-c8: `stw outer, 7944(0x828E0000)` →
|
||
**mem[0x828E1F08] = outer** (the dispatcher slot).
|
||
- `sub_822F1638` is the destructor; clears mem[0x828E1F08]=0.
|
||
|
||
### vtable resolution (read directly from .pe at file offset 0xAD894)
|
||
- `outer.vtable = 0x820AD894`.
|
||
- vtable[0] = `0x825ED990`
|
||
- vtable[1] = `0x825ED990` ← OnNotify
|
||
- vtable[2,3] = `0x825ED990`
|
||
- vtable[4] = `0x824C8F00` (= `bclr 20, lt` — empty 1-instruction return)
|
||
- vtable[5,6] = `0x825ED990`
|
||
- vtable[7] = `0x824C8F00`
|
||
|
||
`sub_825ED990` body (`addr 0x825ED990`):
|
||
```
|
||
mfspr r12, LR
|
||
stw r12, -8(r1)
|
||
stwu r1, -96(r1)
|
||
addis r11, r0, 0x828A
|
||
lwz r11, 23420(r11) ; r11 = mem[0x828A5B7C]
|
||
cmpli 0, r11, 0x0
|
||
bc 12, eq, 0x825ED9B4 ; skip indirect call if pointer is null
|
||
mtspr CTR, r11
|
||
bcctrl 20, lt ; call mem[0x828A5B7C](r3=this, r4=data, r5=id)
|
||
addi r3, r0, 25 ; r3 = 25
|
||
bl 0x825F6B90 ; ?
|
||
addi r4, r0, 1
|
||
addi r3, r0, 0
|
||
bl 0x825F50D0 ; (atomic-OR on mem[0x82887808])
|
||
bl 0x825F5020 ; (long function — looks like _exit)
|
||
return
|
||
```
|
||
|
||
**Static reading is suspicious**: vtable[1] looks like a "must-override"
|
||
base-class abort handler (`__purecall`-style) that calls a registered
|
||
debug callback then runs an exit code path. Yet canary runs Sylpheed
|
||
fine through this dispatch site, so either (i) `mem[0x828A5B7C]` is
|
||
populated with the real dispatcher and the post-call sequence does
|
||
not exit, or (ii) some derived-class instance overwrites the vtable
|
||
between construction and first dispatch (no such write was visible in
|
||
xrefs to mem[0x828E1F08] beyond the constructor and destructor).
|
||
|
||
## Discipline gate
|
||
|
||
| Box | Status | Notes |
|
||
|-----|--------|-------|
|
||
| 1. Specific missing notification with canary file:line | ✅ | 4 IDs, kernel_state.cc:1013-1033, xnotifylistener.cc:25-51, xam_notify.cc:57-95 |
|
||
| 2. Synthesis is small kernel/xam-side change <80 LOC | ✅ | est. ~70 LOC: KernelObject::NotifyListener variant + register-listener auto-enqueue + xnotify_get_next dequeue |
|
||
| 3. Sharp 4-dim cascade prediction can be written | ❌ | Cannot name renderer L1 root — outer's vtable[1] resolves statically to an apparent abort handler (`sub_825ED990`); actual runtime dispatch target is opaque without instrumentation |
|
||
| 4. No renderer/GPU code changes | ✅ | Pure kernel-state/xam change |
|
||
|
||
**Box 3 fails. STOP. Hand back diagnostic-only.**
|
||
|
||
## Next-session plan
|
||
|
||
This is the first session past the kernel-boundary cascade. The
|
||
notification gap is overwhelming evidence (1.49M unanswered
|
||
`XNotifyGetNext` calls), but **the dispatcher's vtable[1] target
|
||
needs a one-shot runtime probe before we can write a sharp prediction**.
|
||
|
||
1. **Phase 1 first**: temporarily patch `xam_notify_get_next` to
|
||
return one synthetic notification (e.g. `id=0x0A=SignInChanged,
|
||
data=1`) on the first call. Add `--pc-probe=0x822f1bfc,0x822f1c00`
|
||
to capture the actual vtable[1] target via the `bcctrl`. Re-run
|
||
-n 100M. Read off the dispatcher target. Revert the temporary
|
||
stub.
|
||
- If target ≠ 0x825ED990: vtable was dynamically replaced;
|
||
follow the chain to find the real handler, identify which
|
||
renderer L1 root is downstream.
|
||
- If target = 0x825ED990: the abort handler IS the real
|
||
dispatcher (likely with a debug callback installed at
|
||
`mem[0x828A5B7C]`); inspect what `mem[0x828A5B7C]` is
|
||
populated with at boot.
|
||
2. **Phase 2 (next-next session, after the prediction is sharp)**:
|
||
- Add `KernelObject::NotifyListener { mask: u64, max_version: u32, is_system: bool, queue: VecDeque<(u32, u32)> }`.
|
||
- Track all listeners on `KernelState` (Vec<handle>).
|
||
- `xam_notify_create_listener`: build the listener, store on
|
||
state, on first kXNotifySystem-mask listener auto-enqueue
|
||
(0x9, IsUIActive=0), (0xA, 1); on first kXNotifyLive-mask
|
||
auto-enqueue (0x02000001, 0x001510F1), (0x02000003, 0).
|
||
- `xnotify_get_next`: lookup listener by handle, dequeue head
|
||
(or matching id if `match_id != 0`), write `*id_ptr` and
|
||
`*param_ptr`, return 1 if dequeued else 0.
|
||
- Update `--trace-imports` golden + lockstep `sylpheed_n*m.json`
|
||
(instr count and downstream counts will shift).
|
||
|
||
## Cascade prediction (provisional, for next session)
|
||
|
||
After the fix lands, expected:
|
||
- **Renderer L1 root**: TBD (requires Phase-1.5 probe to resolve the
|
||
dispatcher's actual vtable[1] target).
|
||
- **Canary-only export to fire**: `XamUserReadProfileSettings`
|
||
(canary line 2787 — fires post-listener-create; SignInChanged
|
||
handler reads the just-signed-in user's profile).
|
||
- **signal_attempts**: renderer subsystem may activate without
|
||
parked-handle interaction this step (notification handlers
|
||
typically run on the calling thread, not via signal).
|
||
- **draws delta**: NO expected this step. We're moving the boot
|
||
horizon one hop, not yet to a draw-emitting subsystem.
|
||
|
||
## Files touched
|
||
|
||
This session: read-only. No code changes. New directory
|
||
`audit-runs/audit-010/` with this `findings.md` only.
|