Files
xenia-rs/audit-runs/audit-010/findings.md
MechaCat02 8e709b0a24 chore: track audit-runs summary artifacts (md/csv/diff/txt/json/etc)
Snapshot of every non-log artifact under audit-runs/ from audits 003
through 058: findings.md per audit, comparison CSVs, probe diffs,
schema docs, register-dump txts, lr-trace JSONL streams, the saved
canary patch diffs, etc. ~284 files / ~52 MB total.

Excluded (per .gitignore): probe stdout/stderr/log streams (the raw
firehose), guest-memory dumps under audit-026/027/029 (4.5 GB of
.bin files; *.bin pattern added to .gitignore this commit).

Also adds the orphan audit-058-sub825070F0-activation directory that
a subagent accidentally created at project-root instead of
under xenia-rs/audit-runs/; relocated to its proper home.

Purpose: cross-machine continuity. With these summaries committed,
a fresh clone gives the next session the full per-audit context
(findings + tables + cascade predictions) without dependence on
local-only working tree.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 21:36:41 +02:00

209 lines
10 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# KRNBUG-AUDIT-010 — XNotify delivery diff (2026-05-05, READ-ONLY)
## Branch classification
**(α) — canary delivers a SPECIFIC set of notifications we don't, at a
specific point. Synthesis side is identifiable.**
## Verified ground truth
### Static — our impl
- `crates/xenia-kernel/src/xam.rs:358-361``xam_notify_create_listener`
is a stub that allocates a handle but **stores no listener object,
no queue, no mask**.
- `crates/xenia-kernel/src/xam.rs:363-366``xnotify_get_next` is a
stub that **always returns `r3 = 0`** (FALSE / no notifications).
- `crates/xenia-kernel/src/objects.rs:14-77``KernelObject` enum has
variants `Event/Semaphore/File/Thread/Timer/Mutex` only. **No
`NotifyListener` variant exists** in our object model.
- No code anywhere in `crates/xenia-kernel/src/` references
`BroadcastNotification`, `EnqueueNotification`, or any notification
queue.
### Static — canary
- `xenia-canary/src/xenia/kernel/xam/xam_notify.cc:22-46`
`XamNotifyCreateListener_entry` constructs an `XNotifyListener`,
calls `Initialize(mask, is_system, max_version)`, returns its
handle.
- `xenia-canary/src/xenia/kernel/xam/xam_notify.cc:57-95`
`XNotifyGetNext_entry` looks up the listener by handle, calls
`DequeueNotification(&id, &param)`, returns 1 if dequeued.
- `xenia-canary/src/xenia/kernel/xnotifylistener.cc:25-51`
`Initialize` creates a manual-reset event, registers the listener
with `KernelState::RegisterNotifyListener(this)`. `EnqueueNotification`
appends to the queue and signals the wait_handle (mask + version
filtered).
- `xenia-canary/src/xenia/kernel/kernel_state.cc:1013-1033`
**`RegisterNotifyListener` enqueues 4 startup notifications on the
first listener whose mask covers `kXNotifySystem` / `kXNotifyLive`**:
- `kXNotificationSystemUI = 0x00000009`, data = `IsUIActive()`
- `kXNotificationSystemSignInChanged = 0x0000000A`, data = `1`
- `kXNotificationLiveConnectionChanged = 0x02000001`, data = `0x001510F1`
(`X_ONLINE_S_LOGON_DISCONNECTED`)
- `kXNotificationLiveLinkStateChanged = 0x02000003`, data = `0`
### Runtime — canary
From `/home/fabi/xenia_canary_windows/xenia.log`:
- Line 1395: `XamNotifyCreateListener(0x000000000000002F, 0x00000000)`
— mask `0x2F = kXNotifySystem|kXNotifyLive|kXNotifyFriends|kXNotifyCustom|kXNotifyXmp`,
max_version=0. Mask covers both `kXNotifySystem` and `kXNotifyLive`,
so all 4 startup notifications are queued by `RegisterNotifyListener`.
- Line 2787: `XamUserReadProfileSettings(0, 0, 0, 0, 8, 701CEC80=10040015, ...)`
— fires AFTER the listener is created and notifications are
delivered. Strong suggestion this is the SignInChanged dispatch
result (sign-in handler reads the now-signed-in user's profile).
- Line 1426: `VdInitializeEngines(...)` then continued progression
through the renderer cluster + `MmAllocatePhysicalMemoryEx` for
ring buffer + `VdEnableRingBufferRPtrWriteBack` etc. — full boot
trajectory.
### Runtime — ours (audit-009 / -n 500M)
From `audit-runs/audit-009/probe-500m.err`:
- `kernel.calls{name=XamNotifyCreateListener} = 1` — the stub fires.
- `kernel.calls{name=XNotifyGetNext} = 1,489,741` — main poll loop
hammers it ~1.5M times in 500M instructions.
- `kernel.calls{name=XNotifyPositionUI} = 1` — fires once.
- 0/21 renderer-cluster + producer-callsite probe PCs fire.
- **canary-only exports unchanged: `ExTerminateThread`,
`KeReleaseSemaphore`, `XamUserReadProfileSettings`** — `XamUserReadProfileSettings`
is the prediction target after fix.
## Consumer side — Sylpheed's notification dispatch path
Static analysis of `sub_822F1AA8` (the main frame-poll loop), via
`sylpheed.db`:
```
0x822f1bcc addi r6, r31, 84 ; r6 = &param
0x822f1bd0 lwz r3, 132(r30) ; r3 = block.listener_handle ([block+132])
0x822f1bd4 addi r5, r31, 88 ; r5 = &id
0x822f1bd8 addi r4, r0, 0 ; r4 = match_id = 0 (any)
0x822f1bdc bl 0x8284E45C ; XNotifyGetNext(handle, 0, &id, &param)
0x822f1be0 cmpi cr6, 0, r3, 0
0x822f1be4 bc ..., 0x822F1C20 ; if r3==0, jump past the dispatch block
0x822f1be8 lwz r3, 7944(r25) ; r25=0x828E0000 → r3 = mem[0x828E1F08] = outer
0x822f1bec lwz r5, 84(r31) ; r5 = id
0x822f1bf0 lwz r4, 88(r31) ; r4 = param/data
0x822f1bf4 lwz r11, 0(r3) ; r11 = outer.vtable
0x822f1bf8 lwz r11, 4(r11) ; r11 = vtable[1] = OnNotify(this, data, id)
0x822f1bfc mtspr CTR, r11
0x822f1c00 bcctrl 20, lt ; call OnNotify
0x822f1c04 ... ; loop back to 0x822F1BE8 (drain queue)
```
### Construction chain
- `sub_8216EA68` (main) at 0x8216ECAC: `bl sub_822F2758(r3=&outer_on_stack)`.
- `sub_822F2758`:
- 0x822f2788: `stw r11, 0(r30)` with `r11 = 0x820AD894`
**outer.vtable = 0x820AD894**.
- 0x822f2790: `bl sub_82150EF8` allocates 288-byte block.
- if non-null, 0x822f27a4: `bl sub_822F14D8(block, outer)`.
- 0x822f27b8: `stw block, 4(outer)` → outer[4] = block.
- `sub_822F14D8`:
- 0x822f15a0: `bl sub_826124A0` (= tail-jump to `XamNotifyCreateListener`
with `r3 = 0x2F = mask`, `r4 = 0` set inside the trampoline).
- 0x822f15a8: `stw listener_handle, 132(block)`.
- 0x822f15c4-c8: `stw outer, 7944(0x828E0000)`
**mem[0x828E1F08] = outer** (the dispatcher slot).
- `sub_822F1638` is the destructor; clears mem[0x828E1F08]=0.
### vtable resolution (read directly from .pe at file offset 0xAD894)
- `outer.vtable = 0x820AD894`.
- vtable[0] = `0x825ED990`
- vtable[1] = `0x825ED990` ← OnNotify
- vtable[2,3] = `0x825ED990`
- vtable[4] = `0x824C8F00` (= `bclr 20, lt` — empty 1-instruction return)
- vtable[5,6] = `0x825ED990`
- vtable[7] = `0x824C8F00`
`sub_825ED990` body (`addr 0x825ED990`):
```
mfspr r12, LR
stw r12, -8(r1)
stwu r1, -96(r1)
addis r11, r0, 0x828A
lwz r11, 23420(r11) ; r11 = mem[0x828A5B7C]
cmpli 0, r11, 0x0
bc 12, eq, 0x825ED9B4 ; skip indirect call if pointer is null
mtspr CTR, r11
bcctrl 20, lt ; call mem[0x828A5B7C](r3=this, r4=data, r5=id)
addi r3, r0, 25 ; r3 = 25
bl 0x825F6B90 ; ?
addi r4, r0, 1
addi r3, r0, 0
bl 0x825F50D0 ; (atomic-OR on mem[0x82887808])
bl 0x825F5020 ; (long function — looks like _exit)
return
```
**Static reading is suspicious**: vtable[1] looks like a "must-override"
base-class abort handler (`__purecall`-style) that calls a registered
debug callback then runs an exit code path. Yet canary runs Sylpheed
fine through this dispatch site, so either (i) `mem[0x828A5B7C]` is
populated with the real dispatcher and the post-call sequence does
not exit, or (ii) some derived-class instance overwrites the vtable
between construction and first dispatch (no such write was visible in
xrefs to mem[0x828E1F08] beyond the constructor and destructor).
## Discipline gate
| Box | Status | Notes |
|-----|--------|-------|
| 1. Specific missing notification with canary file:line | ✅ | 4 IDs, kernel_state.cc:1013-1033, xnotifylistener.cc:25-51, xam_notify.cc:57-95 |
| 2. Synthesis is small kernel/xam-side change <80 LOC | ✅ | est. ~70 LOC: KernelObject::NotifyListener variant + register-listener auto-enqueue + xnotify_get_next dequeue |
| 3. Sharp 4-dim cascade prediction can be written | ❌ | Cannot name renderer L1 root — outer's vtable[1] resolves statically to an apparent abort handler (`sub_825ED990`); actual runtime dispatch target is opaque without instrumentation |
| 4. No renderer/GPU code changes | ✅ | Pure kernel-state/xam change |
**Box 3 fails. STOP. Hand back diagnostic-only.**
## Next-session plan
This is the first session past the kernel-boundary cascade. The
notification gap is overwhelming evidence (1.49M unanswered
`XNotifyGetNext` calls), but **the dispatcher's vtable[1] target
needs a one-shot runtime probe before we can write a sharp prediction**.
1. **Phase 1 first**: temporarily patch `xam_notify_get_next` to
return one synthetic notification (e.g. `id=0x0A=SignInChanged,
data=1`) on the first call. Add `--pc-probe=0x822f1bfc,0x822f1c00`
to capture the actual vtable[1] target via the `bcctrl`. Re-run
-n 100M. Read off the dispatcher target. Revert the temporary
stub.
- If target ≠ 0x825ED990: vtable was dynamically replaced;
follow the chain to find the real handler, identify which
renderer L1 root is downstream.
- If target = 0x825ED990: the abort handler IS the real
dispatcher (likely with a debug callback installed at
`mem[0x828A5B7C]`); inspect what `mem[0x828A5B7C]` is
populated with at boot.
2. **Phase 2 (next-next session, after the prediction is sharp)**:
- Add `KernelObject::NotifyListener { mask: u64, max_version: u32, is_system: bool, queue: VecDeque<(u32, u32)> }`.
- Track all listeners on `KernelState` (Vec<handle>).
- `xam_notify_create_listener`: build the listener, store on
state, on first kXNotifySystem-mask listener auto-enqueue
(0x9, IsUIActive=0), (0xA, 1); on first kXNotifyLive-mask
auto-enqueue (0x02000001, 0x001510F1), (0x02000003, 0).
- `xnotify_get_next`: lookup listener by handle, dequeue head
(or matching id if `match_id != 0`), write `*id_ptr` and
`*param_ptr`, return 1 if dequeued else 0.
- Update `--trace-imports` golden + lockstep `sylpheed_n*m.json`
(instr count and downstream counts will shift).
## Cascade prediction (provisional, for next session)
After the fix lands, expected:
- **Renderer L1 root**: TBD (requires Phase-1.5 probe to resolve the
dispatcher's actual vtable[1] target).
- **Canary-only export to fire**: `XamUserReadProfileSettings`
(canary line 2787 — fires post-listener-create; SignInChanged
handler reads the just-signed-in user's profile).
- **signal_attempts**: renderer subsystem may activate without
parked-handle interaction this step (notification handlers
typically run on the calling thread, not via signal).
- **draws delta**: NO expected this step. We're moving the boot
horizon one hop, not yet to a draw-emitting subsystem.
## Files touched
This session: read-only. No code changes. New directory
`audit-runs/audit-010/` with this `findings.md` only.