Files
xenia-rs/audit-runs/audit-010/findings.md
MechaCat02 8e709b0a24 chore: track audit-runs summary artifacts (md/csv/diff/txt/json/etc)
Snapshot of every non-log artifact under audit-runs/ from audits 003
through 058: findings.md per audit, comparison CSVs, probe diffs,
schema docs, register-dump txts, lr-trace JSONL streams, the saved
canary patch diffs, etc. ~284 files / ~52 MB total.

Excluded (per .gitignore): probe stdout/stderr/log streams (the raw
firehose), guest-memory dumps under audit-026/027/029 (4.5 GB of
.bin files; *.bin pattern added to .gitignore this commit).

Also adds the orphan audit-058-sub825070F0-activation directory that
a subagent accidentally created at project-root instead of
under xenia-rs/audit-runs/; relocated to its proper home.

Purpose: cross-machine continuity. With these summaries committed,
a fresh clone gives the next session the full per-audit context
(findings + tables + cascade predictions) without dependence on
local-only working tree.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 21:36:41 +02:00

10 KiB
Raw Blame History

KRNBUG-AUDIT-010 — XNotify delivery diff (2026-05-05, READ-ONLY)

Branch classification

(α) — canary delivers a SPECIFIC set of notifications we don't, at a specific point. Synthesis side is identifiable.

Verified ground truth

Static — our impl

  • crates/xenia-kernel/src/xam.rs:358-361xam_notify_create_listener is a stub that allocates a handle but stores no listener object, no queue, no mask.
  • crates/xenia-kernel/src/xam.rs:363-366xnotify_get_next is a stub that always returns r3 = 0 (FALSE / no notifications).
  • crates/xenia-kernel/src/objects.rs:14-77KernelObject enum has variants Event/Semaphore/File/Thread/Timer/Mutex only. No NotifyListener variant exists in our object model.
  • No code anywhere in crates/xenia-kernel/src/ references BroadcastNotification, EnqueueNotification, or any notification queue.

Static — canary

  • xenia-canary/src/xenia/kernel/xam/xam_notify.cc:22-46XamNotifyCreateListener_entry constructs an XNotifyListener, calls Initialize(mask, is_system, max_version), returns its handle.
  • xenia-canary/src/xenia/kernel/xam/xam_notify.cc:57-95XNotifyGetNext_entry looks up the listener by handle, calls DequeueNotification(&id, &param), returns 1 if dequeued.
  • xenia-canary/src/xenia/kernel/xnotifylistener.cc:25-51Initialize creates a manual-reset event, registers the listener with KernelState::RegisterNotifyListener(this). EnqueueNotification appends to the queue and signals the wait_handle (mask + version filtered).
  • xenia-canary/src/xenia/kernel/kernel_state.cc:1013-1033RegisterNotifyListener enqueues 4 startup notifications on the first listener whose mask covers kXNotifySystem / kXNotifyLive:
    • kXNotificationSystemUI = 0x00000009, data = IsUIActive()
    • kXNotificationSystemSignInChanged = 0x0000000A, data = 1
    • kXNotificationLiveConnectionChanged = 0x02000001, data = 0x001510F1 (X_ONLINE_S_LOGON_DISCONNECTED)
    • kXNotificationLiveLinkStateChanged = 0x02000003, data = 0

Runtime — canary

From /home/fabi/xenia_canary_windows/xenia.log:

  • Line 1395: XamNotifyCreateListener(0x000000000000002F, 0x00000000) — mask 0x2F = kXNotifySystem|kXNotifyLive|kXNotifyFriends|kXNotifyCustom|kXNotifyXmp, max_version=0. Mask covers both kXNotifySystem and kXNotifyLive, so all 4 startup notifications are queued by RegisterNotifyListener.
  • Line 2787: XamUserReadProfileSettings(0, 0, 0, 0, 8, 701CEC80=10040015, ...) — fires AFTER the listener is created and notifications are delivered. Strong suggestion this is the SignInChanged dispatch result (sign-in handler reads the now-signed-in user's profile).
  • Line 1426: VdInitializeEngines(...) then continued progression through the renderer cluster + MmAllocatePhysicalMemoryEx for ring buffer + VdEnableRingBufferRPtrWriteBack etc. — full boot trajectory.

Runtime — ours (audit-009 / -n 500M)

From audit-runs/audit-009/probe-500m.err:

  • kernel.calls{name=XamNotifyCreateListener} = 1 — the stub fires.
  • kernel.calls{name=XNotifyGetNext} = 1,489,741 — main poll loop hammers it ~1.5M times in 500M instructions.
  • kernel.calls{name=XNotifyPositionUI} = 1 — fires once.
  • 0/21 renderer-cluster + producer-callsite probe PCs fire.
  • canary-only exports unchanged: ExTerminateThread, KeReleaseSemaphore, XamUserReadProfileSettingsXamUserReadProfileSettings is the prediction target after fix.

Consumer side — Sylpheed's notification dispatch path

Static analysis of sub_822F1AA8 (the main frame-poll loop), via sylpheed.db:

0x822f1bcc  addi    r6, r31, 84         ; r6 = &param  
0x822f1bd0  lwz     r3, 132(r30)         ; r3 = block.listener_handle ([block+132])
0x822f1bd4  addi    r5, r31, 88          ; r5 = &id
0x822f1bd8  addi    r4, r0, 0            ; r4 = match_id = 0 (any)
0x822f1bdc  bl      0x8284E45C           ; XNotifyGetNext(handle, 0, &id, &param)
0x822f1be0  cmpi    cr6, 0, r3, 0
0x822f1be4  bc      ..., 0x822F1C20      ; if r3==0, jump past the dispatch block
0x822f1be8  lwz     r3, 7944(r25)        ; r25=0x828E0000 → r3 = mem[0x828E1F08] = outer
0x822f1bec  lwz     r5, 84(r31)          ; r5 = id
0x822f1bf0  lwz     r4, 88(r31)          ; r4 = param/data
0x822f1bf4  lwz     r11, 0(r3)           ; r11 = outer.vtable
0x822f1bf8  lwz     r11, 4(r11)          ; r11 = vtable[1] = OnNotify(this, data, id)
0x822f1bfc  mtspr   CTR, r11
0x822f1c00  bcctrl  20, lt                ; call OnNotify
0x822f1c04  ...                          ; loop back to 0x822F1BE8 (drain queue)

Construction chain

  • sub_8216EA68 (main) at 0x8216ECAC: bl sub_822F2758(r3=&outer_on_stack).
  • sub_822F2758:
    • 0x822f2788: stw r11, 0(r30) with r11 = 0x820AD894outer.vtable = 0x820AD894.
    • 0x822f2790: bl sub_82150EF8 allocates 288-byte block.
    • if non-null, 0x822f27a4: bl sub_822F14D8(block, outer).
    • 0x822f27b8: stw block, 4(outer) → outer[4] = block.
  • sub_822F14D8:
    • 0x822f15a0: bl sub_826124A0 (= tail-jump to XamNotifyCreateListener with r3 = 0x2F = mask, r4 = 0 set inside the trampoline).
    • 0x822f15a8: stw listener_handle, 132(block).
    • 0x822f15c4-c8: stw outer, 7944(0x828E0000)mem[0x828E1F08] = outer (the dispatcher slot).
  • sub_822F1638 is the destructor; clears mem[0x828E1F08]=0.

vtable resolution (read directly from .pe at file offset 0xAD894)

  • outer.vtable = 0x820AD894.
  • vtable[0] = 0x825ED990
  • vtable[1] = 0x825ED990 ← OnNotify
  • vtable[2,3] = 0x825ED990
  • vtable[4] = 0x824C8F00 (= bclr 20, lt — empty 1-instruction return)
  • vtable[5,6] = 0x825ED990
  • vtable[7] = 0x824C8F00

sub_825ED990 body (addr 0x825ED990):

mfspr   r12, LR
stw     r12, -8(r1)
stwu    r1, -96(r1)
addis   r11, r0, 0x828A
lwz     r11, 23420(r11)        ; r11 = mem[0x828A5B7C]
cmpli   0, r11, 0x0
bc      12, eq, 0x825ED9B4      ; skip indirect call if pointer is null
mtspr   CTR, r11
bcctrl  20, lt                  ; call mem[0x828A5B7C](r3=this, r4=data, r5=id)
addi    r3, r0, 25              ; r3 = 25
bl      0x825F6B90              ; ?
addi    r4, r0, 1
addi    r3, r0, 0
bl      0x825F50D0              ; (atomic-OR on mem[0x82887808])
bl      0x825F5020              ; (long function — looks like _exit)
return

Static reading is suspicious: vtable[1] looks like a "must-override" base-class abort handler (__purecall-style) that calls a registered debug callback then runs an exit code path. Yet canary runs Sylpheed fine through this dispatch site, so either (i) mem[0x828A5B7C] is populated with the real dispatcher and the post-call sequence does not exit, or (ii) some derived-class instance overwrites the vtable between construction and first dispatch (no such write was visible in xrefs to mem[0x828E1F08] beyond the constructor and destructor).

Discipline gate

Box Status Notes
1. Specific missing notification with canary file:line 4 IDs, kernel_state.cc:1013-1033, xnotifylistener.cc:25-51, xam_notify.cc:57-95
2. Synthesis is small kernel/xam-side change <80 LOC est. ~70 LOC: KernelObject::NotifyListener variant + register-listener auto-enqueue + xnotify_get_next dequeue
3. Sharp 4-dim cascade prediction can be written Cannot name renderer L1 root — outer's vtable[1] resolves statically to an apparent abort handler (sub_825ED990); actual runtime dispatch target is opaque without instrumentation
4. No renderer/GPU code changes Pure kernel-state/xam change

Box 3 fails. STOP. Hand back diagnostic-only.

Next-session plan

This is the first session past the kernel-boundary cascade. The notification gap is overwhelming evidence (1.49M unanswered XNotifyGetNext calls), but the dispatcher's vtable[1] target needs a one-shot runtime probe before we can write a sharp prediction.

  1. Phase 1 first: temporarily patch xam_notify_get_next to return one synthetic notification (e.g. id=0x0A=SignInChanged, data=1) on the first call. Add --pc-probe=0x822f1bfc,0x822f1c00 to capture the actual vtable[1] target via the bcctrl. Re-run -n 100M. Read off the dispatcher target. Revert the temporary stub.
    • If target ≠ 0x825ED990: vtable was dynamically replaced; follow the chain to find the real handler, identify which renderer L1 root is downstream.
    • If target = 0x825ED990: the abort handler IS the real dispatcher (likely with a debug callback installed at mem[0x828A5B7C]); inspect what mem[0x828A5B7C] is populated with at boot.
  2. Phase 2 (next-next session, after the prediction is sharp):
    • Add KernelObject::NotifyListener { mask: u64, max_version: u32, is_system: bool, queue: VecDeque<(u32, u32)> }.
    • Track all listeners on KernelState (Vec).
    • xam_notify_create_listener: build the listener, store on state, on first kXNotifySystem-mask listener auto-enqueue (0x9, IsUIActive=0), (0xA, 1); on first kXNotifyLive-mask auto-enqueue (0x02000001, 0x001510F1), (0x02000003, 0).
    • xnotify_get_next: lookup listener by handle, dequeue head (or matching id if match_id != 0), write *id_ptr and *param_ptr, return 1 if dequeued else 0.
    • Update --trace-imports golden + lockstep sylpheed_n*m.json (instr count and downstream counts will shift).

Cascade prediction (provisional, for next session)

After the fix lands, expected:

  • Renderer L1 root: TBD (requires Phase-1.5 probe to resolve the dispatcher's actual vtable[1] target).
  • Canary-only export to fire: XamUserReadProfileSettings (canary line 2787 — fires post-listener-create; SignInChanged handler reads the just-signed-in user's profile).
  • signal_attempts: renderer subsystem may activate without parked-handle interaction this step (notification handlers typically run on the calling thread, not via signal).
  • draws delta: NO expected this step. We're moving the boot horizon one hop, not yet to a draw-emitting subsystem.

Files touched

This session: read-only. No code changes. New directory audit-runs/audit-010/ with this findings.md only.