# AUDIT-069 Session 3 — writer report v3 Date: 2026-05-20 xenia-rs HEAD: `e6d43a23ac393004d2e5adf2f0395fd0b5e6448b` (UNCHANGED from S1/S2) `git diff HEAD | sha256sum`: `ed30fd526643918f67311caff0a10d1346d73fd0c0323e02477883cf5ff20357` (UNCHANGED at start AND end of S3) No canary instrumentation added this session. No ours source modifications. `--lr-trace` is a runtime flag (main.rs:233-243). ## Headline (HIGH confidence, direct measurement) ours's tid=5 (= canary tid=10 by entry/ctx identity) fires the γ-signaler family from the SAME guest LRs as canary — but **only 81 times where canary fires 492 times (16%)**. This is NOT a "wrong-handle" bug — it is a **producer-loop underrun**. The dispatch loop in `sub_82450A68` exits early or starves; consumer threads then block on events that ours never gets to signal. S2's "the producer fires identically, just selects wrong handles" framing is REFINED, not falsified: the producer reaches the wrappers via the EXACT same call sites but completes ~5× fewer iterations. ## Method Read-only `--lr-trace=0x824AA2F0,0x824AAF50` on cold ours boot, 1.5B instructions / 47 s wallclock (and re-validated at 5B / 159s — same 81 fires, same handle universe, same import_calls=39290 → no new work after the producer's initial burst). JSONL output to s3/ours-lr-trace.jsonl. Cross-engine paired against S1's `signal-probe-correlated.log` (canary data, fresh 2026-05-20). ## Per-LR fire counts | caller LR | symbol | wrapper PC | canary tid=10 | ours tid=5 | ratio | |---|---|---|---:|---:|---:| | 0x8245DA44 | γ-D-A (sub_8245D9D8) | 0x824AA2F0 | 23 | 5 | 22% | | 0x8245DB08 | γ-D-B (sub_8245DA78) | 0x824AA2F0 | 8 | 1 | 12% | | 0x8245DC5C | γ-DB40 (sub_8245DB40 NEW) | 0x824AAF50 | 461 | 75 | 16% | | **TOTAL** | | | **492** | **81** | **16%** | ours runs the same producer code, but the loop terminates early. S2's per-PC fire-count table also shows ours = 6/1/75 for the three γ-fns — this S3 data agrees with S2 for the wrapper-entry side too. ## Handle namespaces are incomparable by raw ID - canary uses `XEvent::native_object()` pseudo-handles `F8000xxx` (high bit set, encodes a synthetic ID assigned by `XObject::GetNativeObject`). - ours uses normal slot IDs `0x10xx` from the handle-slot allocator. Comparison must be by (a) **position in the per-LR sequence** and (b) **call args** (size r5, signal-kind r4). ## Position-0 args MATCH (HIGH confidence, direct measurement) | LR | r5 (size / kind) | matches? | |---|---|---| | 0x8245DC5C | ours=0x800 / canary=0x800 | YES | | 0x8245DA44 | ours=2 (Set) / canary=2 | YES | | 0x8245DB08 | ours=2 / canary=2 | YES | r4 (buffer/ctx pointers) DIFFER in absolute address (different memory layouts) but TYPE-shaped identically. The first invocation of each signaler is structurally identical. The divergence is in COUNT of subsequent loop iterations, not in handle-selection of position-0. See `s3/handle-sequence-diff.md` for full position-aligned table. ## γ-DB40 signal-target distribution (the 461-vs-75 case) | canary handle | count | ours handle | count | |---|---:|---|---:| | F80000C8 | 229 | 0x000010E0 | 69 | | F80000DC | 79 | 0x00001040 | 1 | | F8000078 | 71 | 0x0000105C | 1 | | F80000BC | 39 | 0x00001098 | 1 | | F800012C | 28 | 0x000010AC | 1 | | F80000B4 | 7 | 0x000010D0 | 1 | | F8000044 | 4 | 0x0000121C | 1 | Shape: both have one dominant handle that absorbs ~half the signals (canary 229/461=50%, ours 69/75=92%) and a long tail. ours's tail is truncated — only 7 distinct handles in γ-DB40 vs canary's 10+. This is consistent with **the producer enqueues the same kinds of work items but the upstream feeder under-fires**, so the dominant work-item (handle `0x10E0` ≈ `F80000C8` by position) gets some iterations, the next-most-common items get truncated to 1×, and the long tail (canary's `F80000DC` 79× / `F8000078` 71×) is mostly missing. ## Wedge handle status (HIGH confidence) AUDIT-062 archive recorded ours wedge handles `0x12AC` and `0x12B8` with `` annotation in a deeper-boot run. In S3's lr-trace: **handle 0x12AC count = 0, handle 0x12B8 count = 0**. **No handle ≥ 0x121C appears in tid=5's signal trace at all.** Max handle observed in this run: 0x121C (cache:/aab216c3 NtCreateFile). The wedge handles are NEVER allocated in this 5B-instruction run, because boot terminates **before** the trajectory that would create them. The producer fires 81 times, then tid=5 goes quiet; the import_call counter freezes at 39,290; `--halt-on-deadlock` does NOT trigger (consumers wait on existing events that were never the wedge in this run). **This is a stronger statement than "the wedge handle is never signaled": the wedge handle is never even CREATED, because the boot never reaches the point of creating it.** ours's boot trajectory is truncated by the producer underrun upstream. ## Classification: producer-loop underrun (HIGH confidence) NOT a race (timing-dependent), NOT a wrong-handle bug (the args at matching positions are structurally identical), NOT a missing-kernel- handler bug (the signals that DO fire pass through bit-equivalent wrappers). It is **producer-loop underrun**: sub_82450A68's dispatch loop iterates fewer times. Either: 1. The work queue (read from guest memory by sub_82450A68) is populated with fewer items by some upstream feeder. 2. The dispatch loop's exit condition trips early. 3. The thread blocks on a dispatcher event that never gets re-signaled. Mechanism candidates (S4 to discriminate): - **upstream feeder**: callers of sub_8244FEA8 (11 sites in DB) — one enqueues less work in ours. Most likely the audio cluster (sub_8225EE20) or sub_82452DC0 (2 calls) given they relate to APUBUG- PRODUCER-001 territory. - **dispatch loop exit**: the loop reads a flag from the dispatcher struct at `0x828F3B68 + offset`; a state divergence there exits early. - **inner KeWait at 0x824AB240** (mentioned in S1 spawn-chain notes): if this wait times out / fails differently in ours, the loop exits. ## Reading-error registry NO new reading-error class needed. This session confirms one existing class: - **#28 cross-engine tid label mismatch** — used correctly here (compared by entry/ctx, not by tid integer). - **AUDIT-062 "wrong handles" framing** is a SYMPTOM of the producer underrun (fewer signals → some handles signaled, others starved), not a separate bug. ## Cascade - **A** (capture ours per-PC signaler firings): PASS (137 records, 81 on tid=5). - **B** (parallel canary sequence from S1): PASS (492 records on tid=10). - **C** (first-mismatch identification): PASS — divergence is in iteration count, not in handle-at-position-0. Position-0 args match structurally. - **D** (race-vs-missing-signal classification): PASS — neither pure race nor pure missing-signal. It is **producer-loop underrun** (boot doesn't reach the wedge-handle-creating subsystem). Net 4/4 PASS. ## S4 recommendation (refined) **Drop the "wrong-handles-from-γ-signaler" framing.** Focus upstream on WHY tid=5's dispatch loop runs ~5× fewer iterations. ### Path A (RECOMMENDED, ~30 LOC ours-only diagnostic, no source mod) Use `--lr-trace=0x82450A68` (the dispatch-loop body PC) plus the existing `--branch-probe` to see WHERE in the loop body ours exits. If the loop has a backward branch at offset X and ours's last fire is at offset Y < X, the loop is exiting early. Pair with the inner `bl 0x824AB240` (KeWaitForMultipleObjects) to see if the loop blocks on a wait that returns differently than canary. ### Path B (~80 LOC ours-only) — feeder-side capture `--lr-trace=0x8244FEA8` on cold ours AND canary. The spawn-helper fires 11 times statically in DB-derived list of callers; runtime fires 7× in S2's ours run. Pair r3/r4 (the spawned thread's start_ctx args) with canary's equivalent. ours may be missing one or more enqueues — the missing enqueue is the upstream root cause. ### Path C (~250 LOC, larger) — work-queue struct disassembly Disassemble sub_82450A68 body, identify the work-queue struct it reads from (likely at `[r29 + N]` where r29 = start_ctx 0x828F3B68 or a derived pointer). Watch the struct with `--mem-watch` to identify the populator (which fn writes the queue items). Trace that populator upstream. LOC budget for S4: Path A ~30, Path B ~80, Path C ~250. **Path A first** — gives the precise exit-condition (loop-body branch vs inner-wait timeout) at zero LOC cost. ## Discipline - xenia-rs HEAD UNCHANGED (sha256 of `git diff HEAD` matches S1/S2 end). - No source modifications. - `--lr-trace` is read-only, lockstep-digest-unaffected (per state.rs:1463-1500). - No canary run this session (S1's data is fresh). - No canary cache to wipe (no canary run). - ours runs cold (no cache pre-population). ## Artifacts ``` audit-runs/audit-069-wait-signal-producer/s3/ ours-lr-trace.jsonl (137 records, both PCs, all tids) ours-lr-trace.stderr (run log + counters) ours-lr-trace.stdout (empty under --quiet) ours-lr-trace-824AA2F0.log (60 records, NtSetEvent wrapper) ours-lr-trace-824AAF50.log (77 records, Ke wrapper) ours-lr-trace-extended.{jsonl,stderr,stdout} (5B-instr re-validation: same 81 fires) handle-sequence-diff.md (parallel comparison + first-mismatch table) writer-report-v3.md (this file) ``` No fresh canary run was needed — S1's `signal-probe-correlated.log` (154,187 lines) carries all canary signal-probe data. ## Summary of S1 → S2 → S3 progression - **S1**: identified canary's tid=10 as the signaler; claimed ours lacks this thread (FALSIFIED by S2). - **S2**: spawn-chain runs identically on ours tid=5; refined to "wrong- handle selection" downstream (REFINED by S3). - **S3**: ours runs identical PC/LR chain but with ~5× fewer iterations. Loop underrun classification. Wedge handle never even gets created in ours's truncated boot trajectory. The bug is **upstream of the γ-signaler**: in WHAT the dispatch loop reads from the work queue, or in the loop's exit condition.