Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
5.3 KiB
Absorber inventory (Phase absorber-review, 2026-05-19)
The diff tool currently lands three absorbers that cross reading-error #23 (matching genuinely different guest behavior at the diff layer). Each is documented below — trigger, match heuristic, rationale, what is silenced.
The investigation goal is to determine whether any of them is hiding
signal flow that would explain the AUDIT-049 wedge (tid=13 blocked on
Event/Auto handle 0x12d0, sister wedges 0x1020/0x1040/0x10A8/0x10E4/0x12B8,
sub_825070F0 worker spawner fires 0×).
A) Shared-global handle.create floating absorb (Phase C+18)
- File:
diff_events.py::diff_one_tid, branch guarded byis_shared_global_handle_create+cross_tid_floating_sids. - Trigger condition: at a kind mismatch, exactly one side has
handle.createwhose SID is in the cross-tid floating set. - Match heuristic: the SID equals the deterministic
shared_global_sid(pointer, object_type)recipe (FNV-1a over marker0xC01AB005, pointer, object_type) OR appears across ≥2 distinct tids in either engine's stream (cross-tid usage heuristic). - Rationale: process-global dispatcher objects (XAudio voice-volume
semaphores, shared CSes, shared KEVENTs) get lazy-wrapped by whichever
guest thread is the first toucher; that thread differs between cold
runs. The SID recipe is scheduling-invariant so the diff can absorb
the
handle.createon the "wrong" tid. - What's silenced:
handle.createevents for process-global dispatchers. Per-thread (alloc_handle_for/AddHandle) handle.create events are NOT silenced because their SID uses the per-(tid, idx) recipe.
B) Shared-global wait.begin floating absorb (Phase C+21)
- File:
diff_events.py::diff_one_tid, branch guarded byis_shared_global_wait_begin. - Trigger condition: at a kind mismatch, exactly one side has
wait.beginwhosehandles_semantic_idslist includes at least one SID in the shared-global set. - Match heuristic: any of the wait's handles matches the
shared-global SID criterion above. For
wait_type=all, ANY single shared-global handle is enough to classify the whole wait as floating (heuristic risk: a wait on one shared + multiple per-thread handles is fully absorbed). - Rationale: contention on shared dispatchers is host-scheduler
driven. One cold run may emit
wait.begin(slow path) while another fast-paths past it without ever blocking. Reading-error #32. - What's silenced:
wait.beginevents that touch shared-global dispatchers. The associatedwait.end(which has its own field skips perSKIP_PAYLOAD_FIELDS_BY_KIND) still aligns positionally.
C) Nested-CS-cleanup absorber, Phase D D-extension (v1.5)
- File:
diff_events.py::_try_absorb_nested_cs_cleanup, invoked fromdiff_one_tid. - Trigger condition: kind mismatch where canary has
import.call RtlEnterCriticalSectionwhile ours hasimport.call RtlLeaveCriticalSection. Pattern is exact — NO other kind-mismatch shape engages this absorber. - Match heuristic: walks canary forward consuming balanced
[Enter-block(3), Leave-block(3)]pairs (each pair = 6 events: import.call, kernel.call, kernel.return for Enter; same triple for Leave). Cap_NESTED_CS_PAIR_CAP = 32. After each pair, checks whether canary's next event has the SAME kind AND payloadnameas ours's current event — first convergence wins (greedy). - Rationale: the 104,607 cap is a producer-throughput divergence:
canary's preemptive host-OS scheduling lets a peer tid insert more
work items into a CS-protected registry/tree during a notification-event
wait window than ours's cooperative scheduler does. Canary then
iterates
[E L]cleanups over those entries; ours has fewer entries and fast-Leaves. Per Phase D forensics, this is a real guest-behavior divergence, not jitter. - What's silenced: contiguous
[E L]blocks on canary's side at the specific Enter-vs-Leave mismatch site (~+439 events at the 104,607→105,046 advance per the D-extension memory). - Stated caveat: this explicitly crosses reading-error #23. The band-aid was approved because the underlying root cause requires preempting the cooperative scheduler (invalidates 23 phases of digest stability; out of scope per H' plan).
Cross-references for wedge hunt
Per Phase W ground truth, the unsignaled handles at deadlock are:
0x00001020 Event/Manual waiters=1 signals=0 waits=1 wakes=0
0x00001040 Event/Auto waiters=0 signals=0 waits=32 wakes=0
0x000010b0 Event/Auto waiters=0 signals=0 waits=7 wakes=0
0x000010ec Event/Manual waiters=1 signals=0 waits=2 wakes=0
0x000012d0 Event/Auto waiters=1 signals=0 waits=1 wakes=0 ← THE WEDGE
0x000012e4 Event/Auto waiters=1 signals=0 waits=1 wakes=0
Per the dossier caveat (AUDIT-049 era ID 0x1288 → Phase W ID 0x12d0),
handle ID is allocator-ordinal-dependent and does NOT match across
engines. So we look up by canary's analog handles via the canary
event stream — i.e. any Event/Auto whose tid+site equals canary's
analog of ours's tid=13 sub_821CB030+0x1B0 worker create call. Per
Phase W's table, canary tid=14/15 are the worker cluster (1.9M / 995K
events). If an absorbed event on canary is a worker-cluster
handle.create/wait.begin for an event-like object, that's wedge-
relevant.