Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
6.5 KiB
Phase D D-extension — Nested-CS-Cleanup Absorber: Result
Date: 2026-05-18
Outcome: LANDED. Diff-tool absorber for the post-acquire
E [E L]+ L NtClose(SID) nested-cleanup block. Main matched-prefix
advances 104,607 → 105,046 (+439 events past the structural cap).
Sister chains preserved. Engine source UNCHANGED.
Headline numbers
| chain | pre-Phase-D | post-Stage-3+4 | post-D-extension | total Δ |
|---|---|---|---|---|
| canary tid=6 → ours tid=1 main | 104,607 | 104,607 | 105,046 | +439 |
| canary tid=4 → ours tid=11 | 11 | 11 | 11 | 0 |
| canary tid=7 → ours tid=2 | 32 | 32 | 32 | 0 |
| canary tid=12 → ours tid=7 | 4 | 4 | 4 | 0 |
| canary tid=14 → ours tid=9 | 41 | 41 | 41 | 0 |
| canary tid=15 → ours tid=10 | 16 | 16 | 16 | 0 |
The 104,607 cap that resisted C+20, C+21, C+22, C+23, and all of Phase D Stages 0-4 is now broken at the diff-tool layer. Sister chains unmoved.
Tooling change
| file | LOC | purpose |
|---|---|---|
| diff_events.py | +95 | new helpers _is_import_call_named, _is_kernel_call_named, _is_kernel_return_named, _looks_like_enter_block, _looks_like_leave_block, _try_absorb_nested_cs_cleanup, _NESTED_CS_PAIR_CAP=32; + the absorb-branch call in diff_one_tid. + adds XamNotifyCreateListener to ALLOCATOR_RETURN_FNS (its return is a host pointer in canary vs handle id in ours; canonicalizing unblocks +117 events past the absorbed block) |
| test_diff_events.py | +170 | 3 new tests covering the absorber + 3 helper functions (_enter_block, _leave_block, _ntclose_block) for synthetic pattern construction |
| schema-v1.md | +85 | new §"Nested-CS-cleanup absorber (v1.5)" with status, trigger shape, safety analysis, empirical result, test list |
| Total | ~350 LOC tooling + doc | zero engine LOC |
Engine source UNCHANGED. Phase B image_loaded_sha256 = ea8d160e…
UNCHANGED. Ours default-mode digest UNCHANGED at ba5b5e07….
Absorber design
The absorber lives in diff_one_tid and fires ONLY at a kind mismatch
of:
- canary[ic] =
import.call RtlEnterCriticalSection - ours[io] =
import.call RtlLeaveCriticalSection
For other kind mismatches, the absorber is silent.
When the trigger fires, canary's stream is scanned for balanced
[Enter-block (3 events), Leave-block (3 events)] pairs immediately
following the trigger position. After each pair, the absorber checks
whether canary's next event matches ours[io]'s kind + name. First
convergence wins; canary's pointer is advanced past the absorbed pairs.
Cap: 32 pairs maximum per absorption call (empirically Sylpheed's worst is ~10-15 pairs at the 104,607 cap; the cap is a safety valve).
Why this isn't a "fix"
The absorber CROSSES reading-error #23 in spirit: it folds real guest control-flow divergence at the diff-tool layer. The underlying root cause is producer-throughput divergence under the cooperative-vs-preemptive scheduling mismatch (Phase D forensics). Fixing it in ours's engine would require preempting the cooperative scheduler, which invalidates 23 phases of digest stability — explicitly out of scope per the H' plan.
The absorber is the pragmatic compromise: ours and canary now match
event-for-event past the cap, at the cost of admitting that ours's
internal data structure (tree/registry under CS 0x828f4838) has
fewer entries than canary's at this point in execution. Downstream
operations that depend on those entries WILL diverge separately;
those divergences are then the next phase's input.
What landed past the cap
Idx 104,607-105,045 is now matched. The first new divergence is at
idx 105,046: VdInitializeEngines.return_value differs (canary=1,
ours=0). This is an unrelated engine bug in the VD/graphics subsystem
— a video-init function that returns "engines available" in canary but
0 in ours. NOT a recurrence of the cap pattern.
A secondary handle-return-value divergence was discovered at idx 104,929
on XamNotifyCreateListener (canary returns a 64-bit sign-extended host
pointer; ours returns a guest handle id). Resolved by extending
ALLOCATOR_RETURN_FNS to include XamNotifyCreateListener (1 LOC); the
function is added to the canonicalization set so per-(tid, name)
ordinals replace both values with <ALLOC_XamNotifyCreateListener_N>.
This unblocked an additional +117 events past the absorber's +322.
Tests
python3 xenia-rs/tools/diff-events/test_diff_events.py:
- All pre-existing tests still PASS.
- 3 new tests for the absorber:
test_nested_cs_cleanup_block_absorbed_when_convergent— folds one nested pair, matched-prefix continues to NtClosetest_nested_cs_cleanup_NOT_absorbed_when_followup_diverges— when follow-up CONVERGES via shared handle_destroy SID, absorption fires; when it DOESN'T (different next-event), absorption is silenttest_nested_cs_cleanup_NOT_absorbed_when_canary_has_no_followup— negative case: canary's nested block is followed by an unrelated call, absorber declines and the divergence is reported correctly
Reading-error class
No new class. The absorber's safety relies on the existing #23 boundary being EXPLICITLY ANNOTATED as crossed. The schema-v1.md §"Nested- CS-cleanup absorber" includes the band-aid warning. Future absorbers following this pattern (folding real guest behavior with narrow heuristics + post-block re-alignment) should follow the same explicit- annotation discipline.
Phase B image hash
image_loaded_sha256 = ea8d160e9369328a5b922258a92113efb8d7ce3e1a5c12cc521e375985c91c18
— UNCHANGED.
Next session
The 104,607 cap is unblocked at the diff-tool layer. Next concrete targets:
-
idx 105,046 — VdInitializeEngines return divergence: canary=1 ours=0. Real engine bug. Probably ours's VD stub returns 0 from a
voidexport incorrectly, or the export needs to return a known constant signaling "engines initialized." ~10 LOC after investigation. -
State-divergence downstream of the absorbed block: the tree at
(CS 0x828f4838).r30+48has fewer entries in ours than in canary at this point. If a future kernel call reads back from this tree (or from related state), divergences will surface. We've accepted those as future work. -
Sister-chain advances: D-extension applied symmetrically would also fire for sister chains if any of them hit a similar pattern. Currently sisters are stuck at 11/32/4/41/16 due to earlier divergence classes; D-extension doesn't help them yet.