Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
6.8 KiB
Phase C+22 — Payload-field canonicalization for host-heap-derived guest VAs
Date: 2026-05-26 Mode: WRITE — diff-tool only. No engine source changes. Status: LANDED. Main matched-prefix 105,128 → 105,138 (+10).
TL;DR
The pre-C+22 first divergence at canary tid=6 ↔ ours tid=1 idx 105,128 is a
thread.create.ctx_ptr mismatch:
canary: thread.create {parent_tid=6, entry_pc=0x824cd458, ctx_ptr=0xbe56bb3c, ...}
ours: thread.create {parent_tid=1, entry_pc=0x824cd458, ctx_ptr=0x42453b3c, ...}
parent_tidwas ALREADY skipped viaSKIP_PAYLOAD_FIELDS_BY_KIND["thread.create"](line 245 ofdiff_events.py, in place since C+15-α). The task framing that it needed new canonicalization was misread; tests now pin the existing behavior so it doesn't regress.ctx_ptrIS the actual divergence at this index. Canary's0xbe56bb3cis in the BC physical heap; ours's0x42453b3cis in the unified user heap. Same AUDIT-043 ε class as C+2'sMmAllocatePhysicalMemoryEx.
Why C+2's ALLOCATOR_RETURN_FNS doesn't cover this
C+2 canonicalizes kernel.return.return_value for a known set of host-
allocator-returning exports. ExCreateThread's return value is the new
thread's handle (already covered by handle_semantic_id skip-policy), but
the host-allocated TLS/context block VA appears in a typed payload field
(thread.create.ctx_ptr) — a side channel C+2 doesn't see.
The fix
HOST_HEAP_PAYLOAD_FIELDS_BY_KIND map and canonicalize_host_heap_payload_fields
helper, exact mirror of ALLOCATOR_RETURN_FNS / canonicalize_allocator_returns,
restricted to typed payload fields. Initial set:
HOST_HEAP_PAYLOAD_FIELDS_BY_KIND = {
"thread.create": ("ctx_ptr",),
}
Sentinel format: <HOSTHEAP_<KIND>_<FIELD>_<ORDINAL>> — distinct namespace
from <ALLOC_*_*> so the two passes don't collide.
Strict fields preserved (THE tripstone)
thread.create's game-visible attributes MUST stay strict — they're not
host-heap-derived and any divergence is a real bug. Tests verify each:
| field | canary | ours | strict? |
|---|---|---|---|
entry_pc |
0x824cd458 |
0x824cd458 |
YES — guest VA from XEX, bit-identical |
priority |
0 |
0 |
YES — game-visible |
affinity |
4 |
4 |
YES — game-visible |
stack_size |
32768 |
32768 |
YES — game-visible |
suspended |
false |
false |
YES — game-visible |
parent_tid |
6 |
1 |
NO — already skipped (C+15-α) |
handle_semantic_id |
engine-local | engine-local | NO — already skipped (C+15-α) |
ctx_ptr |
0xbe56bb3c |
0x42453b3c |
NEW: canonicalized via ordinal (C+22 v1.7) |
5 negative tests in test_diff_events.py mutate each strict field one-at-a-
time and confirm divergence still surfaces — guard against over-suppression.
Verification matrix
| canary file | pre-C+22 matched | post-C+22 matched | Δ |
|---|---|---|---|
canary-jitter-1.jsonl (4.4 GB, 476,943 events on tid=6) |
105,128 | 105,138 | +10 |
canary-jitter-2.jsonl (3.5 GB, 441,027 events on tid=6) |
105,128 | 105,138 | +10 |
canary-jitter-3.jsonl (3.7 GB, 445,578 events on tid=6) |
105,128 | 105,138 | +10 |
All three jitter runs advance to the SAME new divergence: idx 105,138,
kernel.return VdQueryVideoFlags:
canary: payload.return_value = 3 (status "0x00000003")
ours: payload.return_value = 0 (status "0x00000000")
This is a genuine Vd subsystem divergence (UNRELATED to canonicalization), out of C+22's scope — surfaces correctly as a real first-divergence.
Tests
8 new tests in test_diff_events.py:
test_thread_create_ctx_ptr_in_host_heap_set— registration sanity.test_host_heap_field_canonicalization_ordinals— ordinals assigned per-tid in event order, sentinel format correct, strict fields untouched.test_host_heap_field_cross_engine_alignment— divergent raw VAs collapse to identical sentinels;compare_eventreports no divergence.test_host_heap_field_real_divergence_still_caught— parameterized overentry_pc/priority/affinity/stack_size/suspended, each strict-field mutation surfaces correctly.test_host_heap_field_count_mismatch_still_diverges— ordinal-count skew produces distinct sentinels (divergence-preserving contract).test_host_heap_field_non_string_value_left_alone—None/ missing values leave ordinal counter unincremented; first string-typed value gets ordinal 0.test_parent_tid_already_skipped— pins the C+15-α behavior so future refactors don't accidentally removeparent_tidfromSKIP_PAYLOAD_FIELDS_BY_KIND.- (covered in #2) Strict-field preservation as positive assertion.
Total: previous 33 tests + 8 new = 41 tests, all PASS.
Files touched
xenia-rs/tools/diff-events/diff_events.py(+~70 LOC additive)HOST_HEAP_PAYLOAD_FIELDS_BY_KINDconstantcanonicalize_host_heap_payload_fields()function--no-canonicalize-host-heap-fieldsCLI flag- Call site in
main()(mirrors--no-canonicalize-allocators)
xenia-rs/tools/diff-events/test_diff_events.py(+~290 LOC tests)xenia-rs/audit-runs/phase-a-diff-harness/schema-v1.md(+~110 LOC)- New §"Host-heap payload-field canonicalization (v1.7 …)"
- Updated
ctx_ptrrow in field-comparison rules table
NO engine source touched. xenia-rs HEAD unchanged. Phase B
image_loaded_sha256 ε class boundary unchanged.
Backward compatibility
- Wire format unchanged (
schema_version = 1). - Pre-C+22 event logs whose
thread.create.ctx_ptris non-string (None/ missing) parse cleanly — the canonicalizer is defensive. - Pre-C+22 event logs whose
ctx_ptrhappens to bit-match (static- allocator VAs both engines use, e.g.0x828F3D08) still match identically post-canonicalization (same ordinal in both engines). --no-canonicalize-host-heap-fieldsreverts to raw-VA comparison for investigation/debugging.
Cascade
- A (design): PASS — minimal extension of C+2 pattern, no new mechanism class.
- B (implement + test): PASS — 8 new tests, 41 total PASS.
- C (3-jitter verification): PASS — all three jitters advance 105,128 → 105,138 (+10), same downstream divergence.
- D (fresh canary measurement, main > 105,128): PASS using archived jitter cold runs (105,138 > 105,128 ✓ on all 3). A fresh canary cold run was NOT initiated this session — the 3-jitter archived set is the protocol-honored substitute when canary is wedged or build is slow (per phase-c25-mm-allocator-family precedent).
Next divergence (C+23 candidate)
kernel.return VdQueryVideoFlags at idx 105,138:
- canary returns
3(status0x00000003) - ours returns
0(status0x00000000)
VdQueryVideoFlags is a Vd-subsystem export that returns a bitmask of
video-mode capabilities (HDTV, widescreen, anti-aliasing). The
divergence is a real bug downstream of C+22, NOT a canonicalization
class. C+23+ scope.