Files
xenia-rs/audit-runs/phase-c-first-divergence/classification.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

4.4 KiB
Raw Blame History

Phase C — first-divergence classification

The raw first byte-diff

Guest VA 0x82000600
File offset 0x00000600
Section .rdata (start of section, virtual_address = 0x600)
canary byte 0xde (start of de ad c0 de poison pattern)
ours byte 0x00
.pe byte 0x00

The diff is the xam.xex variable-import slot table

xex.json lists 52 record_type=0 imports for xam.xex, each at a sequential 4-byte slot starting at address = 0x82000600:

xam.xex ord=652 rt=0 addr=0x82000600
xam.xex ord=700 rt=0 addr=0x82000604
xam.xex ord=705 rt=0 addr=0x82000608
xam.xex ord=725 rt=0 addr=0x8200060c
...

The next 20452 = 152 record_type=0 slots are for xboxkrnl.exe, continuing at 0x820006D0..0x82000934.

What each engine writes at these slots

record_type=0 (var slot, 4 bytes) record_type=1 (thunk, 16 bytes)
canary de ad c0 de (poison sentinel) host-shim bytes: 44 00 00 42 / 4e 80 00 20 / 60 00 00 00 / 60 00 00 00 (sc; blr; nop; nop)
ours 00 00 00 00 (zero) leaves .pe bytes in place (01 00 ord_hi ord_lo / 02 00 ord_hi ord_lo / mtspr ctr,r11 / bctr)
.pe XEX import-record tag: 00 00 ord_hi ord_lo template thunk: 01 00 ord_hi ord_lo / 02 00 ord_hi ord_lo / mtspr ctr,r11 / bctr

Classification: import-thunk / ε-class allocator drift

This matches tripstone #2 of the Phase C brief verbatim:

Import thunks are legitimately engine-specific. If first byte-diff is in a thunk, canonicalize and re-find first diff.

The two engines implement different HLE dispatch strategies:

  • canary: in-place thunk patching. Overwrites the guest XEX bytes with host-shim instructions; record_type=0 slots get 0xDEADC0DE poison (canary panics if a guest dereferences an unimplemented import variable).
  • ours: HLE dispatch happens at the JIT translation layer, not by patching the thunk. Record_type=1 thunks keep their original .pe bytes; record_type=0 slots get zeroed (still distinguishable from the .pe ordinal-tag content if guest code reads them).

Both are valid engine implementation choices.

After canonicalization — the real check

Mask all import-slot bytes (record_type=0 = 4 bytes per slot, record_type=1 = 16 bytes per slot, total 3920 bytes across 398 slots) to 0xCD in canary, ours, AND .pe. Then compare:

canary canonical sha256: 62c51908e2df705583fe81a084f39bd399196f9000cfa7bffd56127b41a4ab96
ours   canonical sha256: 62c51908e2df705583fe81a084f39bd399196f9000cfa7bffd56127b41a4ab96
pe     canonical sha256: 62c51908e2df705583fe81a084f39bd399196f9000cfa7bffd56127b41a4ab96

All three match. Bytes differing canonical: 0.

Conclusion

There is NO real engine divergence at the image-load layer.

  • Both engines decode the XEX2 file correctly.
  • Both load it into guest memory at the correct virtual addresses.
  • Both produce byte-identical content outside the import-patch region.
  • Even .pe (an independent third-party offline XEX2 decoder) produces the exact same canonical content.

The Phase B image_loaded_sha256 δ-content-STOP was a false positive caused by an overly strict invariant: hashing engine-specific runtime patches as if they were XEX content.

What the fix is

The fix is in the comparison framework, not the engines:

  1. diff_state.py: relaxed STOP invariant — when --xex-json is provided AND both snapshots contain image.bin, compute and check image_canonical_sha256 (engine-mask agnostic) as the real STOP key. The raw image_loaded_sha256 is still reported but is informational.
  2. phase_b_snapshot.{rs,cc}: when phase_b_dump_section_content is set, emit image.bin (raw bytes of the XEX image region) so the diff tool can perform canonicalization. Default-off; cvar-OFF binary digest is byte-identical to pre-Phase-C baseline.

What this implies for downstream divergences

The Phase B catalog's 57 remaining divergences (post-image-load) are still meaningful — they describe real differences in stack/PCR/TLS allocation strategy, heap layout, kernel-object population, and exports-table state. These are now interpretable on a verified canonically-equivalent image baseline.

The Phase A diff's first runtime divergence at tid_event_idx=113 (KeQuerySystemTime return_value) is the next Phase C+1 target. It is not a downstream symptom of the image-load mismatch; it is the next genuine engine divergence in the kernel-call sequence.