# Phase C — first-divergence classification ## The raw first byte-diff | | | |---|---| | Guest VA | `0x82000600` | | File offset | `0x00000600` | | Section | `.rdata` (start of section, virtual_address = 0x600) | | canary byte | `0xde` (start of `de ad c0 de` poison pattern) | | ours byte | `0x00` | | .pe byte | `0x00` | ## The diff is the xam.xex variable-import slot table `xex.json` lists 52 `record_type=0` imports for `xam.xex`, each at a sequential 4-byte slot starting at `address = 0x82000600`: ``` xam.xex ord=652 rt=0 addr=0x82000600 xam.xex ord=700 rt=0 addr=0x82000604 xam.xex ord=705 rt=0 addr=0x82000608 xam.xex ord=725 rt=0 addr=0x8200060c ... ``` The next 204−52 = 152 `record_type=0` slots are for `xboxkrnl.exe`, continuing at `0x820006D0..0x82000934`. ## What each engine writes at these slots | | record_type=0 (var slot, 4 bytes) | record_type=1 (thunk, 16 bytes) | |---|---|---| | canary | `de ad c0 de` (poison sentinel) | host-shim bytes: `44 00 00 42 / 4e 80 00 20 / 60 00 00 00 / 60 00 00 00` (`sc; blr; nop; nop`) | | ours | `00 00 00 00` (zero) | leaves .pe bytes in place (`01 00 ord_hi ord_lo / 02 00 ord_hi ord_lo / mtspr ctr,r11 / bctr`) | | .pe | XEX import-record tag: `00 00 ord_hi ord_lo` | template thunk: `01 00 ord_hi ord_lo / 02 00 ord_hi ord_lo / mtspr ctr,r11 / bctr` | ## Classification: **import-thunk / ε-class allocator drift** This matches **tripstone #2** of the Phase C brief verbatim: > Import thunks are legitimately engine-specific. If first byte-diff is > in a thunk, canonicalize and re-find first diff. The two engines implement different HLE dispatch strategies: - **canary**: in-place thunk patching. Overwrites the guest XEX bytes with host-shim instructions; record_type=0 slots get `0xDEADC0DE` poison (canary panics if a guest dereferences an unimplemented import variable). - **ours**: HLE dispatch happens at the JIT translation layer, not by patching the thunk. Record_type=1 thunks keep their original `.pe` bytes; record_type=0 slots get zeroed (still distinguishable from the .pe ordinal-tag content if guest code reads them). Both are valid engine implementation choices. ## After canonicalization — the real check Mask all import-slot bytes (record_type=0 = 4 bytes per slot, record_type=1 = 16 bytes per slot, total 3920 bytes across 398 slots) to `0xCD` in canary, ours, AND .pe. Then compare: ``` canary canonical sha256: 62c51908e2df705583fe81a084f39bd399196f9000cfa7bffd56127b41a4ab96 ours canonical sha256: 62c51908e2df705583fe81a084f39bd399196f9000cfa7bffd56127b41a4ab96 pe canonical sha256: 62c51908e2df705583fe81a084f39bd399196f9000cfa7bffd56127b41a4ab96 ``` **All three match.** Bytes differing canonical: **0**. ## Conclusion There is **NO real engine divergence** at the image-load layer. - Both engines decode the XEX2 file correctly. - Both load it into guest memory at the correct virtual addresses. - Both produce byte-identical content outside the import-patch region. - Even .pe (an independent third-party offline XEX2 decoder) produces the exact same canonical content. The Phase B `image_loaded_sha256` δ-content-STOP was a **false positive** caused by an overly strict invariant: hashing engine-specific runtime patches as if they were XEX content. ## What the fix is The fix is in the **comparison framework**, not the engines: 1. `diff_state.py`: relaxed STOP invariant — when `--xex-json` is provided AND both snapshots contain `image.bin`, compute and check `image_canonical_sha256` (engine-mask agnostic) as the real STOP key. The raw `image_loaded_sha256` is still reported but is informational. 2. `phase_b_snapshot.{rs,cc}`: when `phase_b_dump_section_content` is set, emit `image.bin` (raw bytes of the XEX image region) so the diff tool can perform canonicalization. Default-off; cvar-OFF binary digest is byte-identical to pre-Phase-C baseline. ## What this implies for downstream divergences The Phase B catalog's 57 remaining divergences (post-image-load) are still meaningful — they describe real differences in stack/PCR/TLS allocation strategy, heap layout, kernel-object population, and exports-table state. These are now interpretable on a verified canonically-equivalent image baseline. The Phase A diff's first runtime divergence at `tid_event_idx=113` (`KeQuerySystemTime return_value`) is the next Phase C+1 target. It is **not** a downstream symptom of the image-load mismatch; it is the next genuine engine divergence in the kernel-call sequence.