Files
xenia-rs/audit-runs/phase-c18-shared-global-race/cold-vs-cold-result.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

4.3 KiB
Raw Blame History

Phase C+18 cold-vs-cold result (2026-05-14)

Matched-prefix table

canary_tid ours_tid C+17 C+18 delta first_divergence_at kind
6 1 102,553 102,553 0 102,553 NtDuplicateObject no handle.create (D-NEW-1, unchanged)
4 11 11 11 0 no divergence in 11 events
7 2 32 32 0 no divergence in 32 events
12 7 3 3 0 3 timeout_ns mismatch (D-NEW-2, unchanged)
14 9 41 41 0 41 unrelated XAudioGetVoiceCategoryVolumeChangeMask
15 10 2 16 +14 regression RESTORED (D-NEW-3 fix landed)

tid=15→10 RESTORED: matched-prefix advances 2 → 16 (+14), back to the C+16 baseline. 1 ours-side handle.create floating event absorbed by Phase C+18 cross-tid SID matching (floating_skipped column = 0/1). No other chains regress. Main chain unchanged at 102,553.

Acceptance gates

  • Gate 1 (default-off digest): PASS — 3× reproducible at e1dfcb1559f987b35012a7f2dc6d93f5 (unchanged from C+13/C+15-α/C+16/C+17 baseline). The fix is observation-only at the digest level; the new SID recipe is a string change in the event-log emit, NOT a guest-behavior change.
  • Gate 2 (cvar-on emit): PASS — ours 121,544 events (unchanged from C+17 — same emit count, different SID values), canary 3,517,980 events in ~90s.
  • Gate 3 (diff tool): PASS — produces 6-chain report with new floating_skipped (c/o) column. tid=15→10 shows 0/1 — exactly one ours-side floating create absorbed.
  • Gate 4 (cold-vs-cold): PASS — main matched prefix unchanged at 102,553, tid=15→10 restored from 2 → 16, all other sister chains unchanged.
  • Gate 5 (build): PASS — both engines build clean (only the pre-existing dead_code warning on walk_committed_regions).
  • Gate 6 (tests): PASS — ours kernel tests 191 → 193 (+2 new SID determinism tests). Diff-tool tests: 14/14 PASS (new test_diff_events.py).
  • Gate 7 (Phase B image hash): PASS — image_loaded_sha256 = ea8d160e9369328a5b922258a92113efb8d7ce3e1a5c12cc521e375985c91c18 (unchanged).
  • Gate 8 (event-log determinism): PASS — handle.create emit count unchanged (121,544 → 121,544 in ours). The new SID recipe is bit-deterministic over (pointer, object_type).

Sister-chain analysis

  • tid=4→11 (no divergence): unchanged 11 events matched.
  • tid=7→2 (no divergence): unchanged 32 events matched.
  • tid=12→7 (timeout_ns mismatch): unchanged at idx=3. D-NEW-2 is next-after-D-NEW-1 in the queue.
  • tid=14→9 (audio export): unchanged at idx=41. D-NEW-something to be triaged later.
  • tid=15→10 (RESTORED): the diff tool's floating-create absorb pulled tid=15's matched count back up to 16 (= matches the full canary tid=15 stream length within the 20k truncation cap; the next divergence is presumably beyond the truncation window).

Refcount and stability audit

The fix touches the SID computation only — the handle_refcount and state.objects insertion logic is unchanged. The C+17 refcount-leak risk audit (5 tests) continues to apply unchanged.

The deterministic SID is a fresh value computed at first-touch and overwrites the registry entry. The old per-tid SID is never seen by the diff tool. No double-insertion or stale-mapping issues.

D-NEW-3 status

RESOLVED. The race is now invisible to the diff tool. Both engines emit the same SID for the same dispatcher; the diff tool absorbs the floating-tid mismatch via the cross-tid match.

Next target

C+19 = D-NEW-1 (NtDuplicateObject handle.create), unchanged from C+17 plan. Canary's ObjectTable::DuplicateHandle allocates a fresh slot via AddHandle (emits handle.create); ours's nt_duplicate_object aliases via dup_id=source_id per AUDIT-062 and does NOT emit a new event. Tradeoff between mirror (~30-40 LOC, risk = AUDIT-062 worker-cluster regression) vs diff-tool suppress (band-aid).