Files
xenia-rs/audit-runs/phase-c18-shared-global-race/cold-vs-cold-result.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

84 lines
4.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Phase C+18 cold-vs-cold result (2026-05-14)
## Matched-prefix table
| canary_tid | ours_tid | C+17 | C+18 | delta | first_divergence_at | kind |
|------------|----------|---------|---------|----------|---------------------|------|
| 6 | 1 | 102,553 | 102,553 | 0 | 102,553 | `NtDuplicateObject` no `handle.create` (D-NEW-1, unchanged) |
| 4 | 11 | 11 | 11 | 0 | — | no divergence in 11 events |
| 7 | 2 | 32 | 32 | 0 | — | no divergence in 32 events |
| 12 | 7 | 3 | 3 | 0 | 3 | `timeout_ns` mismatch (D-NEW-2, unchanged) |
| 14 | 9 | 41 | 41 | 0 | 41 | unrelated `XAudioGetVoiceCategoryVolumeChangeMask` |
| 15 | 10 | 2 | **16** | **+14** | — | **regression RESTORED** (D-NEW-3 fix landed) |
**tid=15→10 RESTORED**: matched-prefix advances 2 → 16 (+14), back to
the C+16 baseline. 1 ours-side `handle.create` floating event absorbed
by Phase C+18 cross-tid SID matching (`floating_skipped` column =
`0/1`). No other chains regress. Main chain unchanged at 102,553.
## Acceptance gates
- **Gate 1 (default-off digest)**: PASS — 3× reproducible at
`e1dfcb1559f987b35012a7f2dc6d93f5` (unchanged from
C+13/C+15-α/C+16/C+17 baseline). The fix is observation-only at the
digest level; the new SID recipe is a string change in the event-log
emit, NOT a guest-behavior change.
- **Gate 2 (cvar-on emit)**: PASS — ours 121,544 events (unchanged
from C+17 — same emit count, different SID values), canary
3,517,980 events in ~90s.
- **Gate 3 (diff tool)**: PASS — produces 6-chain report with new
`floating_skipped (c/o)` column. tid=15→10 shows `0/1` — exactly
one ours-side floating create absorbed.
- **Gate 4 (cold-vs-cold)**: PASS — main matched prefix unchanged at
102,553, tid=15→10 restored from 2 → 16, all other sister chains
unchanged.
- **Gate 5 (build)**: PASS — both engines build clean (only the
pre-existing `dead_code` warning on `walk_committed_regions`).
- **Gate 6 (tests)**: PASS — ours kernel tests 191 → 193 (+2 new SID
determinism tests). Diff-tool tests: 14/14 PASS (new
`test_diff_events.py`).
- **Gate 7 (Phase B image hash)**: PASS — `image_loaded_sha256` =
`ea8d160e9369328a5b922258a92113efb8d7ce3e1a5c12cc521e375985c91c18`
(unchanged).
- **Gate 8 (event-log determinism)**: PASS — `handle.create` emit
count unchanged (121,544 → 121,544 in ours). The new SID recipe is
bit-deterministic over `(pointer, object_type)`.
## Sister-chain analysis
- **tid=4→11** (no divergence): unchanged 11 events matched.
- **tid=7→2** (no divergence): unchanged 32 events matched.
- **tid=12→7** (`timeout_ns` mismatch): unchanged at idx=3. D-NEW-2 is
next-after-D-NEW-1 in the queue.
- **tid=14→9** (audio export): unchanged at idx=41. D-NEW-something
to be triaged later.
- **tid=15→10** (RESTORED): the diff tool's floating-create absorb
pulled tid=15's matched count back up to 16 (= matches the full
canary tid=15 stream length within the 20k truncation cap; the next
divergence is presumably beyond the truncation window).
## Refcount and stability audit
The fix touches the SID computation only — the `handle_refcount` and
`state.objects` insertion logic is unchanged. The C+17 refcount-leak
risk audit (5 tests) continues to apply unchanged.
The deterministic SID is a fresh value computed at first-touch and
overwrites the registry entry. The old per-tid SID is never seen by
the diff tool. No double-insertion or stale-mapping issues.
## D-NEW-3 status
**RESOLVED**. The race is now invisible to the diff tool. Both engines
emit the same SID for the same dispatcher; the diff tool absorbs the
floating-tid mismatch via the cross-tid match.
## Next target
**C+19 = D-NEW-1 (`NtDuplicateObject` `handle.create`)**, unchanged
from C+17 plan. Canary's `ObjectTable::DuplicateHandle` allocates a
fresh slot via `AddHandle` (emits `handle.create`); ours's
`nt_duplicate_object` aliases via `dup_id=source_id` per AUDIT-062 and
does NOT emit a new event. Tradeoff between mirror (~30-40 LOC, risk =
AUDIT-062 worker-cluster regression) vs diff-tool suppress (band-aid).