handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes

Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-06-05 07:19:08 +02:00
parent acd1656753
commit ef93a4fa14
620 changed files with 108303 additions and 1 deletions

View File

@@ -0,0 +1,83 @@
# Phase C+18 cold-vs-cold result (2026-05-14)
## Matched-prefix table
| canary_tid | ours_tid | C+17 | C+18 | delta | first_divergence_at | kind |
|------------|----------|---------|---------|----------|---------------------|------|
| 6 | 1 | 102,553 | 102,553 | 0 | 102,553 | `NtDuplicateObject` no `handle.create` (D-NEW-1, unchanged) |
| 4 | 11 | 11 | 11 | 0 | — | no divergence in 11 events |
| 7 | 2 | 32 | 32 | 0 | — | no divergence in 32 events |
| 12 | 7 | 3 | 3 | 0 | 3 | `timeout_ns` mismatch (D-NEW-2, unchanged) |
| 14 | 9 | 41 | 41 | 0 | 41 | unrelated `XAudioGetVoiceCategoryVolumeChangeMask` |
| 15 | 10 | 2 | **16** | **+14** | — | **regression RESTORED** (D-NEW-3 fix landed) |
**tid=15→10 RESTORED**: matched-prefix advances 2 → 16 (+14), back to
the C+16 baseline. 1 ours-side `handle.create` floating event absorbed
by Phase C+18 cross-tid SID matching (`floating_skipped` column =
`0/1`). No other chains regress. Main chain unchanged at 102,553.
## Acceptance gates
- **Gate 1 (default-off digest)**: PASS — 3× reproducible at
`e1dfcb1559f987b35012a7f2dc6d93f5` (unchanged from
C+13/C+15-α/C+16/C+17 baseline). The fix is observation-only at the
digest level; the new SID recipe is a string change in the event-log
emit, NOT a guest-behavior change.
- **Gate 2 (cvar-on emit)**: PASS — ours 121,544 events (unchanged
from C+17 — same emit count, different SID values), canary
3,517,980 events in ~90s.
- **Gate 3 (diff tool)**: PASS — produces 6-chain report with new
`floating_skipped (c/o)` column. tid=15→10 shows `0/1` — exactly
one ours-side floating create absorbed.
- **Gate 4 (cold-vs-cold)**: PASS — main matched prefix unchanged at
102,553, tid=15→10 restored from 2 → 16, all other sister chains
unchanged.
- **Gate 5 (build)**: PASS — both engines build clean (only the
pre-existing `dead_code` warning on `walk_committed_regions`).
- **Gate 6 (tests)**: PASS — ours kernel tests 191 → 193 (+2 new SID
determinism tests). Diff-tool tests: 14/14 PASS (new
`test_diff_events.py`).
- **Gate 7 (Phase B image hash)**: PASS — `image_loaded_sha256` =
`ea8d160e9369328a5b922258a92113efb8d7ce3e1a5c12cc521e375985c91c18`
(unchanged).
- **Gate 8 (event-log determinism)**: PASS — `handle.create` emit
count unchanged (121,544 → 121,544 in ours). The new SID recipe is
bit-deterministic over `(pointer, object_type)`.
## Sister-chain analysis
- **tid=4→11** (no divergence): unchanged 11 events matched.
- **tid=7→2** (no divergence): unchanged 32 events matched.
- **tid=12→7** (`timeout_ns` mismatch): unchanged at idx=3. D-NEW-2 is
next-after-D-NEW-1 in the queue.
- **tid=14→9** (audio export): unchanged at idx=41. D-NEW-something
to be triaged later.
- **tid=15→10** (RESTORED): the diff tool's floating-create absorb
pulled tid=15's matched count back up to 16 (= matches the full
canary tid=15 stream length within the 20k truncation cap; the next
divergence is presumably beyond the truncation window).
## Refcount and stability audit
The fix touches the SID computation only — the `handle_refcount` and
`state.objects` insertion logic is unchanged. The C+17 refcount-leak
risk audit (5 tests) continues to apply unchanged.
The deterministic SID is a fresh value computed at first-touch and
overwrites the registry entry. The old per-tid SID is never seen by
the diff tool. No double-insertion or stale-mapping issues.
## D-NEW-3 status
**RESOLVED**. The race is now invisible to the diff tool. Both engines
emit the same SID for the same dispatcher; the diff tool absorbs the
floating-tid mismatch via the cross-tid match.
## Next target
**C+19 = D-NEW-1 (`NtDuplicateObject` `handle.create`)**, unchanged
from C+17 plan. Canary's `ObjectTable::DuplicateHandle` allocates a
fresh slot via `AddHandle` (emits `handle.create`); ours's
`nt_duplicate_object` aliases via `dup_id=source_id` per AUDIT-062 and
does NOT emit a new event. Tradeoff between mirror (~30-40 LOC, risk =
AUDIT-062 worker-cluster regression) vs diff-tool suppress (band-aid).