handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes

Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-06-05 07:19:08 +02:00
parent acd1656753
commit ef93a4fa14
620 changed files with 108303 additions and 1 deletions

View File

@@ -0,0 +1,108 @@
# Phase C+19 cold-vs-cold result (2026-05-14)
## Verified resolution of D-NEW-1
Direct inspection of `ours-cold.jsonl` (50M instructions, cold cache)
on tid=1 around idx 102553 (the C+18 baseline divergence point):
```
idx=102551 kind=import.call name=NtDuplicateObject
idx=102552 kind=kernel.call name=NtDuplicateObject
idx=102553 kind=handle.create [NEW in C+19 — fresh dup slot]
idx=102554 kind=kernel.return name=NtDuplicateObject ret=0
```
**D-NEW-1 RESOLVED at the source.** Ours now emits `handle.create`
between `kernel.call NtDuplicateObject` and `kernel.return
NtDuplicateObject`, exactly mirroring canary's
`ObjectTable::DuplicateHandle``AddHandle` (object_table.cc:210-223
→ 148-208). The new `handle.create` payload carries:
- `raw_handle_id` = freshly allocated dup id (NOT source id).
- `object_type` = same as source's `KernelObject` variant.
- `handle_semantic_id` = per-tid SID at the allocation point.
## Acceptance gates
- **Gate 1 (default-off digest)**: PASS — 3× reproducible at
`e1dfcb1559f987b35012a7f2dc6d93f5` (unchanged from C+13/C+15-α/
C+16/C+17/C+18 baseline). C+19 is observation-only at the digest
level; instruction count, swaps, draws all bit-identical to C+18.
- **Gate 2 (cvar-on emit)**: PASS — ours-cold produces 121,569
events (matches C+18's 121,544 ± shared-global tid jitter; the
+25 events are the new dup-side `handle.create` and balancing
per-slot `handle.destroy` events).
- **Gate 3 (diff tool runs)**: PASS — produces 6-chain report.
- **Gate 4 (cold-vs-cold matched prefix)**: PARTIALLY PASS — see
"Canary cache jitter" below.
- **Gate 5 (build)**: PASS — both engines build clean (only the
pre-existing `dead_code` warning on `walk_committed_regions`).
- **Gate 6 (tests)**: PASS — ours kernel tests 193 → 204 (+11 new
AUDIT-062 regression + dup lifecycle tests). Workspace tests all
pass.
- **Gate 7 (Phase B image hash)**: PASS — `image_loaded_sha256` =
`ea8d160e9369328a5b922258a92113efb8d7ce3e1a5c12cc521e375985c91c18`
(unchanged).
- **Gate 8 (event-log determinism)**: PASS — emit count bit-stable
across cold runs. The new `handle.create` and per-slot
`handle.destroy` events are deterministically emitted at the
canary-symmetric boundary.
- **Gate 9 (AUDIT-062 regression)**: PASS — see
`audit062-regression-check.md`. All 11 new tests guard the
signal-on-dup-wakes-wait-on-source invariant.
## Canary cache jitter
The diff tool reports main matched-prefix at 102,424 — below the
C+18 baseline of 102,553. Investigation shows this is **canary-side
cache jitter, not a regression of the C+19 fix**:
```
C+18 baseline canary tid=6 idx=102424: status=0xc000000f (NO_SUCH_FILE)
C+19 canary v2 tid=6 idx=102424: status=0x00000000 (SUCCESS)
C+19 canary v3 tid=6 idx=102424: status=0x00000000 (SUCCESS)
C+19 canary v5 tid=6 idx=102424: status=0x00000000 (SUCCESS)
```
`NtQueryFullAttributesFile` on canary's side returned a different
status across cold runs (cache-state-dependent). Ours's status at
this idx is unchanged (`0xc000000f` in both C+18 and C+19 baselines).
The canary log used to establish the C+18 baseline reflected a
specific cache state that successive cold-canary runs have not
reproduced; this is independent of any change in xenia-rs.
The C+19 fix's true effect is verified by direct inspection of
ours-cold.jsonl at idx 102553 (above), NOT by the canary-comparison
matched-prefix at 102,424.
## Sister chain summary
Unchanged from C+18 baseline (canary jitter doesn't affect sisters):
| chain | C+18 | C+19 | delta |
|--------------------------------|---------|---------|-------|
| canary tid=4 → ours tid=11 | 11 | 11 | 0 |
| canary tid=7 → ours tid=2 | 32 | 32 | 0 |
| canary tid=12 → ours tid=7 | 3 | 3 | 0 |
| canary tid=14 → ours tid=9 | 41 | 41 | 0 |
| canary tid=15 → ours tid=10 | 16 | 16 | 0 |
No sister-chain regressions.
## Conclusion
- Direct verification: D-NEW-1 RESOLVED.
- AUDIT-062 invariant: PRESERVED (11 new regression tests + framing
analysis in `audit062-regression-check.md`).
- Cold-stable digest: UNCHANGED.
- Build + tests: PASS.
- Sister chains: UNCHANGED.
- Canary-side cold-run jitter is an independent observability
concern; the C+19 fix itself is correct and minimal.
## Next target
**C+20 = D-NEW-2 (`KeWaitForSingleObject` `timeout_ns` mismatch on
canary tid=12 → ours tid=7 at idx=3)**. ε-class encoding divergence:
canary=`-30000000` ns, ours=`429466729600` ns. Likely a sign/scale
asymmetry in the timeout payload emitter.