handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes

Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-06-05 07:19:08 +02:00
parent acd1656753
commit ef93a4fa14
620 changed files with 108303 additions and 1 deletions

View File

@@ -0,0 +1,91 @@
# Phase C+16 cold-vs-cold result (2026-05-14)
## Matched-prefix table
| canary_tid | ours_tid | C+15-α | C+16 | delta | first_divergence_at | kind |
|------------|----------|---------|---------|-------|---------------------|-----------------------------------|
| 6 | 1 | 102,168 | 102,171 | **+3**| 102,171 | `handle.create` (class E) |
| 4 | 11 | 8 | 8 | 0 | 8 | `handle.create` (class E) |
| 7 | 2 | 30 | 30 | 0 | 30 | `handle.create` (class E) |
| 12 | 7 | 2 | 2 | 0 | 2 | `handle.create` (class E) |
| 14 | 9 | 2 | 2 | 0 | 2 | `handle.create` (class E) |
| 15 | 10 | 16 | 16 | — | — | no divergence |
Main matched prefix advanced 102,168 → 102,171 (+3). All 5 sister
chains unchanged.
## New first divergence (idx=102,171)
```
canary: [102169] import.call KeWaitForSingleObject
ours: [102169] import.call KeWaitForSingleObject
canary: [102170] kernel.call KeWaitForSingleObject
ours: [102170] kernel.call KeWaitForSingleObject
canary: [102171] handle.create sid=68fec8909ea5d1f5
ours: [102171] wait.begin {'handles_semantic_ids': ['0000000000000000'], ...}
canary: [102172] wait.begin {'handles_semantic_ids': ['68fec8909ea5d1f5'], ...}
ours: [102172] kernel.return KeWaitForSingleObject
```
This is **class E** — same root cause as D-2/D-3/D-4 in the C+15-α
catalog. Canary's `xeKeWaitForSingleObject` calls
`XObject::GetNativeObject<XObject>(...)` which CREATES a new handle for
the native dispatcher object on first encounter; ours's
`resolve_pseudo_handle` does not, so the `wait.begin`'s
`handles_semantic_ids` is `0000000000000000`. The next Phase C+17
target.
## Acceptance gates
- **Gate 1 (default-off digest)**: PASS — 3× reproducible at
`e1dfcb1559f987b35012a7f2dc6d93f5` (unchanged from C+13/C+15-α
baseline). The refcount fix is observation-only at the digest level;
guest behavior is unchanged because no actual code path depends on
the precise destruction timing of the closed-but-still-running thread
handle within the 50M-instruction window.
- **Gate 2 (cvar-on emit)**: PASS — both engines produce JSONL cleanly
(ours 121,537 events; canary 2,512,481 events in 90s).
- **Gate 3 (diff tool)**: PASS — diff tool consumes events, produces
6-chain divergence report; main divergence at 102,171 (was 102,168).
- **Gate 4 (cold-vs-cold)**: PASS — main matched prefix advances +3,
no sister-chain regressions.
- **Gate 5 (build clean)**: PASS — `cargo build --release` clean
(1 pre-existing dead_code warning unrelated).
- **Gate 6 (tests)**: PASS — 181 → 186 (added 5 refcount lifecycle
tests; all pass).
- **Gate 7 (Phase B image hash)**: NOT EXECUTED (no engine change
reaches XEX load); inferred unchanged from invariant cold-stable
digest.
## Sister-chain analysis
No sister chain advanced beyond C+15-α matched-prefix. tid=4→11,
tid=7→2, tid=12→7, tid=14→9 all diverge at the same indexes — the
C+16 refcount fix is on a distinct code path from class-E
KeWaitForSingleObject native-obj handle. C+17 must address class E
to advance those chains.
## Reading-error class
None new. Reading-error #28 discipline (verify framing first) was
followed; canary source was read end-to-end for `XThread::Create`,
`XObject::RetainHandle`/`ReleaseHandle`, `ObjectTable::AddHandle`/
`RetainHandle`/`ReleaseHandle`/`RemoveHandle`, and
`XamTaskSchedule_entry`/`XamTaskCloseHandle_entry` before any code
change.
## Refcount leak risk audit
Three test cases cover the lifecycle balance:
1. `ex_create_then_close_then_exit_balances_refcount` — close first,
then exit. Refcount 2→1→0. Handle destroyed. No leak.
2. `xam_task_schedule_close_then_thread_exit_destroys_handle` — same
ordering via XAM path.
3. `xam_task_thread_exit_then_close_destroys_handle` — exit first,
then close. Refcount 2→1→0. Handle destroyed. No leak.
The reverse case (no close, only exit) leaves refcount at 1
(creator-only) which is correct: the handle slot remains until the
creator explicitly closes it. This matches canary's behavior — the
guest is responsible for closing handles it allocated.