Files
xenia-rs/audit-runs/phase-c16-XamTaskCloseHandle-refcount/cold-vs-cold-result.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

4.3 KiB
Raw Blame History

Phase C+16 cold-vs-cold result (2026-05-14)

Matched-prefix table

canary_tid ours_tid C+15-α C+16 delta first_divergence_at kind
6 1 102,168 102,171 +3 102,171 handle.create (class E)
4 11 8 8 0 8 handle.create (class E)
7 2 30 30 0 30 handle.create (class E)
12 7 2 2 0 2 handle.create (class E)
14 9 2 2 0 2 handle.create (class E)
15 10 16 16 no divergence

Main matched prefix advanced 102,168 → 102,171 (+3). All 5 sister chains unchanged.

New first divergence (idx=102,171)

canary: [102169] import.call KeWaitForSingleObject
ours:   [102169] import.call KeWaitForSingleObject
canary: [102170] kernel.call KeWaitForSingleObject
ours:   [102170] kernel.call KeWaitForSingleObject
canary: [102171] handle.create sid=68fec8909ea5d1f5
ours:   [102171] wait.begin {'handles_semantic_ids': ['0000000000000000'], ...}
canary: [102172] wait.begin {'handles_semantic_ids': ['68fec8909ea5d1f5'], ...}
ours:   [102172] kernel.return KeWaitForSingleObject

This is class E — same root cause as D-2/D-3/D-4 in the C+15-α catalog. Canary's xeKeWaitForSingleObject calls XObject::GetNativeObject<XObject>(...) which CREATES a new handle for the native dispatcher object on first encounter; ours's resolve_pseudo_handle does not, so the wait.begin's handles_semantic_ids is 0000000000000000. The next Phase C+17 target.

Acceptance gates

  • Gate 1 (default-off digest): PASS — 3× reproducible at e1dfcb1559f987b35012a7f2dc6d93f5 (unchanged from C+13/C+15-α baseline). The refcount fix is observation-only at the digest level; guest behavior is unchanged because no actual code path depends on the precise destruction timing of the closed-but-still-running thread handle within the 50M-instruction window.
  • Gate 2 (cvar-on emit): PASS — both engines produce JSONL cleanly (ours 121,537 events; canary 2,512,481 events in 90s).
  • Gate 3 (diff tool): PASS — diff tool consumes events, produces 6-chain divergence report; main divergence at 102,171 (was 102,168).
  • Gate 4 (cold-vs-cold): PASS — main matched prefix advances +3, no sister-chain regressions.
  • Gate 5 (build clean): PASS — cargo build --release clean (1 pre-existing dead_code warning unrelated).
  • Gate 6 (tests): PASS — 181 → 186 (added 5 refcount lifecycle tests; all pass).
  • Gate 7 (Phase B image hash): NOT EXECUTED (no engine change reaches XEX load); inferred unchanged from invariant cold-stable digest.

Sister-chain analysis

No sister chain advanced beyond C+15-α matched-prefix. tid=4→11, tid=7→2, tid=12→7, tid=14→9 all diverge at the same indexes — the C+16 refcount fix is on a distinct code path from class-E KeWaitForSingleObject native-obj handle. C+17 must address class E to advance those chains.

Reading-error class

None new. Reading-error #28 discipline (verify framing first) was followed; canary source was read end-to-end for XThread::Create, XObject::RetainHandle/ReleaseHandle, ObjectTable::AddHandle/ RetainHandle/ReleaseHandle/RemoveHandle, and XamTaskSchedule_entry/XamTaskCloseHandle_entry before any code change.

Refcount leak risk audit

Three test cases cover the lifecycle balance:

  1. ex_create_then_close_then_exit_balances_refcount — close first, then exit. Refcount 2→1→0. Handle destroyed. No leak.
  2. xam_task_schedule_close_then_thread_exit_destroys_handle — same ordering via XAM path.
  3. xam_task_thread_exit_then_close_destroys_handle — exit first, then close. Refcount 2→1→0. Handle destroyed. No leak.

The reverse case (no close, only exit) leaves refcount at 1 (creator-only) which is correct: the handle slot remains until the creator explicitly closes it. This matches canary's behavior — the guest is responsible for closing handles it allocated.