Files
xenia-rs/audit-runs/phase-c16-XamTaskCloseHandle-refcount/investigation.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

5.5 KiB
Raw Blame History

Phase C+16 Investigation — XamTaskCloseHandle refcount (2026-05-14)

Framing verification (reading-error #28 discipline)

C+15-α's catalog D-1 hypothesis was: "canary's spawned thread keeps an additional ref on the thread handle (object->Retain() in XThread::Create line 408 via RetainHandle())". Verified against canary source.

Canary's refcount model (xobject.cc + util/object_table.cc)

Two separate refcounts on XObject:

  1. pointer_ref_count_ — the C++ object pointer refcount.
    • Bumped by XObject::Retain() / dropped by XObject::Release().
    • AddHandle() calls object->Retain() once when inserting into table.
  2. handle_ref_count in ObjectTableEntry — the per-handle (per-slot) refcount that determines when the object is removed from the object table.
    • Initialized to 1 in AddHandle() (object_table.cc:164).
    • Bumped by RetainHandle() (object_table.cc:218-228 → entry->handle_ref_count++).
    • Decremented by ReleaseHandle() (object_table.cc:230-249).
    • On reaching 0, calls RemoveHandle() which emits handle.destroy and releases the pointer ref (object->Release()).

Canary's XThread lifecycle

  • XObject ctor → AddHandle(this)handle_ref_count = 1, emits handle.create.
  • XThread::Create() (xthread.cc:414) → RetainHandle()handle_ref_count = 2. Comment: "Always retain when starting - the thread owns itself until exited."
  • User calls NtClose(handle)ReleaseHandle()handle_ref_count = 1. Object SURVIVES; no handle.destroy emitted.
  • Thread exits via XThread::Exit() (xthread.cc:524) → ReleaseHandle()handle_ref_count = 0RemoveHandle() → emits handle.destroy
    • drops pointer ref → object destroyed.

Canary's XAM task lifecycle (xam/xam_task.cc:43-94)

XamTaskSchedule_entry creates an XThread (which adds it to the object table) then calls thread->Create() (xthread.cc:315) which adds the self-ref via RetainHandle(). The handle written to handle_ptr is 12345 (a stub!), not the real thread handle. The actual thread handle lives on the XThread object.

XamTaskCloseHandle_entry calls xboxkrnl::NtClose(obj_handle). Even when obj_handle=12345 (stub), NtClose of an invalid handle returns X_STATUS_INVALID_HANDLE and the function returns false. But our test data shows it returns 1 (success) on both engines, indicating the SHIM-VS-GAME handle plumbing produces a valid handle in practice on the main chain. (Possibly the game passes the actual thread handle.)

The crucial behavior is: after NtClose, canary's refcount went 2→1, so no handle.destroy event. Ours's refcount went 1→0, emitting the extra handle.destroy. Hypothesis confirmed.

Ours's pre-fix state

  • alloc_handle_forhandle_refcount.insert(h, 1).
  • ex_create_thread / xam_task_schedule after spawn → no retain.
  • nt_close / xam_task_close_handle → decrement, destroy on 0.
  • ex_terminate_thread → marks scheduler Exited, wakes joiners, does NOT release the (missing) self-ref.
  • Main thread (install_initial_thread) — refcount=1, never closed.

So ours's spawned threads had handle_refcount = 1 (creator only). Any guest NtClose on a thread handle destroyed it.

Fix design

Mirror canary precisely:

  1. After successful spawn in ex_create_thread + xam_task_schedule: call state.retain_handle(handle) (refcount 1 → 2).
  2. In ex_terminate_thread (explicit ExTerminateThread) and in the main-loop LR-sentinel implicit-exit path (main.rs): call state.release_handle(handle) after the scheduler exit_current bookkeeping.
  3. Main thread (install_initial_thread): symmetric retain (canary's main also goes through Create()::RetainHandle()). Released at the LR-sentinel path on main thread shutdown.

New helpers in state.rs:

  • KernelState::retain_handle(handle) -> u32 — saturating increment; returns new refcount.
  • KernelState::release_handle(handle) -> bool — saturating decrement; on hitting zero: removes object, scrubs async_file_handles + disarm_timer, emits handle.destroy, returns true. False if other refs remain.

The implicit-exit path in main.rs also gained the missing thread.exit schema event (previously only ex_terminate_thread emitted it; canary's XThread::Exit covers both explicit and implicit paths, so this is a symmetry fix even though it didn't cause the C+16 divergence directly).

Code summary

~75 LOC additive across 4 files; pure additive, no refactor:

  • crates/xenia-kernel/src/state.rsretain_handle + release_handle helpers. +50 LOC.
  • crates/xenia-kernel/src/exports.rs — retain in ex_create_thread, release in ex_terminate_thread. +20 LOC.
  • crates/xenia-kernel/src/xam.rs — retain in xam_task_schedule. +10 LOC.
  • crates/xenia-app/src/main.rs — implicit-exit path: emit thread.exit, release self-ref; install_initial_thread post-call retain. +20 LOC.

Tests: +5 (181 → 186 total).

  • xam_task_schedule_close_then_thread_exit_destroys_handle — refcount lifecycle balance (close-first).
  • xam_task_thread_exit_then_close_destroys_handle — refcount lifecycle balance (exit-first).
  • xam_task_schedule_then_close_round_trip_returns_one — extended with refcount asserts (post-spawn=2, post-close=1).
  • ex_create_thread_installs_self_reference — verifies refcount=2 after spawn.
  • ex_terminate_thread_releases_self_reference — verifies refcount=1 after terminate.
  • ex_create_then_close_then_exit_balances_refcount — end-to-end three-step lifecycle.