Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
125 lines
5.5 KiB
Markdown
125 lines
5.5 KiB
Markdown
# Phase C+16 Investigation — XamTaskCloseHandle refcount (2026-05-14)
|
||
|
||
## Framing verification (reading-error #28 discipline)
|
||
|
||
C+15-α's catalog D-1 hypothesis was: "canary's spawned thread keeps an
|
||
additional ref on the thread handle (`object->Retain()` in `XThread::Create`
|
||
line 408 via `RetainHandle()`)". Verified against canary source.
|
||
|
||
### Canary's refcount model (xobject.cc + util/object_table.cc)
|
||
|
||
Two separate refcounts on `XObject`:
|
||
|
||
1. **`pointer_ref_count_`** — the C++ object pointer refcount.
|
||
- Bumped by `XObject::Retain()` / dropped by `XObject::Release()`.
|
||
- `AddHandle()` calls `object->Retain()` once when inserting into table.
|
||
2. **`handle_ref_count`** in `ObjectTableEntry` — the per-handle (per-slot)
|
||
refcount that determines when the object is removed from the object
|
||
table.
|
||
- Initialized to 1 in `AddHandle()` (object_table.cc:164).
|
||
- Bumped by `RetainHandle()` (object_table.cc:218-228 → `entry->handle_ref_count++`).
|
||
- Decremented by `ReleaseHandle()` (object_table.cc:230-249).
|
||
- On reaching 0, calls `RemoveHandle()` which emits `handle.destroy`
|
||
and releases the pointer ref (`object->Release()`).
|
||
|
||
### Canary's XThread lifecycle
|
||
|
||
- `XObject` ctor → `AddHandle(this)` → `handle_ref_count = 1`,
|
||
emits `handle.create`.
|
||
- `XThread::Create()` (xthread.cc:414) → `RetainHandle()` →
|
||
`handle_ref_count = 2`. Comment: *"Always retain when starting - the
|
||
thread owns itself until exited."*
|
||
- User calls `NtClose(handle)` → `ReleaseHandle()` → `handle_ref_count = 1`.
|
||
Object SURVIVES; no `handle.destroy` emitted.
|
||
- Thread exits via `XThread::Exit()` (xthread.cc:524) → `ReleaseHandle()`
|
||
→ `handle_ref_count = 0` → `RemoveHandle()` → emits `handle.destroy`
|
||
+ drops pointer ref → object destroyed.
|
||
|
||
### Canary's XAM task lifecycle (xam/xam_task.cc:43-94)
|
||
|
||
`XamTaskSchedule_entry` creates an `XThread` (which adds it to the
|
||
object table) then calls `thread->Create()` (xthread.cc:315) which adds
|
||
the self-ref via `RetainHandle()`. The handle written to `handle_ptr`
|
||
is `12345` (a stub!), not the real thread handle. The actual thread
|
||
handle lives on the `XThread` object.
|
||
|
||
`XamTaskCloseHandle_entry` calls `xboxkrnl::NtClose(obj_handle)`. Even
|
||
when `obj_handle=12345` (stub), `NtClose` of an invalid handle returns
|
||
`X_STATUS_INVALID_HANDLE` and the function returns false. But our test
|
||
data shows it returns 1 (success) on both engines, indicating the
|
||
SHIM-VS-GAME handle plumbing produces a valid handle in practice on
|
||
the main chain. (Possibly the game passes the actual thread handle.)
|
||
|
||
The crucial behavior is: after `NtClose`, canary's refcount went 2→1,
|
||
so no `handle.destroy` event. Ours's refcount went 1→0, emitting the
|
||
extra `handle.destroy`. **Hypothesis confirmed.**
|
||
|
||
## Ours's pre-fix state
|
||
|
||
- `alloc_handle_for` → `handle_refcount.insert(h, 1)`.
|
||
- `ex_create_thread` / `xam_task_schedule` after spawn → no retain.
|
||
- `nt_close` / `xam_task_close_handle` → decrement, destroy on 0.
|
||
- `ex_terminate_thread` → marks scheduler Exited, wakes joiners, does
|
||
NOT release the (missing) self-ref.
|
||
- Main thread (`install_initial_thread`) — refcount=1, never closed.
|
||
|
||
So ours's spawned threads had `handle_refcount = 1` (creator only). Any
|
||
guest `NtClose` on a thread handle destroyed it.
|
||
|
||
## Fix design
|
||
|
||
Mirror canary precisely:
|
||
|
||
1. After successful spawn in `ex_create_thread` + `xam_task_schedule`:
|
||
call `state.retain_handle(handle)` (refcount 1 → 2).
|
||
2. In `ex_terminate_thread` (explicit `ExTerminateThread`) and in the
|
||
main-loop LR-sentinel implicit-exit path (`main.rs`): call
|
||
`state.release_handle(handle)` after the scheduler `exit_current`
|
||
bookkeeping.
|
||
3. Main thread (`install_initial_thread`): symmetric retain (canary's
|
||
main also goes through `Create()::RetainHandle()`). Released at the
|
||
LR-sentinel path on main thread shutdown.
|
||
|
||
New helpers in `state.rs`:
|
||
|
||
- `KernelState::retain_handle(handle) -> u32` — saturating increment;
|
||
returns new refcount.
|
||
- `KernelState::release_handle(handle) -> bool` — saturating decrement;
|
||
on hitting zero: removes object, scrubs async_file_handles +
|
||
disarm_timer, emits `handle.destroy`, returns true. False if other
|
||
refs remain.
|
||
|
||
The implicit-exit path in `main.rs` also gained the missing
|
||
`thread.exit` schema event (previously only `ex_terminate_thread`
|
||
emitted it; canary's `XThread::Exit` covers both explicit and implicit
|
||
paths, so this is a symmetry fix even though it didn't cause the C+16
|
||
divergence directly).
|
||
|
||
## Code summary
|
||
|
||
~75 LOC additive across 4 files; pure additive, no refactor:
|
||
|
||
- `crates/xenia-kernel/src/state.rs` — `retain_handle` + `release_handle`
|
||
helpers. +50 LOC.
|
||
- `crates/xenia-kernel/src/exports.rs` — retain in `ex_create_thread`,
|
||
release in `ex_terminate_thread`. +20 LOC.
|
||
- `crates/xenia-kernel/src/xam.rs` — retain in `xam_task_schedule`.
|
||
+10 LOC.
|
||
- `crates/xenia-app/src/main.rs` — implicit-exit path: emit `thread.exit`,
|
||
release self-ref; `install_initial_thread` post-call retain. +20 LOC.
|
||
|
||
Tests: +5 (181 → 186 total).
|
||
|
||
- `xam_task_schedule_close_then_thread_exit_destroys_handle` —
|
||
refcount lifecycle balance (close-first).
|
||
- `xam_task_thread_exit_then_close_destroys_handle` —
|
||
refcount lifecycle balance (exit-first).
|
||
- `xam_task_schedule_then_close_round_trip_returns_one` — extended
|
||
with refcount asserts (post-spawn=2, post-close=1).
|
||
- `ex_create_thread_installs_self_reference` — verifies refcount=2
|
||
after spawn.
|
||
- `ex_terminate_thread_releases_self_reference` — verifies refcount=1
|
||
after terminate.
|
||
- `ex_create_then_close_then_exit_balances_refcount` — end-to-end
|
||
three-step lifecycle.
|