handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,124 @@
|
||||
# Phase C+16 Investigation — XamTaskCloseHandle refcount (2026-05-14)
|
||||
|
||||
## Framing verification (reading-error #28 discipline)
|
||||
|
||||
C+15-α's catalog D-1 hypothesis was: "canary's spawned thread keeps an
|
||||
additional ref on the thread handle (`object->Retain()` in `XThread::Create`
|
||||
line 408 via `RetainHandle()`)". Verified against canary source.
|
||||
|
||||
### Canary's refcount model (xobject.cc + util/object_table.cc)
|
||||
|
||||
Two separate refcounts on `XObject`:
|
||||
|
||||
1. **`pointer_ref_count_`** — the C++ object pointer refcount.
|
||||
- Bumped by `XObject::Retain()` / dropped by `XObject::Release()`.
|
||||
- `AddHandle()` calls `object->Retain()` once when inserting into table.
|
||||
2. **`handle_ref_count`** in `ObjectTableEntry` — the per-handle (per-slot)
|
||||
refcount that determines when the object is removed from the object
|
||||
table.
|
||||
- Initialized to 1 in `AddHandle()` (object_table.cc:164).
|
||||
- Bumped by `RetainHandle()` (object_table.cc:218-228 → `entry->handle_ref_count++`).
|
||||
- Decremented by `ReleaseHandle()` (object_table.cc:230-249).
|
||||
- On reaching 0, calls `RemoveHandle()` which emits `handle.destroy`
|
||||
and releases the pointer ref (`object->Release()`).
|
||||
|
||||
### Canary's XThread lifecycle
|
||||
|
||||
- `XObject` ctor → `AddHandle(this)` → `handle_ref_count = 1`,
|
||||
emits `handle.create`.
|
||||
- `XThread::Create()` (xthread.cc:414) → `RetainHandle()` →
|
||||
`handle_ref_count = 2`. Comment: *"Always retain when starting - the
|
||||
thread owns itself until exited."*
|
||||
- User calls `NtClose(handle)` → `ReleaseHandle()` → `handle_ref_count = 1`.
|
||||
Object SURVIVES; no `handle.destroy` emitted.
|
||||
- Thread exits via `XThread::Exit()` (xthread.cc:524) → `ReleaseHandle()`
|
||||
→ `handle_ref_count = 0` → `RemoveHandle()` → emits `handle.destroy`
|
||||
+ drops pointer ref → object destroyed.
|
||||
|
||||
### Canary's XAM task lifecycle (xam/xam_task.cc:43-94)
|
||||
|
||||
`XamTaskSchedule_entry` creates an `XThread` (which adds it to the
|
||||
object table) then calls `thread->Create()` (xthread.cc:315) which adds
|
||||
the self-ref via `RetainHandle()`. The handle written to `handle_ptr`
|
||||
is `12345` (a stub!), not the real thread handle. The actual thread
|
||||
handle lives on the `XThread` object.
|
||||
|
||||
`XamTaskCloseHandle_entry` calls `xboxkrnl::NtClose(obj_handle)`. Even
|
||||
when `obj_handle=12345` (stub), `NtClose` of an invalid handle returns
|
||||
`X_STATUS_INVALID_HANDLE` and the function returns false. But our test
|
||||
data shows it returns 1 (success) on both engines, indicating the
|
||||
SHIM-VS-GAME handle plumbing produces a valid handle in practice on
|
||||
the main chain. (Possibly the game passes the actual thread handle.)
|
||||
|
||||
The crucial behavior is: after `NtClose`, canary's refcount went 2→1,
|
||||
so no `handle.destroy` event. Ours's refcount went 1→0, emitting the
|
||||
extra `handle.destroy`. **Hypothesis confirmed.**
|
||||
|
||||
## Ours's pre-fix state
|
||||
|
||||
- `alloc_handle_for` → `handle_refcount.insert(h, 1)`.
|
||||
- `ex_create_thread` / `xam_task_schedule` after spawn → no retain.
|
||||
- `nt_close` / `xam_task_close_handle` → decrement, destroy on 0.
|
||||
- `ex_terminate_thread` → marks scheduler Exited, wakes joiners, does
|
||||
NOT release the (missing) self-ref.
|
||||
- Main thread (`install_initial_thread`) — refcount=1, never closed.
|
||||
|
||||
So ours's spawned threads had `handle_refcount = 1` (creator only). Any
|
||||
guest `NtClose` on a thread handle destroyed it.
|
||||
|
||||
## Fix design
|
||||
|
||||
Mirror canary precisely:
|
||||
|
||||
1. After successful spawn in `ex_create_thread` + `xam_task_schedule`:
|
||||
call `state.retain_handle(handle)` (refcount 1 → 2).
|
||||
2. In `ex_terminate_thread` (explicit `ExTerminateThread`) and in the
|
||||
main-loop LR-sentinel implicit-exit path (`main.rs`): call
|
||||
`state.release_handle(handle)` after the scheduler `exit_current`
|
||||
bookkeeping.
|
||||
3. Main thread (`install_initial_thread`): symmetric retain (canary's
|
||||
main also goes through `Create()::RetainHandle()`). Released at the
|
||||
LR-sentinel path on main thread shutdown.
|
||||
|
||||
New helpers in `state.rs`:
|
||||
|
||||
- `KernelState::retain_handle(handle) -> u32` — saturating increment;
|
||||
returns new refcount.
|
||||
- `KernelState::release_handle(handle) -> bool` — saturating decrement;
|
||||
on hitting zero: removes object, scrubs async_file_handles +
|
||||
disarm_timer, emits `handle.destroy`, returns true. False if other
|
||||
refs remain.
|
||||
|
||||
The implicit-exit path in `main.rs` also gained the missing
|
||||
`thread.exit` schema event (previously only `ex_terminate_thread`
|
||||
emitted it; canary's `XThread::Exit` covers both explicit and implicit
|
||||
paths, so this is a symmetry fix even though it didn't cause the C+16
|
||||
divergence directly).
|
||||
|
||||
## Code summary
|
||||
|
||||
~75 LOC additive across 4 files; pure additive, no refactor:
|
||||
|
||||
- `crates/xenia-kernel/src/state.rs` — `retain_handle` + `release_handle`
|
||||
helpers. +50 LOC.
|
||||
- `crates/xenia-kernel/src/exports.rs` — retain in `ex_create_thread`,
|
||||
release in `ex_terminate_thread`. +20 LOC.
|
||||
- `crates/xenia-kernel/src/xam.rs` — retain in `xam_task_schedule`.
|
||||
+10 LOC.
|
||||
- `crates/xenia-app/src/main.rs` — implicit-exit path: emit `thread.exit`,
|
||||
release self-ref; `install_initial_thread` post-call retain. +20 LOC.
|
||||
|
||||
Tests: +5 (181 → 186 total).
|
||||
|
||||
- `xam_task_schedule_close_then_thread_exit_destroys_handle` —
|
||||
refcount lifecycle balance (close-first).
|
||||
- `xam_task_thread_exit_then_close_destroys_handle` —
|
||||
refcount lifecycle balance (exit-first).
|
||||
- `xam_task_schedule_then_close_round_trip_returns_one` — extended
|
||||
with refcount asserts (post-spawn=2, post-close=1).
|
||||
- `ex_create_thread_installs_self_reference` — verifies refcount=2
|
||||
after spawn.
|
||||
- `ex_terminate_thread_releases_self_reference` — verifies refcount=1
|
||||
after terminate.
|
||||
- `ex_create_then_close_then_exit_balances_refcount` — end-to-end
|
||||
three-step lifecycle.
|
||||
Reference in New Issue
Block a user