handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes

Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-06-05 07:19:08 +02:00
parent acd1656753
commit ef93a4fa14
620 changed files with 108303 additions and 1 deletions

View File

@@ -0,0 +1,124 @@
# Phase C+16 Investigation — XamTaskCloseHandle refcount (2026-05-14)
## Framing verification (reading-error #28 discipline)
C+15-α's catalog D-1 hypothesis was: "canary's spawned thread keeps an
additional ref on the thread handle (`object->Retain()` in `XThread::Create`
line 408 via `RetainHandle()`)". Verified against canary source.
### Canary's refcount model (xobject.cc + util/object_table.cc)
Two separate refcounts on `XObject`:
1. **`pointer_ref_count_`** — the C++ object pointer refcount.
- Bumped by `XObject::Retain()` / dropped by `XObject::Release()`.
- `AddHandle()` calls `object->Retain()` once when inserting into table.
2. **`handle_ref_count`** in `ObjectTableEntry` — the per-handle (per-slot)
refcount that determines when the object is removed from the object
table.
- Initialized to 1 in `AddHandle()` (object_table.cc:164).
- Bumped by `RetainHandle()` (object_table.cc:218-228 → `entry->handle_ref_count++`).
- Decremented by `ReleaseHandle()` (object_table.cc:230-249).
- On reaching 0, calls `RemoveHandle()` which emits `handle.destroy`
and releases the pointer ref (`object->Release()`).
### Canary's XThread lifecycle
- `XObject` ctor → `AddHandle(this)``handle_ref_count = 1`,
emits `handle.create`.
- `XThread::Create()` (xthread.cc:414) → `RetainHandle()`
`handle_ref_count = 2`. Comment: *"Always retain when starting - the
thread owns itself until exited."*
- User calls `NtClose(handle)``ReleaseHandle()``handle_ref_count = 1`.
Object SURVIVES; no `handle.destroy` emitted.
- Thread exits via `XThread::Exit()` (xthread.cc:524) → `ReleaseHandle()`
`handle_ref_count = 0``RemoveHandle()` → emits `handle.destroy`
+ drops pointer ref → object destroyed.
### Canary's XAM task lifecycle (xam/xam_task.cc:43-94)
`XamTaskSchedule_entry` creates an `XThread` (which adds it to the
object table) then calls `thread->Create()` (xthread.cc:315) which adds
the self-ref via `RetainHandle()`. The handle written to `handle_ptr`
is `12345` (a stub!), not the real thread handle. The actual thread
handle lives on the `XThread` object.
`XamTaskCloseHandle_entry` calls `xboxkrnl::NtClose(obj_handle)`. Even
when `obj_handle=12345` (stub), `NtClose` of an invalid handle returns
`X_STATUS_INVALID_HANDLE` and the function returns false. But our test
data shows it returns 1 (success) on both engines, indicating the
SHIM-VS-GAME handle plumbing produces a valid handle in practice on
the main chain. (Possibly the game passes the actual thread handle.)
The crucial behavior is: after `NtClose`, canary's refcount went 2→1,
so no `handle.destroy` event. Ours's refcount went 1→0, emitting the
extra `handle.destroy`. **Hypothesis confirmed.**
## Ours's pre-fix state
- `alloc_handle_for``handle_refcount.insert(h, 1)`.
- `ex_create_thread` / `xam_task_schedule` after spawn → no retain.
- `nt_close` / `xam_task_close_handle` → decrement, destroy on 0.
- `ex_terminate_thread` → marks scheduler Exited, wakes joiners, does
NOT release the (missing) self-ref.
- Main thread (`install_initial_thread`) — refcount=1, never closed.
So ours's spawned threads had `handle_refcount = 1` (creator only). Any
guest `NtClose` on a thread handle destroyed it.
## Fix design
Mirror canary precisely:
1. After successful spawn in `ex_create_thread` + `xam_task_schedule`:
call `state.retain_handle(handle)` (refcount 1 → 2).
2. In `ex_terminate_thread` (explicit `ExTerminateThread`) and in the
main-loop LR-sentinel implicit-exit path (`main.rs`): call
`state.release_handle(handle)` after the scheduler `exit_current`
bookkeeping.
3. Main thread (`install_initial_thread`): symmetric retain (canary's
main also goes through `Create()::RetainHandle()`). Released at the
LR-sentinel path on main thread shutdown.
New helpers in `state.rs`:
- `KernelState::retain_handle(handle) -> u32` — saturating increment;
returns new refcount.
- `KernelState::release_handle(handle) -> bool` — saturating decrement;
on hitting zero: removes object, scrubs async_file_handles +
disarm_timer, emits `handle.destroy`, returns true. False if other
refs remain.
The implicit-exit path in `main.rs` also gained the missing
`thread.exit` schema event (previously only `ex_terminate_thread`
emitted it; canary's `XThread::Exit` covers both explicit and implicit
paths, so this is a symmetry fix even though it didn't cause the C+16
divergence directly).
## Code summary
~75 LOC additive across 4 files; pure additive, no refactor:
- `crates/xenia-kernel/src/state.rs``retain_handle` + `release_handle`
helpers. +50 LOC.
- `crates/xenia-kernel/src/exports.rs` — retain in `ex_create_thread`,
release in `ex_terminate_thread`. +20 LOC.
- `crates/xenia-kernel/src/xam.rs` — retain in `xam_task_schedule`.
+10 LOC.
- `crates/xenia-app/src/main.rs` — implicit-exit path: emit `thread.exit`,
release self-ref; `install_initial_thread` post-call retain. +20 LOC.
Tests: +5 (181 → 186 total).
- `xam_task_schedule_close_then_thread_exit_destroys_handle`
refcount lifecycle balance (close-first).
- `xam_task_thread_exit_then_close_destroys_handle`
refcount lifecycle balance (exit-first).
- `xam_task_schedule_then_close_round_trip_returns_one` — extended
with refcount asserts (post-spawn=2, post-close=1).
- `ex_create_thread_installs_self_reference` — verifies refcount=2
after spawn.
- `ex_terminate_thread_releases_self_reference` — verifies refcount=1
after terminate.
- `ex_create_then_close_then_exit_balances_refcount` — end-to-end
three-step lifecycle.