Files
xenia-rs/audit-runs/phase-c17-keWait-native-object/investigation.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

188 lines
9.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Phase C+17 Investigation — KeWait native-object handle synthesis (2026-05-14)
## Framing verification (reading-error #28 discipline)
C+15-α / C+16 catalog D-2/D-3/D-4 hypothesis: ours's `KeWait*` doesn't emit
`handle.create` when passed a raw native dispatcher object pointer (PKEVENT /
PKSEMAPHORE), while canary's `xeKeWaitForSingleObject` /
`KeWaitForMultipleObjects_entry` call `XObject::GetNativeObject` which
lazy-synthesizes an `XEvent`/`XSemaphore`/`XMutant`/`XTimer` wrapper and
inserts it in the object table — `ObjectTable::AddHandle` fires
`phase_a::EmitHandleCreateAuto` (object_table.cc:191-198).
### Canary's `GetNativeObject` semantics (xobject.cc:397-483)
Triggered by: `KeWait*` (and family) is called with a raw kernel-object
pointer. The first action of `xeKeWaitForSingleObject` is to call
`XObject::GetNativeObject<XObject>(kernel_state, object_ptr)`
(threading.cc:972, threading.cc:1070).
`GetNativeObject(kernel_state, native_ptr, as_type=-1, already_locked=false)`:
1. Read `X_DISPATCH_HEADER` at `native_ptr`. `as_type` defaults to
`header->type` (the dispatcher-type byte: 0=manual event, 1=auto event,
2=mutant, 5=semaphore, 8/9=timer).
2. Check the `wait_list.flink_ptr` magic: if it equals `kXObjSignature`
(`'X','E','N','\0'` = 0x58454E00) the dispatcher has already been adopted;
read the existing handle from `wait_list.blink_ptr` and return the existing
`XObject` via `LookupObject<XObject>(handle, true)`.
3. Otherwise FIRST USE — synthesize:
- case 0 / 1: `new XEvent(kernel_state)` → calls
`XEvent::InitializeNative(native_ptr, header)` then assigns to result.
- case 2: `new XMutant` + `InitializeNative` (but body asserts —
unsupported).
- case 5: `new XSemaphore` + `InitializeNative` (semaphore->limit /
signal_state).
- case 3/4/6/7/8/9/18..24: `assert_always()`. Timer not handled here.
4. After construction, call `StashHandle(header, object->handle())` — writes
`kXObjSignature` to `wait_list.flink_ptr` and the new handle to
`wait_list.blink_ptr`. This guarantees idempotency: next call returns the
same handle.
Crucially, the `XObject` ctor `XObject(KernelState*, Type, host_object)`
(xobject.cc:35-48) **always** calls `kernel_state->object_table()->AddHandle(this, nullptr)`,
which (C+15-α-wired) **emits `handle.create`** via
`phase_a::EmitHandleCreateAuto` (object_table.cc:148-201).
So: first call → 1× `handle.create` emit; subsequent calls (signature
matches) → 0 emits.
### Canary KeWaitForSingleObject entry ordering (threading.cc:969-1013)
```
xeKeWaitForSingleObject(object_ptr, ...):
auto object = XObject::GetNativeObject<XObject>(kernel_state(), object_ptr);
^^^ emits handle.create on first use (object_type=1 / 3 / etc)
if (!object) { return X_STATUS_ABANDONED_WAIT_0; }
if (phase_a::IsEnabled()) {
uint64_t sid = 0;
if (!object->handles().empty()) {
sid = phase_a::LookupHandleSemanticId(object->handles()[0]);
}
phase_a::EmitWaitBegin(&sid, 1, ...); // wait.begin with real SID
}
result = object->Wait(...);
```
So canary's emit order on first use is: `handle.create``wait.begin`,
exactly as observed on the cold log (idx=102171 → 102172).
### Lifetime / refcount
The synthesized `XObject` lives until its `handle_ref_count` reaches 0. Since
`AddHandle` initializes it to 1, and there's no balancing `RemoveHandle`
elsewhere in the lazy-wrap path, the wrapper survives for the rest of the
session (no `handle.destroy` is emitted by canary either — confirmed by
absence in canary's log post-102171). This is structurally consistent with
canary's "stash the handle in the dispatcher; reuse forever" pattern.
For ours we mirror this: emit one `handle.create` on first
`ensure_dispatcher_object` adoption; no `handle.destroy` thereafter.
### Object-type mapping
| dispatcher header.type | canary symbol | ours `KernelObject` variant | ours object_type code (event_log) |
|------------------------|-------------------------|------------------------------|------------------------------------|
| 0 (manual event) | XEvent (notification) | Event { manual_reset=true } | EVENT = 1 |
| 1 (auto event) | XEvent (synchronization)| Event { manual_reset=false } | EVENT = 1 |
| 5 (semaphore) | XSemaphore | Semaphore { .. } | SEMAPHORE = 3 |
| 8 (notif timer) | XTimer (canary asserts) | Timer { manual_reset=true } | TIMER = 4 |
| 9 (sync timer) | XTimer (canary asserts) | Timer { manual_reset=false } | TIMER = 4 |
| 2 (mutant) | XMutant (canary asserts)| (no shadow — return early) | n/a |
Note canary's `GetNativeObject` `assert_always()`s for timer types 8/9 — it
panics on unsupported dispatcher types. Sylpheed apparently never hits these
in canary (canary keeps running, so the assert is never tripped in our cold
log). Ours's `ensure_dispatcher_object` historically supports timer/8/9 via
the shadow path; we keep that for ours's robustness and emit
`object_type=TIMER` for them. Cross-engine SID matching only matters for
codes both engines emit; ours's extra timer emits would surface as new
divergences (acceptable per the catalog).
## Ours's pre-fix behavior
- `resolve_pseudo_handle` (exports.rs:4321): only translates the magic
`0xFFFF_FFFF` / `0xFFFF_FFFE` self-handle. For any other value it's a
pass-through. Native dispatcher pointers and real handles both reach the
next step unchanged.
- `ensure_dispatcher_object` (exports.rs:4363): on first encounter of a guest
pointer (`ptr >= 0x1_0000` and not already in `state.objects`), reads the
dispatcher header, creates the shadow `KernelObject::{Event, Semaphore,
Timer}`, inserts into `state.objects`, stamps `kXObjSignature` at
`+0x08/+0x0C`. **Does NOT emit `handle.create`.** **Does NOT bump
`handle_refcount`** (entry stays absent).
- `ke_wait_for_single_object` (exports.rs:4954): calls `resolve_pseudo_handle`
`ensure_dispatcher_object``refresh_pkevent_shadow_from_guest`
emits `wait.begin` with `lookup_handle_semantic_id(handle) = 0`
(since no SID was ever registered) → calls `do_wait_single`.
Result observed at idx=102171: ours emits `wait.begin
handles_semantic_ids=['0000000000000000']` and zero `handle.create` events.
## Fix shape
Symmetric: extend `ensure_dispatcher_object` to do the equivalent of
canary's `XObject::AddHandle` post-construction emit. Specifically:
1. After inserting the shadow into `state.objects` (existing line ~4409),
**and** when this is a fresh adoption (the inserted-before check is the
guard at line 4367), seed `handle_refcount.insert(ptr, 1)` for lifecycle
symmetry (no canary-side `handle.destroy` is expected, but consistency
with `alloc_handle_for` is worth ~1 LOC).
2. When `event_log::is_enabled()`, call
`event_log::emit_handle_create_auto(tid, cycle, /* pc */ 0, object_type,
raw_handle_id=ptr, object_name=None)`. The chosen `object_type` matches
the variant: Event=1, Semaphore=3, Timer=4. This both emits the event AND
registers the SID in the registry so the subsequent `wait.begin` resolves
non-zero.
Order in `ke_wait_for_single_object` already matches canary: synth (now
emits `handle.create`) before `wait.begin`. No re-ordering needed.
For `ke_wait_for_multiple_objects` the same applies — the loop already calls
`ensure_dispatcher_object` per pointer (exports.rs:5022). Each first
adoption emits one `handle.create` and the SID array used by `wait.begin`
becomes non-zero per element.
### Idempotency / refcount lifecycle
- First-touch: shadow inserted + `handle_refcount[ptr] = 1` + emit
`handle.create`.
- Re-touch (same pointer): early return at the `contains_key` guard → no
emit, no refcount change. Matches canary's "already-initialized" branch.
- Destroy: there is no path that destroys these shadows in ours today
(parity with canary). If someone later wires `handle.destroy` on
shadow-removal, the refcount will be present and decrement-to-zero will
fire the symmetric event. Not in scope here.
### Scope
C+17 strictly addresses D-2/D-3/D-4. We **do not** touch:
- `NtWait*` (handle-based; already SID-resolves through the registry once
the underlying `Nt*Create*` emit fires `handle.create`).
- `Ke{Set,Reset,Pulse}Event` / `KeReleaseSemaphore` paths that also call
`ensure_dispatcher_object`. These will now emit `handle.create` on their
first-touch — that's EXPECTED engine-symmetric behavior, and matches
canary (every entry into `GetNativeObject` may emit). The wait-side has
pre-context emits in both engines, so observable order is preserved.
## Tripstone register
- Reading-error #28 (canary semantics first): VERIFIED.
- Reading-error #23 (widely-used primitive flip): MITIGATED via cold-vs-cold
gate and HARD-REVERT-IF-MAIN-REGRESSES discipline.
- Reading-error #19 (host-side emits): event_log::is_enabled() guard
preserved on every new emit — default-off zero cost.
- Refcount semantics: matches canary's "stash forever" lazy-wrap pattern;
not symmetric with `alloc_handle_for`'s NtClose-balanced lifecycle (which
is correct — these are different kinds of handles).
## Cascade prediction (for the run)
A=verify canary's GetNativeObject semantics: DONE.
B=land symmetric ~30-50 LOC fix: PENDING.
C=main matched-prefix > 102,171: ~75%.
D=sister chains advance (4 chains): ~75%.
E=NEW divergences surface (downstream): ~80% (intended).