handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes

Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-06-05 07:19:08 +02:00
parent acd1656753
commit ef93a4fa14
620 changed files with 108303 additions and 1 deletions

View File

@@ -0,0 +1,187 @@
# Phase C+17 Investigation — KeWait native-object handle synthesis (2026-05-14)
## Framing verification (reading-error #28 discipline)
C+15-α / C+16 catalog D-2/D-3/D-4 hypothesis: ours's `KeWait*` doesn't emit
`handle.create` when passed a raw native dispatcher object pointer (PKEVENT /
PKSEMAPHORE), while canary's `xeKeWaitForSingleObject` /
`KeWaitForMultipleObjects_entry` call `XObject::GetNativeObject` which
lazy-synthesizes an `XEvent`/`XSemaphore`/`XMutant`/`XTimer` wrapper and
inserts it in the object table — `ObjectTable::AddHandle` fires
`phase_a::EmitHandleCreateAuto` (object_table.cc:191-198).
### Canary's `GetNativeObject` semantics (xobject.cc:397-483)
Triggered by: `KeWait*` (and family) is called with a raw kernel-object
pointer. The first action of `xeKeWaitForSingleObject` is to call
`XObject::GetNativeObject<XObject>(kernel_state, object_ptr)`
(threading.cc:972, threading.cc:1070).
`GetNativeObject(kernel_state, native_ptr, as_type=-1, already_locked=false)`:
1. Read `X_DISPATCH_HEADER` at `native_ptr`. `as_type` defaults to
`header->type` (the dispatcher-type byte: 0=manual event, 1=auto event,
2=mutant, 5=semaphore, 8/9=timer).
2. Check the `wait_list.flink_ptr` magic: if it equals `kXObjSignature`
(`'X','E','N','\0'` = 0x58454E00) the dispatcher has already been adopted;
read the existing handle from `wait_list.blink_ptr` and return the existing
`XObject` via `LookupObject<XObject>(handle, true)`.
3. Otherwise FIRST USE — synthesize:
- case 0 / 1: `new XEvent(kernel_state)` → calls
`XEvent::InitializeNative(native_ptr, header)` then assigns to result.
- case 2: `new XMutant` + `InitializeNative` (but body asserts —
unsupported).
- case 5: `new XSemaphore` + `InitializeNative` (semaphore->limit /
signal_state).
- case 3/4/6/7/8/9/18..24: `assert_always()`. Timer not handled here.
4. After construction, call `StashHandle(header, object->handle())` — writes
`kXObjSignature` to `wait_list.flink_ptr` and the new handle to
`wait_list.blink_ptr`. This guarantees idempotency: next call returns the
same handle.
Crucially, the `XObject` ctor `XObject(KernelState*, Type, host_object)`
(xobject.cc:35-48) **always** calls `kernel_state->object_table()->AddHandle(this, nullptr)`,
which (C+15-α-wired) **emits `handle.create`** via
`phase_a::EmitHandleCreateAuto` (object_table.cc:148-201).
So: first call → 1× `handle.create` emit; subsequent calls (signature
matches) → 0 emits.
### Canary KeWaitForSingleObject entry ordering (threading.cc:969-1013)
```
xeKeWaitForSingleObject(object_ptr, ...):
auto object = XObject::GetNativeObject<XObject>(kernel_state(), object_ptr);
^^^ emits handle.create on first use (object_type=1 / 3 / etc)
if (!object) { return X_STATUS_ABANDONED_WAIT_0; }
if (phase_a::IsEnabled()) {
uint64_t sid = 0;
if (!object->handles().empty()) {
sid = phase_a::LookupHandleSemanticId(object->handles()[0]);
}
phase_a::EmitWaitBegin(&sid, 1, ...); // wait.begin with real SID
}
result = object->Wait(...);
```
So canary's emit order on first use is: `handle.create``wait.begin`,
exactly as observed on the cold log (idx=102171 → 102172).
### Lifetime / refcount
The synthesized `XObject` lives until its `handle_ref_count` reaches 0. Since
`AddHandle` initializes it to 1, and there's no balancing `RemoveHandle`
elsewhere in the lazy-wrap path, the wrapper survives for the rest of the
session (no `handle.destroy` is emitted by canary either — confirmed by
absence in canary's log post-102171). This is structurally consistent with
canary's "stash the handle in the dispatcher; reuse forever" pattern.
For ours we mirror this: emit one `handle.create` on first
`ensure_dispatcher_object` adoption; no `handle.destroy` thereafter.
### Object-type mapping
| dispatcher header.type | canary symbol | ours `KernelObject` variant | ours object_type code (event_log) |
|------------------------|-------------------------|------------------------------|------------------------------------|
| 0 (manual event) | XEvent (notification) | Event { manual_reset=true } | EVENT = 1 |
| 1 (auto event) | XEvent (synchronization)| Event { manual_reset=false } | EVENT = 1 |
| 5 (semaphore) | XSemaphore | Semaphore { .. } | SEMAPHORE = 3 |
| 8 (notif timer) | XTimer (canary asserts) | Timer { manual_reset=true } | TIMER = 4 |
| 9 (sync timer) | XTimer (canary asserts) | Timer { manual_reset=false } | TIMER = 4 |
| 2 (mutant) | XMutant (canary asserts)| (no shadow — return early) | n/a |
Note canary's `GetNativeObject` `assert_always()`s for timer types 8/9 — it
panics on unsupported dispatcher types. Sylpheed apparently never hits these
in canary (canary keeps running, so the assert is never tripped in our cold
log). Ours's `ensure_dispatcher_object` historically supports timer/8/9 via
the shadow path; we keep that for ours's robustness and emit
`object_type=TIMER` for them. Cross-engine SID matching only matters for
codes both engines emit; ours's extra timer emits would surface as new
divergences (acceptable per the catalog).
## Ours's pre-fix behavior
- `resolve_pseudo_handle` (exports.rs:4321): only translates the magic
`0xFFFF_FFFF` / `0xFFFF_FFFE` self-handle. For any other value it's a
pass-through. Native dispatcher pointers and real handles both reach the
next step unchanged.
- `ensure_dispatcher_object` (exports.rs:4363): on first encounter of a guest
pointer (`ptr >= 0x1_0000` and not already in `state.objects`), reads the
dispatcher header, creates the shadow `KernelObject::{Event, Semaphore,
Timer}`, inserts into `state.objects`, stamps `kXObjSignature` at
`+0x08/+0x0C`. **Does NOT emit `handle.create`.** **Does NOT bump
`handle_refcount`** (entry stays absent).
- `ke_wait_for_single_object` (exports.rs:4954): calls `resolve_pseudo_handle`
`ensure_dispatcher_object``refresh_pkevent_shadow_from_guest`
emits `wait.begin` with `lookup_handle_semantic_id(handle) = 0`
(since no SID was ever registered) → calls `do_wait_single`.
Result observed at idx=102171: ours emits `wait.begin
handles_semantic_ids=['0000000000000000']` and zero `handle.create` events.
## Fix shape
Symmetric: extend `ensure_dispatcher_object` to do the equivalent of
canary's `XObject::AddHandle` post-construction emit. Specifically:
1. After inserting the shadow into `state.objects` (existing line ~4409),
**and** when this is a fresh adoption (the inserted-before check is the
guard at line 4367), seed `handle_refcount.insert(ptr, 1)` for lifecycle
symmetry (no canary-side `handle.destroy` is expected, but consistency
with `alloc_handle_for` is worth ~1 LOC).
2. When `event_log::is_enabled()`, call
`event_log::emit_handle_create_auto(tid, cycle, /* pc */ 0, object_type,
raw_handle_id=ptr, object_name=None)`. The chosen `object_type` matches
the variant: Event=1, Semaphore=3, Timer=4. This both emits the event AND
registers the SID in the registry so the subsequent `wait.begin` resolves
non-zero.
Order in `ke_wait_for_single_object` already matches canary: synth (now
emits `handle.create`) before `wait.begin`. No re-ordering needed.
For `ke_wait_for_multiple_objects` the same applies — the loop already calls
`ensure_dispatcher_object` per pointer (exports.rs:5022). Each first
adoption emits one `handle.create` and the SID array used by `wait.begin`
becomes non-zero per element.
### Idempotency / refcount lifecycle
- First-touch: shadow inserted + `handle_refcount[ptr] = 1` + emit
`handle.create`.
- Re-touch (same pointer): early return at the `contains_key` guard → no
emit, no refcount change. Matches canary's "already-initialized" branch.
- Destroy: there is no path that destroys these shadows in ours today
(parity with canary). If someone later wires `handle.destroy` on
shadow-removal, the refcount will be present and decrement-to-zero will
fire the symmetric event. Not in scope here.
### Scope
C+17 strictly addresses D-2/D-3/D-4. We **do not** touch:
- `NtWait*` (handle-based; already SID-resolves through the registry once
the underlying `Nt*Create*` emit fires `handle.create`).
- `Ke{Set,Reset,Pulse}Event` / `KeReleaseSemaphore` paths that also call
`ensure_dispatcher_object`. These will now emit `handle.create` on their
first-touch — that's EXPECTED engine-symmetric behavior, and matches
canary (every entry into `GetNativeObject` may emit). The wait-side has
pre-context emits in both engines, so observable order is preserved.
## Tripstone register
- Reading-error #28 (canary semantics first): VERIFIED.
- Reading-error #23 (widely-used primitive flip): MITIGATED via cold-vs-cold
gate and HARD-REVERT-IF-MAIN-REGRESSES discipline.
- Reading-error #19 (host-side emits): event_log::is_enabled() guard
preserved on every new emit — default-off zero cost.
- Refcount semantics: matches canary's "stash forever" lazy-wrap pattern;
not symmetric with `alloc_handle_for`'s NtClose-balanced lifecycle (which
is correct — these are different kinds of handles).
## Cascade prediction (for the run)
A=verify canary's GetNativeObject semantics: DONE.
B=land symmetric ~30-50 LOC fix: PENDING.
C=main matched-prefix > 102,171: ~75%.
D=sister chains advance (4 chains): ~75%.
E=NEW divergences surface (downstream): ~80% (intended).