handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,102 @@
|
||||
# AUDIT-062 regression check (Phase C+19)
|
||||
|
||||
## What AUDIT-062 verified
|
||||
|
||||
AUDIT-062 (2026-05-12, dossier:
|
||||
`xenia-rs/docs/functions/sub_821CB030.md`, memory:
|
||||
`project_xenia_rs_audit_062_worker_wake_gap_2026_05_12.md`) located
|
||||
the worker-cluster wedge to "the producer never signals the worker-
|
||||
idle event". It explicitly RULED OUT the NtDuplicate aliasing as the
|
||||
bug, citing the live `ours-ntdup.jsonl` trace:
|
||||
|
||||
> ours DOES dup the wedge (kernel-aliasing hypothesis falsified):
|
||||
> `--lr-trace=0x8284DF7C` captured `tid=13 cycle=26711 r3=0x000012ac
|
||||
> r4=0x40541E80` (out_ptr). Per ours's `crates/xenia-kernel/src/
|
||||
> exports.rs:4263`, NtDup aliases — dup_id = source_id = 0x12AC,
|
||||
> refcount++. NOT a kernel bug.
|
||||
|
||||
The load-bearing invariant from AUDIT-062 is:
|
||||
**signal-on-dup wakes wait-on-source.**
|
||||
|
||||
Pre-C+19 mechanism: dup_id collided with source_id, so the same
|
||||
`state.objects` entry was hit by both paths.
|
||||
|
||||
Post-C+19 mechanism: dup_id is a fresh slot mapped to source_id via
|
||||
`state.handle_aliases`; every lookup through `resolve_handle`
|
||||
canonicalizes to source_id, hitting the same `state.objects` entry.
|
||||
|
||||
## Risk assessment
|
||||
|
||||
| Risk | Pre-C+19 | Post-C+19 |
|
||||
|------|----------|-----------|
|
||||
| Signal-on-dup wakes wait-on-source | YES (id collision) | YES (alias canonicalize) |
|
||||
| File ops on dup work | YES (id collision) | YES (alias canonicalize) |
|
||||
| Thread suspend/resume on dup | YES (id collision) | YES (alias canonicalize) |
|
||||
| Close-dup keeps source alive | partial (refcount sharing) | YES (per-slot refcount + canonical_slot_count) |
|
||||
| Close-source keeps dup alive | partial | YES |
|
||||
| handle.destroy emitted per slot | NO (one per object) | YES (one per slot — canary parity) |
|
||||
|
||||
## Tests proving AUDIT-062 invariant survives
|
||||
|
||||
11 new unit tests in `xenia-kernel/src/exports.rs::tests`:
|
||||
|
||||
1. `nt_duplicate_object_allocates_fresh_handle_id` — dup != source.
|
||||
2. **`nt_duplicate_object_signal_on_dup_wakes_wait_on_source`** —
|
||||
**THE AUDIT-062 REGRESSION GUARD**. Creates an Event, dups,
|
||||
signals the dup, asserts source Event's `signaled == true`. If
|
||||
this test ever fails, the C+19 fix has broken AUDIT-062's
|
||||
worker-cluster wedge resolution.
|
||||
3. `nt_duplicate_object_signal_on_source_visible_via_dup` — symmetric.
|
||||
4. `nt_duplicate_object_refcount_lifecycle` — per-slot refcount =
|
||||
1 for both source and dup; canonical_slot_count = 2; alias map
|
||||
has `dup → source`.
|
||||
5. `nt_duplicate_object_then_close_dup_keeps_source_live` —
|
||||
close dup, source still live and signalable.
|
||||
6. `nt_duplicate_object_then_close_source_keeps_dup_live` —
|
||||
close source, dup still live and signalable (incl. signal
|
||||
propagation test).
|
||||
7. `nt_duplicate_object_close_both_destroys_underlying` —
|
||||
close both → object gone; canonical_slot_count entry pruned.
|
||||
8. `nt_duplicate_object_with_close_source_flag` —
|
||||
DUPLICATE_CLOSE_SOURCE atomically dups and closes source.
|
||||
9. `nt_duplicate_object_invalid_handle_returns_invalid_handle`.
|
||||
10. `nt_duplicate_object_dup_of_dup_canonicalizes` —
|
||||
transitive aliasing flattens to original source.
|
||||
11. `nt_duplicate_object_works_for_semaphore` — non-Event type works
|
||||
identically.
|
||||
|
||||
All 11 pass. Kernel tests: 193 → 204 (+11). Full workspace test
|
||||
suite passes.
|
||||
|
||||
## End-to-end runtime verification
|
||||
|
||||
Direct inspection of `ours-cold.jsonl` at tid=1 idx=102553:
|
||||
|
||||
```
|
||||
idx=102551 kind=import.call name=NtDuplicateObject
|
||||
idx=102552 kind=kernel.call name=NtDuplicateObject
|
||||
idx=102553 kind=handle.create name= (FRESH slot) ← C+19 NEW
|
||||
idx=102554 kind=kernel.return name=NtDuplicateObject ret=0
|
||||
```
|
||||
|
||||
The `handle.create` at idx=102553 is the canary-symmetric event that
|
||||
was missing pre-C+19. Verifies the fix lands at the observable
|
||||
boundary.
|
||||
|
||||
## Conclusion
|
||||
|
||||
AUDIT-062's load-bearing invariant — signal-on-dup wakes
|
||||
wait-on-source — is PRESERVED by the C+19 fix. The invariant
|
||||
relies on canonical kernel-object sharing, which is now achieved
|
||||
via the alias map rather than id collision. The mechanism shift
|
||||
is observation-equivalent to upstream callers: they pass dup_id
|
||||
to Nt*/Ke* functions; ours resolves dup_id → source_id at lookup
|
||||
time; the same `KernelObject::Event` (or whatever type) is
|
||||
mutated regardless of which slot id the caller named.
|
||||
|
||||
The pre-C+19 mechanism (id collision) is a special case of the
|
||||
post-C+19 mechanism (alias map): if no dup_id is ever allocated,
|
||||
`handle_aliases.get(h)` returns `None`, `resolve_handle(h)` returns
|
||||
`h` unchanged, and every lookup behaves exactly as it did before.
|
||||
|
||||
No AUDIT-062 regression detected.
|
||||
Reference in New Issue
Block a user