handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
63
audit-runs/phase-w-wedge-reattack/current-state.md
Normal file
63
audit-runs/phase-w-wedge-reattack/current-state.md
Normal file
@@ -0,0 +1,63 @@
|
||||
# Phase W — current ours state (Phase W.1 ground truth)
|
||||
|
||||
Captured 2026-05-19. `-n 500000000` cold run with `XENIA_CACHE_WIPE=1
|
||||
--halt-on-deadlock --trace-handles --ctor-probe=0x825070F0`.
|
||||
|
||||
## Headline: wedge is STRUCTURALLY UNCHANGED from AUDIT-049/058/059/062/065
|
||||
|
||||
- **tid=1** (main) `state=Blocked(WaitAny { handles: [4808] })` =
|
||||
handle `0x12c8` = `Thread(id=13, exit=None)`. PC `0x824ac578`
|
||||
(`do_wait_single`), `r12 = 0x82173c64` = `sub_82173990+0x2D4`
|
||||
(post-wait PC). Same join-on-tid=13 as AUDIT-049.
|
||||
- **tid=13** (worker) `state=Blocked(WaitAny { handles: [4816] })` =
|
||||
handle `0x12d0` = `Event/Auto`. PC `0x824ac578`, `r12 = 0x821cb1e0`
|
||||
= `sub_821CB030+0x1B0` (post-wait PC). Same wait site as AUDIT-059.
|
||||
- **`sub_825070F0` fires 0×** (no `CTOR-PROBE` line in stderr). Still
|
||||
the same activation gate that has not budged across all of Phase C
|
||||
+ Phase D.
|
||||
|
||||
## Wedge handle identity drift (AUDIT-049 era → today)
|
||||
|
||||
| Audit | Wedge handle | Wedge thread | Site |
|
||||
|---|---|---|---|
|
||||
| AUDIT-049 | `0x1288` | tid=13 | `sub_821CB030+0x1AC` |
|
||||
| AUDIT-059 | `0x12AC` | tid=13 | `sub_821CB030+0x1AC` |
|
||||
| AUDIT-062 | `0x12AC` | tid=13 | `sub_821CB030+0x1AC` |
|
||||
| AUDIT-065 | `0x12AC` | tid=13 | `sub_821CB030+0x1AC` |
|
||||
| **Phase W** | **`0x12d0`** | **tid=13** | **`sub_821CB030+0x1B0`** |
|
||||
|
||||
Handle ID drift is expected per dossier caveat (allocator ordinals
|
||||
differ between runs); the site, thread, and "Event/Auto created by
|
||||
tid=13 itself via `lr=0x824a9f6c src=NtCreateEvent`" all match.
|
||||
|
||||
## All `<NO_SIGNALS_DESPITE_WAITS>` handles at deadlock
|
||||
|
||||
```
|
||||
handle=0x00001020 kind=Event/Manual waiters=1 signals=0 waits=1 wakes=0
|
||||
handle=0x00001040 kind=Event/Auto waiters=0 signals=0 waits=32 wakes=0
|
||||
handle=0x000010b0 kind=Event/Auto waiters=0 signals=0 waits=7 wakes=0
|
||||
handle=0x000010ec kind=Event/Manual waiters=1 signals=0 waits=2 wakes=0
|
||||
handle=0x000012d0 kind=Event/Auto waiters=1 signals=0 waits=1 wakes=0 ← THE WEDGE
|
||||
handle=0x000012e4 kind=Event/Auto waiters=1 signals=0 waits=1 wakes=0
|
||||
```
|
||||
|
||||
`0x12d0` SID `d5e23609d3948568` does NOT appear in any canary cold trace
|
||||
(SID is per-tid per-PC and the run-to-run handle/PC numbering precludes
|
||||
matching). This confirms reading-error #30: shared-global SID recipe is
|
||||
NOT applicable to NtCreateEvent on a per-call basis — these are
|
||||
worker-local Events, not process-global dispatchers.
|
||||
|
||||
## Thread / event totals: the worker-cluster gap
|
||||
|
||||
| canary_tid | ours_tid | matched | canary_total | ours_total | first_div |
|
||||
|---|---|---|---|---|---|
|
||||
| 4 | 11 | 11 | 75,287 | 11 | — |
|
||||
| 6 | 1 | 105,112 | 351,340 | 108,507 | 105,112 |
|
||||
| 7 | 2 | 32 | 32 | 33 | — |
|
||||
| 12 | 7 | 4 | 9,264 | 5 | 4 |
|
||||
| 14 | 9 | 41 | 1,904,055 | 77 | 41 |
|
||||
| 15 | 10 | 16 | 995,517 | 17 | — |
|
||||
|
||||
**Canary tid=14 / tid=15 emit 1.9M / 995K events; ours's mapped tids
|
||||
emit 77 / 17.** The worker cluster (cf. AUDIT-057 thread-gap) never
|
||||
wakes up in ours, exactly as documented at every audit since.
|
||||
137
audit-runs/phase-w-wedge-reattack/diff-postfix.md
Normal file
137
audit-runs/phase-w-wedge-reattack/diff-postfix.md
Normal file
@@ -0,0 +1,137 @@
|
||||
# Phase A diff report
|
||||
|
||||
**This report is the output of Phase A's diff harness. Divergences
|
||||
shown here are INPUT for Phase B (first-divergence localization),
|
||||
not findings of Phase A.** Phase A's job is to make the harness
|
||||
itself correct, not to analyze what it surfaces.
|
||||
|
||||
## Summary
|
||||
|
||||
| canary_tid | ours_tid | matched | canary_total | ours_total | first_divergence_at | floating_create (c/o) | floating_wait (c/o) |
|
||||
|---|---|---|---|---|---|---|---|
|
||||
| 4 | 11 | 11 | 75287 | 11 | — | 0/0 | 0/0 |
|
||||
| 6 | 1 | 105112 | 351340 | 108507 | 105112 | 0/0 | 1/0 |
|
||||
| 7 | 2 | 32 | 32 | 33 | — | 0/0 | 0/0 |
|
||||
| 12 | 7 | 4 | 9264 | 5 | 4 | 0/0 | 0/0 |
|
||||
| 14 | 9 | 41 | 1904055 | 77 | 41 | 0/0 | 0/0 |
|
||||
| 15 | 10 | 16 | 995517 | 17 | — | 0/1 | 0/0 |
|
||||
|
||||
*`floating_create (c/o)` counts shared-global `handle.create` events absorbed by Phase C+18 cross-tid SID matching. `floating_wait (c/o)` counts `wait.begin` events on shared-global dispatchers absorbed by Phase C+21 (scheduling-jitter window — canary's contention slow path may fire while ours fast-paths or vice versa). See schema-v1.md §"Shared-global SIDs" and §"Wait-begin floating absorb".*
|
||||
|
||||
## canary_tid=4 → ours_tid=11
|
||||
|
||||
No divergence within the 11 compared events (canary has 75287, ours has 11).
|
||||
|
||||
## canary_tid=6 → ours_tid=1
|
||||
|
||||
First divergence at `tid_event_idx=105112`: payload.return_value: canary=353042432 ours=182251520
|
||||
|
||||
**Pre-context (last 5 matching events):**
|
||||
```
|
||||
canary: [105114] import.call MmAllocatePhysicalMemoryEx
|
||||
ours: [105107] import.call MmAllocatePhysicalMemoryEx
|
||||
canary: [105115] kernel.call MmAllocatePhysicalMemoryEx
|
||||
ours: [105108] kernel.call MmAllocatePhysicalMemoryEx
|
||||
canary: [105116] kernel.return MmAllocatePhysicalMemoryEx
|
||||
ours: [105109] kernel.return MmAllocatePhysicalMemoryEx
|
||||
canary: [105117] import.call MmGetPhysicalAddress
|
||||
ours: [105110] import.call MmGetPhysicalAddress
|
||||
canary: [105118] kernel.call MmGetPhysicalAddress
|
||||
ours: [105111] kernel.call MmGetPhysicalAddress
|
||||
```
|
||||
|
||||
**Divergent event:**
|
||||
```
|
||||
canary: [105119] kernel.return MmGetPhysicalAddress
|
||||
ours: [105112] kernel.return MmGetPhysicalAddress
|
||||
```
|
||||
|
||||
**Next event after the divergence (if any):**
|
||||
```
|
||||
canary: [105120] import.call VdInitializeRingBuffer
|
||||
ours: [105113] import.call VdInitializeRingBuffer
|
||||
```
|
||||
|
||||
**Raw events (JSON):**
|
||||
```json
|
||||
{"deterministic": true, "engine": "canary", "guest_cycle": 0, "host_ns": 1641525000, "kind": "kernel.return", "payload": {"name": "MmGetPhysicalAddress", "return_value": 353042432, "side_effects": [], "status": "0x150b0000"}, "schema_version": 1, "tid": 6, "tid_event_idx": 105119}
|
||||
{"deterministic": true, "engine": "ours", "guest_cycle": 5543165, "host_ns": 494678106, "kind": "kernel.return", "payload": {"name": "MmGetPhysicalAddress", "return_value": 182251520, "side_effects": [], "status": "0x0adcf000"}, "schema_version": 1, "tid": 1, "tid_event_idx": 105112}
|
||||
```
|
||||
|
||||
## canary_tid=7 → ours_tid=2
|
||||
|
||||
No divergence within the 32 compared events (canary has 32, ours has 33).
|
||||
|
||||
## canary_tid=12 → ours_tid=7
|
||||
|
||||
First divergence at `tid_event_idx=4`: payload.return_value: canary=258 ours=0
|
||||
|
||||
**Pre-context (last 5 matching events):**
|
||||
```
|
||||
canary: [0] import.call KeWaitForSingleObject
|
||||
ours: [0] import.call KeWaitForSingleObject
|
||||
canary: [1] kernel.call KeWaitForSingleObject
|
||||
ours: [1] kernel.call KeWaitForSingleObject
|
||||
canary: [2] handle.create sid=c49d8f0ab90401ea
|
||||
ours: [2] handle.create sid=6e3d96c5a52bf429
|
||||
canary: [3] wait.begin {'handles_semantic_ids': ['c49d8f0ab90401ea'], 'timeout_ns': -30000000, 'alertable': False, 'wait_type': 'any'}
|
||||
ours: [3] wait.begin {'handles_semantic_ids': ['6e3d96c5a52bf429'], 'timeout_ns': -30000000, 'alertable': False, 'wait_type': 'any'}
|
||||
```
|
||||
|
||||
**Divergent event:**
|
||||
```
|
||||
canary: [4] kernel.return KeWaitForSingleObject
|
||||
ours: [4] kernel.return KeWaitForSingleObject
|
||||
```
|
||||
|
||||
**Next event after the divergence (if any):**
|
||||
```
|
||||
canary: [5] import.call RtlEnterCriticalSection
|
||||
ours: <end of stream>
|
||||
```
|
||||
|
||||
**Raw events (JSON):**
|
||||
```json
|
||||
{"deterministic": true, "engine": "canary", "guest_cycle": 0, "host_ns": 1676368000, "kind": "kernel.return", "payload": {"name": "KeWaitForSingleObject", "return_value": 258, "side_effects": [], "status": "0x00000102"}, "schema_version": 1, "tid": 12, "tid_event_idx": 4}
|
||||
{"deterministic": true, "engine": "ours", "guest_cycle": 30, "host_ns": 494789418, "kind": "kernel.return", "payload": {"name": "KeWaitForSingleObject", "return_value": 0, "side_effects": [], "status": "0x00000000"}, "schema_version": 1, "tid": 7, "tid_event_idx": 4}
|
||||
```
|
||||
|
||||
## canary_tid=14 → ours_tid=9
|
||||
|
||||
First divergence at `tid_event_idx=41`: payload.ord: canary=503 ours=293
|
||||
|
||||
**Pre-context (last 5 matching events):**
|
||||
```
|
||||
canary: [36] kernel.call KeReleaseSpinLockFromRaisedIrql
|
||||
ours: [36] kernel.call KeReleaseSpinLockFromRaisedIrql
|
||||
canary: [37] kernel.return KeReleaseSpinLockFromRaisedIrql
|
||||
ours: [37] kernel.return KeReleaseSpinLockFromRaisedIrql
|
||||
canary: [38] import.call KfLowerIrql
|
||||
ours: [38] import.call KfLowerIrql
|
||||
canary: [39] kernel.call KfLowerIrql
|
||||
ours: [39] kernel.call KfLowerIrql
|
||||
canary: [40] kernel.return KfLowerIrql
|
||||
ours: [40] kernel.return KfLowerIrql
|
||||
```
|
||||
|
||||
**Divergent event:**
|
||||
```
|
||||
canary: [41] import.call XAudioGetVoiceCategoryVolumeChangeMask
|
||||
ours: [41] import.call RtlEnterCriticalSection
|
||||
```
|
||||
|
||||
**Next event after the divergence (if any):**
|
||||
```
|
||||
canary: [42] kernel.call XAudioGetVoiceCategoryVolumeChangeMask
|
||||
ours: [42] kernel.call RtlEnterCriticalSection
|
||||
```
|
||||
|
||||
**Raw events (JSON):**
|
||||
```json
|
||||
{"deterministic": true, "engine": "canary", "guest_cycle": 0, "host_ns": 1898677900, "kind": "import.call", "payload": {"module": "xboxkrnl.exe", "name": "XAudioGetVoiceCategoryVolumeChangeMask", "ord": 503}, "schema_version": 1, "tid": 14, "tid_event_idx": 41}
|
||||
{"deterministic": true, "engine": "ours", "guest_cycle": 417, "host_ns": 1694886289, "kind": "import.call", "payload": {"module": "xboxkrnl.exe", "name": "RtlEnterCriticalSection", "ord": 293}, "schema_version": 1, "tid": 9, "tid_event_idx": 41}
|
||||
```
|
||||
|
||||
## canary_tid=15 → ours_tid=10
|
||||
|
||||
No divergence within the 16 compared events (canary has 995517, ours has 17).
|
||||
137
audit-runs/phase-w-wedge-reattack/diff-prefix.md
Normal file
137
audit-runs/phase-w-wedge-reattack/diff-prefix.md
Normal file
@@ -0,0 +1,137 @@
|
||||
# Phase A diff report
|
||||
|
||||
**This report is the output of Phase A's diff harness. Divergences
|
||||
shown here are INPUT for Phase B (first-divergence localization),
|
||||
not findings of Phase A.** Phase A's job is to make the harness
|
||||
itself correct, not to analyze what it surfaces.
|
||||
|
||||
## Summary
|
||||
|
||||
| canary_tid | ours_tid | matched | canary_total | ours_total | first_divergence_at | floating_create (c/o) | floating_wait (c/o) |
|
||||
|---|---|---|---|---|---|---|---|
|
||||
| 4 | 11 | 11 | 75287 | 11 | — | 0/0 | 0/0 |
|
||||
| 6 | 1 | 105046 | 351340 | 108507 | 105046 | 0/0 | 1/0 |
|
||||
| 7 | 2 | 32 | 32 | 33 | — | 0/0 | 0/0 |
|
||||
| 12 | 7 | 4 | 9264 | 5 | 4 | 0/0 | 0/0 |
|
||||
| 14 | 9 | 41 | 1904055 | 77 | 41 | 0/0 | 0/0 |
|
||||
| 15 | 10 | 16 | 995517 | 17 | — | 0/1 | 0/0 |
|
||||
|
||||
*`floating_create (c/o)` counts shared-global `handle.create` events absorbed by Phase C+18 cross-tid SID matching. `floating_wait (c/o)` counts `wait.begin` events on shared-global dispatchers absorbed by Phase C+21 (scheduling-jitter window — canary's contention slow path may fire while ours fast-paths or vice versa). See schema-v1.md §"Shared-global SIDs" and §"Wait-begin floating absorb".*
|
||||
|
||||
## canary_tid=4 → ours_tid=11
|
||||
|
||||
No divergence within the 11 compared events (canary has 75287, ours has 11).
|
||||
|
||||
## canary_tid=6 → ours_tid=1
|
||||
|
||||
First divergence at `tid_event_idx=105046`: payload.return_value: canary=1 ours=0
|
||||
|
||||
**Pre-context (last 5 matching events):**
|
||||
```
|
||||
canary: [105048] import.call RtlInitializeCriticalSection
|
||||
ours: [105041] import.call RtlInitializeCriticalSection
|
||||
canary: [105049] kernel.call RtlInitializeCriticalSection
|
||||
ours: [105042] kernel.call RtlInitializeCriticalSection
|
||||
canary: [105050] kernel.return RtlInitializeCriticalSection
|
||||
ours: [105043] kernel.return RtlInitializeCriticalSection
|
||||
canary: [105051] import.call VdInitializeEngines
|
||||
ours: [105044] import.call VdInitializeEngines
|
||||
canary: [105052] kernel.call VdInitializeEngines
|
||||
ours: [105045] kernel.call VdInitializeEngines
|
||||
```
|
||||
|
||||
**Divergent event:**
|
||||
```
|
||||
canary: [105053] kernel.return VdInitializeEngines
|
||||
ours: [105046] kernel.return VdInitializeEngines
|
||||
```
|
||||
|
||||
**Next event after the divergence (if any):**
|
||||
```
|
||||
canary: [105054] import.call VdShutdownEngines
|
||||
ours: [105047] import.call VdShutdownEngines
|
||||
```
|
||||
|
||||
**Raw events (JSON):**
|
||||
```json
|
||||
{"deterministic": true, "engine": "canary", "guest_cycle": 0, "host_ns": 1637248100, "kind": "kernel.return", "payload": {"name": "VdInitializeEngines", "return_value": 1, "side_effects": [], "status": "0x00000001"}, "schema_version": 1, "tid": 6, "tid_event_idx": 105053}
|
||||
{"deterministic": true, "engine": "ours", "guest_cycle": 5541402, "host_ns": 523232070, "kind": "kernel.return", "payload": {"name": "VdInitializeEngines", "return_value": 0, "side_effects": [], "status": "0x00000000"}, "schema_version": 1, "tid": 1, "tid_event_idx": 105046}
|
||||
```
|
||||
|
||||
## canary_tid=7 → ours_tid=2
|
||||
|
||||
No divergence within the 32 compared events (canary has 32, ours has 33).
|
||||
|
||||
## canary_tid=12 → ours_tid=7
|
||||
|
||||
First divergence at `tid_event_idx=4`: payload.return_value: canary=258 ours=0
|
||||
|
||||
**Pre-context (last 5 matching events):**
|
||||
```
|
||||
canary: [0] import.call KeWaitForSingleObject
|
||||
ours: [0] import.call KeWaitForSingleObject
|
||||
canary: [1] kernel.call KeWaitForSingleObject
|
||||
ours: [1] kernel.call KeWaitForSingleObject
|
||||
canary: [2] handle.create sid=c49d8f0ab90401ea
|
||||
ours: [2] handle.create sid=6e3d96c5a52bf429
|
||||
canary: [3] wait.begin {'handles_semantic_ids': ['c49d8f0ab90401ea'], 'timeout_ns': -30000000, 'alertable': False, 'wait_type': 'any'}
|
||||
ours: [3] wait.begin {'handles_semantic_ids': ['6e3d96c5a52bf429'], 'timeout_ns': -30000000, 'alertable': False, 'wait_type': 'any'}
|
||||
```
|
||||
|
||||
**Divergent event:**
|
||||
```
|
||||
canary: [4] kernel.return KeWaitForSingleObject
|
||||
ours: [4] kernel.return KeWaitForSingleObject
|
||||
```
|
||||
|
||||
**Next event after the divergence (if any):**
|
||||
```
|
||||
canary: [5] import.call RtlEnterCriticalSection
|
||||
ours: <end of stream>
|
||||
```
|
||||
|
||||
**Raw events (JSON):**
|
||||
```json
|
||||
{"deterministic": true, "engine": "canary", "guest_cycle": 0, "host_ns": 1676368000, "kind": "kernel.return", "payload": {"name": "KeWaitForSingleObject", "return_value": 258, "side_effects": [], "status": "0x00000102"}, "schema_version": 1, "tid": 12, "tid_event_idx": 4}
|
||||
{"deterministic": true, "engine": "ours", "guest_cycle": 30, "host_ns": 523599940, "kind": "kernel.return", "payload": {"name": "KeWaitForSingleObject", "return_value": 0, "side_effects": [], "status": "0x00000000"}, "schema_version": 1, "tid": 7, "tid_event_idx": 4}
|
||||
```
|
||||
|
||||
## canary_tid=14 → ours_tid=9
|
||||
|
||||
First divergence at `tid_event_idx=41`: payload.ord: canary=503 ours=293
|
||||
|
||||
**Pre-context (last 5 matching events):**
|
||||
```
|
||||
canary: [36] kernel.call KeReleaseSpinLockFromRaisedIrql
|
||||
ours: [36] kernel.call KeReleaseSpinLockFromRaisedIrql
|
||||
canary: [37] kernel.return KeReleaseSpinLockFromRaisedIrql
|
||||
ours: [37] kernel.return KeReleaseSpinLockFromRaisedIrql
|
||||
canary: [38] import.call KfLowerIrql
|
||||
ours: [38] import.call KfLowerIrql
|
||||
canary: [39] kernel.call KfLowerIrql
|
||||
ours: [39] kernel.call KfLowerIrql
|
||||
canary: [40] kernel.return KfLowerIrql
|
||||
ours: [40] kernel.return KfLowerIrql
|
||||
```
|
||||
|
||||
**Divergent event:**
|
||||
```
|
||||
canary: [41] import.call XAudioGetVoiceCategoryVolumeChangeMask
|
||||
ours: [41] import.call RtlEnterCriticalSection
|
||||
```
|
||||
|
||||
**Next event after the divergence (if any):**
|
||||
```
|
||||
canary: [42] kernel.call XAudioGetVoiceCategoryVolumeChangeMask
|
||||
ours: [42] kernel.call RtlEnterCriticalSection
|
||||
```
|
||||
|
||||
**Raw events (JSON):**
|
||||
```json
|
||||
{"deterministic": true, "engine": "canary", "guest_cycle": 0, "host_ns": 1898677900, "kind": "import.call", "payload": {"module": "xboxkrnl.exe", "name": "XAudioGetVoiceCategoryVolumeChangeMask", "ord": 503}, "schema_version": 1, "tid": 14, "tid_event_idx": 41}
|
||||
{"deterministic": true, "engine": "ours", "guest_cycle": 417, "host_ns": 1753797001, "kind": "import.call", "payload": {"module": "xboxkrnl.exe", "name": "RtlEnterCriticalSection", "ord": 293}, "schema_version": 1, "tid": 9, "tid_event_idx": 41}
|
||||
```
|
||||
|
||||
## canary_tid=15 → ours_tid=10
|
||||
|
||||
No divergence within the 16 compared events (canary has 995517, ours has 17).
|
||||
10
audit-runs/phase-w-wedge-reattack/digest-500M.json
Normal file
10
audit-runs/phase-w-wedge-reattack/digest-500M.json
Normal file
@@ -0,0 +1,10 @@
|
||||
{
|
||||
"instructions": 500000007,
|
||||
"imports": 40390,
|
||||
"unimpl": 0,
|
||||
"draws": 0,
|
||||
"swaps": 1,
|
||||
"unique_render_targets": 0,
|
||||
"shader_blobs_live": 0,
|
||||
"texture_cache_entries": 0
|
||||
}
|
||||
10
audit-runs/phase-w-wedge-reattack/digest-baseline-500M.json
Normal file
10
audit-runs/phase-w-wedge-reattack/digest-baseline-500M.json
Normal file
@@ -0,0 +1,10 @@
|
||||
{
|
||||
"instructions": 500000007,
|
||||
"imports": 40390,
|
||||
"unimpl": 0,
|
||||
"draws": 0,
|
||||
"swaps": 1,
|
||||
"unique_render_targets": 0,
|
||||
"shader_blobs_live": 0,
|
||||
"texture_cache_entries": 0
|
||||
}
|
||||
10
audit-runs/phase-w-wedge-reattack/digest-baseline-50M.json
Normal file
10
audit-runs/phase-w-wedge-reattack/digest-baseline-50M.json
Normal file
@@ -0,0 +1,10 @@
|
||||
{
|
||||
"instructions": 50000007,
|
||||
"imports": 40390,
|
||||
"unimpl": 0,
|
||||
"draws": 0,
|
||||
"swaps": 1,
|
||||
"unique_render_targets": 0,
|
||||
"shader_blobs_live": 0,
|
||||
"texture_cache_entries": 0
|
||||
}
|
||||
10
audit-runs/phase-w-wedge-reattack/digest-rep1.json
Normal file
10
audit-runs/phase-w-wedge-reattack/digest-rep1.json
Normal file
@@ -0,0 +1,10 @@
|
||||
{
|
||||
"instructions": 50000007,
|
||||
"imports": 40390,
|
||||
"unimpl": 0,
|
||||
"draws": 0,
|
||||
"swaps": 1,
|
||||
"unique_render_targets": 0,
|
||||
"shader_blobs_live": 0,
|
||||
"texture_cache_entries": 0
|
||||
}
|
||||
10
audit-runs/phase-w-wedge-reattack/digest-rep2.json
Normal file
10
audit-runs/phase-w-wedge-reattack/digest-rep2.json
Normal file
@@ -0,0 +1,10 @@
|
||||
{
|
||||
"instructions": 50000007,
|
||||
"imports": 40390,
|
||||
"unimpl": 0,
|
||||
"draws": 0,
|
||||
"swaps": 1,
|
||||
"unique_render_targets": 0,
|
||||
"shader_blobs_live": 0,
|
||||
"texture_cache_entries": 0
|
||||
}
|
||||
10
audit-runs/phase-w-wedge-reattack/digest-rep3.json
Normal file
10
audit-runs/phase-w-wedge-reattack/digest-rep3.json
Normal file
@@ -0,0 +1,10 @@
|
||||
{
|
||||
"instructions": 50000007,
|
||||
"imports": 40390,
|
||||
"unimpl": 0,
|
||||
"draws": 0,
|
||||
"swaps": 1,
|
||||
"unique_render_targets": 0,
|
||||
"shader_blobs_live": 0,
|
||||
"texture_cache_entries": 0
|
||||
}
|
||||
132
audit-runs/phase-w-wedge-reattack/escalation.md
Normal file
132
audit-runs/phase-w-wedge-reattack/escalation.md
Normal file
@@ -0,0 +1,132 @@
|
||||
# Phase W escalation — wedge unbroken by accumulated tooling
|
||||
|
||||
**Outcome category: (C) — escalating cleanly.**
|
||||
|
||||
The Phase W mini-fix landed (`VdInitializeEngines` returns 1 vs old
|
||||
0; matches canary `xboxkrnl_video.cc:271-279`). This is a real
|
||||
correctness fix that advances Phase A matched-prefix
|
||||
**105,046 → 105,112 (+66 events)**. But on the brief's actual gate —
|
||||
`swaps > 1` / `draws > 0` / `texture_cache_entries > 0` — the
|
||||
`check --stable-digest -n 500000000` run is **byte-identical to
|
||||
baseline**: `draws=0, swaps=1, render_targets=0`. The fix does not
|
||||
unblock progression.
|
||||
|
||||
## What we verified afresh
|
||||
|
||||
1. The wedge is structurally identical to AUDIT-049/058/059/062/065:
|
||||
* tid=1 join-waits tid=13 at `sub_82173990+0x2D0` (handle `0x12c8`).
|
||||
* tid=13 wedges at `sub_821CB030+0x1B0` on Event `0x12d0`
|
||||
(`<NO_SIGNALS_DESPITE_WAITS>`).
|
||||
* `sub_825070F0` (vtable[1] worker-spawner) fires 0×.
|
||||
* 4 of 5 canary worker tids (canary's tid=14/15/4/+ several more)
|
||||
emit hundreds of thousands of events; ours's equivalents emit
|
||||
≤80. AUDIT-057 thread-gap PERSISTS.
|
||||
|
||||
2. New tooling (handle.create/destroy, thread.create/exit,
|
||||
wait.begin, shared-global SID absorbers) was applied. It surfaces
|
||||
normal cold-vs-cold divergences past 105K but does NOT illuminate
|
||||
a new signal-flow gap on the wedge handle itself.
|
||||
|
||||
3. The wedge handle's SID `d5e23609d3948568` has zero matches in any
|
||||
canary cold trace. The per-tid-PC SID recipe yields different SIDs
|
||||
for what is *logically* the same Event across engines, because
|
||||
create-site PC + tid + tid_event_idx all participate in the hash.
|
||||
This is by design (it's NOT a process-global dispatcher), but it
|
||||
means the new wait.begin events cannot directly identify "which
|
||||
canary NtSetEvent call should signal this".
|
||||
|
||||
## Why this is hard — the structural impasse
|
||||
|
||||
The matched-prefix metric and the progression metric measure
|
||||
different things. Matched-prefix tracks the **tid=1-only** event
|
||||
sequence in lockstep up to the first kind-mismatch. The wedge is on
|
||||
**tid=13** waiting for a signal that would come from a
|
||||
**worker-cluster thread that never spawns**. The two threads barely
|
||||
overlap in the matched-prefix view (tid=1 is fine for 105K events
|
||||
*because* it hasn't reached the join-wait yet from Phase A's
|
||||
perspective — `sub_82173990+0x2D0` is past idx 105,112 in canary's
|
||||
tid=6 stream).
|
||||
|
||||
Every Phase C fix has correctly advanced matched-prefix while
|
||||
leaving the wedge untouched, because the wedge needs the worker
|
||||
cluster to bootstrap, and the worker cluster's activation chain
|
||||
(`sub_822F1AA8 → sub_82173990 → sub_821746B0 → sub_821748F0 →
|
||||
sub_821C4EB0 → sub_821CC3F8 → sub_821CBA08 → sub_821CB030` and
|
||||
in parallel `→ sub_82172BA0 → sub_821B55D8 → sub_824F8398 →
|
||||
sub_824F7CD0 → sub_824F7800 → sub_825070F0 → 4 worker spawns`) is
|
||||
gated on the tid=13 wait completing, which is gated on a worker
|
||||
signal, which is gated on the worker cluster bootstrapping. This
|
||||
is the **same self-referential lock** AUDIT-063 documented.
|
||||
|
||||
## What new information Phase W produced
|
||||
|
||||
1. **VdInitializeEngines stub fix** (the landing). Trivially
|
||||
correct, advances matched-prefix +66, does not move progression.
|
||||
Worth keeping in canon for cold-vs-cold parity. New stable digest
|
||||
`73e99d60029128b4d5c3dd98e540457d82a52b8a962e7495132be2be31411aca`
|
||||
× 3 byte-identical.
|
||||
2. **Confirmed via the new wait.begin events**: canary's tid=9
|
||||
(= ours's tid=13 logical role) calls `wait.begin` on shared-global
|
||||
dispatcher Event `0xf800004c` (SID `c9f426cc34f55865`) at idx 321
|
||||
*immediately* after `RtlEnterCriticalSection` issues — proving
|
||||
that CS contention on canary's side awakens via the shared-global
|
||||
path while ours's per-tid Event takes the explicit
|
||||
`NtCreateEvent+NtWaitForSingleObjectEx` path. **These are two
|
||||
different objects, not one waiting for the same signal.** The
|
||||
tooling correctly says so.
|
||||
3. **The brief's hypothesis is correct**: matched-prefix is no
|
||||
longer the right metric. Progression has not moved across 25
|
||||
phases.
|
||||
|
||||
## Recommended next steps (ranked)
|
||||
|
||||
### Path 1 (recommended) — accept C+25 fallback and continue normal iteration
|
||||
|
||||
Dispatch C+25 = `MmAllocatePhysicalMemoryEx` / `MmGetPhysicalAddress`
|
||||
deterministic allocator (the new first divergence at idx 105,112 is
|
||||
in this family). Normal Phase C cadence; advances matched-prefix
|
||||
without claiming wedge unblocking. **Be honest in memory notes that
|
||||
matched-prefix is the only metric moving.**
|
||||
|
||||
### Path 2 — re-examine the absorbers
|
||||
|
||||
The C+18/C+21/D-extension absorbers all explicitly fold "scheduling
|
||||
jitter" classes. Per the brief's Path B suggestion: is any absorber
|
||||
HIDING a signal that would resolve the wedge? Specifically:
|
||||
* C+18 shared-global SID absorber folds canary's
|
||||
`aafae4c71fd42890` work-queue semaphore creation into ours's
|
||||
emission window even when ours never creates the equivalent. If
|
||||
ours's worker fails to *enqueue* something canary's worker awaits,
|
||||
we'd never see the gap because the matched-prefix isn't on the
|
||||
worker tid in the first place.
|
||||
* The D-extension absorber folds nested-CS cleanup blocks. If
|
||||
canary's `Enter/Leave` block contains the NtSetEvent that signals
|
||||
the wedge handle (via descendant `xeKeSetEvent`), the absorber
|
||||
hides that.
|
||||
|
||||
Concrete: un-absorb, re-diff, look for the first FOLDED canary block
|
||||
that contains an `NtSetEvent` whose SID resolves to the wedge handle.
|
||||
~3-5 hours of analysis, no LOC change.
|
||||
|
||||
### Path 3 — install host-side mem-watch + diff on wedge handle's guest memory
|
||||
|
||||
AUDIT-067 established that vtable installs go through host-side
|
||||
writes invisible to guest-PC traces. By the same logic, the wedge
|
||||
handle's kernel object header may be mutated by host code (the
|
||||
canary scheduler / dispatcher) in ways ours doesn't replicate. Hook
|
||||
`Memory::write*` in canary on the wedge handle's address; compare
|
||||
against ours.
|
||||
|
||||
### Path 4 — scheduler determinism investment
|
||||
|
||||
The unfunded `scheduler_determinism_plan` artifact (per memory). Stage
|
||||
0 was null result; the contention manifest stages landed but didn't
|
||||
move the cap. The PLAN doc explicitly notes the wedge is upstream of
|
||||
contention, so this is unlikely to help WITHOUT additional work.
|
||||
|
||||
## Honesty note
|
||||
|
||||
19 prior audits attacked this same wedge and failed. Phase W is the
|
||||
20th. We landed a correct mini-fix, but the wedge itself is
|
||||
unchanged. The user's instinct to call this honest fallback is the
|
||||
correct posture.
|
||||
30
audit-runs/phase-w-wedge-reattack/fix.diff
Normal file
30
audit-runs/phase-w-wedge-reattack/fix.diff
Normal file
@@ -0,0 +1,30 @@
|
||||
--- a/xenia-rs/crates/xenia-kernel/src/exports.rs
|
||||
+++ b/xenia-rs/crates/xenia-kernel/src/exports.rs
|
||||
@@ stub_return_zero …
|
||||
fn stub_return_zero(ctx: &mut PpcContext, _mem: &GuestMemory, _state: &mut KernelState) {
|
||||
ctx.gpr[3] = 0;
|
||||
}
|
||||
|
||||
+/// Phase W: a literal `return 1`. Matches canary's
|
||||
+/// `VdInitializeEngines_entry` in `xboxkrnl_video.cc:271-279` which
|
||||
+/// returns `1` (truthy success token) rather than STATUS_SUCCESS=0.
|
||||
+/// Sylpheed-side guest code branches on this non-zero, so returning
|
||||
+/// 0 made the game skip the VdInitializeRingBuffer-and-after init
|
||||
+/// sequence and never set up the post-init render-target state.
|
||||
+fn stub_return_one(ctx: &mut PpcContext, _mem: &GuestMemory, _state: &mut KernelState) {
|
||||
+ ctx.gpr[3] = 1;
|
||||
+}
|
||||
+
|
||||
@@ exports table …
|
||||
- state.register_export(Xboxkrnl, 0x01C2, "VdInitializeEngines", stub_success);
|
||||
+ state.register_export(Xboxkrnl, 0x01C2, "VdInitializeEngines", stub_return_one);
|
||||
|
||||
@@ tests mod …
|
||||
+ /// Phase W: ensure `VdInitializeEngines` writes `r3=1` …
|
||||
+ #[test]
|
||||
+ fn vd_initialize_engines_returns_one() {
|
||||
+ let (mut ctx, mem, mut state) = fresh();
|
||||
+ ctx.gpr[3] = 0xDEAD_BEEF; // sentinel — must be overwritten
|
||||
+ stub_return_one(&mut ctx, &mem, &mut state);
|
||||
+ assert_eq!(ctx.gpr[3], 1, "stub_return_one must put 1 in r3");
|
||||
+ }
|
||||
35
audit-runs/phase-w-wedge-reattack/halt-on-deadlock-dump.txt
Normal file
35
audit-runs/phase-w-wedge-reattack/halt-on-deadlock-dump.txt
Normal file
@@ -0,0 +1,35 @@
|
||||
=== Thread diagnostics ===
|
||||
hw=0 idx=0 tid=1 state=Blocked(WaitAny { handles: [4808], deadline: None }) pc=0x824ac578 lr=0x824ac578 sp=0x700ff6e0
|
||||
hw=0 idx=1 tid=11 state=Blocked(WaitAny { handles: [2190094916, 2190094880], deadline: None }) pc=0x824d2a94 lr=0x824d2a94 sp=0x71497d90
|
||||
hw=1 idx=0 tid=2 state=Blocked(WaitAny { handles: [2189887804], deadline: None }) pc=0x824a95f8 lr=0x824a95f8 sp=0x710ffd20
|
||||
hw=1 idx=1 tid=13 state=Blocked(WaitAny { handles: [4816], deadline: None }) pc=0x824ac578 lr=0x824ac578 sp=0x715a7a20
|
||||
hw=2 idx=0 tid=7 state=Blocked(WaitAny { handles: [1111833436], deadline: Some(3000) }) pc=0x824cd4f4 lr=0x824cd4f4 sp=0x71187e60
|
||||
hw=2 idx=1 tid=8 state=Blocked(WaitAny { handles: [4332, 4312], deadline: None }) pc=0x824ab214 lr=0x824ab214 sp=0x71287c90
|
||||
hw=3 idx=0 tid=4 state=Blocked(WaitAny { handles: [4136], deadline: None }) pc=0x824ac578 lr=0x824ac578 sp=0x7112fb80
|
||||
hw=3 idx=1 tid=5 state=Blocked(WaitAny { handles: [4836], deadline: None }) pc=0x824ac578 lr=0x824ac578 sp=0x7116fbe0
|
||||
hw=4 idx=0 tid=9 state=Ready pc=0x824d1404 lr=0x824d22b4 sp=0x71387df0
|
||||
hw=5 idx=0 tid=3 state=Blocked(WaitAny { handles: [4128], deadline: None }) pc=0x824ac578 lr=0x824ac578 sp=0x7111fdf0
|
||||
hw=5 idx=1 tid=6 state=Ready pc=0x824ab214 lr=0x824ab214 sp=0x7117fc60
|
||||
hw=5 idx=2 tid=10 state=Ready pc=0x824d1404 lr=0x824d22b4 sp=0x71487e00
|
||||
hw=5 idx=3 tid=12 state=Ready pc=0x824aa6a4 lr=0x824aa6a4 sp=0x714a7da0
|
||||
-- Handle waiter lists --
|
||||
handle=0x000010d8 Semaphore(0/2147483647) waiters(tid)=[8]
|
||||
handle=0x828a3220 Event(sig=false, mr=true) waiters(tid)=[11]
|
||||
handle=0x00001028 Semaphore(0/2147483647) waiters(tid)=[4]
|
||||
handle=0x000012e4 Event(sig=false, mr=false) waiters(tid)=[5]
|
||||
handle=0x42453b5c Event(sig=false, mr=true) waiters(tid)=[7]
|
||||
handle=0x828a3244 Event(sig=false, mr=false) waiters(tid)=[11]
|
||||
handle=0x8287093c Event(sig=false, mr=false) waiters(tid)=[2]
|
||||
handle=0x000010ec Event(sig=false, mr=true) waiters(tid)=[8]
|
||||
handle=0x000012d0 Event(sig=false, mr=false) waiters(tid)=[13]
|
||||
handle=0x00001020 Event(sig=false, mr=true) waiters(tid)=[3]
|
||||
handle=0x000012c8 Thread(id=13, exit=None) waiters(tid)=[1]
|
||||
handle=0x00001020 kind=Event/Manual waiters=1 signals=0 waits=1 wakes=0 <NO_SIGNALS_DESPITE_WAITS>
|
||||
handle=0x00001040 kind=Event/Auto waiters=0 signals=0 waits=32 wakes=0 <NO_SIGNALS_DESPITE_WAITS>
|
||||
handle=0x000010b0 kind=Event/Auto waiters=0 signals=0 waits=7 wakes=0 <NO_SIGNALS_DESPITE_WAITS>
|
||||
handle=0x000010dc kind=Event/Manual waiters=0 signals=1 waits=1 wakes=1 <SUSPECT>
|
||||
handle=0x000010ec kind=Event/Manual waiters=1 signals=0 waits=2 wakes=0 <NO_SIGNALS_DESPITE_WAITS>
|
||||
handle=0x000010fc kind=Event/Auto waiters=0 signals=1 waits=1 wakes=1 <SUSPECT>
|
||||
handle=0x00001104 kind=Event/Auto waiters=0 signals=1 waits=0 wakes=0 <SUSPECT>
|
||||
handle=0x000012d0 kind=Event/Auto waiters=1 signals=0 waits=1 wakes=0 <NO_SIGNALS_DESPITE_WAITS>
|
||||
handle=0x000012e4 kind=Event/Auto waiters=1 signals=0 waits=1 wakes=0 <NO_SIGNALS_DESPITE_WAITS>
|
||||
Reference in New Issue
Block a user