handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
162
audit-runs/iterate-2D-fire-pattern-diff/report.md
Normal file
162
audit-runs/iterate-2D-fire-pattern-diff/report.md
Normal file
@@ -0,0 +1,162 @@
|
||||
# Iterate 2.D fire-pattern diff — report
|
||||
|
||||
**Date**: 2026-05-27. **Mode**: read-only re-analysis of cached iterate-2D-peer-producer-trace JSONLs. Zero LOC engine/canary changes.
|
||||
|
||||
## Headline
|
||||
|
||||
**DIVERGENT-FIRE-PATTERN-FOUND — multiple distinct producer LRs fire in canary with ZERO ours analog across ALL tids.**
|
||||
|
||||
Canary total NtSetEvent+NtReleaseSemaphore fires: **79,014** across **33** distinct (op,lr) tuples.
|
||||
Ours total (IAT + LR thunk trace): **303** across **19** distinct (op,lr) tuples.
|
||||
Tuples in canary with **zero** ours analog: **28** carrying **29,441** canary fires (37.3% of canary's volume).
|
||||
Matched tuples: **5** carrying **49,573** canary fires.
|
||||
Extra-in-ours tuples (not in canary): **14** (sanity tally only).
|
||||
|
||||
## Reading-error #28 discipline
|
||||
|
||||
Diff key omits tid — we ask 'does this canary (op, lr, handle-class) fire at all in ours, on ANY tid'. Tids are tracked separately per key for context but never used as cross-engine identity.
|
||||
|
||||
## Top divergent tuples (canary fires N, ours fires 0)
|
||||
|
||||
| # | op | LR | canary fires | canary tids | canary handles | likely fn |
|
||||
|--:|----|----|--:|--|--|---|
|
||||
| 1 | set | `0x824d292c` | 16,452 | 14 | 0xf800007c | sub_824D2878 (AUDIO worker entry-2 γ-set) |
|
||||
| 2 | set | `0x82506c90` | 2,378 | 27 | 0xf8000180 | sub_82506B08+0x188 (worker dispatch γ-set) |
|
||||
| 3 | set | `0x82508510` | 2,373 | 14,15 | 0xf8000184 | sub_82508400+0x110 (worker dispatch γ-set) |
|
||||
| 4 | set | `0x82508524` | 2,373 | 14,15 | 0xf8000180 | sub_82508400+0x124 (worker dispatch γ-set) |
|
||||
| 5 | set | `0x82506f9c` | 2,355 | 28 | 0xf800017c | sub_82506DE8 (worker dispatch γ-set) |
|
||||
| 6 | set | `0x82508358` | 2,350 | 13 | 0xf8000188 | sub_825078D8+0xa80 (worker dispatch γ-set) |
|
||||
| 7 | set | `0x824aafc8` | 1,113 | 6,10,27,28 | 0xf800004c, 0xf8000050, 0xf8000078 (+29) | sub_824AAF50 (KeSetEvent game wrapper) |
|
||||
| 8 | set | `0x827e843c` | 15 | 14 | 0xf80000ac | (unknown) |
|
||||
| 9 | set | `0x82178d9c` | 6 | 16 | 0xf8000104 | (unknown) |
|
||||
| 10 | set | `0x824d0868` | 5 | 16 | 0xf8000168 | (unknown) |
|
||||
| 11 | set | `0x824d0c6c` | 3 | 16 | 0xf8000168 | (unknown) |
|
||||
| 12 | set | `0x824d08c0` | 2 | 14,16 | 0xf8000168 | (unknown) |
|
||||
| 13 | set | `0x824d091c` | 1 | 6 | 0xf8000168 | (unknown) |
|
||||
| 14 | set | `0x822d30ec` | 1 | 6 | 0xf80000c8 | (unknown) |
|
||||
| 15 | set | `0x82507abc` | 1 | 13 | 0xf8000178 | (unknown) |
|
||||
|
||||
## Top under-firing matched tuples (canary >>> ours)
|
||||
|
||||
| op | LR | canary | ours | ratio | likely fn |
|
||||
|----|----|--:|--:|--:|---|
|
||||
| release | `0x824d229c` | 16,452 | 1 | 0.01% | sub_824D21F0+0xAC (AUDIO dispatch γ-release) |
|
||||
| set | `0x824d2a44` | 16,452 | 1 | 0.01% | sub_824D29F0 (AUDIO worker entry γ-set) |
|
||||
| set | `0x824aa304` | 15,765 | 60 | 0.38% | sub_824AA2F0 (NtSetEvent game wrapper) |
|
||||
| release | `0x824ab168` | 903 | 90 | 9.97% | sub_824AB158 (NtReleaseSemaphore game wrapper) |
|
||||
|
||||
## γ-signaler family intersection (AUDIT-069 S3/S2)
|
||||
|
||||
| LR | op | canary | ours | likely fn |
|
||||
|----|----|--:|--:|---|
|
||||
| `0x824aa304` | set | 15,765 | 60 | sub_824AA2F0 (NtSetEvent game wrapper) |
|
||||
| `0x824aafc8` | set | 1,113 | 0 | sub_824AAF50 (KeSetEvent game wrapper) |
|
||||
| `0x824ab168` | release | 903 | 90 | sub_824AB158 (NtReleaseSemaphore game wrapper) |
|
||||
| `0x824d229c` | release | 16,452 | 1 | sub_824D21F0+0xAC (AUDIO dispatch γ-release) |
|
||||
| `0x824d292c` | set | 16,452 | 0 | sub_824D2878 (AUDIO worker entry-2 γ-set) |
|
||||
| `0x824d2a44` | set | 16,452 | 1 | sub_824D29F0 (AUDIO worker entry γ-set) |
|
||||
| `0x82506c90` | set | 2,378 | 0 | sub_82506B08+0x188 (worker dispatch γ-set) |
|
||||
| `0x82506f9c` | set | 2,355 | 0 | sub_82506DE8 (worker dispatch γ-set) |
|
||||
| `0x82508358` | set | 2,350 | 0 | sub_825078D8+0xa80 (worker dispatch γ-set) |
|
||||
| `0x82508510` | set | 2,373 | 0 | sub_82508400+0x110 (worker dispatch γ-set) |
|
||||
| `0x82508524` | set | 2,373 | 0 | sub_82508400+0x124 (worker dispatch γ-set) |
|
||||
|
||||
Of 28 missing-in-ours tuples: **7** intersect the AUDIT-069 γ-signaler family, **21** lie outside it (fresh chains).
|
||||
|
||||
## Wedge-related LRs (cache-thread / worker-dispatch self-release)
|
||||
|
||||
These LRs are the deep game-side call sites that route through `sub_824AB158` (NtReleaseSemaphore wrapper) and ultimately feed the wedge's wait predicate (work-semaphore handle 0x1050 at guest VA [0x828F3BC4]). **Note: canary's audit_70 hook fires at the IAT-thunk depth and reports the wrapper-return LR (`0x824AB168`) for ALL NtReleaseSemaphore fires** — it never sees deeper game-side LRs. Ours's `--lr-trace=0x824AB158` probe captures one level deeper (the game-wrapper caller). So canary count here is always 0; the value of this table is **ours's count** showing which of these game-side sites still execute at all in ours:
|
||||
|
||||
| LR | op | canary | ours | likely fn |
|
||||
|----|----|--:|--:|---|
|
||||
| `0x82450314` | release | 0 | 6 | sub_82450218+0xFC (cache-thread release site) |
|
||||
| `0x82450ce0` | release | 0 | 68 | sub_82450B68+0x178 (worker self-release path 1) |
|
||||
| `0x82450d2c` | release | 0 | 6 | sub_82450B68+0x1C4 (worker self-release path 2) |
|
||||
|
||||
Comparable apples-to-apples roll-up: canary's 903 fires at LR `0x824AB168` (NtReleaseSemaphore wrapper return) ↔ ours's 90 fires at the SAME LR (IAT trace). Ratio = **9.97%**. The shortfall is dominated by ours's worker tid=5 (75/903) and cache-thread tid=13 (1/903) under-firing per AUDIT-069 S6.
|
||||
|
||||
## Canary-only tids (entry-PC bucket inferred via release-LR clustering)
|
||||
|
||||
These canary tids have ZERO matched ours analog at the (op,lr) level:
|
||||
|
||||
| canary tid | total fires | matched-LR fires | missing-LR fires | distinct LRs | analog in ours? |
|
||||
|--:|--:|--:|--:|--:|---|
|
||||
| 14 | 33,546 | 16,452 | 17,094 | 6 | YES |
|
||||
| 4 | 16,452 | 16,452 | 0 | 1 | YES |
|
||||
| 6 | 10,965 | 10,945 | 20 | 7 | YES |
|
||||
| 2 | 5,268 | 5,268 | 0 | 1 | YES |
|
||||
| 15 | 4,120 | 0 | 4,120 | 2 | NO |
|
||||
| 27 | 2,726 | 0 | 2,726 | 2 | NO |
|
||||
| 28 | 2,724 | 0 | 2,724 | 2 | NO |
|
||||
| 13 | 2,356 | 5 | 2,351 | 3 | YES |
|
||||
| 10 | 800 | 419 | 381 | 4 | YES |
|
||||
| 16 | 24 | 2 | 22 | 13 | YES |
|
||||
| 18 | 16 | 14 | 2 | 3 | YES |
|
||||
| 17 | 8 | 8 | 0 | 1 | YES |
|
||||
| 11 | 5 | 5 | 0 | 1 | YES |
|
||||
| 0 | 1 | 0 | 1 | 1 | NO |
|
||||
| 21 | 1 | 1 | 0 | 1 | YES |
|
||||
| 7 | 1 | 1 | 0 | 1 | YES |
|
||||
| 26 | 1 | 1 | 0 | 1 | YES |
|
||||
|
||||
## Canary thread families with no ours analog (entire-thread divergence)
|
||||
|
||||
Per the 'Canary-only tids' table above, **three canary tids (15, 27, 28) have ZERO matched-LR fires** — every event they produce is on an LR ours never visits. Their fire patterns:
|
||||
|
||||
- **canary tid=15** (4,120 fires): exclusively on LRs `0x82508510` (2,373×, sub_82508400+0x110) and `0x82508524` (2,373×, sub_82508400+0x124) — paired worker-dispatch γ-set sites. Co-fires with canary tid=14 on the same LRs.
|
||||
- **canary tid=27** (2,726 fires): exclusively on LR `0x82506c90` (2,378×, sub_82506B08+0x188, worker dispatch) + `0x824AAFC8` (348×, KeSetEvent wrapper).
|
||||
- **canary tid=28** (2,724 fires): exclusively on LR `0x82506f9c` (2,355×, sub_82506DE8, worker dispatch) + `0x824AAFC8` (369×, KeSetEvent wrapper).
|
||||
|
||||
**Conclusion: canary tids 15/27/28 are members of the sub_825070F0 worker fan-out cluster that ours fails to spawn or whose start ctx is mis-initialized.** This matches the Review A Step 1 force-spawn-workers diagnosis (workers spawn but fault on `[ctx+44] = 0xBCE25640` unmapped read).
|
||||
|
||||
Canary tid=14 (33,546 fires, the audio worker A) HAS a partial ours analog (ours tids 9/10/11 fire 3 total events on the audio LRs), confirming that ours DOES spawn the audio threads but they wedge after 1 iteration (per iterate-2D investigation §Step 3).
|
||||
|
||||
## Outcome class
|
||||
|
||||
**Class (C) Many distributed producers missing** (confirms iterate-2D's outcome). Not a single (lr, handle) tuple — at least 15+ distinct call sites in canary have zero ours analog on any tid.
|
||||
|
||||
## Recommendation
|
||||
|
||||
**DROP-TO-OPTION-2 (boot-time delta replay), NOT force-spawn crowbar.**
|
||||
|
||||
Why not the crowbar (Option-C from goal): Review A Step 1 attempted exactly that on 2026-05-27 (`review-a-step1-force-spawn/progression-result.md`) and FAILED the PRIMARY progression gate. The 4 workers spawn under `--force-spawn-workers` but fault ~159 instructions in at `vtable[35..38]` dispatching on `[ctx+44]=0xBCE25640` — an unmapped VA in ours's allocator namespace. Force-spawning without first fixing the upstream ctx-state-installer chain is futile.
|
||||
|
||||
Why Option-2: iterate-2D §Step 3 documented a **1.3 s upstream timing skew** (canary first audio fire at host_ns=278 ms; ours first audio fire at 1,587 ms — 5.7× later). The 28 missing producer LRs found here are downstream consequences of that delay. Diffing the first ~1200 phase-a events to find the single early-init kernel-call divergence is cheaper, doesn't add LOC, and likely cascades to most of the 28 LRs at once. The canary's tid=6 ↔ ours's tid=1 main-thread bootstrap matches for 20 releases (per AUDIT-069 S6) then diverges — that's the right window to inspect.
|
||||
|
||||
Concrete next iterate: `iterate-2E-boot-delta-replay` — ~0 engine LOC, ~100 LOC investigation. Read existing phase-a event logs at `xenia-rs/audit-runs/phase-a-diff-harness/` (dated 2026-05-26) for both engines, time-bucket by host_ns, diff at first kernel-import-call mismatch. If the harness's diff path already covers this, the analysis may be pure data work.
|
||||
|
||||
## Cross-check vs γ-signaler family
|
||||
|
||||
γ-family LRs (defined per AUDIT-069 S3/S2) have **7** representatives among the missing-in-ours set. The remaining **21** missing tuples lie outside the γ-family — these are fresh producer chains the audit-069 work never characterized:
|
||||
|
||||
- `0x827e843c` (set, canary=15 fires, tids=[14])
|
||||
- `0x82178d9c` (set, canary=6 fires, tids=[16])
|
||||
- `0x824d0868` (set, canary=5 fires, tids=[16])
|
||||
- `0x824d0c6c` (set, canary=3 fires, tids=[16])
|
||||
- `0x824d08c0` (set, canary=2 fires, tids=[14, 16])
|
||||
- `0x824d091c` (set, canary=1 fires, tids=[6])
|
||||
- `0x822d30ec` (set, canary=1 fires, tids=[6])
|
||||
- `0x82507abc` (set, canary=1 fires, tids=[13])
|
||||
|
||||
## Cascade check
|
||||
|
||||
- A (acquire both engines' fire data): **PASS** — cached canary 79,014 events + ours 153 events.
|
||||
- B (build cross-engine tuple key respecting reading-error #28): **PASS** — keyed on (op, lr); handle namespace differences absorbed by structural LR identity.
|
||||
- C (identify divergent tuples): **PASS** — see top-15 table above.
|
||||
- D (attribute cause): **PASS MEDIUM** — class (C) structural ladder; not a single bug.
|
||||
- E (recommend next iterate): **PASS** — Option-2 boot-time delta replay (per iterate-2D's investigation §Step 5).
|
||||
|
||||
## Tripstones honored
|
||||
|
||||
- **#28** (per-engine tid stability): tids omitted from diff key.
|
||||
- **#39** (composite progression metric): not relevant — this is an investigation, not a progression iterate.
|
||||
- **#40** (single-keystone framing): explicitly checked and falsified.
|
||||
|
||||
## Artifacts
|
||||
|
||||
Under `xenia-rs/audit-runs/iterate-2D-fire-pattern-diff/`:
|
||||
|
||||
- `diff.py` — this analysis script.
|
||||
- `report.md` — this report.
|
||||
- `divergent-tuples.csv` — full list of missing-in-ours tuples for further xref.
|
||||
- `matched-tuples.csv` — full list of matched tuples with canary/ours counts.
|
||||
Reference in New Issue
Block a user