Files
xenia-rs/audit-runs/iterate-2D-fire-pattern-diff/report.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

163 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Iterate 2.D fire-pattern diff — report
**Date**: 2026-05-27. **Mode**: read-only re-analysis of cached iterate-2D-peer-producer-trace JSONLs. Zero LOC engine/canary changes.
## Headline
**DIVERGENT-FIRE-PATTERN-FOUND — multiple distinct producer LRs fire in canary with ZERO ours analog across ALL tids.**
Canary total NtSetEvent+NtReleaseSemaphore fires: **79,014** across **33** distinct (op,lr) tuples.
Ours total (IAT + LR thunk trace): **303** across **19** distinct (op,lr) tuples.
Tuples in canary with **zero** ours analog: **28** carrying **29,441** canary fires (37.3% of canary's volume).
Matched tuples: **5** carrying **49,573** canary fires.
Extra-in-ours tuples (not in canary): **14** (sanity tally only).
## Reading-error #28 discipline
Diff key omits tid — we ask 'does this canary (op, lr, handle-class) fire at all in ours, on ANY tid'. Tids are tracked separately per key for context but never used as cross-engine identity.
## Top divergent tuples (canary fires N, ours fires 0)
| # | op | LR | canary fires | canary tids | canary handles | likely fn |
|--:|----|----|--:|--|--|---|
| 1 | set | `0x824d292c` | 16,452 | 14 | 0xf800007c | sub_824D2878 (AUDIO worker entry-2 γ-set) |
| 2 | set | `0x82506c90` | 2,378 | 27 | 0xf8000180 | sub_82506B08+0x188 (worker dispatch γ-set) |
| 3 | set | `0x82508510` | 2,373 | 14,15 | 0xf8000184 | sub_82508400+0x110 (worker dispatch γ-set) |
| 4 | set | `0x82508524` | 2,373 | 14,15 | 0xf8000180 | sub_82508400+0x124 (worker dispatch γ-set) |
| 5 | set | `0x82506f9c` | 2,355 | 28 | 0xf800017c | sub_82506DE8 (worker dispatch γ-set) |
| 6 | set | `0x82508358` | 2,350 | 13 | 0xf8000188 | sub_825078D8+0xa80 (worker dispatch γ-set) |
| 7 | set | `0x824aafc8` | 1,113 | 6,10,27,28 | 0xf800004c, 0xf8000050, 0xf8000078 (+29) | sub_824AAF50 (KeSetEvent game wrapper) |
| 8 | set | `0x827e843c` | 15 | 14 | 0xf80000ac | (unknown) |
| 9 | set | `0x82178d9c` | 6 | 16 | 0xf8000104 | (unknown) |
| 10 | set | `0x824d0868` | 5 | 16 | 0xf8000168 | (unknown) |
| 11 | set | `0x824d0c6c` | 3 | 16 | 0xf8000168 | (unknown) |
| 12 | set | `0x824d08c0` | 2 | 14,16 | 0xf8000168 | (unknown) |
| 13 | set | `0x824d091c` | 1 | 6 | 0xf8000168 | (unknown) |
| 14 | set | `0x822d30ec` | 1 | 6 | 0xf80000c8 | (unknown) |
| 15 | set | `0x82507abc` | 1 | 13 | 0xf8000178 | (unknown) |
## Top under-firing matched tuples (canary >>> ours)
| op | LR | canary | ours | ratio | likely fn |
|----|----|--:|--:|--:|---|
| release | `0x824d229c` | 16,452 | 1 | 0.01% | sub_824D21F0+0xAC (AUDIO dispatch γ-release) |
| set | `0x824d2a44` | 16,452 | 1 | 0.01% | sub_824D29F0 (AUDIO worker entry γ-set) |
| set | `0x824aa304` | 15,765 | 60 | 0.38% | sub_824AA2F0 (NtSetEvent game wrapper) |
| release | `0x824ab168` | 903 | 90 | 9.97% | sub_824AB158 (NtReleaseSemaphore game wrapper) |
## γ-signaler family intersection (AUDIT-069 S3/S2)
| LR | op | canary | ours | likely fn |
|----|----|--:|--:|---|
| `0x824aa304` | set | 15,765 | 60 | sub_824AA2F0 (NtSetEvent game wrapper) |
| `0x824aafc8` | set | 1,113 | 0 | sub_824AAF50 (KeSetEvent game wrapper) |
| `0x824ab168` | release | 903 | 90 | sub_824AB158 (NtReleaseSemaphore game wrapper) |
| `0x824d229c` | release | 16,452 | 1 | sub_824D21F0+0xAC (AUDIO dispatch γ-release) |
| `0x824d292c` | set | 16,452 | 0 | sub_824D2878 (AUDIO worker entry-2 γ-set) |
| `0x824d2a44` | set | 16,452 | 1 | sub_824D29F0 (AUDIO worker entry γ-set) |
| `0x82506c90` | set | 2,378 | 0 | sub_82506B08+0x188 (worker dispatch γ-set) |
| `0x82506f9c` | set | 2,355 | 0 | sub_82506DE8 (worker dispatch γ-set) |
| `0x82508358` | set | 2,350 | 0 | sub_825078D8+0xa80 (worker dispatch γ-set) |
| `0x82508510` | set | 2,373 | 0 | sub_82508400+0x110 (worker dispatch γ-set) |
| `0x82508524` | set | 2,373 | 0 | sub_82508400+0x124 (worker dispatch γ-set) |
Of 28 missing-in-ours tuples: **7** intersect the AUDIT-069 γ-signaler family, **21** lie outside it (fresh chains).
## Wedge-related LRs (cache-thread / worker-dispatch self-release)
These LRs are the deep game-side call sites that route through `sub_824AB158` (NtReleaseSemaphore wrapper) and ultimately feed the wedge's wait predicate (work-semaphore handle 0x1050 at guest VA [0x828F3BC4]). **Note: canary's audit_70 hook fires at the IAT-thunk depth and reports the wrapper-return LR (`0x824AB168`) for ALL NtReleaseSemaphore fires** — it never sees deeper game-side LRs. Ours's `--lr-trace=0x824AB158` probe captures one level deeper (the game-wrapper caller). So canary count here is always 0; the value of this table is **ours's count** showing which of these game-side sites still execute at all in ours:
| LR | op | canary | ours | likely fn |
|----|----|--:|--:|---|
| `0x82450314` | release | 0 | 6 | sub_82450218+0xFC (cache-thread release site) |
| `0x82450ce0` | release | 0 | 68 | sub_82450B68+0x178 (worker self-release path 1) |
| `0x82450d2c` | release | 0 | 6 | sub_82450B68+0x1C4 (worker self-release path 2) |
Comparable apples-to-apples roll-up: canary's 903 fires at LR `0x824AB168` (NtReleaseSemaphore wrapper return) ↔ ours's 90 fires at the SAME LR (IAT trace). Ratio = **9.97%**. The shortfall is dominated by ours's worker tid=5 (75/903) and cache-thread tid=13 (1/903) under-firing per AUDIT-069 S6.
## Canary-only tids (entry-PC bucket inferred via release-LR clustering)
These canary tids have ZERO matched ours analog at the (op,lr) level:
| canary tid | total fires | matched-LR fires | missing-LR fires | distinct LRs | analog in ours? |
|--:|--:|--:|--:|--:|---|
| 14 | 33,546 | 16,452 | 17,094 | 6 | YES |
| 4 | 16,452 | 16,452 | 0 | 1 | YES |
| 6 | 10,965 | 10,945 | 20 | 7 | YES |
| 2 | 5,268 | 5,268 | 0 | 1 | YES |
| 15 | 4,120 | 0 | 4,120 | 2 | NO |
| 27 | 2,726 | 0 | 2,726 | 2 | NO |
| 28 | 2,724 | 0 | 2,724 | 2 | NO |
| 13 | 2,356 | 5 | 2,351 | 3 | YES |
| 10 | 800 | 419 | 381 | 4 | YES |
| 16 | 24 | 2 | 22 | 13 | YES |
| 18 | 16 | 14 | 2 | 3 | YES |
| 17 | 8 | 8 | 0 | 1 | YES |
| 11 | 5 | 5 | 0 | 1 | YES |
| 0 | 1 | 0 | 1 | 1 | NO |
| 21 | 1 | 1 | 0 | 1 | YES |
| 7 | 1 | 1 | 0 | 1 | YES |
| 26 | 1 | 1 | 0 | 1 | YES |
## Canary thread families with no ours analog (entire-thread divergence)
Per the 'Canary-only tids' table above, **three canary tids (15, 27, 28) have ZERO matched-LR fires** — every event they produce is on an LR ours never visits. Their fire patterns:
- **canary tid=15** (4,120 fires): exclusively on LRs `0x82508510` (2,373×, sub_82508400+0x110) and `0x82508524` (2,373×, sub_82508400+0x124) — paired worker-dispatch γ-set sites. Co-fires with canary tid=14 on the same LRs.
- **canary tid=27** (2,726 fires): exclusively on LR `0x82506c90` (2,378×, sub_82506B08+0x188, worker dispatch) + `0x824AAFC8` (348×, KeSetEvent wrapper).
- **canary tid=28** (2,724 fires): exclusively on LR `0x82506f9c` (2,355×, sub_82506DE8, worker dispatch) + `0x824AAFC8` (369×, KeSetEvent wrapper).
**Conclusion: canary tids 15/27/28 are members of the sub_825070F0 worker fan-out cluster that ours fails to spawn or whose start ctx is mis-initialized.** This matches the Review A Step 1 force-spawn-workers diagnosis (workers spawn but fault on `[ctx+44] = 0xBCE25640` unmapped read).
Canary tid=14 (33,546 fires, the audio worker A) HAS a partial ours analog (ours tids 9/10/11 fire 3 total events on the audio LRs), confirming that ours DOES spawn the audio threads but they wedge after 1 iteration (per iterate-2D investigation §Step 3).
## Outcome class
**Class (C) Many distributed producers missing** (confirms iterate-2D's outcome). Not a single (lr, handle) tuple — at least 15+ distinct call sites in canary have zero ours analog on any tid.
## Recommendation
**DROP-TO-OPTION-2 (boot-time delta replay), NOT force-spawn crowbar.**
Why not the crowbar (Option-C from goal): Review A Step 1 attempted exactly that on 2026-05-27 (`review-a-step1-force-spawn/progression-result.md`) and FAILED the PRIMARY progression gate. The 4 workers spawn under `--force-spawn-workers` but fault ~159 instructions in at `vtable[35..38]` dispatching on `[ctx+44]=0xBCE25640` — an unmapped VA in ours's allocator namespace. Force-spawning without first fixing the upstream ctx-state-installer chain is futile.
Why Option-2: iterate-2D §Step 3 documented a **1.3 s upstream timing skew** (canary first audio fire at host_ns=278 ms; ours first audio fire at 1,587 ms — 5.7× later). The 28 missing producer LRs found here are downstream consequences of that delay. Diffing the first ~1200 phase-a events to find the single early-init kernel-call divergence is cheaper, doesn't add LOC, and likely cascades to most of the 28 LRs at once. The canary's tid=6 ↔ ours's tid=1 main-thread bootstrap matches for 20 releases (per AUDIT-069 S6) then diverges — that's the right window to inspect.
Concrete next iterate: `iterate-2E-boot-delta-replay` — ~0 engine LOC, ~100 LOC investigation. Read existing phase-a event logs at `xenia-rs/audit-runs/phase-a-diff-harness/` (dated 2026-05-26) for both engines, time-bucket by host_ns, diff at first kernel-import-call mismatch. If the harness's diff path already covers this, the analysis may be pure data work.
## Cross-check vs γ-signaler family
γ-family LRs (defined per AUDIT-069 S3/S2) have **7** representatives among the missing-in-ours set. The remaining **21** missing tuples lie outside the γ-family — these are fresh producer chains the audit-069 work never characterized:
- `0x827e843c` (set, canary=15 fires, tids=[14])
- `0x82178d9c` (set, canary=6 fires, tids=[16])
- `0x824d0868` (set, canary=5 fires, tids=[16])
- `0x824d0c6c` (set, canary=3 fires, tids=[16])
- `0x824d08c0` (set, canary=2 fires, tids=[14, 16])
- `0x824d091c` (set, canary=1 fires, tids=[6])
- `0x822d30ec` (set, canary=1 fires, tids=[6])
- `0x82507abc` (set, canary=1 fires, tids=[13])
## Cascade check
- A (acquire both engines' fire data): **PASS** — cached canary 79,014 events + ours 153 events.
- B (build cross-engine tuple key respecting reading-error #28): **PASS** — keyed on (op, lr); handle namespace differences absorbed by structural LR identity.
- C (identify divergent tuples): **PASS** — see top-15 table above.
- D (attribute cause): **PASS MEDIUM** — class (C) structural ladder; not a single bug.
- E (recommend next iterate): **PASS** — Option-2 boot-time delta replay (per iterate-2D's investigation §Step 5).
## Tripstones honored
- **#28** (per-engine tid stability): tids omitted from diff key.
- **#39** (composite progression metric): not relevant — this is an investigation, not a progression iterate.
- **#40** (single-keystone framing): explicitly checked and falsified.
## Artifacts
Under `xenia-rs/audit-runs/iterate-2D-fire-pattern-diff/`:
- `diff.py` — this analysis script.
- `report.md` — this report.
- `divergent-tuples.csv` — full list of missing-in-ours tuples for further xref.
- `matched-tuples.csv` — full list of matched tuples with canary/ours counts.