Files
xenia-rs/audit-runs/audit-069-session6-phase-a-bridge/first-20-release-diff.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

91 lines
5.2 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# AUDIT-069 Session 6 — Time-ordered first-N release diff
## Source data
- Canary: `xenia-rs/audit-runs/audit-069-wait-signal-producer/s5/canary-release-trace.log` (414 `AUDIT-070-RELEASE` events on work-semaphore handle `0xF800003C`).
- Ours: `xenia-rs/audit-runs/audit-069-wait-signal-producer/s5/ours-release-trace.jsonl` (99 `--lr-trace` events at PC `0x824ab158`, the `NtReleaseSemaphore` wrapper entry; 83 on handle `0x1044` which is ours's work-semaphore analog of canary's `0xF800003C`).
Apples-to-apples comparison uses **canary 414 ↔ ours 83 on the work semaphore** (handle-equivalent, both `0xF800003C` canary / `0x1044` ours). Ratio = **20.0%** — close to but slightly below the S5-reported "24%" headline figure (which counted ours's 99 across ALL handles vs canary's 414 single-handle).
## Per-tid release totals
| Canary tid | Role | Releases | Ours tid (map) | Releases | Delta |
|---:|---|---:|---:|---:|---:|
| 6 | main | 7 | 1 | 7 | 0 |
| 10 | worker | 382 | 5 | 75 | **+307 canary** |
| 17 | cache producer | 9 | 13 | 1 | **+8 canary** |
| 18 | (other producer) | 14 | — | 0 | +14 canary (no ours analog) |
| 16 | (other producer) | 1 | — | 0 | +1 canary |
| 26 | (other producer) | 1 | — | 0 | +1 canary |
Main thread releases **match exactly (7=7)** — ours's main is bit-equivalent on this path.
## FIRST-N=20 time-ordered diff
Time-ordered by canary `host_ns` and aligned to ours via the AUDIT-068/069 documented tid map (`6↔1`, `10↔5`, `17↔13`):
| can_ord | can_tid | ours_tid | ours_ord (on tid) | status | canary host_ns |
|---:|---:|---:|---:|---|---:|
| 0 | 6 | 1 | 0 | MATCHED | 6,600 |
| 1 | 10 | 5 | 0 | MATCHED | 9,503,200 |
| 2 | 6 | 1 | 1 | MATCHED | 44,374,500 |
| 3 | 10 | 5 | 1 | MATCHED | 45,152,800 |
| 4 | 6 | 1 | 2 | MATCHED | 56,846,700 |
| 5 | 10 | 5 | 2 | MATCHED | 105,855,200 |
| 6 | 6 | 1 | 3 | MATCHED | 188,211,400 |
| 7 | 10 | 5 | 3 | MATCHED | 192,596,400 |
| 8 | 6 | 1 | 4 | MATCHED | 194,344,500 |
| 9 | 10 | 5 | 4 | MATCHED | 195,199,800 |
| 10 | 6 | 1 | 5 | MATCHED | 196,786,900 |
| 11 | 10 | 5 | 5 | MATCHED | 197,419,200 |
| 12 | 6 | 1 | 6 | MATCHED | 335,050,200 |
| 13 | 10 | 5 | 6 | MATCHED | 336,046,100 |
| 14 | 10 | 5 | 7 | MATCHED | 337,214,700 |
| 15 | 10 | 5 | 8 | MATCHED | 337,443,900 |
| 16 | 10 | 5 | 9 | MATCHED | 337,674,900 |
| 17 | 10 | 5 | 10 | MATCHED | 337,900,800 |
| 18 | 10 | 5 | 11 | MATCHED | 338,123,800 |
| 19 | 10 | 5 | 12 | MATCHED | 338,350,000 |
**All 20 match. Bootstrap is identical for first 20 releases.**
## First divergence
Extending the walk past the first 20:
```
First time-ordered canary event NOT matched in ours:
canary ord = 83 tid = 10 (worker) host_ns = 372,415,500
reason = ours's tid=5 worker has produced ALL of its 75 releases by this point
```
But the **causal** divergence is one ord earlier:
```
canary ord = 82 tid = 17 (cache-thread) host_ns = 372,105,500 lr = 0x824AB168
→ canary's tid=17 emits its FIRST work-sem release at 372 ms
→ ours's tid=13 (cache-thread analog) emitted its only release at cycle=26,803 (LR 0x82450314)
early in bootstrap, then NEVER releases again — it wedges at sub_821CB030+0x1AC
(per AUDIT-069 S1 wait-site, AUDIT-049 wedge family).
```
Canary tid=17's 9 releases (ords 82, 84, 86, 88, 92, 93, 94, 95, 96) feed the work-semaphore at host_ns 372399 ms. These supply work-items to canary's worker tid=10, which then produces another ~300 releases as it processes the queued items.
Ours's tid=13 is silent after its bootstrap-time release. The worker tid=5 runs out of work and halts at 75 releases — the moment it finishes consuming items produced before tid=13 wedged.
## Interpretation vs S5 H3 ("systemic under-production")
H3 predicted a "systemic" under-production across all producers. The first-20 diff REFINES H3:
- **First 20 releases match cleanly across both engines.** The system is NOT broken at boot.
- The under-production is **concentrated on the cache-thread (canary tid=17 / ours tid=13).** That thread's failure to produce 8 more releases (after its 1st) cascades into a missing ~300 worker releases.
- Canary tids 18/16/26 (14+1+1 = 16 additional releases from "other producers") have no observable ours analog. Whether ours never spawned analogs or those threads exist but never reach the release site is not determined by this measurement.
**H3 is therefore PARTIALLY CONFIRMED with refinement:** the dominant under-production source is the cache-thread (tid=17/13), not a generic systemic deficit. The remaining 16 releases from canary-only producer tids (18/16/26) are the secondary contribution.
## Recommended AUDIT-070 next steps
1. **Probe ours tid=13 between cycle 26,803 (its first release) and its wedge at `sub_821CB030+0x1AC`.** Identify why the cache-thread loops once in ours but ~10× in canary. AUDIT-069 S4's hypothesis (work-sem over-release causing producer to never re-enter wait) is now FALSIFIED by S5+S6 data; the producer simply never gets back to its release site.
2. **Inventory canary tids 18/16/26.** Identify their entry PCs in canary, then check whether ours spawns analogs at all (`thread.create` events in a Phase A event log).
3. **The schema bridge wired in this session** (see `summary.md`) makes future regressions in semaphore-release cadence diff-visible without ad-hoc cvars.