Files
xenia-rs/audit-runs/audit-069-session6-phase-a-bridge/first-20-release-diff.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

5.2 KiB
Raw Blame History

AUDIT-069 Session 6 — Time-ordered first-N release diff

Source data

  • Canary: xenia-rs/audit-runs/audit-069-wait-signal-producer/s5/canary-release-trace.log (414 AUDIT-070-RELEASE events on work-semaphore handle 0xF800003C).
  • Ours: xenia-rs/audit-runs/audit-069-wait-signal-producer/s5/ours-release-trace.jsonl (99 --lr-trace events at PC 0x824ab158, the NtReleaseSemaphore wrapper entry; 83 on handle 0x1044 which is ours's work-semaphore analog of canary's 0xF800003C).

Apples-to-apples comparison uses canary 414 ↔ ours 83 on the work semaphore (handle-equivalent, both 0xF800003C canary / 0x1044 ours). Ratio = 20.0% — close to but slightly below the S5-reported "24%" headline figure (which counted ours's 99 across ALL handles vs canary's 414 single-handle).

Per-tid release totals

Canary tid Role Releases Ours tid (map) Releases Delta
6 main 7 1 7 0
10 worker 382 5 75 +307 canary
17 cache producer 9 13 1 +8 canary
18 (other producer) 14 0 +14 canary (no ours analog)
16 (other producer) 1 0 +1 canary
26 (other producer) 1 0 +1 canary

Main thread releases match exactly (7=7) — ours's main is bit-equivalent on this path.

FIRST-N=20 time-ordered diff

Time-ordered by canary host_ns and aligned to ours via the AUDIT-068/069 documented tid map (6↔1, 10↔5, 17↔13):

can_ord can_tid ours_tid ours_ord (on tid) status canary host_ns
0 6 1 0 MATCHED 6,600
1 10 5 0 MATCHED 9,503,200
2 6 1 1 MATCHED 44,374,500
3 10 5 1 MATCHED 45,152,800
4 6 1 2 MATCHED 56,846,700
5 10 5 2 MATCHED 105,855,200
6 6 1 3 MATCHED 188,211,400
7 10 5 3 MATCHED 192,596,400
8 6 1 4 MATCHED 194,344,500
9 10 5 4 MATCHED 195,199,800
10 6 1 5 MATCHED 196,786,900
11 10 5 5 MATCHED 197,419,200
12 6 1 6 MATCHED 335,050,200
13 10 5 6 MATCHED 336,046,100
14 10 5 7 MATCHED 337,214,700
15 10 5 8 MATCHED 337,443,900
16 10 5 9 MATCHED 337,674,900
17 10 5 10 MATCHED 337,900,800
18 10 5 11 MATCHED 338,123,800
19 10 5 12 MATCHED 338,350,000

All 20 match. Bootstrap is identical for first 20 releases.

First divergence

Extending the walk past the first 20:

First time-ordered canary event NOT matched in ours:
    canary ord = 83   tid = 10 (worker)   host_ns = 372,415,500
    reason     = ours's tid=5 worker has produced ALL of its 75 releases by this point

But the causal divergence is one ord earlier:

canary ord = 82   tid = 17 (cache-thread)   host_ns = 372,105,500   lr = 0x824AB168
    → canary's tid=17 emits its FIRST work-sem release at 372 ms
    → ours's tid=13 (cache-thread analog) emitted its only release at cycle=26,803 (LR 0x82450314)
       early in bootstrap, then NEVER releases again — it wedges at sub_821CB030+0x1AC
       (per AUDIT-069 S1 wait-site, AUDIT-049 wedge family).

Canary tid=17's 9 releases (ords 82, 84, 86, 88, 92, 93, 94, 95, 96) feed the work-semaphore at host_ns 372399 ms. These supply work-items to canary's worker tid=10, which then produces another ~300 releases as it processes the queued items.

Ours's tid=13 is silent after its bootstrap-time release. The worker tid=5 runs out of work and halts at 75 releases — the moment it finishes consuming items produced before tid=13 wedged.

Interpretation vs S5 H3 ("systemic under-production")

H3 predicted a "systemic" under-production across all producers. The first-20 diff REFINES H3:

  • First 20 releases match cleanly across both engines. The system is NOT broken at boot.
  • The under-production is concentrated on the cache-thread (canary tid=17 / ours tid=13). That thread's failure to produce 8 more releases (after its 1st) cascades into a missing ~300 worker releases.
  • Canary tids 18/16/26 (14+1+1 = 16 additional releases from "other producers") have no observable ours analog. Whether ours never spawned analogs or those threads exist but never reach the release site is not determined by this measurement.

H3 is therefore PARTIALLY CONFIRMED with refinement: the dominant under-production source is the cache-thread (tid=17/13), not a generic systemic deficit. The remaining 16 releases from canary-only producer tids (18/16/26) are the secondary contribution.

  1. Probe ours tid=13 between cycle 26,803 (its first release) and its wedge at sub_821CB030+0x1AC. Identify why the cache-thread loops once in ours but ~10× in canary. AUDIT-069 S4's hypothesis (work-sem over-release causing producer to never re-enter wait) is now FALSIFIED by S5+S6 data; the producer simply never gets back to its release site.
  2. Inventory canary tids 18/16/26. Identify their entry PCs in canary, then check whether ours spawns analogs at all (thread.create events in a Phase A event log).
  3. The schema bridge wired in this session (see summary.md) makes future regressions in semaphore-release cadence diff-visible without ad-hoc cvars.