Files
xenia-rs/audit-runs/phase-host-audio-eager/re-validation.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

4.5 KiB
Raw Blame History

Phase Host-Audio-Eager — Re-validation (2026-05-19)

Progression metric (primary gate)

metric pre-fix baseline post-fix delta
swaps 1 1 0
draws 0 0 0

The progression metric did NOT move. Despite landing the eager-seed implementation cleanly with 3× reproducibility, neither swaps nor draws advanced. This matches the prior agent's diagnosis: the audio worker ordering issue is real, but the deeper root cause is voice-struct state divergence — the audio callback at 0x824D6640 in ours blocks on KeWaitForMultipleObjects([0x82928B04, 0x82928AE0]) immediately because the voice struct at [r31+356] reads 0x01 (ours) vs 0x00 (canary). Pre-seeding 8 fires lets try_inject_audio_callback deliver the first callback earlier, but the callback still blocks on the same guest dispatchers — fires 2-8 sit in the queue because interrupts.is_in_callback() stays true.

3× determinism

73e99d60029128b4d5c3dd98e540457d82a52b8a962e7495132be2be31411aca  /tmp/digest_eager_1.json
73e99d60029128b4d5c3dd98e540457d82a52b8a962e7495132be2be31411aca  /tmp/digest_eager_2.json
73e99d60029128b4d5c3dd98e540457d82a52b8a962e7495132be2be31411aca  /tmp/digest_eager_3.json

All three cold runs produce byte-identical digest JSON. The seed-at-register implementation is fully deterministic in lockstep mode (the ticker accumulator gets pre-populated synchronously inside the register handler, no host-thread non-determinism).

Digest JSON

{
  "instructions": 50000007,
  "imports": 40390,
  "unimpl": 0,
  "draws": 0,
  "swaps": 1,
  "unique_render_targets": 0,
  "shader_blobs_live": 0,
  "texture_cache_entries": 0
}

imports/unimpl unchanged from the C+22 baseline (40390/0).

Phase B invariant

image_loaded_sha256 = ea8d160e9369328a5b922258a92113efb8d7ce3e1a5c12cc521e375985c91c18

UNCHANGED — Phase B is not affected by audio-runtime changes.

Per-chain matched-prefix

100M-instruction cold trace vs canary baseline (xenia-rs/audit-runs/phase-d-stage1/canary-cvaroff-trunc.jsonl — pre Phase D D-extension absorber, so main reads at the C+18 102,424 value, NOT the post-D-extension 105,046).

chain pre-fix post-fix delta first divergence
canary tid=4 → ours tid=11 11 11 0 (preserved)
canary tid=6 → ours tid=1 102,424 102,424 0 NtQueryFullAttributesFile (C+18-era)
canary tid=7 → ours tid=2 32 32 0 (preserved)
canary tid=12 → ours tid=7 4 4 0 C+23 idx=4
canary tid=14 → ours tid=9 41 41 0 (no advance — primary target)
canary tid=15 → ours tid=10 16 16 0 (no advance — primary target)

The two primary targets (tid=14→9 and tid=15→10) were the audio worker guest threads spinning on the uninitialized voice struct. Their matched-prefix did NOT advance.

Kernel tests

  • Pre: 217 passed
  • Post: 221 passed (+4 new seed_fires_for_* tests)
  • Failures: 0
  • All existing tests pass

Build

cargo build --release clean. One pre-existing dead-code warning unrelated to this fix.

Total LOC

file added removed
crates/xenia-kernel/src/xaudio.rs 86 0
crates/xenia-kernel/src/exports.rs 18 5
total 104 5

Net ~100 LOC, of which ~60 LOC are tests + doc comments. Engine logic delta is ~25 LOC.

Conclusion

The implementation lands cleanly:

  • 3× cold-deterministic
  • Phase B unchanged
  • All tests pass
  • Sister chains preserved

But the progression metric (swaps/draws) did NOT move. This is an HONEST NEGATIVE RESULT: the eager-seed approach addresses the symptom (ticker delays the first callback) but not the root cause (the callback at 0x824D6640 still blocks on guest dispatchers that only tid=9/10 can signal, and tid=9/10 are stuck on a voice-struct field that the callback would need to clear — but doesn't, because canary's callback takes a DIFFERENT control-flow path that doesn't reach the KeWaitForMultipleObjects in the first place).

The deeper fix requires either:

  • Identifying the guest write that initializes [r31+356] to 0 in canary's boot path and ensuring ours produces the same write.
  • A true host-side audio worker thread that can run the callback in a host context (substantial threading-model rework).

Both are out of scope for this session per the brief's "Don't widen scope" tripstone.