Files
xenia-rs/audit-runs/review-a-step1-force-spawn/spec.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

4.0 KiB
Raw Blame History

Review A Step 1 — --force-spawn-workers crowbar spec

Date: 2026-05-27 Status: LANDED; PRIMARY gate FAIL (progression metric unmoved).

This run re-validates and documents the pre-existing v1/v2/v3 crowbar implementation under the canonical "Review A Step 1" framing. The implementation already lives in the working tree (committed-like, not yet git add-ed). This session re-runs the gates and lands the default-OFF determinism + PRIMARY-gate verdict on the present HEAD.

Implementation surface (already in working tree)

  • crates/xenia-kernel/src/exports.rs

    • CROWBAR_WORKER_ENTRIES = [0x82506528, 0x82506558, 0x82506588, 0x825065B8]
    • CROWBAR_VTABLE_BASE = 0x8200_A1E8 (reading-error #37 honoured: this is the vtable BASE, not slot-N)
    • CROWBAR_STACK_SIZE = 65_536
    • crowbar_spawn_one_worker(): allocates thread image, allocates handle, spawns via state.scheduler.spawn(SpawnParams { ... }) with create_suspended=true, affinity=0, priority=0, retains self-ref.
    • crowbar_dump_vtable_region(): read-only diag dumping 128 vtable u32 slots so we see slots 35-38 (offsets 140/144/148/152) used by the worker entry stubs.
    • crowbar_maybe_install_vtable_from_file(): v2 opt-in via env var XENIA_CROWBAR_VTABLE_BIN; no-op if unset (this run leaves unset because ours already has the vtable populated — see results).
    • crowbar_maybe_install_ctx_from_file(): v3 opt-in via env var XENIA_CROWBAR_CTX_BIN; installs canary-captured 64-byte ctx blob (vptr / self / self / refcount / sentinels / secondary-obj-ptr / float).
    • crowbar_force_spawn_workers(): orchestrator. Allocates a 0x1000 ctx page, installs {vptr, self, self, refcount=1} POD-copy head, optionally installs vtable + ctx blobs, spawns 4 workers, resumes 4 workers. Returns count resumed.
  • crates/xenia-kernel/src/state.rs

    • KernelState::crowbar_workers_enabled (bool, default false)
    • KernelState::crowbar_workers_trigger_instr (u64, default 20_000_000)
    • KernelState::crowbar_workers_fired (bool latch)
    • KernelState::try_fire_crowbar_workers(&mut self, &GuestMemory, instruction_count): at-most-once helper; no-op when disabled, when already fired, or before threshold.
  • crates/xenia-app/src/main.rs

    • --force-spawn-workers CLI flag on the Exec subcommand (line ~278) → sets XENIA_CROWBAR_WORKERS=1 for downstream wire-up (line ~455).
    • Env-var wire-up in cmd_exec_inner (~line 1212): reads XENIA_CROWBAR_WORKERS=1 and XENIA_CROWBAR_TRIGGER_INSTR=N.
    • The trigger call is at coord_pre_round (~line 2479) inside the per-round prologue, gated on kernel.crowbar_workers_enabled && !kernel.crowbar_workers_fired.
    • check subcommand has NO --force-spawn-workers flag; activation via env var works for both exec and check.

Crowbar firing-moment choice

Option β = fixed cycle threshold of 20M instructions.

20M ≈ 3 s wallclock at lockstep cadence, well past:

  • the 10-thread initial spawn burst that peaks around boot-init VdSwap, and
  • the AUDIT-049 wedge crystallisation at host_ns ≈ 1.728 s (~12-15M instr).

The trigger fires once and latches crowbar_workers_fired = true so the helper is at-most-once per process lifetime.

Default-off invariant

  • crowbar_workers_enabled defaults to false in KernelState::with_gpu().
  • The trigger condition kernel.crowbar_workers_enabled && !kernel.crowbar_workers_fired short-circuits the helper when disabled.
  • Env-var read returns false when XENIA_CROWBAR_WORKERS is unset.
  • Therefore: zero behaviour change in normal runs. 3× OFF cold runs are bit-identical (see re-validation.md).

Determinism under crowbar-on

3× ON cold runs are bit-identical (instructions=20000159, identical fault PC/LR/CTR/r3/r4/r29/r30/r31, identical tid=16). The crowbar fires deterministically at the threshold instruction count, the 4 spawned tids are bit-stable across runs, and the fault site is bit-stable.