Files
xenia-rs/audit-runs/review-a-step1-force-spawn/spec.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

86 lines
4.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Review A Step 1 — `--force-spawn-workers` crowbar spec
**Date**: 2026-05-27
**Status**: LANDED; PRIMARY gate FAIL (progression metric unmoved).
This run re-validates and documents the pre-existing v1/v2/v3 crowbar
implementation under the canonical "Review A Step 1" framing. The
implementation already lives in the working tree (committed-like, not
yet `git add`-ed). This session re-runs the gates and lands the
default-OFF determinism + PRIMARY-gate verdict on the present HEAD.
## Implementation surface (already in working tree)
- `crates/xenia-kernel/src/exports.rs`
- `CROWBAR_WORKER_ENTRIES = [0x82506528, 0x82506558, 0x82506588, 0x825065B8]`
- `CROWBAR_VTABLE_BASE = 0x8200_A1E8` (reading-error #37 honoured: this is the vtable BASE, not slot-N)
- `CROWBAR_STACK_SIZE = 65_536`
- `crowbar_spawn_one_worker()`: allocates thread image, allocates
handle, spawns via `state.scheduler.spawn(SpawnParams { ... })` with
`create_suspended=true, affinity=0, priority=0`, retains self-ref.
- `crowbar_dump_vtable_region()`: read-only diag dumping 128 vtable
u32 slots so we see slots 35-38 (offsets 140/144/148/152) used by
the worker entry stubs.
- `crowbar_maybe_install_vtable_from_file()`: v2 opt-in via env var
`XENIA_CROWBAR_VTABLE_BIN`; no-op if unset (this run leaves unset
because ours already has the vtable populated — see results).
- `crowbar_maybe_install_ctx_from_file()`: v3 opt-in via env var
`XENIA_CROWBAR_CTX_BIN`; installs canary-captured 64-byte ctx
blob (vptr / self / self / refcount / sentinels /
secondary-obj-ptr / float).
- `crowbar_force_spawn_workers()`: orchestrator. Allocates a 0x1000
ctx page, installs `{vptr, self, self, refcount=1}` POD-copy
head, optionally installs vtable + ctx blobs, spawns 4 workers,
resumes 4 workers. Returns count resumed.
- `crates/xenia-kernel/src/state.rs`
- `KernelState::crowbar_workers_enabled` (bool, default false)
- `KernelState::crowbar_workers_trigger_instr` (u64, default
`20_000_000`)
- `KernelState::crowbar_workers_fired` (bool latch)
- `KernelState::try_fire_crowbar_workers(&mut self, &GuestMemory,
instruction_count)`: at-most-once helper; no-op when disabled, when
already fired, or before threshold.
- `crates/xenia-app/src/main.rs`
- `--force-spawn-workers` CLI flag on the `Exec` subcommand
(line ~278) → sets `XENIA_CROWBAR_WORKERS=1` for downstream wire-up
(line ~455).
- Env-var wire-up in `cmd_exec_inner` (~line 1212): reads
`XENIA_CROWBAR_WORKERS=1` and `XENIA_CROWBAR_TRIGGER_INSTR=N`.
- **The trigger call** is at `coord_pre_round` (~line 2479) inside
the per-round prologue, gated on
`kernel.crowbar_workers_enabled && !kernel.crowbar_workers_fired`.
- `check` subcommand has NO `--force-spawn-workers` flag; activation
via env var works for both `exec` and `check`.
## Crowbar firing-moment choice
**Option β = fixed cycle threshold of 20M instructions.**
20M ≈ 3 s wallclock at lockstep cadence, well past:
- the 10-thread initial spawn burst that peaks around boot-init
VdSwap, and
- the AUDIT-049 wedge crystallisation at host_ns ≈ 1.728 s
(~12-15M instr).
The trigger fires once and latches `crowbar_workers_fired = true` so
the helper is at-most-once per process lifetime.
## Default-off invariant
- `crowbar_workers_enabled` defaults to `false` in `KernelState::with_gpu()`.
- The trigger condition `kernel.crowbar_workers_enabled && !kernel.crowbar_workers_fired`
short-circuits the helper when disabled.
- Env-var read returns `false` when `XENIA_CROWBAR_WORKERS` is unset.
- Therefore: zero behaviour change in normal runs. 3× OFF cold runs
are bit-identical (see `re-validation.md`).
## Determinism under crowbar-on
3× ON cold runs are bit-identical (`instructions=20000159`, identical
fault PC/LR/CTR/r3/r4/r29/r30/r31, identical tid=16). The crowbar
fires deterministically at the threshold instruction count, the 4
spawned tids are bit-stable across runs, and the fault site is
bit-stable.