handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes

Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-06-05 07:19:08 +02:00
parent acd1656753
commit ef93a4fa14
620 changed files with 108303 additions and 1 deletions

View File

@@ -0,0 +1,85 @@
# Review A Step 1 — `--force-spawn-workers` crowbar spec
**Date**: 2026-05-27
**Status**: LANDED; PRIMARY gate FAIL (progression metric unmoved).
This run re-validates and documents the pre-existing v1/v2/v3 crowbar
implementation under the canonical "Review A Step 1" framing. The
implementation already lives in the working tree (committed-like, not
yet `git add`-ed). This session re-runs the gates and lands the
default-OFF determinism + PRIMARY-gate verdict on the present HEAD.
## Implementation surface (already in working tree)
- `crates/xenia-kernel/src/exports.rs`
- `CROWBAR_WORKER_ENTRIES = [0x82506528, 0x82506558, 0x82506588, 0x825065B8]`
- `CROWBAR_VTABLE_BASE = 0x8200_A1E8` (reading-error #37 honoured: this is the vtable BASE, not slot-N)
- `CROWBAR_STACK_SIZE = 65_536`
- `crowbar_spawn_one_worker()`: allocates thread image, allocates
handle, spawns via `state.scheduler.spawn(SpawnParams { ... })` with
`create_suspended=true, affinity=0, priority=0`, retains self-ref.
- `crowbar_dump_vtable_region()`: read-only diag dumping 128 vtable
u32 slots so we see slots 35-38 (offsets 140/144/148/152) used by
the worker entry stubs.
- `crowbar_maybe_install_vtable_from_file()`: v2 opt-in via env var
`XENIA_CROWBAR_VTABLE_BIN`; no-op if unset (this run leaves unset
because ours already has the vtable populated — see results).
- `crowbar_maybe_install_ctx_from_file()`: v3 opt-in via env var
`XENIA_CROWBAR_CTX_BIN`; installs canary-captured 64-byte ctx
blob (vptr / self / self / refcount / sentinels /
secondary-obj-ptr / float).
- `crowbar_force_spawn_workers()`: orchestrator. Allocates a 0x1000
ctx page, installs `{vptr, self, self, refcount=1}` POD-copy
head, optionally installs vtable + ctx blobs, spawns 4 workers,
resumes 4 workers. Returns count resumed.
- `crates/xenia-kernel/src/state.rs`
- `KernelState::crowbar_workers_enabled` (bool, default false)
- `KernelState::crowbar_workers_trigger_instr` (u64, default
`20_000_000`)
- `KernelState::crowbar_workers_fired` (bool latch)
- `KernelState::try_fire_crowbar_workers(&mut self, &GuestMemory,
instruction_count)`: at-most-once helper; no-op when disabled, when
already fired, or before threshold.
- `crates/xenia-app/src/main.rs`
- `--force-spawn-workers` CLI flag on the `Exec` subcommand
(line ~278) → sets `XENIA_CROWBAR_WORKERS=1` for downstream wire-up
(line ~455).
- Env-var wire-up in `cmd_exec_inner` (~line 1212): reads
`XENIA_CROWBAR_WORKERS=1` and `XENIA_CROWBAR_TRIGGER_INSTR=N`.
- **The trigger call** is at `coord_pre_round` (~line 2479) inside
the per-round prologue, gated on
`kernel.crowbar_workers_enabled && !kernel.crowbar_workers_fired`.
- `check` subcommand has NO `--force-spawn-workers` flag; activation
via env var works for both `exec` and `check`.
## Crowbar firing-moment choice
**Option β = fixed cycle threshold of 20M instructions.**
20M ≈ 3 s wallclock at lockstep cadence, well past:
- the 10-thread initial spawn burst that peaks around boot-init
VdSwap, and
- the AUDIT-049 wedge crystallisation at host_ns ≈ 1.728 s
(~12-15M instr).
The trigger fires once and latches `crowbar_workers_fired = true` so
the helper is at-most-once per process lifetime.
## Default-off invariant
- `crowbar_workers_enabled` defaults to `false` in `KernelState::with_gpu()`.
- The trigger condition `kernel.crowbar_workers_enabled && !kernel.crowbar_workers_fired`
short-circuits the helper when disabled.
- Env-var read returns `false` when `XENIA_CROWBAR_WORKERS` is unset.
- Therefore: zero behaviour change in normal runs. 3× OFF cold runs
are bit-identical (see `re-validation.md`).
## Determinism under crowbar-on
3× ON cold runs are bit-identical (`instructions=20000159`, identical
fault PC/LR/CTR/r3/r4/r29/r30/r31, identical tid=16). The crowbar
fires deterministically at the threshold instruction count, the 4
spawned tids are bit-stable across runs, and the fault site is
bit-stable.