handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes

Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-06-05 07:19:08 +02:00
parent acd1656753
commit ef93a4fa14
620 changed files with 108303 additions and 1 deletions

View File

@@ -0,0 +1,110 @@
+
+/// Crowbar v3 Step 2 — optionally install full ctx bytes at `ctx_ptr`
+/// from a binary file specified by `XENIA_CROWBAR_CTX_BIN`. Bytes are
+/// written verbatim (no byte-swap) via `write_u8` because the file is
+/// expected to be a raw guest-endian (big-endian) capture of the ctx
+/// layout from canary's runtime memory. Logs the first 16 u32 slots
+/// after install for verification. If the env var is unset, this is a
+/// no-op so v2 behaviour is preserved exactly.
+///
+/// Captured via canary's `audit_68_host_mem_read_probe` cvar — see
+/// `xenia-rs/audit-runs/review-a-step1c-crowbar-v3/canary-probe-run1.log`.
+///
+/// **Option γ (per v3 brief)**: install verbatim, including canary-VA
+/// pointer fields like `[ctx+44]=0xBCE25640`. These VAs may be unmapped
+/// in ours's address space — if a worker dereferences one and faults,
+/// that confirms the case-(C) recursion is required (v4 work).
+fn crowbar_maybe_install_ctx_from_file(mem: &GuestMemory, ctx_ptr: u32) {
+ let path = match std::env::var("XENIA_CROWBAR_CTX_BIN") {
+ Ok(p) if !p.is_empty() => p,
+ _ => {
+ tracing::warn!(
+ "CROWBAR: XENIA_CROWBAR_CTX_BIN not set — skipping ctx install \
+ (v2 behaviour; only +0/+4/+8/+12 are populated; \
+ workers will likely fault on [ctx+44] dispatch)"
+ );
+ return;
+ }
+ };
+ let bytes = match std::fs::read(&path) {
+ Ok(b) => b,
+ Err(e) => {
+ tracing::error!(
+ "CROWBAR: failed to read ctx bin {:?}: {} — skipping install",
+ path,
+ e,
+ );
+ return;
+ }
+ };
+ let n = bytes.len().min(256);
+ for (i, b) in bytes.iter().take(n).enumerate() {
+ mem.write_u8(ctx_ptr + i as u32, *b);
+ }
+ tracing::warn!(
+ "CROWBAR: installed {} bytes at ctx_ptr={:#010x} from {:?}",
+ n,
+ ctx_ptr,
+ path,
+ );
+ // Verify: log the first 16 u32 slots after install.
+ for slot in 0..16u32 {
+ let off = slot * 4;
+ let v = mem.read_u32(ctx_ptr + off);
+ tracing::warn!(
+ "CROWBAR: post-ctx-install ctx[+{:>3}] (={:#010x}) = {:#010x}",
+ off,
+ ctx_ptr + off,
+ v,
+ );
+ }
+}
+
+/// Crowbar entry point — allocate the worker ctx, install the vtable
+/// + self-pointer doubly-linked-list head pattern that AUDIT-068 S3
+/// captured, spawn all 4 workers suspended, then resume each one.
+/// Returns the number of workers successfully resumed (0..=4).
+///
+/// **Reading-error #37 discipline**: the value written at `ctx+0` is the
+/// vtable BASE `0x8200A1E8`, NOT the slot-N address `0x8200A208` cited
+/// in older audits. Per AUDIT-068 S3 measurement.
+pub fn crowbar_force_spawn_workers(state: &mut KernelState, mem: &GuestMemory) -> u32 {
+ // 0. Crowbar v2 Step 0 diagnostic — dump 256 bytes at the vtable base
+ // BEFORE doing anything else. Distinguishes case (A) vtable .rdata
+ // is missing/zero in ours vs case (B) .rdata present but vtable[35]
+ // is not statically populated (= runtime install needed). Per
+ // Reading-error #37: 0x8200A1E8 is vtable BASE; slot N is at base+4*N.
+ // For workers we care about slots 35/36/37/38 (offsets 140/144/148/152).
+ // Bump dump to 512 bytes (128 slots) so we see vtable[64] which is read
+ // by the slot-35 callee `sub_82506B08` at +256.
+ crowbar_dump_vtable_region(mem, CROWBAR_VTABLE_BASE, 512);
+
+ // 1. Allocate ctx struct (one heap page is plenty; the real struct is
--
+
+ // 2c. Crowbar v3 Step 2 — full ctx-bytes install.
+ // If the cvar `XENIA_CROWBAR_CTX_BIN=<path>` is set AND the file
+ // exists, read up to 256 bytes from it and write them at ctx_ptr.
+ // The file should be a raw guest-endian (big-endian) capture of the
+ // ctx layout — see canary's `audit_68_host_mem_read_probe` cvar.
+ // The v2 init at +0/+4/+8/+12 above is intentionally retained as a
+ // fallback when the env var is unset; the file install overwrites
+ // those four slots verbatim (the bytes match the v2 pattern).
+ //
+ // **Option γ (per v3 brief)**: canary-VA pointer fields like
+ // `[ctx+44]=0xBCE25640` are written as-is even if unmapped in
+ // ours — diagnostic intent is to OBSERVE the fault PC, not avoid
+ // it.
+ crowbar_maybe_install_ctx_from_file(mem, ctx_ptr);
+
+ // 3. Spawn the 4 workers suspended (matching canary jitter sample).
+ let mut handles: [u32; 4] = [0; 4];
+ let mut spawned = 0u32;
+ for (i, entry) in CROWBAR_WORKER_ENTRIES.iter().enumerate() {
+ if let Some(h) = crowbar_spawn_one_worker(state, mem, *entry, ctx_ptr) {
+ handles[i] = h;
+ spawned += 1;
+ }
+ }
+
+ // 4. Resume each spawned worker directly through the scheduler.

View File

@@ -0,0 +1,199 @@
# Crowbar v3 — ctx-state install verbatim
**Date**: 2026-05-21
**Predecessor**: v2 at `audit-runs/review-a-step1b-crowbar-v2/`.
**Status**: LANDED. Hypothesis FALSIFIED: wedge is NOT crowbar-soluble at
the ctx-state-only level. Case (D) needed (recursive secondary-object
install). v3 produces same composite progression score as OFF baseline.
## TL;DR
- v2 found case (C): `[ctx+44]` is a secondary-object pointer.
vtable[36] reads it and dispatches through it.
- v3 captured canary's **actual `[ctx+44]` value** = `0xBCE25640` (via
the `audit_68_host_mem_read_probe` cvar) along with the rest of the
64-byte ctx head, then installed that state verbatim in ours.
- Worker tid=15 now passes the `[ctx+44]` load (loads `0xBCE25640`
into r3) but **`0xBCE25640` is unmapped in ours's address space**
(ours's allocator returns 0x4D1Dxxxx VAs; canary's xenon-arena VAs
in the `0xBCExxxxx` range have no equivalent in ours).
- Reading `[0xBCE25640]` returns 0 → `CTR=0``bctrl` faults at PC=0
with `r3=0xbce25640` (was `r3=0x0` in v2 — confirming the install
worked, just deeper recursion needed).
- 3x OFF / 3x ON runs deterministic: `swaps=1, draws=0,
unique_render_targets=0` identical. **Composite progression Δ = 0.**
## Captured canary ctx state
Canary cold run (90s, `--mute=true`), with cvars:
```
--audit_61_branch_probe_pcs=0x825070F0
--audit_68_host_mem_read_probe=0xBCE251C0:8:1000000,0xBCE251C8:8:1000000,
0xBCE251D0:8:1000000,0xBCE251D8:8:1000000,
0xBCE251E0:8:1000000,0xBCE251E8:8:1000000,
0xBCE251F0:8:1000000,0xBCE251F8:8:1000000
```
AUDIT-061-BR confirmed ctx_ptr=`0xBCE251C0` (per AUDIT-068 S3 expectation;
no arena drift in this run). Read probe captured the install timeline:
| host_ns | event |
|--------:|-------|
| 9.556 s | Install starts: `[ctx+0]=0x8200A1E8` (vtable), `[ctx+4]=ctx`, `[ctx+8]=ctx`, `[ctx+12]=1` (refcount), `[ctx+16]=0x01000000`, `[ctx+32]=0xFFFFFFFF` |
| 9.571 s | `[ctx+44]=0xBCE25640` written, `[ctx+48]=0xBE568F00` written (looks float-ish) |
| 9.754 s | Transient `[ctx+32]=1` and `[ctx+40]=0x30057018` writes that are cleared next probe tick — likely temporary scratch during a function call |
| 9.755 s | Stable post-install state |
Final ctx bytes (saved at `ctx-canary.bin`):
```
+ 0: 82 00 A1 E8 BC E2 51 C0 BC E2 51 C0 00 00 00 01 <- vptr / self / self / refcount
+ 16: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
+ 32: FF FF FF FF 00 00 00 00 00 00 00 00 BC E2 56 40 <- ...sentinel... / [ctx+44]=0xBCE25640
+ 48: BE 56 8F 00 00 00 00 00 00 00 00 00 00 00 00 00 <- [ctx+48]=0xBE568F00 (-0.21f?)
```
## Install path in ours
v3 adds `crowbar_maybe_install_ctx_from_file()` (~63 LOC) that reads
the binary at `$XENIA_CROWBAR_CTX_BIN` and writes the bytes via
`mem.write_u8(ctx_ptr + i, byte)` — same pattern as v2's
`crowbar_maybe_install_vtable_from_file()`. Plus ~12 LOC of comments
and the call-site addition. ~75 LOC additive over v2.
The 64-byte ctx file overwrites the v2 init at `+0/+4/+8/+12` with
identical values (verified — they match), and fills `+16..+63` with
the captured state.
Post-install log confirms exact write:
```
CROWBAR: installed 64 bytes at ctx_ptr=0x4d1d9000
CROWBAR: post-ctx-install ctx[+ 0] (=0x4d1d9000) = 0x8200a1e8
CROWBAR: post-ctx-install ctx[+ 32] (=0x4d1d9020) = 0xffffffff
CROWBAR: post-ctx-install ctx[+ 44] (=0x4d1d902c) = 0xbce25640 <-- secondary obj ptr installed
CROWBAR: post-ctx-install ctx[+ 48] (=0x4d1d9030) = 0xbe568f00
```
## The fault (v3)
Identical fault PC, different r3 — that's the smoking gun:
| | v1 (no ctx install) | v2 (init +0..+12 only) | v3 (full 64 bytes) |
|-|-|-|-|
| FAULT PC | 0 | 0 | 0 |
| LR | 0x82506e38 | 0x82506e38 | 0x82506e38 |
| CTR | 0 | 0 | 0 |
| **r3** | (any) | **0x0** | **0xbce25640** |
| r30 (ctx_ptr) | 0x4D1D9000 | 0x4D1D9000 | 0x4D1D9000 |
| tid | 15 | 15 | 15 |
The `lwz r11, 0(r3)` at PC `0x82506e28` (per v2's disasm) loads from
`r3 = [ctx+44]`. In v2, `r3=0`, so reads `[0]=0`. In v3, `r3=0xBCE25640`,
so reads `[0xBCE25640]`. Both reads return 0 because:
- v2: page 0 isn't mapped (well, it might be but the value is 0).
- v3: page `0xBCE25640` is **definitely** unmapped in ours.
Ours's heap is at `0..0x6FFFFFFF` (per `KernelState::heap_alloc`). The
xenon physical-region VAs (`0xBC000000..0xC0000000`) never appear in
ours's allocator namespace — `MmAllocatePhysicalMemoryEx` just calls
`heap_alloc()` which returns low VAs.
## Why this falsifies the v3 hypothesis
The brief's hypothesis: "with the full ctx state pre-installed AND the
4 workers spawned, ours produces `swaps≥2` or `draws≥1`."
Outcome: ctx state IS installed, 4 workers ARE spawned and resumed,
but the dispatch on the secondary object fails because the secondary
object's VA isn't mappable.
This is exactly **case (γ) → fault at new structural location** that
the brief predicted. The new fault PC isn't actually new (still 0),
but the new fault PRIMARY CAUSE is different: in v2 the cause was
"ctx+44 not initialized"; in v3 it's "ctx+44 points to an unmapped VA."
## Composite progression score
Per brief's option 6 metric (excluding the matched_prefix term, which
needs canary cross-comparison not available in `check` digests):
```
score = 1*swaps + 10*draws + 100*unique_render_targets
```
| Run | swaps | draws | unique_RT | score | instructions |
|-|-:|-:|-:|-:|-:|
| OFF-1 | 1 | 0 | 0 | **1** | 25,000,000 |
| OFF-2 | 1 | 0 | 0 | **1** | 25,000,000 |
| OFF-3 | 1 | 0 | 0 | **1** | 25,000,000 |
| ON-1 | 1 | 0 | 0 | **1** | 20,000,167 (faulted) |
| ON-2 | 1 | 0 | 0 | **1** | 20,000,167 (faulted) |
| ON-3 | 1 | 0 | 0 | **1** | 20,000,167 (faulted) |
**Δ = 0**. The instruction count dropped from 25M to 20.0001M in ON runs
because the fault halts the run early at `instr=20000167`, ~167 instr
after the crowbar trigger (threshold=20M). Confirms the workers can't
even complete one meaningful iteration before faulting.
## LOC delta
- `crates/xenia-kernel/src/exports.rs`: +63 LOC (helper)
+ 13 LOC (call-site comments + wire-up) = +76 LOC over v2.
- `audit-runs/review-a-step1c-crowbar-v3/`: artifacts (ctx-canary.bin,
canary-probe-run1.log, off-{1,2,3}.json, on-{1,2,3}.json, this doc,
summary.md, re-validation.md, fix.diff).
- No tests added: the helper is structurally identical to v2's
`crowbar_maybe_install_vtable_from_file`, which has no test (it's a
diagnostic, opt-in via env var).
- canary instrumentation: **0 LOC** (reused existing
`audit_68_host_mem_read_probe` cvar).
## What this confirms
1. v2's case (C) framing is structurally correct: `[ctx+44]` IS a
secondary-object pointer that vtable[36] dispatches through.
2. Cross-engine pointer-VA mismatch is real and non-trivial:
ours's allocator namespace doesn't include `0xBCxxxxxx` VAs.
3. The wedge is **≥4-deep** (vtable + ctx primary + ctx secondary
pointer + secondary object's own vtable + fn-pointer slot). Crowbar
approach saturates without much deeper state capture.
## What this does NOT confirm
- That the actual canary VA `0xBCE25640` is the ONLY secondary object.
There may be more pointers in deeper ctx slots (we only captured 64
bytes; the full struct may be larger).
- That installing the secondary object would suffice. The secondary
object likely has its own pointer fields (head node of a linked
list — looks like a queue/work-list given the doubly-linked-list
pattern at +4/+8).
## Recommendation
**Stop the crowbar approach.** The wedge is structurally too deep
for state synthesis to be cheaper than fixing the natural-activation
gap. Per Q5 of the boot-state review (methodology-assessment.md): the
matched-prefix metric is on the wrong thread, and the wedge is
**inherently a thread-activation problem**, not a state-construction
problem.
Pivot recommendations (in order of cost):
1. **AUDIT-069 follow-up** — the 25 vs 1 "other producers" gap from
Session 5 is more actionable than the worker-spawn gap. The XAudio
thread resume at canary 1.726 s is a candidate trigger that
produces 8-24 helpers ahead of the wedge.
2. **Recursive ctx-state capture** (option β from brief) — write a
probe-graph tool that captures canary's pointer-reachable closure
from ctx_ptr (BFS via `audit_68_host_mem_read_probe`, follow each
pointer field that's in the BC arena, capture another 64 bytes,
repeat). Estimate: 200-400 LOC tooling + needs ours-side memory
allocator extension to map BC-arena VAs. High complexity vs gain.
3. **Pointer-translation table** (option α) — map canary BC-VAs to
ours allocator-VAs on install. Needs canary-vs-ours linked allocator
walk; ~300 LOC.
The natural-activation path (Step 2 of the boot-state roadmap) is
likely cheaper than any of these crowbar extensions.

View File

@@ -0,0 +1,16 @@
{
"path": "/home/fabi/RE - Project Sylpheed/Project Sylpheed - Arc of Deception (USA, Europe) (En,Ja).iso",
"instructions": 25000000,
"imports": 39290,
"unimpl": 0,
"packets": 21485949,
"draws": 0,
"swaps": 1,
"resolves": 0,
"unique_render_targets": 0,
"shader_blobs_live": 0,
"interrupts_delivered": 54,
"interrupts_dropped": 109,
"texture_cache_entries": 0,
"texture_decodes": 0
}

View File

@@ -0,0 +1,16 @@
{
"path": "/home/fabi/RE - Project Sylpheed/Project Sylpheed - Arc of Deception (USA, Europe) (En,Ja).iso",
"instructions": 25000000,
"imports": 39290,
"unimpl": 0,
"packets": 21770730,
"draws": 0,
"swaps": 1,
"resolves": 0,
"unique_render_targets": 0,
"shader_blobs_live": 0,
"interrupts_delivered": 54,
"interrupts_dropped": 109,
"texture_cache_entries": 0,
"texture_decodes": 0
}

View File

@@ -0,0 +1,16 @@
{
"path": "/home/fabi/RE - Project Sylpheed/Project Sylpheed - Arc of Deception (USA, Europe) (En,Ja).iso",
"instructions": 25000000,
"imports": 39290,
"unimpl": 0,
"packets": 20827204,
"draws": 0,
"swaps": 1,
"resolves": 0,
"unique_render_targets": 0,
"shader_blobs_live": 0,
"interrupts_delivered": 54,
"interrupts_dropped": 109,
"texture_cache_entries": 0,
"texture_decodes": 0
}

View File

@@ -0,0 +1,16 @@
{
"path": "/home/fabi/RE - Project Sylpheed/Project Sylpheed - Arc of Deception (USA, Europe) (En,Ja).iso",
"instructions": 20000167,
"imports": 39290,
"unimpl": 0,
"packets": 18695547,
"draws": 0,
"swaps": 1,
"resolves": 0,
"unique_render_targets": 0,
"shader_blobs_live": 0,
"interrupts_delivered": 54,
"interrupts_dropped": 76,
"texture_cache_entries": 0,
"texture_decodes": 0
}

View File

@@ -0,0 +1,16 @@
{
"path": "/home/fabi/RE - Project Sylpheed/Project Sylpheed - Arc of Deception (USA, Europe) (En,Ja).iso",
"instructions": 20000167,
"imports": 39290,
"unimpl": 0,
"packets": 18706950,
"draws": 0,
"swaps": 1,
"resolves": 0,
"unique_render_targets": 0,
"shader_blobs_live": 0,
"interrupts_delivered": 54,
"interrupts_dropped": 76,
"texture_cache_entries": 0,
"texture_decodes": 0
}

View File

@@ -0,0 +1,16 @@
{
"path": "/home/fabi/RE - Project Sylpheed/Project Sylpheed - Arc of Deception (USA, Europe) (En,Ja).iso",
"instructions": 20000167,
"imports": 39290,
"unimpl": 0,
"packets": 19099150,
"draws": 0,
"swaps": 1,
"resolves": 0,
"unique_render_targets": 0,
"shader_blobs_live": 0,
"interrupts_delivered": 54,
"interrupts_dropped": 76,
"texture_cache_entries": 0,
"texture_decodes": 0
}

View File

@@ -0,0 +1,56 @@
# Re-validation — Crowbar v3 composite progression metric
**Date**: 2026-05-21
**Method**: 3x OFF + 3x ON cold runs, 25M-instruction `check` digests.
## Protocol
- Binary: `target/release/xrs-crowbar3` (release build of HEAD + v3 patch).
- Command: `xrs-crowbar3 check <ISO> -n 25000000 --gpu-thread --out <json>`.
- OFF: no env vars set (crowbar default-off).
- ON: `XENIA_CROWBAR_WORKERS=1 XENIA_CROWBAR_CTX_BIN=ctx-canary.bin`.
- ISO: Sylpheed master image; no cache wipes between ours-runs (the
crowbar trigger is per-instr-count, deterministic).
## Results
| Run | instructions | swaps | draws | unique_RT | score | notes |
|-|-:|-:|-:|-:|-:|-|
| OFF-1 | 25,000,000 | 1 | 0 | 0 | **1** | natural halt at limit |
| OFF-2 | 25,000,000 | 1 | 0 | 0 | **1** | natural halt at limit |
| OFF-3 | 25,000,000 | 1 | 0 | 0 | **1** | natural halt at limit |
| ON-1 | 20,000,167 | 1 | 0 | 0 | **1** | FAULT @ pc=0 lr=0x82506e38 r3=0xBCE25640 tid=15 |
| ON-2 | 20,000,167 | 1 | 0 | 0 | **1** | FAULT @ pc=0 lr=0x82506e38 r3=0xBCE25640 tid=15 |
| ON-3 | 20,000,167 | 1 | 0 | 0 | **1** | FAULT @ pc=0 lr=0x82506e38 r3=0xBCE25640 tid=15 |
## Composite score (option 6 from boot-state review)
```
score = 1*swaps + 10*draws + 100*unique_render_targets
```
- OFF mean: 1.0
- ON mean: 1.0
- **Δ = 0**
## Determinism
Across 3 ON runs:
- Fault PC, LR, CTR, r3, r4, r29, r30, r31 are bit-identical.
- Fault cycle is bit-identical (cycle=20000167).
- `instructions=20000167` matches across all 3 runs (deterministic
scheduling under the GPU thread backend).
- `packets` varies slightly (18.6M / 18.7M / 19.1M) — expected, this
is the GPU thread race documented as ±2-8% jitter.
## Conclusion
3x replication confirms the v3 crowbar deterministically faults at
the same dispatch site with the same register state. The
`XENIA_CROWBAR_CTX_BIN` install path is structurally sound — it writes
exactly the canary-captured bytes (verified via the post-install u32
slot dump in stderr) — but the secondary-object recursion is now the
blocker.
**The hypothesis "wedge is crowbar-soluble with full ctx state +
worker spawn" is FALSIFIED at the case-(D) recursion level.**