Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
110 lines
4.7 KiB
Markdown
110 lines
4.7 KiB
Markdown
# Step 0 — framing verification
|
||
|
||
Read-only checks of the crowbar's expected parameters against
|
||
`xenia-rs/audit-runs/phase-nonmatch-investigation/create-thread-events.json`,
|
||
the AUDIT-068 S3/S4 memory dossier (write epoch 9.4-9.6 s, vtable
|
||
base `0x8200A1E8`), and ours's `ExCreateThread`
|
||
(`crates/xenia-kernel/src/exports.rs:294`).
|
||
|
||
## The 4 thread.create events (from canary-jitter-1.jsonl)
|
||
|
||
| Index | host_ns | tid (creator) | entry_pc | ctx_ptr | stack | susp | aff | prio |
|
||
|------:|---------------:|--------------:|------------:|------------:|-------:|-----:|----:|-----:|
|
||
| 20 | 10,382,912,900 | 6 | 0x82506528 | 0xBCE251C0 | 65536 | true | 0 | 0 |
|
||
| 21 | 10,383,282,200 | 6 | 0x82506558 | 0xBCE251C0 | 65536 | true | 0 | 0 |
|
||
| 22 | 10,383,647,200 | 6 | 0x82506588 | 0xBCE251C0 | 65536 | true | 0 | 0 |
|
||
| 23 | 10,384,161,700 | 6 | 0x825065B8 | 0xBCE251C0 | 65536 | true | 0 | 0 |
|
||
|
||
All 4 share `ctx_ptr=0xBCE251C0`, all spaced ~370–500 ns apart on
|
||
canary tid=6 (main). `affinity=0` means scheduler chooses; `priority=0`
|
||
default.
|
||
|
||
Canary's natural resume happens "later" via `NtResumeThread` from
|
||
worker code (not captured in this jsonl excerpt; deferred — for the
|
||
crowbar we resume directly after the 4-spawn burst since the natural
|
||
resume gate is downstream of the wedge).
|
||
|
||
## The ctx layout @ ctx_ptr (per AUDIT-068 S3/S4)
|
||
|
||
At install epoch host_ns ≈ 9.416 s on canary tid=6, three u32 slots
|
||
written simultaneously by guest PC `sub_824FD240+0x24` POD-copy:
|
||
|
||
```
|
||
[ctx_ptr + 0x00] = 0x8200A1E8 (vtable BASE — class ANON_Class_713383D7)
|
||
[ctx_ptr + 0x04] = ctx_ptr (self pointer — doubly-linked list head)
|
||
[ctx_ptr + 0x08] = ctx_ptr (self pointer — doubly-linked list head)
|
||
[ctx_ptr + 0x0C] = (refcount, observed = 1 at later epoch per S4)
|
||
```
|
||
|
||
**Reading-error #37 discipline**: the value `0x8200A1E8` is the
|
||
vtable BASE, NOT slot-N address. `0x8200A208` cited in older
|
||
AUDIT-058/060/067 is `base + 0x20` = slot-8 address within the
|
||
vtable, mistaken for the base in those audits. The install value
|
||
is `0x8200A1E8` per AUDIT-068 S3 measurement.
|
||
|
||
## Worker entry stubs
|
||
|
||
Per `sub_825070F0.md`, each of the 4 entries (`0x82506528`, +0x30,
|
||
+0x60, +0x90) is a thin stub that does:
|
||
|
||
```
|
||
lwz r11, 0(r3) ; load vtable base from ctx
|
||
lwz r11, 140(r11) ; load fn ptr from vtable[35]
|
||
; (each entry uses a different slot: 35/36/37/38)
|
||
mtctr r11
|
||
bctr
|
||
```
|
||
|
||
So the workers dispatch through ctx's vtable. If the vtable's
|
||
slots 35-38 are not populated (or `0x8200A1E8` is in `.rdata` and
|
||
slot reads are valid), the workers will jump to whatever guest code
|
||
is at those addresses. The dossier says vtable is "7 entries" but
|
||
the worker stubs read at offsets 140/144/148/152 → so the actual
|
||
class has at least 39 vtable entries (consistent with AUDIT-058's
|
||
"this is a wider parent class" framing).
|
||
|
||
The risk that the workers fault on a bad vtable load is REAL but
|
||
HONEST — the crowbar's job is to test this exact thing.
|
||
|
||
## What ours's `ex_create_thread` does today
|
||
|
||
`crates/xenia-kernel/src/exports.rs:294-405`. Takes 6 PPC regs,
|
||
allocates thread image (stack + PCR + TLS), allocates a thread
|
||
handle, calls `scheduler.spawn(SpawnParams { ... })`, installs the
|
||
self-ref via `state.retain_handle(handle)`, writes the handle to
|
||
`r3` and tid to `r5`. Phase A `thread.create` event is emitted when
|
||
`event_log::is_enabled()`.
|
||
|
||
The host-side analog therefore only needs:
|
||
1. Allocate ctx page via `state.heap_alloc(0x1000, mem)` → write the
|
||
4 u32s described above into it.
|
||
2. For each of 4 entries: call a host-side `ex_create_thread`-like
|
||
helper that takes (entry, ctx_ptr, stack_size, suspended, affinity,
|
||
priority) directly, skipping the PPC-reg-marshalling.
|
||
3. Resume each of the 4 spawned threads via the scheduler's
|
||
`resume_ref`.
|
||
|
||
## Trigger choice
|
||
|
||
`coord_pre_round` in `xenia-app/src/main.rs:2038` is per-outer-round
|
||
and has access to both `KernelState` and `ExecStats`. Adding a
|
||
one-shot check on `stats.instruction_count >= threshold` is
|
||
trivially additive.
|
||
|
||
Threshold default = 20_000_000. At ~6.7M instr/sec lockstep that's
|
||
~3 s wallclock; well past the 10-thread initial spawn burst (which
|
||
peaks around the boot-init swap) but still early enough for the
|
||
workers to have time before the 200M cap.
|
||
|
||
Configurable via env `XENIA_CROWBAR_TRIGGER_INSTR=N`.
|
||
|
||
## LOC estimate
|
||
|
||
- `xenia-kernel/src/exports.rs` host_spawn_worker_thread helper: ~50 LOC
|
||
- `xenia-kernel/src/state.rs` crowbar `CrowbarConfig` field + `tick_crowbar`: ~40 LOC
|
||
- `xenia-app/src/main.rs` cvar + trigger wire-up: ~30 LOC
|
||
- Tests: ~50 LOC
|
||
|
||
Total ~170 LOC; trim by inlining the helper or sharing
|
||
`SpawnParams` boilerplate. Target ≤150 LOC.
|