Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
252 lines
12 KiB
Markdown
252 lines
12 KiB
Markdown
# Iterate 2.K — Longer-budget cache-wipe replay (writer report)
|
||
|
||
**Date:** 2026-05-28. **LOC delta:** engine **0**, canary **0**. Pure
|
||
measurement.
|
||
**Tests:** N/A (no source modifications).
|
||
|
||
## Headline
|
||
|
||
**INSTALL-CHAIN-ABSENT-NEW-BLOCKER.** 500M-instruction budget run
|
||
(10× 2.J's 50M) reaches the budget cap cleanly at wallclock=13.96s
|
||
**but emits ZERO new Phase-A events past 2.J's terminus.** Event count
|
||
121,569 bit-identical to 2.J. tid=1 max guest_cycle 9,169,116 bit-identical
|
||
to 2.J. The keystone `sub_824F8398` install chain still **0 fires**;
|
||
`sub_825070F0` worker fan-out still **0 fires**. Final-state dump
|
||
reveals **all 12 live threads parked in `Blocked(WaitAny ..., deadline:
|
||
None)` waits, 5 of them at PC `0x824ac578`** — the exact AUDIT-049
|
||
wedge PC. The 2.J "wedge moved / wait returns success" observation was
|
||
budget-truncated artifact: under longer budget, the engine re-converges
|
||
to a deadlock at the same call site. **2.J's `NtWaitForSingleObjectEx
|
||
return=0` events are the wrapper successfully returning on prior
|
||
iterations of a tight `wait → return → wait` loop; the FINAL wait of
|
||
each tid blocks forever and never emits a `kernel.return`.** Cache
|
||
parity was load-bearing but is NOT THE keystone. Next blocker is
|
||
upstream of the install chain at the wedge-loop level.
|
||
|
||
## Mode
|
||
|
||
ZERO LOC. Invocation:
|
||
```
|
||
XENIA_CACHE_WIPE=1 timeout 600 ./xenia-rs/target/release/xenia-rs exec \
|
||
-n 500000000 --quiet \
|
||
--phase-a-event-log audit-runs/iterate-2K-longer-budget-replay/ours-cold.jsonl \
|
||
"Project Sylpheed - Arc of Deception (USA, Europe) (En,Ja).iso"
|
||
```
|
||
Identical to 2.J except `-n 50000000` → `-n 500000000`. XDG cache
|
||
already absent (no `/home/fabi/.local/share/xenia-rs/cache/`) before run;
|
||
`XENIA_CACHE_WIPE=1` set for belt-and-braces.
|
||
|
||
Run completed `EXIT=0` at wallclock 13.96s. Final reason from non-quiet
|
||
diagnostic re-run: `reached max instruction count limit=500000000`
|
||
(instruction budget hit, not a panic/fault/timeout). Total instructions
|
||
executed: 500,000,004.
|
||
|
||
## Primary gate results
|
||
|
||
| gate | 2.J | 2.K | result |
|
||
|------|-----|-----|--------|
|
||
| `sub_824F8398` install-chain fires | 0 | **0** | UNCHANGED |
|
||
| `sub_825070F0` worker fan-out fires | 0 | **0** | UNCHANGED |
|
||
|
||
Grep against full ours-cold.jsonl (case-insensitive on hex literal,
|
||
plus per-tid first-kernel-call signature): zero hits for either symbol
|
||
across all kinds (thread.create, import.call, kernel.call,
|
||
kernel.return, payload fields). The canary's tids 15/27/28 (the
|
||
`sub_825070F0` family workers) and tid 14 (audio worker
|
||
`sub_824D2878`-driven) are **structurally absent from ours's thread
|
||
fan-out at this trajectory point**, even given 10× the instruction
|
||
budget.
|
||
|
||
## Secondary cascade gate results
|
||
|
||
### Thread set
|
||
**10 thread.create entries, bit-identical to 2.J** (same entry_pcs,
|
||
same ctx_ptrs). Per tripstone #28 (don't key on integer tid):
|
||
|
||
| entry_pc | ctx_ptr | canary analog |
|
||
|----------|---------|---------------|
|
||
| 0x82181830 | 0x828f3d08 | main bootstrap |
|
||
| 0x8245a5d0 | 0x828f4838 | early helper |
|
||
| 0x82450a28 | 0x828f3b68 | producer (AUDIT-069) |
|
||
| 0x82457ef0 | 0x828f3b08 | dispatcher tid=5 |
|
||
| 0x824cd458 | 0xbe8cbb3c | per-AUDIT-068 sister |
|
||
| 0x822f1ee0 | 0xbd184a40 | helper |
|
||
| 0x824d2878 | 0x00000000 | audio worker (no kernel calls) |
|
||
| 0x824d2940 | 0x00000000 | audio companion (no kernel calls) |
|
||
| 0x82178950 | 0x828f3ec0 | input/lifecycle |
|
||
| 0x821748f0 | 0xbc6c5640 | early helper |
|
||
|
||
NB: `sub_824D2878` IS in the spawn set but its tid emits no kernel
|
||
calls in the entire 500M-instruction run (matches 2.J). Workers
|
||
`sub_825070F0` × 4 + secondary-burst tids never spawn.
|
||
|
||
### VdSwap / draws (gameplay progression — tripstone #39)
|
||
- **VdSwap = 1** (same single swap at cycle=5,577,303 / host_ns=493.5ms
|
||
as 2.J). Bit-identical timestamp.
|
||
- **Draws = 0** (no `*Draw*` kernel name emitted).
|
||
- **Gameplay progression NOT achieved.** Honest "no" per #39.
|
||
|
||
### Total event count
|
||
- **121,569 events** (bit-identical to 2.J).
|
||
- File size 28,724,871 bytes vs 2.J 28,667,xxx ish — content identical
|
||
up to floating host_ns jitter; structurally equal.
|
||
- Implication: between 50M and 500M instructions (4× more wallclock),
|
||
the engine emitted **0 new kernel calls, 0 new wait.begin, 0 new
|
||
handle events**. The host clock advanced (~3× wallclock) but the
|
||
guest committed no observable progress.
|
||
|
||
### Wedge state (final-state dump, non-quiet diagnostic re-run)
|
||
At budget exhaustion, all live threads parked:
|
||
|
||
| tid | PC | LR | state | handle waiting on |
|
||
|-----|----|----|-------|-------------------|
|
||
| 1 | 0x824ac578 | 0x824ac578 | Blocked(WaitAny, no deadline) | 0x12c8 = Thread(id=13) |
|
||
| 11 | 0x824d2a94 | 0x824d2a94 | Blocked(WaitAny, no deadline) | 0x828a3244 = Event(sig=false) |
|
||
| 2 | 0x824a95f8 | 0x824a95f8 | Blocked(WaitAny, no deadline) | 0x8287093c = Event(sig=false) |
|
||
| 13 | 0x824ac578 | 0x824ac578 | Blocked(WaitAny, no deadline) | 0x12d0 = Event(sig=false) |
|
||
| 7 | 0x824cd4f4 | 0x824cd4f4 | Blocked(WaitAny, deadline=3000) | 0xbe8cbb5c = Event |
|
||
| 8 | 0x824ab214 | 0x824ab214 | Blocked(WaitAny, no deadline) | 0x10d8 = Semaphore(0/2^31-1) |
|
||
| 4 | 0x824ac578 | 0x824ac578 | Blocked(WaitAny, no deadline) | 0x1028 = Semaphore(0/2^31-1) |
|
||
| 5 | 0x824ac578 | 0x824ac578 | Blocked(WaitAny, no deadline) | 0x12e4 = Event(sig=false) |
|
||
| 9 | 0x824d1404 | 0x824d22b4 | Ready | — |
|
||
| 6 | 0x824ab214 | 0x824ab214 | Ready | — |
|
||
| 10 | 0x824d1404 | 0x824d22b4 | Ready | — |
|
||
| 12 | 0x824aa6a4 | 0x824aa6a4 | Ready | — |
|
||
| 3 | 0x824ac578 | 0x824ac578 | Blocked(WaitAny, no deadline) | 0x1020 = Event(sig=false) |
|
||
|
||
**5 of 13 tids parked at PC `0x824ac578`** (the AUDIT-049 wedge),
|
||
including the canonical tid=1 → Thread(id=13) → Event circular wait.
|
||
4 tids in `Ready` state but never re-scheduled to advance.
|
||
|
||
tid=1's last `kernel.return` in Phase-A log shows
|
||
`NtWaitForSingleObjectEx return_value=0 status=0x00000000` at
|
||
cycle=9,169,116 — but this is one of an **earlier** iteration of the
|
||
wait loop, NOT the wait it is currently blocked on. The final wait
|
||
(handle 0x12c8 = tid=13 thread handle) NEVER returned; no
|
||
`kernel.return` event was emitted for it because the wrapper is parked
|
||
indefinitely.
|
||
|
||
### Reading-error #41 candidate (new this iterate)
|
||
**Phase-A "kernel.return success" events do NOT imply forward progress
|
||
when the call site is a tight wait-loop.** 2.J's report observed "tid=1
|
||
NtWait returns success, wedge moved or absent" — but the events
|
||
captured were prior loop iterations that **fed back into the SAME wait
|
||
call** which then blocks forever. The honest interpretation is "wait
|
||
wrapper made N successful round-trips, then the (N+1)th call blocked
|
||
indefinitely." Recommend registering: **return-success in Phase-A does
|
||
not prove wedge resolution; cross-check against final-state thread
|
||
diagnostic dump under the longest available budget.**
|
||
|
||
## Comparison: 2.H → 2.J → 2.K
|
||
|
||
| gate | 2.H (no wipe) | 2.J (wipe, 50M) | 2.K (wipe, 500M) |
|
||
|------|-----|-----|-----|
|
||
| cache probe `0xc000000f` | FAIL | PASS (9/9) | PASS (9/9) |
|
||
| total events | 118,149 | 121,569 | **121,569** |
|
||
| tid=4 events | 160 | 2,075 | **2,075** |
|
||
| thread.create count | 10 | 10 | **10** |
|
||
| tid=1 last cycle | 9,140,200 | 9,169,116 | **9,169,116** |
|
||
| VdSwap count | 1 | 1 | **1** |
|
||
| draws | 0 | 0 | **0** |
|
||
| `sub_824F8398` fires | 0 | 0 | **0** |
|
||
| `sub_825070F0` fires | 0 | 0 | **0** |
|
||
| wedge PC `0x824ac578` parked | yes | "moved" (budget short) | **5 tids parked there** |
|
||
| termination | 50M budget | 50M budget | 500M budget cleanly |
|
||
| wallclock to terminate | ~5s | ~5s | **13.96s** |
|
||
|
||
**Critical finding: 2.J ≡ 2.K at the Phase-A event level.** All
|
||
gates identical to 2.J. The 10× budget bought 4× more wallclock but
|
||
zero additional observable guest progress. The engine is genuinely
|
||
wedged from somewhere between cycle 9,140,200 and 9,169,116 onward.
|
||
|
||
## Tripstone audit
|
||
|
||
- **#28** (cross-engine tid stability): All ours-internal claims keyed
|
||
on entry_pc, not integer tid. 2.J ↔ 2.K both ours-side so integer tid
|
||
stable; entry_pc/ctx_ptr columns bit-stable.
|
||
- **#39** (gameplay progression IS progression): Headline does NOT
|
||
claim progression. VdSwap=1, draws=0 — same as 2.J. PASS claim is on
|
||
*characterization* of the wedge (now visible at the same PC as
|
||
AUDIT-049), not on cascade.
|
||
- **#40** (single-keystone framing): The 2.J framing "cache parity is
|
||
the keystone, longer budget will reveal the install chain" is
|
||
**FALSIFIED** by 2.K. Neither cache parity nor longer budget
|
||
unblocks `sub_824F8398`. Reading-error #40 class repeats again
|
||
(this iterate's expectation that 10× budget unblocks the chain).
|
||
Recommend registering reading-error **#41**: Phase-A
|
||
`kernel.return success` events do not prove wedge resolution when
|
||
the call site is a tight wait-loop with N successful spins before
|
||
the (N+1)th terminal block.
|
||
|
||
## Confidence
|
||
|
||
- **HIGH** that 2.K reached 500M instructions cleanly (`exec complete
|
||
wall_ms=13959 instructions=500000004` in diagnostic re-run).
|
||
- **HIGH** that Phase-A event log is bit-identical to 2.J at the
|
||
structural level (count, last tid_event_idx, last guest_cycle).
|
||
- **HIGH** that 5 tids parked at `0x824ac578` at budget exhaustion
|
||
(final-state dump direct evidence).
|
||
- **HIGH** that `sub_824F8398` and `sub_825070F0` are 0 fires (grepped
|
||
across all event kinds + payload fields).
|
||
- **HIGH** that wallclock-vs-events ratio diverges 3:1 between 2.J and
|
||
2.K — the engine is consuming host time without making guest
|
||
observable progress, i.e. spinning in the JIT loop on
|
||
re-execution of already-blocked waits or busy-loops.
|
||
|
||
## Next iterate recommendation
|
||
|
||
**Iterate 2.L should be ONE of:**
|
||
|
||
1. **Walk the wedge backward from `0x824ac578` to find the missing
|
||
signaler** (~0-50 LOC instrumentation). Each parked tid is waiting
|
||
on a specific event/semaphore handle. Identify per-tid: (a) who in
|
||
canary signals that handle and when; (b) whether the signaler tid
|
||
exists in ours; (c) if it exists, why doesn't it reach the signal
|
||
site. The wedge handles in this run are:
|
||
- tid=1 → 0x12c8 = Thread(id=13) — waiting for tid=13 to exit
|
||
- tid=13 → 0x12d0 = Event — needs an external signaler
|
||
- tid=3,4,5 → various Event/Semaphore handles
|
||
- tid=8 → 0x10d8 = Semaphore (the AUDIT-069 work-semaphore class)
|
||
This is essentially AUDIT-069 territory: producer-underrun at the
|
||
work-semaphore. ~0 LOC if reusing existing `--lr-trace` /
|
||
`--branch-probe` infra.
|
||
|
||
2. **Push budget further (-n 5000000000, 50×) to see if anything
|
||
eventually fires** (~0 LOC, ~2.5 min wallclock estimate, decisive
|
||
negative). LOW PRIORITY — based on 2.K's flat-zero events 50M-500M,
|
||
strongly predict 0 events 500M-5000M.
|
||
|
||
3. **2.D-style diff re-measure** of (op, lr) missing-tuple count from
|
||
the IAT producer LR side (~0-30 LOC). 2.J said "expected unchanged
|
||
at 28/28". 2.K confirms structurally identical to 2.J, so
|
||
missing-tuple count is also expected unchanged. Re-measure to
|
||
CONFIRM (and to refresh the producer-rate at LR 0x824AB168
|
||
which was 9.97% in 2.D). Useful as cascade-sanity even if
|
||
negative.
|
||
|
||
**Recommended priority: (1)** — direct per-handle waiter→signaler
|
||
walk on the 5 parked tids at `0x824ac578`. Will identify the most
|
||
upstream missing signaler and likely lead to either AUDIT-069's
|
||
producer-underrun root or a new state-parity divergence upstream of
|
||
the install epoch. ~0-50 LOC, ~30-60 min.
|
||
|
||
**DO NOT pursue (2)** without first attempting (1) — the structural
|
||
evidence (event count flat, max-cycle flat, final-state genuine
|
||
wedge) makes "longer budget" a high-confidence negative.
|
||
|
||
## Artifacts
|
||
|
||
Under `xenia-rs/audit-runs/iterate-2K-longer-budget-replay/`:
|
||
|
||
- `ours-cold.jsonl` (121,569 events, 500M-instr quiet run, ~28MB)
|
||
- `ours-cold.stdout.log` / `ours-cold.stderr.log` (empty — quiet mode)
|
||
- `exit-diag-full.log` (390 lines, non-quiet diagnostic re-run
|
||
capturing budget-hit message + final-state dump + thread diagnostics
|
||
+ metrics summary)
|
||
- `exit-diag.log` (50-line tail of first diagnostic run)
|
||
- `exit-diag-head.log` (100-line head of second diagnostic run)
|
||
- `writer-report.md` (this file)
|
||
|
||
Cache wiped via `XENIA_CACHE_WIPE=1` env (per-process tmpdir at
|
||
`/tmp/xenia-rs-cache-244570-0/`). No XDG cache pre-existed.
|