Files
xenia-rs/audit-runs/iterate-2K-longer-budget-replay/writer-report.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

252 lines
12 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Iterate 2.K — Longer-budget cache-wipe replay (writer report)
**Date:** 2026-05-28. **LOC delta:** engine **0**, canary **0**. Pure
measurement.
**Tests:** N/A (no source modifications).
## Headline
**INSTALL-CHAIN-ABSENT-NEW-BLOCKER.** 500M-instruction budget run
(10× 2.J's 50M) reaches the budget cap cleanly at wallclock=13.96s
**but emits ZERO new Phase-A events past 2.J's terminus.** Event count
121,569 bit-identical to 2.J. tid=1 max guest_cycle 9,169,116 bit-identical
to 2.J. The keystone `sub_824F8398` install chain still **0 fires**;
`sub_825070F0` worker fan-out still **0 fires**. Final-state dump
reveals **all 12 live threads parked in `Blocked(WaitAny ..., deadline:
None)` waits, 5 of them at PC `0x824ac578`** — the exact AUDIT-049
wedge PC. The 2.J "wedge moved / wait returns success" observation was
budget-truncated artifact: under longer budget, the engine re-converges
to a deadlock at the same call site. **2.J's `NtWaitForSingleObjectEx
return=0` events are the wrapper successfully returning on prior
iterations of a tight `wait → return → wait` loop; the FINAL wait of
each tid blocks forever and never emits a `kernel.return`.** Cache
parity was load-bearing but is NOT THE keystone. Next blocker is
upstream of the install chain at the wedge-loop level.
## Mode
ZERO LOC. Invocation:
```
XENIA_CACHE_WIPE=1 timeout 600 ./xenia-rs/target/release/xenia-rs exec \
-n 500000000 --quiet \
--phase-a-event-log audit-runs/iterate-2K-longer-budget-replay/ours-cold.jsonl \
"Project Sylpheed - Arc of Deception (USA, Europe) (En,Ja).iso"
```
Identical to 2.J except `-n 50000000``-n 500000000`. XDG cache
already absent (no `/home/fabi/.local/share/xenia-rs/cache/`) before run;
`XENIA_CACHE_WIPE=1` set for belt-and-braces.
Run completed `EXIT=0` at wallclock 13.96s. Final reason from non-quiet
diagnostic re-run: `reached max instruction count limit=500000000`
(instruction budget hit, not a panic/fault/timeout). Total instructions
executed: 500,000,004.
## Primary gate results
| gate | 2.J | 2.K | result |
|------|-----|-----|--------|
| `sub_824F8398` install-chain fires | 0 | **0** | UNCHANGED |
| `sub_825070F0` worker fan-out fires | 0 | **0** | UNCHANGED |
Grep against full ours-cold.jsonl (case-insensitive on hex literal,
plus per-tid first-kernel-call signature): zero hits for either symbol
across all kinds (thread.create, import.call, kernel.call,
kernel.return, payload fields). The canary's tids 15/27/28 (the
`sub_825070F0` family workers) and tid 14 (audio worker
`sub_824D2878`-driven) are **structurally absent from ours's thread
fan-out at this trajectory point**, even given 10× the instruction
budget.
## Secondary cascade gate results
### Thread set
**10 thread.create entries, bit-identical to 2.J** (same entry_pcs,
same ctx_ptrs). Per tripstone #28 (don't key on integer tid):
| entry_pc | ctx_ptr | canary analog |
|----------|---------|---------------|
| 0x82181830 | 0x828f3d08 | main bootstrap |
| 0x8245a5d0 | 0x828f4838 | early helper |
| 0x82450a28 | 0x828f3b68 | producer (AUDIT-069) |
| 0x82457ef0 | 0x828f3b08 | dispatcher tid=5 |
| 0x824cd458 | 0xbe8cbb3c | per-AUDIT-068 sister |
| 0x822f1ee0 | 0xbd184a40 | helper |
| 0x824d2878 | 0x00000000 | audio worker (no kernel calls) |
| 0x824d2940 | 0x00000000 | audio companion (no kernel calls) |
| 0x82178950 | 0x828f3ec0 | input/lifecycle |
| 0x821748f0 | 0xbc6c5640 | early helper |
NB: `sub_824D2878` IS in the spawn set but its tid emits no kernel
calls in the entire 500M-instruction run (matches 2.J). Workers
`sub_825070F0` × 4 + secondary-burst tids never spawn.
### VdSwap / draws (gameplay progression — tripstone #39)
- **VdSwap = 1** (same single swap at cycle=5,577,303 / host_ns=493.5ms
as 2.J). Bit-identical timestamp.
- **Draws = 0** (no `*Draw*` kernel name emitted).
- **Gameplay progression NOT achieved.** Honest "no" per #39.
### Total event count
- **121,569 events** (bit-identical to 2.J).
- File size 28,724,871 bytes vs 2.J 28,667,xxx ish — content identical
up to floating host_ns jitter; structurally equal.
- Implication: between 50M and 500M instructions (4× more wallclock),
the engine emitted **0 new kernel calls, 0 new wait.begin, 0 new
handle events**. The host clock advanced (~3× wallclock) but the
guest committed no observable progress.
### Wedge state (final-state dump, non-quiet diagnostic re-run)
At budget exhaustion, all live threads parked:
| tid | PC | LR | state | handle waiting on |
|-----|----|----|-------|-------------------|
| 1 | 0x824ac578 | 0x824ac578 | Blocked(WaitAny, no deadline) | 0x12c8 = Thread(id=13) |
| 11 | 0x824d2a94 | 0x824d2a94 | Blocked(WaitAny, no deadline) | 0x828a3244 = Event(sig=false) |
| 2 | 0x824a95f8 | 0x824a95f8 | Blocked(WaitAny, no deadline) | 0x8287093c = Event(sig=false) |
| 13 | 0x824ac578 | 0x824ac578 | Blocked(WaitAny, no deadline) | 0x12d0 = Event(sig=false) |
| 7 | 0x824cd4f4 | 0x824cd4f4 | Blocked(WaitAny, deadline=3000) | 0xbe8cbb5c = Event |
| 8 | 0x824ab214 | 0x824ab214 | Blocked(WaitAny, no deadline) | 0x10d8 = Semaphore(0/2^31-1) |
| 4 | 0x824ac578 | 0x824ac578 | Blocked(WaitAny, no deadline) | 0x1028 = Semaphore(0/2^31-1) |
| 5 | 0x824ac578 | 0x824ac578 | Blocked(WaitAny, no deadline) | 0x12e4 = Event(sig=false) |
| 9 | 0x824d1404 | 0x824d22b4 | Ready | — |
| 6 | 0x824ab214 | 0x824ab214 | Ready | — |
| 10 | 0x824d1404 | 0x824d22b4 | Ready | — |
| 12 | 0x824aa6a4 | 0x824aa6a4 | Ready | — |
| 3 | 0x824ac578 | 0x824ac578 | Blocked(WaitAny, no deadline) | 0x1020 = Event(sig=false) |
**5 of 13 tids parked at PC `0x824ac578`** (the AUDIT-049 wedge),
including the canonical tid=1 → Thread(id=13) → Event circular wait.
4 tids in `Ready` state but never re-scheduled to advance.
tid=1's last `kernel.return` in Phase-A log shows
`NtWaitForSingleObjectEx return_value=0 status=0x00000000` at
cycle=9,169,116 — but this is one of an **earlier** iteration of the
wait loop, NOT the wait it is currently blocked on. The final wait
(handle 0x12c8 = tid=13 thread handle) NEVER returned; no
`kernel.return` event was emitted for it because the wrapper is parked
indefinitely.
### Reading-error #41 candidate (new this iterate)
**Phase-A "kernel.return success" events do NOT imply forward progress
when the call site is a tight wait-loop.** 2.J's report observed "tid=1
NtWait returns success, wedge moved or absent" — but the events
captured were prior loop iterations that **fed back into the SAME wait
call** which then blocks forever. The honest interpretation is "wait
wrapper made N successful round-trips, then the (N+1)th call blocked
indefinitely." Recommend registering: **return-success in Phase-A does
not prove wedge resolution; cross-check against final-state thread
diagnostic dump under the longest available budget.**
## Comparison: 2.H → 2.J → 2.K
| gate | 2.H (no wipe) | 2.J (wipe, 50M) | 2.K (wipe, 500M) |
|------|-----|-----|-----|
| cache probe `0xc000000f` | FAIL | PASS (9/9) | PASS (9/9) |
| total events | 118,149 | 121,569 | **121,569** |
| tid=4 events | 160 | 2,075 | **2,075** |
| thread.create count | 10 | 10 | **10** |
| tid=1 last cycle | 9,140,200 | 9,169,116 | **9,169,116** |
| VdSwap count | 1 | 1 | **1** |
| draws | 0 | 0 | **0** |
| `sub_824F8398` fires | 0 | 0 | **0** |
| `sub_825070F0` fires | 0 | 0 | **0** |
| wedge PC `0x824ac578` parked | yes | "moved" (budget short) | **5 tids parked there** |
| termination | 50M budget | 50M budget | 500M budget cleanly |
| wallclock to terminate | ~5s | ~5s | **13.96s** |
**Critical finding: 2.J ≡ 2.K at the Phase-A event level.** All
gates identical to 2.J. The 10× budget bought 4× more wallclock but
zero additional observable guest progress. The engine is genuinely
wedged from somewhere between cycle 9,140,200 and 9,169,116 onward.
## Tripstone audit
- **#28** (cross-engine tid stability): All ours-internal claims keyed
on entry_pc, not integer tid. 2.J ↔ 2.K both ours-side so integer tid
stable; entry_pc/ctx_ptr columns bit-stable.
- **#39** (gameplay progression IS progression): Headline does NOT
claim progression. VdSwap=1, draws=0 — same as 2.J. PASS claim is on
*characterization* of the wedge (now visible at the same PC as
AUDIT-049), not on cascade.
- **#40** (single-keystone framing): The 2.J framing "cache parity is
the keystone, longer budget will reveal the install chain" is
**FALSIFIED** by 2.K. Neither cache parity nor longer budget
unblocks `sub_824F8398`. Reading-error #40 class repeats again
(this iterate's expectation that 10× budget unblocks the chain).
Recommend registering reading-error **#41**: Phase-A
`kernel.return success` events do not prove wedge resolution when
the call site is a tight wait-loop with N successful spins before
the (N+1)th terminal block.
## Confidence
- **HIGH** that 2.K reached 500M instructions cleanly (`exec complete
wall_ms=13959 instructions=500000004` in diagnostic re-run).
- **HIGH** that Phase-A event log is bit-identical to 2.J at the
structural level (count, last tid_event_idx, last guest_cycle).
- **HIGH** that 5 tids parked at `0x824ac578` at budget exhaustion
(final-state dump direct evidence).
- **HIGH** that `sub_824F8398` and `sub_825070F0` are 0 fires (grepped
across all event kinds + payload fields).
- **HIGH** that wallclock-vs-events ratio diverges 3:1 between 2.J and
2.K — the engine is consuming host time without making guest
observable progress, i.e. spinning in the JIT loop on
re-execution of already-blocked waits or busy-loops.
## Next iterate recommendation
**Iterate 2.L should be ONE of:**
1. **Walk the wedge backward from `0x824ac578` to find the missing
signaler** (~0-50 LOC instrumentation). Each parked tid is waiting
on a specific event/semaphore handle. Identify per-tid: (a) who in
canary signals that handle and when; (b) whether the signaler tid
exists in ours; (c) if it exists, why doesn't it reach the signal
site. The wedge handles in this run are:
- tid=1 → 0x12c8 = Thread(id=13) — waiting for tid=13 to exit
- tid=13 → 0x12d0 = Event — needs an external signaler
- tid=3,4,5 → various Event/Semaphore handles
- tid=8 → 0x10d8 = Semaphore (the AUDIT-069 work-semaphore class)
This is essentially AUDIT-069 territory: producer-underrun at the
work-semaphore. ~0 LOC if reusing existing `--lr-trace` /
`--branch-probe` infra.
2. **Push budget further (-n 5000000000, 50×) to see if anything
eventually fires** (~0 LOC, ~2.5 min wallclock estimate, decisive
negative). LOW PRIORITY — based on 2.K's flat-zero events 50M-500M,
strongly predict 0 events 500M-5000M.
3. **2.D-style diff re-measure** of (op, lr) missing-tuple count from
the IAT producer LR side (~0-30 LOC). 2.J said "expected unchanged
at 28/28". 2.K confirms structurally identical to 2.J, so
missing-tuple count is also expected unchanged. Re-measure to
CONFIRM (and to refresh the producer-rate at LR 0x824AB168
which was 9.97% in 2.D). Useful as cascade-sanity even if
negative.
**Recommended priority: (1)** — direct per-handle waiter→signaler
walk on the 5 parked tids at `0x824ac578`. Will identify the most
upstream missing signaler and likely lead to either AUDIT-069's
producer-underrun root or a new state-parity divergence upstream of
the install epoch. ~0-50 LOC, ~30-60 min.
**DO NOT pursue (2)** without first attempting (1) — the structural
evidence (event count flat, max-cycle flat, final-state genuine
wedge) makes "longer budget" a high-confidence negative.
## Artifacts
Under `xenia-rs/audit-runs/iterate-2K-longer-budget-replay/`:
- `ours-cold.jsonl` (121,569 events, 500M-instr quiet run, ~28MB)
- `ours-cold.stdout.log` / `ours-cold.stderr.log` (empty — quiet mode)
- `exit-diag-full.log` (390 lines, non-quiet diagnostic re-run
capturing budget-hit message + final-state dump + thread diagnostics
+ metrics summary)
- `exit-diag.log` (50-line tail of first diagnostic run)
- `exit-diag-head.log` (100-line head of second diagnostic run)
- `writer-report.md` (this file)
Cache wiped via `XENIA_CACHE_WIPE=1` env (per-process tmpdir at
`/tmp/xenia-rs-cache-244570-0/`). No XDG cache pre-existed.