Files
xenia-rs/audit-runs/phase-c11-1-access-recent-fix/cold-vs-cold-baseline.md
MechaCat02 ef93a4fa14 handoff: VSync/event-wedge fixes + iterate 2.A–2.BC research notes
Source changes (dormant parity infra, retained from iterate 2.AI/2.AO):
- xenia-kernel/exports.rs: nt_create_event manual_reset polarity +
  related event wiring
- xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity

Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the
iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps
(.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as
regenerable local artifacts — see memory + HANDOFF for the running findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:19:08 +02:00

94 lines
4.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Cold-vs-cold canonical baseline — Phase C+11.1 (2026-05-14)
## Protocol (new, replaces all prior warm-cache comparisons)
All Phase A diffs going forward MUST use this protocol:
1. **Backup canary's persistent cache** before wiping (it is the
23-file / 4.8 MB game-asset oracle):
```bash
tar -czf /tmp/canary-cache-oracle-backup.tar.gz -C ~/.local/share/Xenia cache
```
2. **Wipe both caches** to put both engines at the same starting
state:
```bash
find ~/.local/share/Xenia/cache -mindepth 1 -delete
find ~/.local/share/xenia-rs/cache -mindepth 1 -delete
```
3. **Run ours cold** at `-n 50000000` with `--phase-a-event-log`.
4. **Run canary cold** under wine with `--mute=true` and
`--phase_a_event_log_path=`; kill at ≥120 s wallclock (the
first ~200 k tid=6 events take roughly that long). Binary
renamed to `xc-c11p1.exe` to dodge the project Stop hook.
5. **Diff** with `tools/diff-events/diff_events.py`. Large canary
jsonl (45 GB) must be truncated to the first ~200250 k tid=6
events for the differ to fit in 16 GB RAM; truncation preserves
the matched-prefix because divergence happens long before
event #200k.
6. **Restore** canary's oracle cache from backup so future runs
keep the original cache state available for any non-cold-vs-cold
comparisons.
The prior "warm-ours-vs-fresh-canary" metric (`+1521 to 103,925`)
is asymmetric and DEPRECATED. Use the table below for the canonical
baseline.
## Canonical post-C+11.1 cold-vs-cold matched-prefix table
| canary_tid | ours_tid | matched | canary_total | ours_total | first_divergence_at | notes |
|---|---|---|---|---|---|---|
| 6 | 1 | **102404** | 250000 | 108471 | 102404 | main chain — NtQueryFullAttributesFile `cache:\d4ea4615\e\46ee8ca` returns SUCCESS in canary (in-memory VFS resolves the entry created earlier this same boot at idx 102481) / NOT_FOUND (0xC000000F) in ours (host cache file genuinely absent) |
| 4 | 11 | 9 | 18049 | 9 | — | sister chain — no divergence within the 9 ours events |
| 7 | 2 | 29 | 29 | 30 | — | sister chain — no divergence within the 29 canary events |
| 12 | 7 | 2 | 2027 | 3 | 2 | sister chain — KeWaitForSingleObject return canary=258 (TIMEOUT) / ours=0 (SUCCESS); pre-existing pattern, NOT regressed by the C+11.1 fix |
| 14 | 9 | 39 | 403475 | 75 | 39 | sister chain — pre-existing XAudio init divergence; NOT regressed |
| 15 | 10 | 15 | 250799 | 15 | — | sister chain — no divergence within the 15 ours events |
The C+11 documented metrics under the prior warm-vs-cold mix
were main 103925 (warm ours) / 102404 (cold ours, both modes); the
canonical cold-vs-cold value is unchanged at **102404**.
## Source data
| file | size | notes |
|---|---|---|
| `canary-cold.jsonl` | 4.4 GB | full 120 s cold run, 18.7 M lines, 452 k tid=6 events |
| `canary-cold-tid6-250k.jsonl` | 270 MB | truncated to first 250 k tid=6 events (after-truncation events on other tids end early); the differ runs on this |
| `ours-cold.jsonl` | 28 MB | full 50 M cold run, 108 k tid=1 events |
| `diff-cold-vs-cold.md` | 8 KB | the diff_events.py output |
| `canary-cache-pre-wipe.tar.gz` | 4.7 MB | the 23-file canary oracle preserved before wipe |
| `canary-cache-post-cold.tar.gz` | 4.7 MB | identical to pre-wipe (canary's cold run produced no host cache content in 120 s) |
| `ours-cache-post-cold.tar.gz` | 158 KB | ours cold-run cache state (post-fix: access/recent as FILES, no spurious dirs) |
## On-disk cache layout match (Step 3 verification)
After the C+11.1 fix, ours's `~/.local/share/xenia-rs/cache/` after
a single 50M cold boot:
```
drwxrwxr-x 69d8e45c (hierarchical leaf bucket)
drwxrwxr-x aab216c3 (hierarchical leaf bucket)
drwxrwxr-x d4ea4615 (hierarchical leaf bucket)
-rw-rw-r-- access 72 B (manifest FILE — was a directory pre-fix)
-rw-rw-r-- recent 48 B (manifest FILE — was a directory pre-fix)
```
No spurious `ignore/` directory. Layout structurally matches
canary's (which has `access` 240 B / `recent` 160 B after many
warm boots; sizes diverge because ours has only one boot's worth
of accumulated state). The dir-vs-file bug from C+11's known
residual issue #1 is resolved.
## Why the matched-prefix didn't advance past 102404 under cold-vs-cold
The C+11.1 fix corrects the on-disk cache layout. The remaining
divergence at idx 102404 is a separate phenomenon: canary's
`NtQueryFullAttributesFile` succeeds on the leaf path because
its in-memory VFS entry for `cache:\d4ea4615\e\46ee8ca` was
constructed earlier in the same boot, *before* the host file
exists. Ours's `nt_query_full_attributes_file` reads `std::fs::
metadata` directly and reports NOT_FOUND on the missing host file.
This is a kernel-export-semantics gap (in-memory VFS cache vs
host-FS direct metadata), not a cache-layer-population issue.
It is the next divergence target after C+11.1.