# Plan — `cache:\` subsystem fix for Phase C+11 main-chain advance
## Context
Phase C+10 (2026-05-14) escalated the `cache:\` divergence at Phase A idx=102404:
```
canary[6][102403] NtQueryFullAttributesFile path="cache:\d4ea4615\e\46ee8ca"
ours [1][102403] NtQueryFullAttributesFile path="cache:\d4ea4615\e\46ee8ca"
canary[6][102404] return=0 (file resolved in persistent cache)
ours [1][102404] return=0xC0000034 (file missing from per-process tmpdir)
```
Both engines query the same path byte-for-byte (C+10 emitter extension confirms). Canary's cache mount `~/.local/share/Xenia/cache/` is **pre-populated** with 23 files / 4.8 MB across 16 hash buckets, accumulated over prior boots. Ours's cache mount is per-process tmpdir at `/tmp/xenia-rs-cache-PID-N`, wiped per AUDIT-038 lockstep discipline (or — since AUDIT-054 — `$HOME/.local/share/xenia-rs/cache` when `XENIA_CACHE_PERSIST=1`).
The escalation criteria flagged "cache-population infrastructure" as out-of-scope for the C+10 session and deferred to this planning session.
## Headline finding
The cache divergence is not "missing files" — it is **two specific engine bugs** in ours that prevent Sylpheed from building its own cache correctly:
1. **`NtSetInformationFile` class 10 (`XFileRenameInformation`) is a no-op stub** in ours. Canary properly implements it via `file->Rename(target_path)` ([xboxkrnl_io_info.cc:226-243](xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_io_info.cc#L226-L243)). Ours falls through to the catch-all arm that returns `STATUS_SUCCESS` without renaming ([exports.rs:1820-1905](xenia-rs/crates/xenia-kernel/src/exports.rs#L1820-L1905); specifically line 1820 lists class 10 in `min_length` but no case-arm in the `match info_class` body at 1847-1905; the `_ => (STATUS_SUCCESS, min_length)` arm catches it).
2. **`cache:\access`, `cache:\ignore`, `cache:\recent` are created as directories** in ours when they should be files. After running ours with `XENIA_CACHE_PERSIST=1`, these top-level cache entries appear in the host filesystem as empty directories (`4096 B` each), whereas canary's cache has them as files (`access` = 240 B host file; `recent` = 160 B). The bug is in [exports.rs::open_cache_file](xenia-rs/crates/xenia-kernel/src/exports.rs#L1023-L1196)'s `is_dir_open` discriminator (lines 1041-1051) misclassifying these create requests. Suspected cause: `want_dir = (create_options & FILE_DIRECTORY_FILE) != 0` is true on Sylpheed's first `NtCreateFile cache:\access` call. Either Sylpheed actually sets bit 0x1 (which canary tolerates without creating a directory because its HostPathDevice respects the disposition differently), or ours's `create_options` arg-position read is wrong for the calls in question. Needs instrumentation to confirm.
Together these bugs produce the observed asymmetry:
* Canary's cache (warm, populated from prior boots) has 23 hierarchical leaf files (`
//` form), top-level `access` (240 B) and `recent` (160 B) manifests, and **zero `.tmp` files**.
* Ours's persistent cache after one 50M boot has 7 flat `.tmp` journals at the cache root (`.tmp` form, total 1.4 MB), 7 empty hash subdirectories, and `access`/`ignore`/`recent` as **directories instead of files**.
* Persistence experiment confirms: even with `XENIA_CACHE_PERSIST=1` and a warm boot (the `.tmp` files already present from a prior cold run), main matched-prefix is **still 102404** (unchanged from C+10's default-tmpdir result). Persistence alone does not advance the matched-prefix because the hierarchical leaf file `cache:\d4ea4615\e\46ee8ca` never materializes — the `.tmp` rename to leaf path is silently dropped by ours's stubbed `XFileRenameInformation`.
These findings reframe AUDIT-038/052/053/054's debate. The cache-population problem is not "ours needs canary's cache content" or "ours needs Sylpheed's cache-build logic implemented from scratch" — it is "ours has bugs in two existing kernel exports that block Sylpheed's own cache-build logic from completing". Sylpheed's cache-build path **already fires in ours** (visible as `.tmp` writes, directory creates, `NtSetInformationFile` calls); it just cannot promote `.tmp` to leaf because of bug #1, and writes garbage state for the top-level manifests because of bug #2.
## Investigation summary (verified facts)
### Canary's cache (from disk enumeration of `~/.local/share/Xenia/cache/`)
| top-level | type | size | notes |
|---|---|---|---|
| `access` | file | 240 B | 20 × 12-byte records: `(hash1, hash2, refcount)` manifest |
| `recent` | file | 160 B | 20 × 8-byte records: `(hash1, hash2)` recently-used list |
| `d4ea4615/` | dir | — | 1 leaf (`e/46ee8ca`, 400 B Shift-JIS Japanese localization text with `[SYSTEM]`/`[LANGUAGE]`/`XC_LANGUAGE_*` table) |
| `69d8e45c/` | dir | — | 9 leaves across 7 sub-letters (40 B–114 KB; `IPFB`-magic binary blobs) |
| `87719002/` | dir | — | 7 leaves across 4 sub-letters (38 KB–2.7 MB; largest blob is 2.7 MB asset) |
| `aab216c3/` | dir | — | 3 leaves across 2 sub-letters (2 KB–102 KB) |
Total: 23 files / 4.8 MB. **Zero `.tmp` files.**
Cache content is **game-asset cache**, not shader/PSO cache: localization text, font/asset binary blobs (`IPFB` magic suggests Japanese game-asset format), and the two manifest files (`access` enumerates known hashes; `recent` tracks recently used).
### Canary's cache code (from canary source read)
* Mount registered in [xenia-canary/src/xenia/app/xenia_main.cc:612-652](xenia-canary/src/xenia/app/xenia_main.cc#L612-L652): three `HostPathDevice` mounts (`\\CACHE0`, `\\CACHE1`, `\\CACHE`) with symbolic-link aliases `cache0:`, `cache1:`, `cache:` — registered in that order because `VirtualFileSystem::ResolvePath` does `starts_with` matching.
* Cache root = `storage_root / "cache"`. `storage_root` defaults to `$XDG_DATA_HOME/Xenia` or `$HOME/.local/share/Xenia` on POSIX ([filesystem_posix.cc:76-97](xenia-canary/src/xenia/base/filesystem_posix.cc#L76-L97)).
* Cache is **persistent**: no wipe logic exists anywhere in canary source. Directories created on-demand by `HostPathDevice::Initialize` if missing ([host_path_device.cc:31-48](xenia-canary/src/xenia/vfs/devices/host_path_device.cc#L31-L48)).
* `NtQueryFullAttributesFile` ([xboxkrnl_io.cc:474-513](xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_io.cc#L474-L513)) returns `X_STATUS_SUCCESS` when `file_system()->ResolvePath()` returns an entry; `X_STATUS_NO_SUCH_FILE` otherwise. (Note: canary uses `NO_SUCH_FILE = 0xC000000F`; ours returns `OBJECT_NAME_NOT_FOUND = 0xC0000034`. Both are negative NTSTATUS values; both treated equivalently by Sylpheed.)
* `NtCreateFile` ([xboxkrnl_io.cc:39-111](xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_io.cc#L39-L111)) routes through `FileSystem::OpenFile` → `HostPathEntry::CreateEntryInternal` which calls `std::filesystem::create_directories` for the parent + `OpenFile("wb")` for the file ([host_path_entry.cc:78-98](xenia-canary/src/xenia/vfs/devices/host_path_entry.cc#L78-L98)).
* All file IO is synchronous; canary's `XFile::Write` calls `WriteSync` unconditionally ([xfile.cc:262-293](xenia-canary/src/xenia/kernel/xfile.cc#L262-L293)).
### Ours's cache code (from current tree read)
* [`KernelState::resolve_default_cache_root()`](xenia-rs/crates/xenia-kernel/src/state.rs#L1235-L1273) at state.rs:1235-1273: defaults to per-process tmpdir + wipe; honors `XENIA_CACHE_ROOT=` (no wipe) and `XENIA_CACHE_PERSIST=1` (`$XDG_DATA_HOME/xenia-rs/cache` or `$HOME/.local/share/xenia-rs/cache`, no wipe). Called from [`KernelState::new_with_gpu`](xenia-rs/crates/xenia-kernel/src/state.rs#L418-L425) at state.rs:418-425, before any guest code runs.
* [`init_cache_root`](xenia-rs/crates/xenia-kernel/src/state.rs#L499-L510) at state.rs:499-510: when `wipe=true`, calls `remove_dir_all` then `create_dir_all`; when `wipe=false`, only `create_dir_all`.
* [`open_cache_file`](xenia-rs/crates/xenia-kernel/src/exports.rs#L1023-L1196) at exports.rs:1023-1196: AUDIT-054's `FILE_DIRECTORY_FILE`-bit handling lives here. `is_dir_open` logic (lines 1041-1051) decides file-vs-directory based on `FILE_DIRECTORY_FILE` bit (0x1) and `host_path.is_dir()`. Has a suspicious fallback `host_path == state.cache_root.as_deref().unwrap_or(host_path)` that is a tautology when `cache_root` is `None`.
* [`nt_set_information_file`](xenia-rs/crates/xenia-kernel/src/exports.rs#L1809-L1909) at exports.rs:1809-1909: validates `min_length` for class 10 (correctly 16 bytes) but has **no match-arm for class 10**; falls through to `_ => (STATUS_SUCCESS, min_length)` catch-all at line 1904. **This is the rename bug.**
* C+10 emitter extension at [`call_export`](xenia-rs/crates/xenia-kernel/src/state.rs#L657-L687) state.rs:657-687: wired for `NtQueryFullAttributesFile`, `NtOpenSymbolicLinkObject`, `NtCreateFile`, `NtOpenFile`. Not wired for `NtSetInformationFile` (the rename target path is in the info buffer, not in OBJECT_ATTRIBUTES, so this is the right design — but it means the rename target won't show up in `args_resolved.path`; a separate emitter hook would be needed if we want diff visibility on rename targets).
### Sylpheed's cache-build flow (from disassembly + event logs)
* Dispatcher `sub_82452DC0` at PC 0x82452DEC tries **primary data first** (`sub_82452068`, `sub_82452200`). If primary returns 0 (not found), falls back to cache via `sub_8245B000` at PC 0x82452E1C. (The "cache is fallback" framing reverses the AUDIT-052 framing slightly — cache is the *fallback*, not the primary path.)
* Cache gate `sub_8245B000` validates the hash-key struct, then calls `sub_8245AD00` which formats the path via `sub_82459130` (using `sprintf` to render `cache:\\\`) and queries via `sub_82612A78` (NtQueryFullAttributesFile wrapper). On miss (`r3 == -1`), branches to failure path PC 0x8245ADFC; on hit, enters critical section, calls `sub_8245B1F8` (cache file processor), and returns 1.
* **Cache-write path is NOT in sub_82452DC0**. The agent that disassembled the dispatcher did not find any `NtCreateFile` calls in the cache-miss branch. So the cache-build is in a different code path — likely fired by `sub_82452068`/`sub_82452200` (the "primary data" handlers) which, on first-time access, both compute the data AND write it to cache. The Sylpheed binary references the strings `cache:\access` (0x820B5794), `cache:\recent` (0x820B5774), `%s%08x%08x.tmp` (0x820B57AC), `cache:\ignore` (0x820B5784), `cache:\*.tmp` (0x820B5764), and `cache:\` (0x820B57A4) — confirming the game DOES manage these files itself.
* **Event-log evidence confirms cache-build fires in ours**: ours.jsonl tid=4 events at idx 28-484 show the full sequence: `NtCreateFile cache:\access` → `NtCreateFile cache:\ignore` → `NtCreateFile cache:\recent` → `NtCreateFile cache:\d4ea4615e46ee8ca.tmp` → `NtCreateFile cache:\d4ea4615` (dir, AUDIT-054 path) → `NtCreateFile cache:\d4ea4615\e` (subdir) → `NtOpenFile cache:\d4ea4615e46ee8ca.tmp` → ... → 111 total `NtSetInformationFile` calls. Canary's same trace has **0 `NtSetInformationFile` events** in the 50M window because canary's cache is warm and doesn't fire the build path.
### Persistence experiment (cold + warm boot, 50M each)
* **Boot 1 (cold, `XENIA_CACHE_PERSIST=1`)**: digest `instructions=50000003, imports=40485, swaps=1, draws=0`. Differs from C+10 default-tmpdir baseline (`50000002`, `40465`) by +1 instruction / +20 imports — the persistence path takes slightly more guest cycles. Resulting on-disk cache: 7 `.tmp` flat journals (1.4 MB total), 7 empty hash subdirectories, 3 empty directories named `access`/`ignore`/`recent`.
* **Boot 2 (warm)**: digest unchanged from boot 1 (`instructions=50000003, imports=40485`). No cxx_throw regression at 50M (AUDIT-053's regression was at 500M+; not reproduced in this window). `.tmp` files **grew** (e.g. `d4ea4615e46ee8ca.tmp`: 2400 B → 2800 B; `aab216c3a2c8c185.tmp`: 614 KB → 717 KB) — confirming AUDIT-053's "journal appends per boot" finding.
* **Boot 2 diff vs C+10 canary baseline**: `canary_tid=6 → ours_tid=1` matched=**102404** (unchanged); divergence at the same `NtQueryFullAttributesFile` return-value (canary=0 SUCCESS, ours=0xC0000034 NOT_FOUND). Persistence alone does not advance matched-prefix.
This experiment validates: enabling persistence is necessary but **not sufficient**. The `.tmp` files are produced but the rename-to-leaf step is broken, so the next boot's NtQuery for the leaf still returns NOT_FOUND.
## Approaches considered
I considered five approaches, scored on lockstep digest impact, AUDIT-038 oracle-state risk, LOC, first-boot vs subsequent-boot behavior, and risk of regressing matched-prefix.
### (a) Flip default to `XENIA_CACHE_PERSIST=1` only
* **What**: Change `resolve_default_cache_root` so persistence is on by default.
* **Won't work alone**: experiment proves matched-prefix stays at 102404 because the `.tmp`-to-leaf promotion is broken (bug #1). Necessary but not sufficient.
### (b) Implement Sylpheed's cache-generation logic in the engine
* **What**: Write engine-side code that mirrors what Sylpheed's primary-data path does (build cache from XGD assets).
* **Don't need it**: Sylpheed's binary already does this — the cache-build path fires in ours; it just doesn't finish because of bug #1 (rename). Reverse-engineering Sylpheed's asset extractor would be hundreds of LOC and is not necessary. The game does the work; ours just needs to honor the rename so the leaf file appears.
### (c) Seed-from-canary at startup
* **What**: Copy canary's `~/.local/share/Xenia/cache/*` to ours's cache root at boot.
* **Disqualified per user direction**: AUDIT-038 oracle-state violation. The user's task explicitly says "Disqualify this option unless there's a strong-enough caveat". The strong caveat doesn't apply here because (b)-via-engine-bug-fix is feasible. Save this option as last-resort fallback.
### (d) Synthesize on-demand
* **What**: Intercept `NtQueryFullAttributesFile` for `cache:\` paths and lie SUCCESS even when the file is missing.
* **Doesn't work**: canary follows the query with `NtCreateFile` at idx 102481 (78 events later) to actually open and read the file. A SUCCESS lie without backing bytes only postpones the divergence by 78 events.
### (e) **Fix the two engine bugs that block Sylpheed's own cache-build (RECOMMENDED)**
* **What**:
1. Implement `NtSetInformationFile` class 10 (`XFileRenameInformation`) properly — mirror canary's `file->Rename(target_path)` for cache:-backed handles.
2. Fix `open_cache_file`'s file-vs-directory misclassification for top-level cache files (`access`, `ignore`, `recent`).
3. Flip default to persistent cache so the cache survives across boots and the build path can complete over N iterations. Keep `XENIA_CACHE_WIPE=1` as opt-out.
4. Extend Phase A emitter to capture `NtSetInformationFile` class-10 rename target paths (~60 LOC across both engines) so future rename divergences are diff-visible.
* **Why it's right**:
* No oracle state — ours builds its own cache from the same primary game data.
* Cache convergence is **deterministic** because cache content is derived from XEX assets, not engine-specific behavior. After N boots ours's cache should be byte-identical to canary's.
* Two engine bugs are documented + reproducible; both have direct canary mirrors to copy semantics from.
* AUDIT-053 warm-start cxx_throw regression was at 500M and is NOT reproduced at 50M; the Phase A diff harness window is 50M, so the regression is not blocking for the diff-harness use-case. (Document the regression as a separate known-issue for 500M+ runs.)
* **LOC estimate**: ~150-200 across 4-5 files. Breakdown below.
* **Lockstep digest impact**: NEW baseline. Both engines should be re-baselined together with `XENIA_CACHE_PERSIST=1` enabled and a deterministic cache-warmup procedure.
* **Risk of matched-prefix regression (reading-error #23)**: LOW. The fix only adds behavior on previously-no-op kernel paths; it doesn't change existing successful paths. Determinism gate validates.
## Recommended approach: (e)
Implement the two engine-side bug fixes and flip the persistence default. Let Sylpheed build its own cache over N boots. No oracle state, no `.tmp`-to-leaf magic, no cache seeding.
## Implementation stages
Each stage is independently landable and verifiable.
### Stage 1 — Implement `NtSetInformationFile` class 10 (`XFileRenameInformation`) + extend emitter to surface rename target
* **Files**:
* Ours: [exports.rs](xenia-rs/crates/xenia-kernel/src/exports.rs) (~40 LOC body); [path.rs](xenia-rs/crates/xenia-kernel/src/path.rs) (~10 LOC info-buffer parser); [state.rs](xenia-rs/crates/xenia-kernel/src/state.rs) `call_export` dispatch (~15 LOC); [event_log.rs](xenia-rs/crates/xenia-kernel/src/event_log.rs) (re-use `emit_kernel_call_with_path` — 0 LOC).
* Canary: [xboxkrnl_io_info.cc](xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_io_info.cc) is already correct (no change needed for body); `event_log.cc`'s `EmitImportAndCallWithCtx` dispatch (~30 LOC) — extend to dispatch on `name == "NtSetInformationFile"` and read the rename target ANSI_STRING from the info buffer when info_class==10.
* Total: ~95 LOC additive across both engines.
* **Scope (body fix, ours only)**:
* Add a `case 10` arm in `nt_set_information_file`'s match (around line 1847).
* Parse the `X_FILE_RENAME_INFORMATION` struct at `info_ptr`: skip `replace_if_exists`/`root_directory` (per canary, ignored on Xbox); read the trailing ANSI_STRING name.
* Translate the new name via the same `cache:\`-aware path resolver used by `open_cache_file`.
* If the source handle has `host_path = Some(_)`, call `std::fs::rename(src, dst)` and update the handle's stored `path` + `host_path` + `size` fields.
* If the source handle is VFS-backed (not cache:), return STATUS_INVALID_PARAMETER or NOT_IMPLEMENTED — Sylpheed only renames cache: files.
* Create parent directories for `dst` as needed (`create_dir_all(dst.parent())`).
* Honor the source handle's open-mode (close + re-open if necessary for write-renames).
* **Scope (emitter extension, both engines)**:
* Add a new helper `info_buffer_rename_target_raw(mem, info_ptr, info_length)` in [path.rs](xenia-rs/crates/xenia-kernel/src/path.rs) (ours) and an equivalent `ReadFileRenameInformationTarget` in canary's `event_log.cc`. Both return the raw trimmed target path without normalization, mirroring the C+10 design for `object_attributes_raw_name`.
* In `call_export`'s dispatch (state.rs:657-687 ours; `phase_a_bridge::EmitImportAndCallWithCtx` in canary), add: when `name == "NtSetInformationFile"` and `gpr[7] == 10` (info_class) and `gpr[6] >= 16` (info_length), resolve target via the helper and call `emit_kernel_call_with_path`. Otherwise legacy form.
* No schema version bump — `args_resolved.path` is already declared free-form.
* **Validation**:
* New unit test in `exports.rs`: create `cache:\foo.tmp`, write some bytes, call NtSetInformationFile class 10 with target `cache:\bar`, verify host filesystem has `/bar` with the correct bytes and no `/foo.tmp`.
* Determinism gate (3× `--stable-digest` 50M): with cvar OFF (no Phase A emitter), digest unchanged from baseline `b8fa0e0460359a4f660adb7605e053de`. With cvar ON, Phase A emitter det-fields stable across 2 runs but differ from C+10's `7489e90e…` (because rename-target paths are now in det signature).
* Re-run persistence experiment: after Stage 1, ours's cache after 50M boot should produce hierarchical leaf files (`//`) instead of flat `.tmp` files.
* Phase A diff: re-run `tools/diff-events/diff_events.py` with new ours run vs new canary run; expected matched-prefix advance.
* **Rollback criterion**: if cvar-OFF determinism digest changes from baseline, or if any of the 165 existing unit tests fail, revert.
### Stage 2 — Fix top-level cache file misclassification
* **Files**: [exports.rs](xenia-rs/crates/xenia-kernel/src/exports.rs) `open_cache_file` (~10-20 LOC at lines 1041-1051).
* **Scope**:
* Instrument first: add a one-shot tracing log at top of `open_cache_file` printing `path`, `create_options`, `create_disposition`, `want_dir`, `host_path.is_dir()`, and the final `is_dir_open` value. Run ours with persistence + check the log for the cache:\access call.
* Two likely fixes depending on what instrumentation shows:
* **Option 2a (canary parity)**: if Sylpheed passes `FILE_DIRECTORY_FILE` bit 0x1 for these files, canary tolerates it because its disposition / non-directory bit takes precedence (`(create_options & FILE_DIRECTORY_FILE) != 0` is only treated as authoritative when bit 0x2, `FILE_NON_DIRECTORY_FILE`, is not also set). Cross-check the bit in canary's NtCreateFile_entry.
* **Option 2b (arg-reading fix)**: if ours is reading `create_options` from the wrong slot (similar to AUDIT-053's r7→r8 mistake), correct it.
* Add explicit unit test: `NtCreateFile cache:\access` with the bit-pattern Sylpheed uses must result in a host file, not a directory.
* **Validation**:
* After Stage 2, persistent run of ours should produce `/access`, `/ignore`, `/recent` as files (matching canary), not directories.
* Phase A diff: should not regress matched-prefix.
* **Rollback criterion**: same as Stage 1.
### Stage 3 — Flip default to persistent cache + re-baseline
* **Files**: [state.rs](xenia-rs/crates/xenia-kernel/src/state.rs) `resolve_default_cache_root` (~10 LOC); related unit test `cache_root_cleared_on_init` may need updating.
* **Scope**:
* Change default: `(default_persistent_path(), false)` instead of `(tmpdir_path(), true)`. Persistent cache becomes the new default for both `cargo run` and CI Phase A runs.
* Add `XENIA_CACHE_WIPE=1` opt-out (re-enables AUDIT-038 tmpdir-wipe behavior). Document in state.rs:1235's docstring as "preserved for emergency lockstep-state-reset scenarios; not recommended for diff-harness runs because the C+10 path emitter now makes cache divergences diff-visible regardless".
* Confirm both `XENIA_CACHE_ROOT=` and `XENIA_CACHE_PERSIST=1` retain their prior semantics (the latter becomes a no-op when default is already persistent, but keep it for backwards compat).
* Re-baseline both engines' Phase A digests under the new default. Run a "cache warmup" of e.g. 5 sequential 50M boots so the cache stabilizes, then capture the new C+11 baseline.
* Update existing test `cache_root_cleared_on_init` to use `XENIA_CACHE_WIPE=1` explicitly (its determinism-gate purpose is preserved).
* **Validation**:
* Determinism: 3× 50M runs with default settings must produce the same `--stable-digest` (post-warmup).
* Phase A: re-run diff. Expected behavior: matched-prefix advances **dramatically** past 102404 (canary's `cache:\d4ea4615\e\46ee8ca` query returns SUCCESS in both engines on a warm cache; the next ~16 cache-hash queries also resolve; matched-prefix advances by hundreds-to-thousands of events until a non-cache divergence appears).
* Phase B `image_loaded_sha256`: unchanged (`ea8d160e…`) — cache state doesn't affect image hash.
* Unit tests: all 165 pass.
* **Rollback criterion**: if the new baseline is non-deterministic (3 runs produce different digests) or if matched-prefix REGRESSES below 102404, revert and investigate.
### Stage 4 (optional, deferred) — Re-test AUDIT-053 warm-start regression at 500M
* **Scope**: Run ours `XENIA_CACHE_PERSIST=1` for 500M instructions across 5 successive boots; check for `cxx_throw` events from version-header mismatch (the AUDIT-053 / AUDIT-054 regression). If reproduced, investigate `.tmp` journal truncation logic. If not reproduced (AUDIT-054's FILE_DIRECTORY_FILE fix + Stage 1's rename fix together resolve it), update memory entries accordingly.
* **Validation**: 5×500M sequential boots with no cxx_throw regression; cache content stabilizes (no unbounded `.tmp` growth).
* **Why deferred**: Stage 1-3 unblock the 50M Phase A diff window which is the immediate goal. 500M warm-start is a separate property to validate but not on the critical path for Phase C+11.
## Out of scope / deferred
* **STFS / SVOD content packages** — separate VFS subsystem; not touched.
* **XAM content packages** (DLC, themes, gamerpics) — handled by separate content_root, not by `cache:`.
* **Save games** — separate `content:` mount, not by `cache:`.
* **GPU shader cache** — handled by `cache_root` cvar for `graphics_system_` in canary; ours does not yet implement this (and Sylpheed at 50M doesn't fire the shader-cache path). Deferred.
* **Sylpheed binary writers for `access`/`recent` manifests** — investigation found string refs but did not locate the writers in 50M event window. Bug fixes in this plan should be sufficient because the writers will fire eventually when ours's cache hierarchy supports them.
* **`cache0:` and `cache1:` aliases** — canary mounts three; ours currently funnels all three to one cache root via `resolve_cache_path` prefix-strip (state.rs:534-543). If Sylpheed uses cache0/cache1 distinctly, a follow-up may need to separate them. Not yet known whether Sylpheed does.
* **Phase A emitter for `NtSetInformationFile` rename target path** — schema-v1 supports `args_resolved.path` already; emitter would need extending to dispatch on info_class==10 and read the X_FILE_RENAME_INFORMATION name. Optional, not blocking.
## Validation strategy ("done enough" for iteration to resume)
The cache subsystem is "done enough" when:
1. **Phase A diff matched-prefix advances past 102,404** by at least several hundred events on the main chain (canary tid=6 ↔ ours tid=1). Cascading cache-hash resolutions should advance the matched-prefix by ~100s to ~1000s of events each; the next non-cache divergence appears past idx ~110K.
2. **All 6 sister chains hold or advance** (no regression on tid=4↔11, tid=7↔2, tid=12↔7, tid=14↔9, tid=15↔10).
3. **165 existing unit tests pass**; ~3 new tests land for cache rename + cache top-level files.
4. **Phase A determinism digest reproducible**: 3× `--stable-digest` runs at 50M produce identical digest. New C+11 baseline captured.
5. **Phase B `image_loaded_sha256` unchanged**: `ea8d160e…` still matches.
6. **Both engines build clean** (cargo build --release for ours, `xenia-canary` MSVC Debug for canary).
7. **On-disk cache content (post Stage 3) approximately matches canary's**: same 16 top-level hash buckets, same hierarchical leaf structure, same `access`/`recent` manifests as files (byte-identical content not required because game-data-derived).
If matched-prefix advances past 102,404 but stops at a NEW cache-related divergence (e.g. a 17th hash bucket that wasn't in the original 16), this counts as in-scope continuation. If matched-prefix stops at a non-cache divergence (a different kernel export, a thread-scheduling difference), the cache subsystem is complete and the next session inherits the new divergence.
## Critical files to read before implementation
* [exports.rs:1023-1196](xenia-rs/crates/xenia-kernel/src/exports.rs#L1023-L1196) — `open_cache_file` (Stage 2 target)
* [exports.rs:1809-1909](xenia-rs/crates/xenia-kernel/src/exports.rs#L1809-L1909) — `nt_set_information_file` (Stage 1 target)
* [exports.rs:6830-6980](xenia-rs/crates/xenia-kernel/src/exports.rs#L6830-L6980) — cache test suite (Stage 1/2 add tests here)
* [state.rs:1235-1273](xenia-rs/crates/xenia-kernel/src/state.rs#L1235-L1273) — `resolve_default_cache_root` (Stage 3 target)
* [xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_io_info.cc:226-243](xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_io_info.cc#L226-L243) — canary's XFileRenameInformation impl (mirror semantics)
## Reading-error class
No new class. Existing classes re-affirmed:
* Class #28 (oracle source supersedes spec): verified canary's `NtSetInformationFile` implementation by reading [xboxkrnl_io_info.cc:226-243](xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_io_info.cc#L226-L243); not assumed.
* Class #15 / ζ (VFS layout aliasing per AUDIT-053): the AUDIT-054 fix was correct but didn't catch this sibling bug (rename) or the top-level-file-as-directory bug. Both are now identified.
A possible *future* class would be: "stub-by-min-length-validation": ours's `nt_set_information_file` validated `min_length` for class 10 in its lookup table but had no actual implementation, so calls returned `STATUS_SUCCESS` without performing the operation. This is reading-error class #29 candidate ("validation table claims support that the body doesn't deliver") — defer the formal naming until a second instance is found.
## Open questions (for next implementation session, NOT this plan)
1. Does Sylpheed actually call NtSetInformationFile class 10, or does it use NtDeleteFile + NtCreateFile to "rename"? Stage 1 instrumentation should confirm class 10 is hit; if not, the bug is elsewhere. (Strong indirect evidence says class 10: canary properly implements it, Sylpheed binary references rename-style cache:\ patterns, ours has 111 NtSetInformationFile calls per boot but 0 in canary.)
2. Does Sylpheed write `cache:\access` and `cache:\recent` from the same 50M window, or does that fire later (e.g. after cache-build cycle completes)? If later, those files only appear after Stage 3's multi-boot warmup.
3. Are `cache:\access` and `cache:\recent` size-deterministic byte-for-byte across engines, or do they include host-allocator addresses / timestamps / RNG state? If non-deterministic, matching ours's cache to canary's content would require canonicalization in the diff tool (similar to AUDIT-043's ALLOCATOR_RETURN_FNS).
4. Should Stage 3 introduce a "cache warmup harness" (run N boots automatically) or leave warmup to the developer? Probably the latter — keep tests simple, document the procedure.
## Deliverables expected after this plan is approved
* `xenia-rs/audit-runs/cache-subsystem-plan/plan.md` — this plan (copied from `/home/fabi/.claude/plans/you-are-starting-a-inherited-pizza.md`)
* `xenia-rs/audit-runs/cache-subsystem-plan/investigation.md` — investigation notes captured here (canary cache enumeration, Sylpheed disassembly summary, persistence experiment result)
* `xenia-rs/audit-runs/cache-subsystem-plan/canary-cache-listing.csv` — already collected (23 files / 4.8 MB enumerated)
* `xenia-rs/audit-runs/cache-subsystem-plan/persistent-experiment.md` — already collected (cold-vs-warm 50M digest table, .tmp growth observation, matched-prefix unchanged result)
* `xenia-rs/audit-runs/cache-subsystem-plan/persist-warm-events.jsonl` — already collected (121,450 events from `XENIA_CACHE_PERSIST=1` warm boot)
* Memory entry: `project_cache_subsystem_plan_2026_05_14.md` — summary + recommendation + sized roadmap
* `MEMORY.md` index update — one line