Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
15 KiB
Cache subsystem investigation — Phase C+11 planning (2026-05-14)
Scope
This investigation informs the plan at plan.md. It was run as a dedicated planning session after Phase C+10 escalated the cache divergence at idx 102404. Findings are READ-ONLY observations; no source modified.
1. Canary's cache enumeration
Canary's mount: ~/.local/share/Xenia/cache/ (the POSIX storage_root / "cache"
convention; canary's xenia-canary/src/xenia/app/xenia_main.cc:612-652 registers
three HostPathDevice mounts at \\CACHE0, \\CACHE1, \\CACHE aliased to
cache0:, cache1:, cache: symbolic links).
State at session start: 23 files / 4.8 MB across 16 hash buckets. Pre-populated across many prior canary boots. Full enumeration in canary-cache-listing.csv.
Notable properties:
- Zero
.tmpfiles — canary's cache holds only resolved hierarchical leaves (<H1>/<X>/<H2>form) plus two top-level manifests (access,recent). The.tmpflat-journal files Sylpheed uses for staging are renamed/removed before they persist. - Top-level
accessandrecentare files, not directories. Layouts:access: 20×12-byte records(hash1 u32 BE, hash2 u32 BE, refcount u32). The 240 B file enumerates the 20 known cache entries (note: 23 files total on disk but only 20 manifest entries — three of the on-disk files are not indexed; possiblyrecent-only or orphans).recent: 20×8-byte records(hash1 u32 BE, hash2 u32 BE). Recently-used ordering of the same hash pairs.
- Cache content is game-asset cache: Shift-JIS Japanese localization text
(
d4ea4615/e/46ee8ca—[SYSTEM]/[LANGUAGE]/XC_LANGUAGE_*table);IPFB-magic binary blobs (game-asset format, likely font/sprite/level data); large blobs up to 2.7 MB. This is NOT shader cache or PSO cache.
2. Canary's cache code (xenia-canary)
Mount/init:
xenia-canary/src/xenia/app/xenia_main.cc:612-652— registers threeHostPathDevicemounts.xenia-canary/src/xenia/base/filesystem_posix.cc:76-97— POSIX path resolution forstorage_rootvia$XDG_DATA_HOMEthen$HOME/.local/share.xenia-canary/src/xenia/vfs/devices/host_path_device.cc:31-48— creates the host directory if missing (std::filesystem::create_directories). No wipe logic anywhere in canary source. Cache survives across boots.xenia-canary/src/xenia/vfs/devices/host_path_entry.cc:78-98—CreateEntryInternalcallscreate_directories(parent)+OpenFile("wb").
NT IO handlers:
-
xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_io.cc:39-111—NtCreateFileroutes throughFileSystem::OpenFilewithis_directory = (create_options & FILE_DIRECTORY_FILE) != 0andis_non_directory = (create_options & FILE_NON_DIRECTORY_FILE) != 0. -
xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_io.cc:474-513—NtQueryFullAttributesFile: returnsX_STATUS_SUCCESS(0) onResolvePathhit;X_STATUS_NO_SUCH_FILE(0xC000000F) on miss. -
xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_io_info.cc:226-243—NtSetInformationFileclass 10 (XFileRenameInformation) correctly implemented:case XFileRenameInformation: { auto info = info_ptr.as<X_FILE_RENAME_INFORMATION*>(); std::filesystem::path target_path = util::TranslateAnsiPath(kernel_memory(), &info->ansi_string); if (!IsValidPath(target_path.string(), false)) { return X_STATUS_OBJECT_NAME_INVALID; } if (!target_path.has_filename()) { return X_STATUS_INVALID_PARAMETER; } file->Rename(target_path); out_length = sizeof(*info); break; }
All file IO is synchronous on the host (XFile::Write → WriteSync →
std::fwrite).
3. Ours's cache code (xenia-rs current HEAD)
Mount/init:
xenia-rs/crates/xenia-kernel/src/state.rs:1235-1273—resolve_default_cache_root:- Default: per-process tmpdir
std::env::temp_dir()/xenia-rs-cache-{pid}-{counter}withwipe=true(AUDIT-038). XENIA_CACHE_ROOT=<path>env: explicit path, no wipe.XENIA_CACHE_PERSIST=1(or "true" case-insensitive):$XDG_DATA_HOME/xenia-rs/cacheor$HOME/.local/share/xenia-rs/cache, no wipe.
- Default: per-process tmpdir
xenia-rs/crates/xenia-kernel/src/state.rs:499-510—init_cache_root: conditionally wipes and recreates.xenia-rs/crates/xenia-kernel/src/state.rs:519-554—resolve_cache_path: case-insensitive prefix-match oncache:\,cache:/,cache0:\,cache0:/,cache1:\,cache1:/; backslash → forward slash normalization;.././ empty filtered for traversal safety. Funnels all three (cache, cache0, cache1) to a single backing root — different from canary which has three separateHostPathDevicemounts.
NT IO handlers:
xenia-rs/crates/xenia-kernel/src/exports.rs:1023-1196—open_cache_file. AUDIT-054FILE_DIRECTORY_FILE-bit handling at lines 1041-1051. Theis_dir_opendecision uses(create_options & FILE_DIRECTORY_FILE) != 0 || host_path.is_dir() || host_path == state.cache_root.unwrap_or(host_path). The last term is a tautology whencache_rootisNone(returnshost_path == host_path= true), but harmless whencache_rootisSome(_).xenia-rs/crates/xenia-kernel/src/exports.rs:1354-1373—nt_create_file: readscreate_optionsfromsp + 0x54(per AUDIT-054'sshim_utils.h:49-50citation). r5=obj_attrs, r10=create_disposition.xenia-rs/crates/xenia-kernel/src/exports.rs:1375-1405—nt_open_file: readsopen_optionsfrom r7 (AUDIT-053's r8→r7 fix, Phase C+5).xenia-rs/crates/xenia-kernel/src/exports.rs:1809-1909—nt_set_information_file: validatesmin_lengthfor class 10 at line 1822 (10 => 16), but the match body at 1847-1905 has no case-arm for class 10. The_ => (STATUS_SUCCESS, min_length)catch-all at line 1904 fires for class 10, returning success without performing the rename. This is bug #1 in the plan's headline finding.xenia-rs/crates/xenia-kernel/src/exports.rs:1913-1990—nt_query_full_attributes_file. Cache short-circuit at lines 1930-1957 usesstd::fs::metadata(&hp)directly; returnsSTATUS_OBJECT_NAME_NOT_FOUND(0xC0000034) on miss. Different value than canary's 0xC000000F but treated equivalently by Sylpheed.
C+10 emitter extension:
xenia-rs/crates/xenia-kernel/src/state.rs:657-687—call_exportdispatches by name toobject_attributes_raw_name(path.rs:109-115) for the 4 OBJECT_ATTRIBUTES*-taking exports: NtQueryFullAttributesFile (r3), NtOpenSymbolicLinkObject (r4), NtCreateFile (r5), NtOpenFile (r5). Callsemit_kernel_call_with_path(event_log.rs:202-229). Not wired for NtSetInformationFile (info buffer has the path, not OBJECT_ATTRIBUTES). Stage 1 of the plan extends this dispatch to class-10 rename targets.
Tests:
xenia-rs/crates/xenia-kernel/src/exports.rs:6830-6980— 5 cache-specific tests:cache_create_write_read_roundtrip,cache_file_create_collision,cache_file_open_missing,cache_root_cleared_on_init,cache_resolve_strips_path_traversal. Plus 3 async/sync file tests.- No tests cover
NtSetInformationFileclass 10. Stage 1 of the plan adds this test.
4. Sylpheed's cache code (guest PPC binary)
Disassembly of the cache-fallback dispatcher chain (via xenia-rs disasm + sylpheed.db):
sub_82452DC0(PC 0x82452DC0–0x82453024): high-level dispatcher.- 0x82452DEC: tries primary data via
sub_82452068+sub_82452200. - 0x82452E08: checks
r3 == 0. On not-found, branches to cache fallback at 0x82452E1C. - 0x82452E1C: calls cache gate
sub_8245B000. - 0x82452E28: if cache returns 0 (miss), branches to 0x82452E88 (skip cache).
- 0x82452E30: cache hit → call callback
sub_8245B078.
- 0x82452DEC: tries primary data via
sub_8245B000(cache gate): validates hash key, callssub_8245AD00.sub_8245AD00(cache query): formats path viasub_82459130(sprintfcache:\<H1>\<X>\<H2>); queries viasub_82612A78(NtQueryFullAttributesFile wrapper). On miss (r3 == -1at 0x8245AD90), branches to failure 0x8245ADFC. On hit, enters critical section + callssub_8245B1F8(cache reader).sub_82459130(path formatter): pure sprintf, no cache write.sub_82612A78(NtQueryFullAttributesFile wrapper): wraps the kernel import; converts STATUS to -1 on error.
Cache-write path was NOT located in sub_82452DC0's disassembly. The dispatcher
agent reported no NtCreateFile in the miss branch. Likely the cache build fires
from a different code path (probably inside sub_82452068/sub_82452200, the
"primary data" handlers, which on first-time access compute the data AND write
it to cache).
Sylpheed binary string references (all confirmed via .pe text-search):
cache:\accessat 0x820B5794cache:\recentat 0x820B5774cache:\ignoreat 0x820B5784cache:\*.tmpat 0x820B5764cache:\at 0x820B57A4%s%08x%08x.tmpat 0x820B57AC (format string forcache:\<H1><H2>.tmpflat journal)
Conclusion: Sylpheed manages its own cache content. The game has both the
read path (sub_82452DC0 dispatcher) and the write path (currently unlocated,
likely in primary-data handlers). The write path is what creates .tmp files
and (we infer) calls NtSetInformationFile class 10 to rename them to
hierarchical leaves.
5. Event-log evidence (Phase A jsonl)
From xenia-rs/audit-runs/phase-c10-NtQueryFullAttributesFile/ours.jsonl,
tid=4's cache-build sequence on COLD cache:
| idx | event | path |
|---|---|---|
| 13 | NtOpenFile | cache:\ (probe mount root) |
| 19 | NtClose | (close root probe) |
| 28 | NtCreateFile | cache:\access → returns 0xC0000034 NOT_FOUND on cold |
| 37 | NtCreateFile | cache:\ignore → returns 0xC0000034 |
| 46 | NtCreateFile | cache:\recent → returns 0xC0000034 |
| 64 | NtCreateFile | cache:\d4ea4615e46ee8ca.tmp (flat journal, FILE_CREATE) |
| 69 | NtSetInformationFile | (class TBD; ours emitter doesn't capture info_class) |
| 196 | NtCreateFile | cache:\d4ea4615 (DIR, post-AUDIT-054) |
| 205 | NtCreateFile | cache:\d4ea4615\e (subdir) |
| 214 | NtOpenFile | cache:\d4ea4615e46ee8ca.tmp (reopen flat journal) |
| 286 | NtCreateFile | cache:\69d8e45ce534ffea.tmp (next flat journal) |
| 325 | NtOpenFile | cache:\ |
| 409 | NtCreateFile | cache:\access (retry) |
| 466 | NtCreateFile | cache:\69d8e45c (DIR) |
| 475 | NtCreateFile | cache:\69d8e45c\e (subdir) |
Statistics across the 50M window:
- Ours emits 69
cache:events on tid=4, plus the main-chain divergent events on tid=1. - Ours emits 111
NtSetInformationFilecalls; canary emits 0. Canary's cache is warm, so it skips cache-build entirely.
6. Persistence experiment
See persistent-experiment.md for the full table and per-boot cache-content delta. Headline result:
XENIA_CACHE_PERSIST=1+ 50M boot 1 (cold): digestinstructions=50000003 imports=40485 swaps=1 draws=0. Differs from C+10 default-tmpdir baseline (50000002,40465) by +1 instruction / +20 imports. Persistent path is slightly different from tmpdir.XENIA_CACHE_PERSIST=1+ 50M boot 2 (warm): same digest. No cxx_throw regression at 50M.- On-disk cache after boot 2: 7
.tmpflat journals (grew on each boot from +400 B to +114 KB per file);access,ignore,recentas DIRECTORIES (bug #2); zero hierarchical leaf files (bug #1 prevents promotion). - Phase A diff vs canary baseline: matched-prefix on
canary_tid=6 → ours_tid=1main chain = 102404 (unchanged from C+10's default-tmpdir result). Divergence at the sameNtQueryFullAttributesFilereturn-value (canary=0 SUCCESS, ours=0xC0000034 NOT_FOUND).
Persistence alone does not advance the matched-prefix. The .tmp files
exist but the hierarchical leaf doesn't, so the leaf NtQuery still misses.
7. Discipline / methodology checks
--mute=true: not used in this session because no canary runs were required (the C+10 canary.jsonl was reused as-is for the matched-prefix comparison). Future re-baselines under the plan must use--mute=true.- Binary rename for stop hook: ours run via
xrs-c10(pre-existing from C+10). No background long-run; the experiments completed in <3 s wall-clock on the test host. - Reading-error #28 (oracle source supersedes spec): verified canary's
NtSetInformationFileclass-10 implementation by readingxboxkrnl_io_info.cc:226-243; did not assume from docs. - No source touched: this session was read-only-by-design. Plan-mode kept
the tree clean; the only file-system side effects were Phase A event log
output to
audit-runs/cache-subsystem-plan/persist-warm-events.jsonland this directory's deliverables.
8. Confidence ratings
| claim | source | confidence |
|---|---|---|
Bug #1: nt_set_information_file class 10 is a no-op stub |
direct source read of exports.rs:1809-1909 | HIGH |
| Bug #1 prevents .tmp-to-leaf promotion | indirect: ours's cache has .tmp + no leaf; canary's has leaf + no .tmp; canary properly implements class 10 | HIGH (3 independent confirmations) |
| Bug #2: top-level cache files mis-created as directories | direct on-disk observation post-experiment | HIGH |
Bug #2 root cause: is_dir_open discriminator misclassification |
source-read inference; not yet instrumented | MEDIUM (Stage 2 instrumentation required) |
| Persistence alone doesn't advance matched-prefix | experimentally verified via diff_events.py | HIGH |
| AUDIT-053 cxx_throw regression not reproduced at 50M | experimentally verified (2 sequential boots, same digest) | MEDIUM (AUDIT-053's regression was at 500M; this window is too short to fully rule it out) |
| Sylpheed has its own cache-build path that already fires in ours | event-log evidence (69 cache: events on tid=4) | HIGH |
| The two engine bugs are the ONLY blockers | inferred from the above; could be additional bugs uncovered post-Stage 1 | MEDIUM (Stages are independently rollback-able; if a Stage doesn't advance matched-prefix, investigate further) |
9. Open questions
See plan §"Open questions". Critical ones to resolve during implementation:
- Confirm via instrumentation that Sylpheed actually calls
NtSetInformationFileclass 10 for the .tmp→leaf rename. If it uses a different path (NtDeleteFile + NtCreateFile, or some custom flow), Stage 1's fix won't fully solve the problem. - Confirm via instrumentation whether
cache:\access/ignore/recentcreates haveFILE_DIRECTORY_FILEset increate_options, or whether ours's arg-position read is wrong. - Validate whether
accessandrecentmanifest contents are deterministic byte-for-byte across engines, or whether they include host-allocator addresses / timestamps that need diff-tool canonicalization.
10. Recommended next session
See plan §"Recommended approach" and §"Implementation stages". Three landable stages, ~150-200 LOC total, expected matched-prefix advance of hundreds-to- thousands of events post Stage 3.