audit-038: persistent cache:/* VFS via host-FS backing

Replaces the "Synthesized empty file" stub for cache:/* paths with a
real host-FS HostPathDevice-style mount. Each KernelState gets a fresh
per-process tmpdir under /tmp/xenia-rs-cache-<pid>-<id>/ which is
cleared on init for lockstep determinism (mirrors canary's
xenia_main.cc:649 RegisterSymbolicLink("cache:", "\\CACHE") +
HostPathDevice in xenia-canary/src/xenia/vfs/devices/host_path_device.cc).

NtCreateFile now honours create_disposition for cache: paths:
  FILE_OPEN          -> NOT_FOUND if missing
  FILE_CREATE        -> NAME_COLLISION if present
  FILE_OPEN_IF       -> open or create
  FILE_OVERWRITE_IF  -> create or truncate
  FILE_OVERWRITE     -> NOT_FOUND if missing, else truncate
  FILE_SUPERSEDE     -> create or truncate

NtReadFile / NtWriteFile / NtSetInformationFile (XFileEndOfFileInformation)
/ NtQueryInformationFile / NtQueryFullAttributesFile route through
std::fs against the per-handle host_path; non-cache paths keep their
legacy semantics (read-only disc image, synth-empty stubs).

Verified by audit-037 cascade:
- sub_82459D18 (cache-miss restore): 0 fires (was firing constantly)
- sub_8245D230 (resize/zero-fill):  0 fires (was firing constantly)
- 105+ real cache-file writes per 500M run; 4+ MB of game data persisting
  to disk per boot; cache:/recent, cache:/access, cache:/d4ea*.tmp, etc.
- Lockstep deterministic at instructions=100000004 / imports=987485
  across 3+ reruns (digest shifted as expected; goldens re-baselined).
- swaps=2 plateau still in place; cluster L1 unactivated. Cascade
  dimension D (cluster activation) — UNKNOWN, no L1 fires.

Tests 640 -> 645 (+5 cache-specific unit tests; full workspace green).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-05-09 14:34:27 +02:00
parent 9028021936
commit 77034b6cbf
4 changed files with 672 additions and 18 deletions

View File

@@ -113,6 +113,16 @@ pub struct KernelState {
/// the disc image or host directory into this slot; file I/O handlers
/// route all reads through it.
pub vfs: Option<Box<dyn VfsDevice>>,
/// AUDIT-038 — host directory backing the persistent `cache:` mount
/// (mirrors canary's `cache:` → `\CACHE` symlink in xenia_main.cc:649,
/// implemented atop `HostPathDevice`). When `Some`, opens of `cache:\*`
/// paths go through real `std::fs` I/O against this directory; when
/// `None`, they fall back to the legacy "Synthesized empty file" stub
/// (which doesn't persist writes — see audit-037 for the record-layout
/// divergence that motivated this fix). Set up by [`init_cache_root`]
/// at startup; cleared at the same time so lockstep digests stay
/// reproducible across reruns.
pub cache_root: Option<std::path::PathBuf>,
/// Bridge to the host UI. `None` when running headless. Installed by
/// `cmd_exec` when the user passes `--ui`.
pub ui: Option<UiBridge>,
@@ -292,6 +302,7 @@ impl KernelState {
has_notified_live_startup: false,
next_thread_id: AtomicU32::new(1),
vfs: None,
cache_root: None,
ui: None,
interrupts: crate::interrupts::InterruptState::default(),
xaudio: crate::xaudio::XAudioState::default(),
@@ -315,6 +326,27 @@ impl KernelState {
};
crate::exports::register_exports(&mut state);
crate::xam::register_exports(&mut state);
// AUDIT-038 — set up a deterministic per-process cache root by
// default. Each new `KernelState` lives in its own tmpdir, named
// with the host pid + a monotonic counter so concurrent tests
// don't collide. Errors here are non-fatal (cache I/O degrades
// to the legacy synth-stub fallback) but logged.
static NEXT_CACHE_ID: std::sync::atomic::AtomicU64 =
std::sync::atomic::AtomicU64::new(0);
let id = NEXT_CACHE_ID.fetch_add(1, std::sync::atomic::Ordering::Relaxed);
let root = std::env::temp_dir().join(format!(
"xenia-rs-cache-{}-{}",
std::process::id(),
id
));
if let Err(e) = state.init_cache_root(root.clone()) {
tracing::warn!(
"Failed to initialise cache root at {:?}: {} — cache:/* opens \
will fall back to the synth-empty-file stub",
root,
e
);
}
state
}
@@ -334,6 +366,66 @@ impl KernelState {
self.exports.insert((module, ordinal), (name, func));
}
/// AUDIT-038 — install a host directory as the backing store for the
/// `cache:` mount. The directory is unconditionally cleared (and then
/// re-created) on entry so two consecutive runs see byte-identical
/// initial state — required for the `sylpheed_n*m.json` lockstep
/// goldens. Mirrors canary's `xenia_main.cc:611-651` setup, which
/// `RegisterSymbolicLink("cache:", "\\CACHE")` against a per-emulator
/// host path.
///
/// Returns `Ok(())` on success; bubbles up any I/O error from the
/// clear/create dance so the caller can surface it.
pub fn init_cache_root(&mut self, root: std::path::PathBuf) -> std::io::Result<()> {
// Clear-then-recreate. Determinism beats incremental persistence
// here: Sylpheed's cache subsystem treats a missing/empty cache
// identically to a stale one (cache-miss → reconstruct), so
// wiping is safe and gives reproducible boots.
if root.exists() {
std::fs::remove_dir_all(&root)?;
}
std::fs::create_dir_all(&root)?;
self.cache_root = Some(root);
Ok(())
}
/// Resolve a guest VFS path (e.g. `cache:\d4ea4615e46ee8ca.tmp`) to
/// the host-FS path that backs it. Returns `None` if the path doesn't
/// have a `cache:` prefix or if no cache root is mounted (legacy
/// synth-stub fallback).
///
/// Path-traversal guard: leading `..\` components are stripped so a
/// malicious guest can't escape the cache directory. Backslashes are
/// normalised to host separators on Linux.
pub fn resolve_cache_path(&self, raw: &str) -> Option<std::path::PathBuf> {
let root = self.cache_root.as_ref()?;
let lower = raw.to_ascii_lowercase();
// Match any of the writable cache prefixes (case-insensitive).
// canary uses separate `\CACHE0`/`\CACHE1` host dirs for cache0:/
// cache1:, but Sylpheed only references `cache:`; collapse all
// three to one backing root until a future game splits them.
let after_prefix = if let Some(rest) = lower.strip_prefix("cache:\\") {
&raw[raw.len() - rest.len()..]
} else if let Some(rest) = lower.strip_prefix("cache:/") {
&raw[raw.len() - rest.len()..]
} else if let Some(rest) = lower.strip_prefix("cache0:\\")
.or_else(|| lower.strip_prefix("cache0:/"))
.or_else(|| lower.strip_prefix("cache1:\\"))
.or_else(|| lower.strip_prefix("cache1:/"))
{
&raw[raw.len() - rest.len()..]
} else {
return None;
};
let normalised = after_prefix.replace('\\', "/");
// Strip leading slashes + path-traversal segments.
let clean: std::path::PathBuf = normalised
.split('/')
.filter(|s| !s.is_empty() && *s != "." && *s != "..")
.collect();
Some(root.join(clean))
}
/// Record an import-thunk address resolved at load time. Called once
/// per `record_type==1` import in xenia-app's Phase 1. Idempotent: a
/// duplicate ordinal overwrites (later wins; in practice the loader