# Phase C+10 — NtQueryFullAttributesFile — Investigation ## Phase 1: Emitter extension (LANDED) ### Problem C+9 left the divergence with no resolved path string: ``` canary[6][102404] kernel.return NtQueryFullAttributesFile return_value=0 ours [1][102404] kernel.return NtQueryFullAttributesFile return_value=0xC0000034 ``` `payload.args` and `payload.args_resolved` were both empty objects. We had no way to identify WHICH file the engine was querying. ### Shape of the fix Schema v1 already declares `args_resolved` as a free-form object attached to `kernel.call` (schema-v1.md:108-117), and the existing example explicitly shows `{"path":"..."}`. The emitter just wasn't populating it. Extension is pure schema-v1 compliance, no version bump. #### Ours-side (event_log.rs / path.rs / state.rs) - Added `event_log::emit_kernel_call_with_path(tid, cycle, name, Option<&str>)` — same byte format as `emit_kernel_call`, but when `path` is `Some(non_empty)` emits `args_resolved:{"path":"..."}`. When `None` or empty, degrades to the existing `args_resolved:{}` form so unrelated exports' output is byte-identical to pre-extension. - Added `path::object_attributes_raw_name(mem, ptr) -> Option` — returns the RAW path string (trimmed of whitespace, NO prefix-strip / no case-fold) so the diff surfaces upstream prefix-form differences instead of masking them via normalization. Pre-existing `object_attributes_to_vfs_path` (which DOES normalize) is kept as-is for VFS lookup callers; emitter uses the new raw helper. - `state.rs::call_export`, inside the `phase_a_on` guarded block: new `match name` resolves OBJECT_ATTRIBUTES* from the right gpr position. Argument positions verified against canary's `xboxkrnl/xboxkrnl_io.cc` signatures: - `NtQueryFullAttributesFile` → r3 = obj_attrs - `NtOpenSymbolicLinkObject` → r4 = obj_attrs - `NtCreateFile`, `NtOpenFile` → r5 = obj_attrs Then calls `emit_kernel_call_with_path(..., resolved.as_deref())` instead of `emit_kernel_call(...)`. All other exports fall through to `None` and the legacy form. #### Canary-side (event_log.h / event_log.cc / util/shim_utils.h) - `event_log.h`: declared `EmitKernelCallWithPath(name, path)`. - `event_log.cc`: implemented same as ours (degrades to legacy form for empty path). - `event_log.cc::phase_a_bridge::EmitImportAndCallWithCtx(module, ord, name, ppc_context)` — new bridge function. PPCContext is passed as `void*` to keep the header transitive include footprint small (the bridge cc reinterprets to PPCContext* internally). Inside the bridge, helper `ReadObjectAttributesRawName(ptr)` reads the X_OBJECT_ATTRIBUTES.name_ptr, then the X_ANSI_STRING bytes directly out of guest memory (no util::TranslateAnsiPath normalization). Trims whitespace + trailing NULs to match ours's semantics byte-for-byte. - `util/shim_utils.h`: both export trampolines (X::Trampoline / Y::Trampoline) switched the `phase_a_bridge::EmitImportAndCall` call to `phase_a_bridge::EmitImportAndCallWithCtx`, passing the existing `ppc_context` argument that's already in scope. The legacy `EmitImportAndCall` stays declared and defined for any future callers that don't have a PPCContext. ### Verification - Build both engines clean. - Determinism 3x: digest md5 = `b8fa0e0460359a4f660adb7605e053de` (identical to C+9 baseline — extension is cvar-OFF zero-cost). - Phase A emitter determinism 2x: det-fields md5 = `7489e90e…` byte identical. (Different from C+9's `0b299c37…` because the path field IS in the deterministic signature — but stable across runs.) ## Phase 2: Re-run + capture path string After the extension, both engines emit the path at `kernel.call.args_resolved.path`: ``` canary[6][102403] NtQueryFullAttributesFile path = "cache:\d4ea4615\e\46ee8ca" ours [1][102403] NtQueryFullAttributesFile path = "cache:\d4ea4615\e\46ee8ca" ``` Both engines query the **same path**. No upstream divergence — the ANSI_STRING content matches byte-for-byte. ## Phase 3: Why does ours say NOT_FOUND? ### Trace through ours's `nt_query_full_attributes_file` `exports.rs:1913-1990`: 1. Read OBJECT_ATTRIBUTES → path = `"cache:/d4ea4615/e/46ee8ca"` (after `normalize_path`). 2. `state.resolve_cache_path(&path)` returns `Some(/xenia-rs-cache--0/d4ea4615/e/46ee8ca)`. 3. `std::fs::metadata(host_path)` returns `Err(NotFound)`. 4. Return `STATUS_OBJECT_NAME_NOT_FOUND` (`0xC0000034`). The host path doesn't exist because ours's `init_cache_root` (`state.rs:499-510`) **clears** the cache directory on every boot (AUDIT-038 line: per-process tmpdir + full wipe so two consecutive runs see byte-identical initial state). ### Why does canary's NOT fail? `xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_io.cc:474-513`: 1. Read OBJECT_ATTRIBUTES → target_path via TranslateAnsiPath. 2. `kernel_state()->file_system()->ResolvePath(target_path)`. 3. If `entry` found, populate file_info, return `X_STATUS_SUCCESS`. 4. Else return `X_STATUS_NO_SUCH_FILE` (`0xC0000035`). Canary returns 0 → entry was found. Canary's cache mount is at `/home/fabi/.local/share/Xenia/cache/` (a persistent host directory populated over prior boots). ### Verification of canary's cache state ``` $ ls /home/fabi/.local/share/Xenia/cache/d4ea4615/e/ -rw-rw-r-- 1 fabi fabi 400 May 11 21:01 46ee8ca ``` Single 400-byte file. Total cache: 23 files, ~5 MB across 16 distinct top-level hash directories. ### Sibling-cache observations ours.jsonl shows the SAME `NtQueryFullAttributesFile` fires for multiple cache paths within the 50M window — all returning `0xC0000034`. Example: idx 103810 queries `cache:\69d8e45c\8\3421153`. So the divergence is not a single missing file but a class of 16+ missing hashes. ## Phase 4: Classification + scope decision Per the plan, the classes are: * **(A) Missing file** — a single plant fixes it (small). * **(B) Path-normalization bug** — string operation (small). * **(C) VFS mount missing** — add the mount (small-medium). * **(D) Subsystem-required** — STFS or similar — **ESCALATE**. * **(E) Upstream divergence** — walk back. This is **NOT (B)** — both engines normalize identically (verified by matching args_resolved.path). This is **NOT (E)** — upstream is bit-identical for 102,403 events. This is **NOT (A)** for any single file — the game queries 16+ distinct cache hashes; planting one only postpones the divergence. This is **closest to a hybrid (C+D)**: * **(C)-ish**: canary's cache MOUNT resolves to a populated host dir; ours's mount resolves to a wiped tmp dir. * **(D)-ish**: canary's cache is populated because it ran the game before and the game **built** the cache. To match canary's state on a fresh boot, we either: - implement the game's cache-build logic (subsystem), - copy canary's pre-built cache (oracle state — AUDIT-038 violation), - or accept that ours runs cold and the divergence is a fundamental cold-vs-warm asymmetry. ### AUDIT-053 cross-check (warm-start regression risk) Per AUDIT-053 memo: > Phase 2 permanent fix REVERTED — warm-start regression from VFS > layout aliasing: `open_cache_file` treats all `NtCreateFile` as > files, but `cache:\d4ea4615 disp=CREATE` is meant as a DIRECTORY. AUDIT-054 fixed that specific aliasing (FILE_DIRECTORY_FILE bit threading). But there's still the AUDIT-053 secondary concern: Sylpheed's `cache:\.tmp` journal-style writes append on each boot — making naive persistence self-inconsistent across boots. Whether AUDIT-054's fix fully unblocks persistence is **NOT RE-VERIFIED** in this session. Re-testing the AUDIT-053 regression under AUDIT-054's fix-in-tree is itself a follow-up. ### Scope per user direction User said: > If the fix requires major VFS work, STFS subsystem > implementation, or cache-population infrastructure: ESCALATE. Choices 2-4 from `escalation.md` all qualify as "cache-population infrastructure": * Choice 1 (single file plant) won't solve the problem (16+ hashes). * Choice 2 (seed from canary) is oracle state + warm-start regression risk per AUDIT-053. * Choice 3 (synthesize cache reads) is multi-export semantic-change. * Choice 4 (build cache from scratch) is a full subsystem. **ESCALATION declared.** Phase 1 emitter extension landed as the session's permanent infrastructure contribution. ## Discipline check * **Reading-error #28** (canary source-of-truth): verified canary's actual `NtQueryFullAttributesFile_entry` body (`xboxkrnl_io.cc:474-513`), did not assume. * **Reading-error #23** (downstream regression): no fix landed, so no regression risk. Emitter extension is cvar-OFF zero-cost. * **Escalation discipline**: triggered cleanly; explicit memo; contributing infrastructure (emitter path resolution) kept. * **Path encoding**: ANSI_STRING raw bytes captured; both engines agree byte-for-byte; no Unicode issues for the queried path. * **AUDIT-054 deferred-item**: not re-touched. Cache persistence remains opt-in via `XENIA_CACHE_PERSIST=1`. Default keeps the AUDIT-038 wipe behavior. * **`--mute=true`**: every canary run. * **Renamed binaries**: `xrs-c10` / `xc-c10.exe`. ## Confidence * **Phase 1 emitter extension**: HIGH — schema-compliant, additive, cvar-OFF zero-cost verified via determinism. * **Phase 4 classification**: HIGH — three independent observations agree (canary cache populated, ours cache wiped, multiple hashes). * **Cascade prediction at 102,404**: cache fix lands only the FIRST in a series — next cache hash will be the next divergence. Likely net delta of several hundred to a few thousand matched events per cache slot resolved, until a non-cache divergence appears.