Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
9.2 KiB
Phase C+10 — NtQueryFullAttributesFile — ESCALATION
Outcome
Phase 1 (emitter extension) — LANDED. Phase 4 fix (cache-state seeding) — ESCALATED, deferred to a dedicated cache-subsystem session.
The Phase A emitter now resolves OBJECT_ATTRIBUTES path arguments on both engines (cvar-gated, default-off, behaviorally inert when off). That permanent infrastructure win surfaces the divergence string for this and every future file-IO divergence.
The actual cache-seeding fix needed to advance main matched-prefix past 102,404 is out of scope per the user's escalation criteria.
Captured framing (post-extension)
Both engines now log the resolved path at kernel.call.args_resolved:
canary[6][102403]: NtQueryFullAttributesFile args_resolved.path = "cache:\\d4ea4615\\e\\46ee8ca"
ours [1][102403]: NtQueryFullAttributesFile args_resolved.path = "cache:\\d4ea4615\\e\\46ee8ca"
canary[6][102404]: kernel.return return_value = 0 (STATUS_SUCCESS)
ours [1][102404]: kernel.return return_value = 0xC0000034 (STATUS_OBJECT_NAME_NOT_FOUND)
Both engines query the same path. Canary returns SUCCESS because
its cache directory (/home/fabi/.local/share/Xenia/cache/) is
pre-populated with 23 files (~5 MB) accumulated over prior
Sylpheed boots. Ours's cache directory is fresh-wiped per AUDIT-038.
After this query, canary follows up with NtCreateFile for the same
path (idx 102481) — it actually reads the cached data. So just lying
SUCCESS without backing bytes would only push the divergence ~78
events forward.
Classification (per plan Phase 4)
(A) Missing file — narrowly true (this single cache entry), but (D) Subsystem-required — actual scope.
Choices considered:
-
Plant a single file: would only push the divergence to the next cache-existence query (16+ distinct hashes in
cache:\<HASH1>\<X>\<HASH2>form). 23 files in canary's cache, most of them follow this pattern. After each plant the next query still misses. -
Seed ours's cache from canary's: 23 files, ~5 MB. Mechanically easy (~30 LOC
copy_dir_all) but violates AUDIT-038's no-oracle- state line AND AUDIT-053's documented warm-start regression (Sylpheed'scache:\*.tmpjournal-style writes append per boot, making a naive persistent seed self-inconsistent after the second boot —runtime_errorthrows from version-check on reload). -
Lie SUCCESS on cache: existence + lie SUCCESS on subsequent NtCreateFile + return zero-byte file: changes Nt semantics game-wide, likely breaks any read that expects valid content.
-
Implement the game's cache-generation logic: that's the shader/PSO/material cache build subsystem — multi-hundred-LOC generative subsystem, not in scope.
The user's escalation criteria explicitly call out "cache-population infrastructure" as ESCALATION. Choices 2-4 fit that. Choice 1 doesn't solve the problem.
What was landed (Phase 1 only)
Permanent emitter extension on both engines, schema-v1-compatible
(args_resolved was already part of v1, this just populates it for
OBJECT_ATTRIBUTES*-taking exports).
Ours side (~50 LOC additive)
-
xenia-rs/crates/xenia-kernel/src/event_log.rs:- New
emit_kernel_call_with_path(tid, cycle, name, Option<&str>)that mirrorsemit_kernel_callbut addsargs_resolved:{"path":"..."}when the path is non-empty. Degrades to the existing empty-object form otherwise so output is byte-identical to pre-extension when the path is null.
- New
-
xenia-rs/crates/xenia-kernel/src/path.rs:- New
object_attributes_raw_name(mem, ptr) -> Option<String>that returns the raw trimmed path (no prefix-strip, no case-fold). The emitter uses raw form so the diff surfaces upstream differences (e.g. if one engine called with one prefix and the other with a different prefix), not just post-normalize differences.
- New
-
xenia-rs/crates/xenia-kernel/src/state.rs:- In
call_export, whenphase_a_onandnamematches one of{NtCreateFile, NtOpenFile, NtQueryFullAttributesFile, NtOpenSymbolicLinkObject}, resolve OBJECT_ATTRIBUTES* from the appropriate gpr position (verified against canary's xboxkrnl_io.cc signatures) and callemit_kernel_call_with_path. Otherwise call the legacyemit_kernel_call.
- In
Canary side (~80 LOC additive)
-
xenia-canary/src/xenia/kernel/event_log.h:- New
EmitKernelCallWithPath(name, path)mirroring ours.
- New
-
xenia-canary/src/xenia/kernel/event_log.cc:- Implementation of
EmitKernelCallWithPath. - New
phase_a_bridge::EmitImportAndCallWithCtx(module, ord, name, ppc_context)that dispatches bynameto read OBJECT_ATTRIBUTES from the PPCContext gpr and call the path-bearing form. Falls back to the legacy form when name doesn't match. - Helper
ReadObjectAttributesRawName(obj_attrs_ptr)that mirrors ours'sobject_attributes_raw_namesemantically (raw trimmed, no normalization).
- Implementation of
-
xenia-canary/src/xenia/kernel/util/shim_utils.h:- Both trampolines (X::Trampoline / Y::Trampoline) switched from
EmitImportAndCall(...)toEmitImportAndCallWithCtx(..., ppc_context). PPCContext is already in scope at that call site (it's the first argument the trampoline receives).
- Both trampolines (X::Trampoline / Y::Trampoline) switched from
Total: ~80 LOC each side. Both behaviorally inert when cvar OFF.
Gates (Phase 1 extension only — all pass)
| # | gate | result |
|---|---|---|
| 1 | cvar-OFF determinism 50M (3 runs) | PASS — all 3 = b8fa0e0460359a4f660adb7605e053de (matches C+9 baseline, unchanged) |
| 2 | Phase B image_loaded_sha256 |
PASS — ea8d160e9369328a5b922258a92113efb8d7ce3e1a5c12cc521e375985c91c18 (matches baseline) |
| 3 | Phase A main matched-prefix | UNCHANGED — 102404 (extension was framing-only; no fix landed; no advance expected) |
| 4 | Both engines build clean | PASS |
| 5 | Phase A emitter det fields (2 runs) | PASS — both = 7489e90ef4c9be629af8c9fabb1cbdd7 (new; replaces C+9's 0b299c37… because the new args_resolved.path field is part of the det signature) |
| 6 | Unit tests | PASS — 165 → 165 (no new, no regressions) |
Schema status
The args_resolved field is part of schema-v1 already; this Phase only populates it for a subset of exports. No schema version bump.
The schema-v1 example (schema-v1.md:112) shows exactly the form we
emit. We are now compliant with the documented schema for path-bearing
exports rather than emitting an empty stub.
Cascade prediction (resolution / next steps)
| stage | predicted | outcome |
|---|---|---|
| A=extend emitter cleanly | ~80% | LANDED |
| B=capture path string both engines | ~85% | LANDED — cache:\d4ea4615\e\46ee8ca matched both engines |
| C=classify root cause | ~75% | DONE — Class D (subsystem-required) |
| D=land fix in scope | ~55% | ESCALATED — fix is choice 2-4 above |
| E=main chain advances past 102404 | ~50% | NOT THIS SESSION |
Reading-error class
NO new class. Existing classes #15 / ζ (VFS layout aliasing, AUDIT-053) and AUDIT-038 (no oracle state) are re-affirmed:
- Class #15 ζ (AUDIT-053): persistent cache + journal
.tmpwrites create a warm-start regression. - AUDIT-038 line: oracle state is forbidden in default boot.
Both rules together make the cache-seeding fix subsystem-tier, not single-fix-tier.
Handoff to dedicated cache-subsystem session
The next session targeting this divergence should:
-
Decide cache-state strategy:
- (a) Implement Sylpheed's cache-generation logic so ours builds its own cache from scratch (matches canary's own bootstrap experience — but multi-hundred-LOC).
- (b) Seed-once-then-persist: copy canary's cache into ours's
cache_root behind a new cvar
--cache-seed-from=<path>, then enable persistence. AUDIT-053's warm-start regression must be re-tested with AUDIT-054's FILE_DIRECTORY_FILE fix in tree (it landed AFTER 053's regression was observed). - (c) Hybrid: synthesize a stub success at NtQueryFullAttributesFile for known-good cache hashes, then synthesize NtCreateFile/Read responses with bytes captured from canary's cache files. Closest to a "single missing file plant" but for 23 files.
-
Re-validate after the fix that the warm-start regression identified in AUDIT-053 doesn't recur (AUDIT-054 may have fixed it; needs explicit re-test).
-
Expect cascading Phase A divergences: each cache hash the game looks up in turn — the divergence at 102,404 is only the FIRST. After cache:\d4ea4615 is resolved, the game queries cache:\69d8e45c (idx 103810 already visible in ours.jsonl) and so on through 16+ distinct hashes per AUDIT-052.
Files in this audit run
| file | content |
|---|---|
escalation.md |
this file |
investigation.md |
Phase 1-4 walkthrough |
re-validation.md |
gate results (Phase 1 extension only) |
ours.jsonl, ours-determ.jsonl, canary.jsonl |
Phase A logs with new args_resolved field |
diff-report.md |
re-run with path field populated |
snap/ours/ |
Phase B snapshot (unchanged from C+9) |
digest-cvaroff-{1,2,3}.json |
3× determinism (all = C+9 baseline) |
Next target
Same idx 102,404 NtQueryFullAttributesFile, but in a dedicated cache-subsystem session. Path framing is now captured for the next investigator's first read.