Source changes (dormant parity infra, retained from iterate 2.AI/2.AO): - xenia-kernel/exports.rs: nt_create_event manual_reset polarity + related event wiring - xenia-gpu/mmio_region.rs: D1MODE_VBLANK_VLINE_STATUS hardcode parity Also lands the audit-runs/ analysis notes (.md/.txt/.json digests) for the iterate 2.x VSync/0x10e8/0x1004 wedge investigation. Raw trace dumps (.jsonl/.gz/.csv/.stdout) and agent worktrees (.claude/) are gitignored as regenerable local artifacts — see memory + HANDOFF for the running findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
213 lines
9.2 KiB
Markdown
213 lines
9.2 KiB
Markdown
# Phase C+10 — NtQueryFullAttributesFile — ESCALATION
|
||
|
||
## Outcome
|
||
|
||
**Phase 1 (emitter extension) — LANDED**.
|
||
**Phase 4 fix (cache-state seeding) — ESCALATED**, deferred to a
|
||
dedicated cache-subsystem session.
|
||
|
||
The Phase A emitter now resolves OBJECT_ATTRIBUTES path arguments on
|
||
both engines (cvar-gated, default-off, behaviorally inert when off).
|
||
That permanent infrastructure win surfaces the divergence string for
|
||
this and every future file-IO divergence.
|
||
|
||
The actual cache-seeding fix needed to advance main matched-prefix
|
||
past 102,404 is out of scope per the user's escalation criteria.
|
||
|
||
## Captured framing (post-extension)
|
||
|
||
Both engines now log the resolved path at `kernel.call.args_resolved`:
|
||
|
||
```
|
||
canary[6][102403]: NtQueryFullAttributesFile args_resolved.path = "cache:\\d4ea4615\\e\\46ee8ca"
|
||
ours [1][102403]: NtQueryFullAttributesFile args_resolved.path = "cache:\\d4ea4615\\e\\46ee8ca"
|
||
|
||
canary[6][102404]: kernel.return return_value = 0 (STATUS_SUCCESS)
|
||
ours [1][102404]: kernel.return return_value = 0xC0000034 (STATUS_OBJECT_NAME_NOT_FOUND)
|
||
```
|
||
|
||
Both engines query the **same path**. Canary returns SUCCESS because
|
||
its cache directory (`/home/fabi/.local/share/Xenia/cache/`) is
|
||
**pre-populated** with 23 files (~5 MB) accumulated over prior
|
||
Sylpheed boots. Ours's cache directory is fresh-wiped per AUDIT-038.
|
||
|
||
After this query, canary follows up with `NtCreateFile` for the same
|
||
path (idx 102481) — it actually reads the cached data. So just lying
|
||
SUCCESS without backing bytes would only push the divergence ~78
|
||
events forward.
|
||
|
||
## Classification (per plan Phase 4)
|
||
|
||
**(A) Missing file — narrowly true (this single cache entry), but**
|
||
**(D) Subsystem-required — actual scope**.
|
||
|
||
Choices considered:
|
||
|
||
1. **Plant a single file**: would only push the divergence to the
|
||
next cache-existence query (16+ distinct hashes in
|
||
`cache:\<HASH1>\<X>\<HASH2>` form). 23 files in canary's cache,
|
||
most of them follow this pattern. After each plant the next
|
||
query still misses.
|
||
|
||
2. **Seed ours's cache from canary's**: 23 files, ~5 MB. Mechanically
|
||
easy (~30 LOC `copy_dir_all`) but violates AUDIT-038's no-oracle-
|
||
state line AND AUDIT-053's documented warm-start regression
|
||
(Sylpheed's `cache:\*.tmp` journal-style writes append per boot,
|
||
making a naive persistent seed self-inconsistent after the second
|
||
boot — `runtime_error` throws from version-check on reload).
|
||
|
||
3. **Lie SUCCESS on cache: existence + lie SUCCESS on subsequent
|
||
NtCreateFile + return zero-byte file**: changes Nt semantics
|
||
game-wide, likely breaks any read that expects valid content.
|
||
|
||
4. **Implement the game's cache-generation logic**: that's the
|
||
shader/PSO/material cache build subsystem — multi-hundred-LOC
|
||
generative subsystem, not in scope.
|
||
|
||
The user's escalation criteria explicitly call out
|
||
"cache-population infrastructure" as ESCALATION. Choices 2-4 fit
|
||
that. Choice 1 doesn't solve the problem.
|
||
|
||
## What was landed (Phase 1 only)
|
||
|
||
Permanent emitter extension on both engines, schema-v1-compatible
|
||
(`args_resolved` was already part of v1, this just populates it for
|
||
OBJECT_ATTRIBUTES*-taking exports).
|
||
|
||
### Ours side (~50 LOC additive)
|
||
|
||
- `xenia-rs/crates/xenia-kernel/src/event_log.rs`:
|
||
- New `emit_kernel_call_with_path(tid, cycle, name, Option<&str>)`
|
||
that mirrors `emit_kernel_call` but adds
|
||
`args_resolved:{"path":"..."}` when the path is non-empty.
|
||
Degrades to the existing empty-object form otherwise so output
|
||
is byte-identical to pre-extension when the path is null.
|
||
|
||
- `xenia-rs/crates/xenia-kernel/src/path.rs`:
|
||
- New `object_attributes_raw_name(mem, ptr) -> Option<String>`
|
||
that returns the **raw** trimmed path (no prefix-strip, no
|
||
case-fold). The emitter uses raw form so the diff surfaces
|
||
upstream differences (e.g. if one engine called with one prefix
|
||
and the other with a different prefix), not just post-normalize
|
||
differences.
|
||
|
||
- `xenia-rs/crates/xenia-kernel/src/state.rs`:
|
||
- In `call_export`, when `phase_a_on` and `name` matches one of
|
||
`{NtCreateFile, NtOpenFile, NtQueryFullAttributesFile,
|
||
NtOpenSymbolicLinkObject}`, resolve OBJECT_ATTRIBUTES* from the
|
||
appropriate gpr position (verified against canary's
|
||
xboxkrnl_io.cc signatures) and call
|
||
`emit_kernel_call_with_path`. Otherwise call the legacy
|
||
`emit_kernel_call`.
|
||
|
||
### Canary side (~80 LOC additive)
|
||
|
||
- `xenia-canary/src/xenia/kernel/event_log.h`:
|
||
- New `EmitKernelCallWithPath(name, path)` mirroring ours.
|
||
|
||
- `xenia-canary/src/xenia/kernel/event_log.cc`:
|
||
- Implementation of `EmitKernelCallWithPath`.
|
||
- New `phase_a_bridge::EmitImportAndCallWithCtx(module, ord, name,
|
||
ppc_context)` that dispatches by `name` to read OBJECT_ATTRIBUTES
|
||
from the PPCContext gpr and call the path-bearing form. Falls
|
||
back to the legacy form when name doesn't match.
|
||
- Helper `ReadObjectAttributesRawName(obj_attrs_ptr)` that mirrors
|
||
ours's `object_attributes_raw_name` semantically (raw trimmed,
|
||
no normalization).
|
||
|
||
- `xenia-canary/src/xenia/kernel/util/shim_utils.h`:
|
||
- Both trampolines (X::Trampoline / Y::Trampoline) switched from
|
||
`EmitImportAndCall(...)` to `EmitImportAndCallWithCtx(...,
|
||
ppc_context)`. PPCContext is already in scope at that call site
|
||
(it's the first argument the trampoline receives).
|
||
|
||
Total: ~80 LOC each side. Both behaviorally inert when cvar OFF.
|
||
|
||
## Gates (Phase 1 extension only — all pass)
|
||
|
||
| # | gate | result |
|
||
|---|---|---|
|
||
| 1 | cvar-OFF determinism 50M (3 runs) | PASS — all 3 = `b8fa0e0460359a4f660adb7605e053de` (matches C+9 baseline, unchanged) |
|
||
| 2 | Phase B `image_loaded_sha256` | PASS — `ea8d160e9369328a5b922258a92113efb8d7ce3e1a5c12cc521e375985c91c18` (matches baseline) |
|
||
| 3 | Phase A main matched-prefix | UNCHANGED — 102404 (extension was framing-only; no fix landed; no advance expected) |
|
||
| 4 | Both engines build clean | PASS |
|
||
| 5 | Phase A emitter det fields (2 runs) | PASS — both = `7489e90ef4c9be629af8c9fabb1cbdd7` (new; replaces C+9's `0b299c37…` because the new args_resolved.path field is part of the det signature) |
|
||
| 6 | Unit tests | PASS — 165 → 165 (no new, no regressions) |
|
||
|
||
## Schema status
|
||
|
||
The args_resolved field is part of schema-v1 already; this Phase only
|
||
**populates** it for a subset of exports. No schema version bump.
|
||
|
||
The schema-v1 example (`schema-v1.md:112`) shows exactly the form we
|
||
emit. We are now compliant with the documented schema for path-bearing
|
||
exports rather than emitting an empty stub.
|
||
|
||
## Cascade prediction (resolution / next steps)
|
||
|
||
| stage | predicted | outcome |
|
||
|---|---|---|
|
||
| A=extend emitter cleanly | ~80% | LANDED |
|
||
| B=capture path string both engines | ~85% | LANDED — `cache:\d4ea4615\e\46ee8ca` matched both engines |
|
||
| C=classify root cause | ~75% | DONE — Class D (subsystem-required) |
|
||
| D=land fix in scope | ~55% | **ESCALATED** — fix is choice 2-4 above |
|
||
| E=main chain advances past 102404 | ~50% | NOT THIS SESSION |
|
||
|
||
## Reading-error class
|
||
|
||
NO new class. Existing classes #15 / ζ (VFS layout aliasing,
|
||
AUDIT-053) and AUDIT-038 (no oracle state) are re-affirmed:
|
||
|
||
* Class #15 ζ (AUDIT-053): persistent cache + journal `.tmp` writes
|
||
create a warm-start regression.
|
||
* AUDIT-038 line: oracle state is forbidden in default boot.
|
||
|
||
Both rules together make the cache-seeding fix subsystem-tier, not
|
||
single-fix-tier.
|
||
|
||
## Handoff to dedicated cache-subsystem session
|
||
|
||
The next session targeting this divergence should:
|
||
|
||
1. **Decide cache-state strategy**:
|
||
- (a) Implement Sylpheed's cache-generation logic so ours builds
|
||
its own cache from scratch (matches canary's own bootstrap
|
||
experience — but multi-hundred-LOC).
|
||
- (b) Seed-once-then-persist: copy canary's cache into ours's
|
||
cache_root behind a new cvar `--cache-seed-from=<path>`, then
|
||
enable persistence. AUDIT-053's warm-start regression must be
|
||
re-tested with AUDIT-054's FILE_DIRECTORY_FILE fix in tree
|
||
(it landed AFTER 053's regression was observed).
|
||
- (c) Hybrid: synthesize a stub success at NtQueryFullAttributesFile
|
||
for known-good cache hashes, then synthesize NtCreateFile/Read
|
||
responses with bytes captured from canary's cache files. Closest
|
||
to a "single missing file plant" but for 23 files.
|
||
|
||
2. **Re-validate after the fix** that the warm-start regression
|
||
identified in AUDIT-053 doesn't recur (AUDIT-054 may have fixed
|
||
it; needs explicit re-test).
|
||
|
||
3. **Expect cascading Phase A divergences**: each cache hash the
|
||
game looks up in turn — the divergence at 102,404 is only the
|
||
FIRST. After cache:\d4ea4615 is resolved, the game queries
|
||
cache:\69d8e45c (idx 103810 already visible in ours.jsonl) and
|
||
so on through 16+ distinct hashes per AUDIT-052.
|
||
|
||
## Files in this audit run
|
||
|
||
| file | content |
|
||
|---|---|
|
||
| `escalation.md` | this file |
|
||
| `investigation.md` | Phase 1-4 walkthrough |
|
||
| `re-validation.md` | gate results (Phase 1 extension only) |
|
||
| `ours.jsonl`, `ours-determ.jsonl`, `canary.jsonl` | Phase A logs with new args_resolved field |
|
||
| `diff-report.md` | re-run with path field populated |
|
||
| `snap/ours/` | Phase B snapshot (unchanged from C+9) |
|
||
| `digest-cvaroff-{1,2,3}.json` | 3× determinism (all = C+9 baseline) |
|
||
|
||
## Next target
|
||
|
||
**Same idx 102,404 NtQueryFullAttributesFile**, but in a dedicated
|
||
cache-subsystem session. Path framing is now captured for the next
|
||
investigator's first read.
|