From 767503508212897194986306c86257ec7df254aa Mon Sep 17 00:00:00 2001 From: MechaCat02 Date: Mon, 4 May 2026 21:01:25 +0200 Subject: [PATCH] =?UTF-8?q?fix(kernel):=20KRNBUG-IO-002=20=E2=80=94=20vol-?= =?UTF-8?q?info=20class-3=20returns=200x10000=20alloc=20unit=20(canary=20N?= =?UTF-8?q?ullDevice)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit `nt_query_volume_information_file` class-3 (`FileFsSizeInformation`) was returning sectors_per_unit=1, bytes_per_sector=2048 (alloc unit 2048). Replaced with canary's NullDevice byte-identical values sectors=0x80, bps=0x200 (alloc unit 0x10000), with total / available allocation units lowered to 0x10 / 0x10 to match. Reference: xenia-canary/src/xenia/vfs/devices/null_device.h:38-46 (`NullDevice::sectors_per_allocation_unit()` and `bytes_per_sector()`); consumed by canary's `NtQueryVolumeInformationFile_entry` at xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_io_info.cc:355-365. Tests 591 → 592 (added `nt_query_volume_information_file_class3_returns_64k_alloc_unit`). Lockstep `instructions=100000010, swaps=2, draws=0` deterministic across two `--stable-digest -n 100M` reruns. sylpheed_n50m oracle still matches its existing golden — observably a no-op at -n 50M. The audit-006-predicted 7→0 cascade did NOT fire (canary-only exports still 7, identical set; XexCheckExecutablePrivilege still priv=0xA only; XamTaskSchedule still 0). All 16 NtQueryVolumeInformationFile calls in our 500M trace originate from a single LR 0x82611f38 and complete successfully — vol-info is therefore not the priv-11 gate. The fix value is correct (canary-byte-identical) but is not load-bearing for the gate; landing it anyway because it's the right value and unblocks no regression. Stop condition triggered per the IO-002 task brief — no second fix this session. Next-session: --pc-probe on sub_824A9710 entry to find the actual upstream gate. See `audit-findings.md` (KRNBUG-IO-002 entry) and `audit-runs/post-IO-002/` for the full diagnostic trail. Co-Authored-By: Claude Opus 4.7 (1M context) --- audit-findings.md | 80 ++++++++++++++++++++++++++++++ crates/xenia-kernel/src/exports.rs | 39 +++++++++++++-- 2 files changed, 115 insertions(+), 4 deletions(-) diff --git a/audit-findings.md b/audit-findings.md index 2e2dd74..1b41add 100644 --- a/audit-findings.md +++ b/audit-findings.md @@ -4999,3 +4999,83 @@ for n in sorted(set(canary) - set(ours)): print(f' {canary[n]:>5} {n}') PY ``` + +### KRNBUG-IO-002 — `nt_query_volume_information_file` block size (LANDED, gate hypothesis FALSIFIED) + +**Status:** applied (branch `xboxkrnl-vol-allocunit/p0-65536-cluster`, +single squash commit). Tests 591 → 592. Lockstep +`instructions=100000010, swaps=2, draws=0` deterministic across two +reruns. sylpheed_n50m oracle still matches its existing golden. + +**The fix.** `crates/xenia-kernel/src/exports.rs:1241-1269`, +`nt_query_volume_information_file` class-3 (FileFsSizeInformation) +branch, was returning `(total=0x100000, free=0, +sectors_per_unit=1, bytes_per_sector=2048)`. Replaced with the +canary-NullDevice byte-identical `(total=0x10, free=0x10, +sectors_per_unit=0x80, bytes_per_sector=0x200)` (product = 65536). +Reference: `xenia-canary/src/xenia/vfs/devices/null_device.h:38-46`. + +**The cascade hypothesis.** AUDIT-006 predicted that fixing this would +unblock seven canary-only kernel exports — the priv-11 query at +`sub_824A9710` would fire, `XamTaskSchedule` at `lr=0x824a9a10` would +fire, the Cache0 callback thread would spawn, and dispatcher 0x100c's +producer would finally fire (closing the 6-session producer hunt). + +**The cascade DID NOT FIRE.** Fresh 500 M trace at +`audit-runs/post-IO-002/ours.log` (692 MB, 5.6 M lines): + +| Metric | Pre-IO-002 (audit-006) | Post-IO-002 | +|---|---|---| +| canary-only kernel exports | 7 | **7 (identical set)** | +| `XexCheckExecutablePrivilege` calls | 1 (priv=0xA only) | **1** (still no priv=0xB) | +| `XamTaskSchedule` calls | 0 | **0** | +| `KeResetEvent / ObCreateSymbolicLink / KeReleaseSemaphore / ExTerminateThread / XamTaskCloseHandle / XamUserReadProfileSettings` | 0 | **0** | +| `NtQueryVolumeInformationFile` calls | 16 | **16** (no new sites reached) | +| `swaps` | 2 | 2 | +| `draws` | 0 | 0 | +| Worker thread spawns | 19 | 18 (within noise) | +| `imports` at -n 100M (stable digest) | 987686 | **987630** (-56) | + +**Diagnostic.** All 16 `NtQueryVolumeInformationFile` calls in our trace +originate from a single LR `0x82611f38`, a downstream consumer that +**completes successfully** in both pre- and post-fix runs. The audit-006 +premise that `sub_824ABA98`/`sub_824A9710` consume the volume-info reply +at the priv-11 gate is therefore likely incorrect, *or* the gate consumes +a different information class via a different export entirely. + +**Stop-condition triggered.** Per the IO-002 task brief, this session +landed the correct fix (it makes our reply byte-identical to canary's +NullDevice and survives every test we have) but did not pivot to a +second fix. The branch is kept because the value change is correct +and unblocks no regression; it is, however, **not load-bearing for +the priv-11 gate**. + +**Next-session next-gate hypothesis (untested, ranked by likelihood):** + +1. **`sub_824A9710` early-exit probe.** Per AUDIT-005 instrumentation + the priv-11 site has never fired in any session. Use `--pc-probe` on + the entry of `sub_824A9710` and probe each conditional branch within + it; whichever branch exits the function before the priv-11 + `XexCheckExecutablePrivilege` call site is the actual gate. +2. **Different info-class.** `nt_query_information_file` (class 5 + `FileStandardInformation`, class 22 etc.) or + `nt_query_full_attributes_file` may be the actual consumer. The + 16 calls at LR `0x82611f38` are *not* the gate even though they + complete successfully. +3. **Mis-attributed disasm.** AUDIT-005's identification of + `sub_824ABA98 = VerifyDirBlockSize` came from disasm reading; IO-001's + runtime trace already invalidated parts of that attribution. + Re-disassemble `sub_824A9710` with `xenia-rs dis --json --at 0x824a9710` + and walk every conditional that might exit before the priv-11 query. +4. **A different IOCTL.** `NtDeviceIoControlFile` is now reachable + (KRNBUG-IO-001 unblocked it); some FsCtl response we return may be + the new gate. + +**Trace artifacts:** +- `audit-runs/post-IO-002/ours.log` — 500 M trace, post-fix +- `audit-runs/post-IO-002/canary.log` — copy of the audit-006 canary oracle +- `audit-runs/post-IO-002/diff.py` — copy of audit-006 diff tool +- `audit-runs/post-IO-002/lock_n100m_run{1,2}.json` — bit-identical lockstep digests +- `audit-runs/post-IO-002/canary_only.txt` — set-difference output (the 7-entry list) +- `audit-runs/post-IO-002/canary_exports.txt`, `ours_exports.txt` — sorted unique export names + diff --git a/crates/xenia-kernel/src/exports.rs b/crates/xenia-kernel/src/exports.rs index 2e77fb0..90ff2c7 100644 --- a/crates/xenia-kernel/src/exports.rs +++ b/crates/xenia-kernel/src/exports.rs @@ -1250,10 +1250,10 @@ fn nt_query_volume_information_file(ctx: &mut PpcContext, mem: &GuestMemory, _st // SectorsPerAllocationUnit(u32), BytesPerSector(u32) let written: u32 = match class { 3 if length >= 24 => { - mem.write_u64(info, 0x10_0000); // ~2GB at 2KB sectors - mem.write_u64(info + 8, 0); - mem.write_u32(info + 16, 1); - mem.write_u32(info + 20, 2048); + mem.write_u64(info, 0x10); + mem.write_u64(info + 8, 0x10); + mem.write_u32(info + 16, 0x80); + mem.write_u32(info + 20, 0x200); 24 } _ => { @@ -4331,6 +4331,37 @@ mod tests { assert_eq!(mem.read_u8(info_buf + 21), 0, "normal file not directory"); } + #[test] + fn nt_query_volume_information_file_class3_returns_64k_alloc_unit() { + let (mut ctx, mut mem, mut state) = fresh(); + let h = state.alloc_handle_for(KernelObject::File { + path: String::new(), + size: 0, + position: 0, + data: std::sync::Arc::new(Vec::new()), + dir_enum_pos: None, + }); + let iosb = SCRATCH_BASE; + let info_buf = SCRATCH_BASE + 0x100; + ctx.gpr[3] = h as u64; + ctx.gpr[4] = iosb as u64; + ctx.gpr[5] = info_buf as u64; + ctx.gpr[6] = 24; + ctx.gpr[7] = 3; // FileFsSizeInformation + nt_query_volume_information_file(&mut ctx, &mut mem, &mut state); + + assert_eq!(ctx.gpr[3], STATUS_SUCCESS as u64); + let sectors_per_unit = mem.read_u32(info_buf + 16); + let bytes_per_sector = mem.read_u32(info_buf + 20); + assert_eq!(sectors_per_unit, 0x80); + assert_eq!(bytes_per_sector, 0x200); + assert_eq!( + sectors_per_unit * bytes_per_sector, + 0x10000, + "alloc unit must be 64 KiB to match canary NullDevice", + ); + } + // ===== PKEVENT shim ===== /// Write a DISPATCHER_HEADER at the given guest pointer.