xenia-rs

Author	SHA1	Message	Date
MechaCat02	b78e6fd205	fix(kernel): KRNBUG-IO-004 — real XamNotifyCreateListener + XNotifyGetNext per canary Canary's RegisterNotifyListener (kernel_state.cc:1013-1033) auto-enqueues four startup notifications on the first listener whose mask covers kXNotifySystem (SystemUI=0x09 + SystemSignInChanged=0x0A) and kXNotifyLive (LiveConnectionChanged=0x02000001 + LiveLinkStateChanged=0x02000003). XNotifyGetNext (xam_notify.cc:22-96) pops the queue with mask + version filtering on enqueue per xnotifylistener.cc:38-51. Our prior stubs returned 0 forever; the dispatch loop at 0x822f1be8 in sub_822F1AA8 was thus bypassed indefinitely. Implementation: - KernelObject::NotifyListener { mask, max_version, queue, waiters } variant. - KernelState::has_notified_startup + has_notified_live_startup gates. - xam_notify_create_listener: mask=r3 (qword), max_version=r4 (clamped <=10), alloc handle, conditional 4-tuple startup enqueue. - xnotify_get_next: handle/match_id/id_ptr/param_ptr in r3..r6; pop_front (or scan-by-id), with mask + version filter applied at enqueue time. - 5 unit tests covering: full-mask 4 startup notifications, second-listener no re-fire, system-only mask filtering, max_version=0 too-new drop, unknown handle returning 0. Tests: 594 -> 599. Lockstep `-n 100M` instructions=100000012 deterministic across 2 reruns; bit-identical run-to-run diff. Cascade (verified at -n 500M): - dispatch arm 0x822f1be8 fires; sub_82173DC8 entered. - 3/21 renderer-cluster L1 PCs newly reached: 0x822c6870 (2 workers), 0x824563e0, 0x823ddb50. - canary-only export delta 7 -> 3 (reclassified to fired: KeResetEvent, ObCreateSymbolicLink, XamTaskCloseHandle, XamTaskSchedule). - worker thread count 18 -> 20. - signal_attempts on handle 0x15e0 = 1 (primary=1), was 0. - draws=0 still expected at this step. LOC: 119 (97 impl + 22 scaffolding pattern matches across main.rs / objects.rs / state.rs) <= 120. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-06 16:55:51 +02:00
MechaCat02	a1a7265f29	fix(kernel): KRNBUG-IO-003 — NtDeviceIoControlFile real impl mirroring NullDevice::IoControl Replace the stub_success registration of NtDeviceIoControlFile at exports.rs:90 with a real handler for FsCtlCodes 0x70000 (drive geometry) and 0x74004 (partition info), mirroring xenia-canary xboxkrnl_io.cc:645-678 + null_device.{h,cc}. The 16-byte 0x74004 response with cache_size=0xFF000 at OUT+8 is the gate that lets sub_824ABD88 return SUCCESS and sub_824A9710 reach the priv-11 XexCheckExecutablePrivilege site identified by KRNBUG-AUDIT-007. Stack args 9-10 (OutputBuffer, OutputBufferLength) read from the caller's parameter save area at [sp+0x54] / [sp+0x5C] per the Xbox 360 PowerPC EABI (linkage area sp+0..sp+8, 8-quadword spill area sp+0x14..sp+0x54, then stack args every 8 bytes). First HLE export in the codebase to need 9+ args. Cascade vs. KRNBUG-AUDIT-007 prediction (5/8 held): - XexCheckExecutablePrivilege count 1 → 2 (priv=0xA + priv=0xB) ✓ - XamTaskSchedule count 0 → 1 ✓ - canary-only exports 7 → 3 (audit predicted ≤3) ✓ - 0x15e0 semaphore signal_attempts 0 → 1 (bonus) - 0x100c worker spawn DID NOT fire (still UNCREATED) ✗ - 0x1004 signal_attempts unchanged ✗ - Worker spawn count unchanged at 19 ✗ Tests: 592 → 594. Lockstep deterministic at -n 100M (run1 ≡ run2 ≡ run3, byte-identical). instructions=100000010 → 100000019, imports 407417 → 987524 (+2.4×). swaps=2 draws=0 plateau persists. sylpheed_n50m golden re-baselined instructions=50000004→50000003, imports=407362→407255. sylpheed_n2m unchanged. Still canary-only after this fix: ExTerminateThread, KeReleaseSemaphore, XamUserReadProfileSettings. The next downstream gate is somewhere past XamTaskSchedule's completion path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 22:00:12 +02:00
MechaCat02	c51f51f9cb	feat(kernel): KRNBUG-AUDIT-007 — --branch-probe instrumentation; sub_824A9710 exit gate identified Sister to --pc-probe / --ctor-probe but emits a single compact one-line BRANCH-PROBE record per fire (pc, tid, hw, cycle, r3, lr, cr0/cr6 flags) with no back-chain. Designed for tracing every conditional-branch fire inside a candidate-gate function so the last PC reached before the function epilogue identifies the exit branch. Runtime trace at audit-runs/audit-007/sub_824A9710-trace.log decisively identifies the priv-11 gate: - Exit branch: 0x824a9944 (post bl sub_824ABD88 first call) - Responsible kernel call: NtDeviceIoControlFile, FsCtlCode=0x74004 (registered as stub_success at exports.rs:90) - Mechanical chain: stub returns 0/SUCCESS without writing OUT, game reads [out_buf+8], finds zero, assigns hardcoded 0xC0000034 (STATUS_OBJECT_NAME_NOT_FOUND) at sub_824ABD88:0x824abea8-ac, exits via 0x824a9944's lt branch before priv-11 site at 0x824a99a0. 592→592 tests; lockstep instructions=100000010, swaps=2, draws=0 deterministic across reruns. Read-only diagnostic — no fix this session. Next session: KRNBUG-IO-003 (real NtDeviceIoControlFile per canary NullDevice::IoControl for FsCtlCodes 0x70000 + 0x74004). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 21:35:10 +02:00
MechaCat02	7675035082	fix(kernel): KRNBUG-IO-002 — vol-info class-3 returns 0x10000 alloc unit (canary NullDevice) `nt_query_volume_information_file` class-3 (`FileFsSizeInformation`) was returning sectors_per_unit=1, bytes_per_sector=2048 (alloc unit 2048). Replaced with canary's NullDevice byte-identical values sectors=0x80, bps=0x200 (alloc unit 0x10000), with total / available allocation units lowered to 0x10 / 0x10 to match. Reference: xenia-canary/src/xenia/vfs/devices/null_device.h:38-46 (`NullDevice::sectors_per_allocation_unit()` and `bytes_per_sector()`); consumed by canary's `NtQueryVolumeInformationFile_entry` at xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_io_info.cc:355-365. Tests 591 → 592 (added `nt_query_volume_information_file_class3_returns_64k_alloc_unit`). Lockstep `instructions=100000010, swaps=2, draws=0` deterministic across two `--stable-digest -n 100M` reruns. sylpheed_n50m oracle still matches its existing golden — observably a no-op at -n 50M. The audit-006-predicted 7→0 cascade did NOT fire (canary-only exports still 7, identical set; XexCheckExecutablePrivilege still priv=0xA only; XamTaskSchedule still 0). All 16 NtQueryVolumeInformationFile calls in our 500M trace originate from a single LR 0x82611f38 and complete successfully — vol-info is therefore not the priv-11 gate. The fix value is correct (canary-byte-identical) but is not load-bearing for the gate; landing it anyway because it's the right value and unblocks no regression. Stop condition triggered per the IO-002 task brief — no second fix this session. Next-session: --pc-probe on sub_824A9710 entry to find the actual upstream gate. See `audit-findings.md` (KRNBUG-IO-002 entry) and `audit-runs/post-IO-002/` for the full diagnostic trail. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 21:01:25 +02:00
MechaCat02	bef9793aec	feat(kernel): KRNBUG-IO-001 — NtReadFile on synth-empty file returns SUCCESS+0, not EOF AUDIT-005's static attribution to sub_824ABA98 was wrong. The 0xC0000011 (STATUS_END_OF_FILE) at lr=0x824a97e4 traces to the NtReadFile call at 0x824a9810 inside sub_824A9710 — the cache-loader reads 1024 B from offset 2048 of `\Device\Harddisk0\partition0`. Our synth-empty fallback returned EOF (start_pos 2048 > size 0), so the function bailed via RtlNtStatusToDosError before sub_824ABA98 was ever called. Canary mounts partition0 to a NullDevice; `NullFile::ReadSync` ([null_file.cc:24-31](xenia-canary/src/xenia/vfs/devices/null_file.cc)) returns X_STATUS_SUCCESS with bytes_read=0 and never touches the buffer. Sylpheed's caller pre-zeroes the 1024-byte stack buffer (`memset(sp+208, 0, 1024)` at sub_824A9710 prologue), validates a "Josh" magic on the first read, and falls back to the cache-recreate path when the magic doesn't match. The fix mirrors NullFile semantics: when the open synthesized a zero-length file (`data.is_empty() && size == 0`), NtReadFile returns SUCCESS with information=0 and the buffer untouched. Effects (chain-of-effects verification at -n 500M): - tests: 590 → 591 (added regression covering NullDevice semantics) - lockstep: deterministic across 3 reruns (same instructions=100000010, swaps=2) - sylpheed_n50m golden re-baselined: instructions 50000004→50000000, imports 407416→407362 - canary kernel-call diff: 10 → 7 missing exports (XeCryptSha + XeKeysConsolePrivateKeySign + NtDeviceIoControlFile now run; the cache-recreate path executes through to NtWriteFile) - boot reaches silph::Silph::Impl::OnInit: 19 worker threads spawn (was 6 before the fix) - parked-handle 0x1004 still signal_attempts=0; the original 0x100c and 0x15e0 are now <UNCREATED> because cascade walked past them and the handle assignments shifted; new parked sites: 0x12fc/0x1600/ 0x1040/0x10b8/0x15e8/0x1014/0x101c/0x10bc/0x1044 - draws=0 plateau persists; renderer is multi-causal blocked Next blocker: per the canary-only diff, XamTaskSchedule + the cluster of XAM exports (XamTaskCloseHandle, XamUserReadProfileSettings, ObCreateSymbolicLink) and the post-thread-exit chain (ExTerminateThread, KeReleaseSemaphore, KeResetEvent) are the next-up frontier.	2026-05-04 20:20:10 +02:00
MechaCat02	19659d7f76	feat(kernel): KRNBUG-XAM-001 — XGetAVPack returns 8 (HDMI), not 0x16 Mirrors canary's cvars::avpack default (xam_info.cc:35) and Sylpheed's accepted set {3,4,6,8} (xam_info.cc:250-251). With KRNBUG-XEX-001 having flipped the priv-10 gate, XGetAVPack now reaches its caller in sub_824AB578; returning 0x16 caused Sylpheed to abort the AV/crypto block before XeCryptSha. Cascade walks one step (canary-only export list 11 → 10); sub_824ABA98 is the next candidate. Tests: 589 → 590. Goldens re-baselined (n50m: 50000005→50000004, imports 407417→407416). Lockstep deterministic across 3 reruns at -n 100M (instructions=100000010, import_calls=987686 +2.4×, swaps=2). 9-PC producer probe still 0×; parked handles 0x1004/0x100c/0x15e0 still signal_attempts=0. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 18:54:24 +02:00
MechaCat02	1a892d4641	feat(kernel): KRNBUG-XEX-001 — real XexCheckExecutablePrivilege from XEX header bitmap Replace stub_return_zero with a canary-faithful implementation that returns bit `priv` of the loaded XEX's XEX_HEADER_SYSTEM_FLAGS (key 0x00030000) bitmap. Mirrors xenia-canary xboxkrnl_modules.cc:22-39: `(flags >> priv) & 1` for priv < 32, else 0. Plumbing: - xenia-xex: header_keys::SYSTEM_FLAGS const + get_system_flags() accessor. - xenia-kernel/state.rs: pub xex_system_flags: u32 + xex_priv_logged HashSet for one-shot per-priv tracing. - xenia-app: kernel.xex_system_flags wired in cmd_exec_inner. - xenia-kernel/exports.rs: real export body + unit test covering bits 10/11/0/64 + zero-flags case. Sylpheed's bitmap is 0x00000400 (only XEX_SYSTEM_PAL50_INCOMPATIBLE, bit 10). At -n 500M with the fix: - XGetAVPack: 0 -> 1 (priv-10 gate at lr=0x824ab598 flipped). - 10 other canary-only exports + 9 producer PCs + 3 parked handles unchanged. Priv-11 site at sub_824A9710 is downstream and still not reached — AV/crypto block aborts after XGetAVPack returns our placeholder 0x16 (canary returns 8/HDMI; Sylpheed accepts only 3/4/6/8 per xenia-canary xam_info.cc:250-251). Tests 588 -> 589. Lockstep deterministic (3 reruns identical): n50m goes 50000008 -> 50000005 instr / 407415 -> 407417 imp / swaps=2 / draws=0. Goldens re-baselined (sylpheed_n50m, sylpheed_n2m); oracle test green. Full chain-of-effects + next-frontier hand-off in audit-findings.md under KRNBUG-XEX-001. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 18:32:51 +02:00
MechaCat02	3e2fc1ec88	feat(kernel): KRNBUG-AUDIT-005 — --pc-probe extension + canary diff identifies XexCheckExecutablePrivilege stub cascade Extends `--ctor-probe` machinery into `--pc-probe` (clap alias) with the optional `PC@DISPATCHER:OFFSET` token form: on a hit, the helper additionally logs `[disp+off]` — what the producer's `lwz r3, OFFSET(r3)` is about to read. Reuses `parse_hex_u32`; both flags share parser + storage. Read-only diagnostic. Lockstep digest preserved (`run digest matches golden` at -n 50M `--stable-digest`). 588 tests green. Decisive findings (full deliverable in `audit-findings.md` / `audit-runs/audit-005/`): - Failure mode α confirmed for KRNBUG-AUDIT-004: all 9 producer call sites for handles 0x100c (5 sites) and 0x15e0 (4 sites) fire 0x at -n 500M. The producer code path is not reached. - Set-diff of kernel-call sequences (canary.log oracle vs ours.log at -n 500M) identifies 11 exports canary calls and we don't: XGetAVPack, XeCryptSha, XeKeysConsolePrivateKeySign, ObCreateSymbolicLink, NtDeviceIoControlFile (×2), XamUserReadProfileSettings (×2), XamTaskSchedule, XamTaskCloseHandle, KeReleaseSemaphore (×268), KeResetEvent, ExTerminateThread (×2). - XGetAVPack has exactly one caller (sub_824AB578 at 0x824AB5A0). The 4 instructions immediately preceding it are: addi r3, r0, 10 ; privilege bit 10 bl XexCheckExecutablePrivilege cmpli 0, r3, 0 bc 12, eq, 0x824AB724 ; if r3==0, skip whole block - exports.rs:193 registers XexCheckExecutablePrivilege as stub_return_zero. Always returning 0 -> guest takes the branch and skips the entire AV/crypto/save-data init block. - The other call site (sub_824A9710 at 0x824A99A0) queries privilege 11 with opposite polarity (bne) -> gates XamTaskSchedule on the privilege-NOT-set arm. With both stubs returning 0, the guest walks the wrong arm of every privilege-gated branch. - This explains why the dispatcher fields read zero ([0x828F3D08+0x50]=0, [0x828F4070+0x24]=0 from AUDIT-004 dumps): the ctors run, but the producers that would populate those fields with a non-zero handle never execute. Next session: replace XexCheckExecutablePrivilege stub with real priv-bit lookup from XEX header. See audit-findings.md KRNBUG-AUDIT-005 for the validation matrix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 18:06:22 +02:00
MechaCat02	7108d6d131	feat(kernel): KRNBUG-AUDIT-004 — --ctor-probe PC hook + --dump-addr struct dump Diagnostic-only, read-only. Lockstep `instructions=100000002` preserved bit-exact at -n 100M --stable-digest. 586 → 588 tests. Adds two read-only diagnostics for the parked-waiter producer hunt: * `--ctor-probe=0x8217C850,0x...` — at every interpreter step, if `ctx.pc` is in the configured set, print one `CTOR-PROBE` line capturing live r3 (= `this` in MSVC PPC ctors), lr (= return site), sp, plus an 8-frame back-chain with saved-r31/r30 per frame. Fires once per hit, exactly what the 8-instance-pool probe needed. * `--dump-addr=0x828F3D08,0x828F4070,0x828F3EC0,...` — at end of run (after the FOCUS report in `dump_thread_diagnostic`), each address gets a 128-byte hex + be32 + ASCII dump. Used to inspect the static dispatcher / job-queue struct layouts AUDIT-003 identified. Both gated default-off; empty set is a single `is_empty()` test on the hot path. No guest state is mutated, so the `sylpheed_nm.json` lockstep digest is preserved. KRNBUG-AUDIT-004 findings (corrects KRNBUG-AUDIT-002/003): 1. The "8-instance pool" hypothesis for handle 0x1004 is FALSE.* Probing the inner per-instance ctors `[0x821783D8, 0x82181750, 0x821701C8]` at -n 50M shows each fires EXACTLY ONCE with r3 = `[0x828F3EC0, 0x828F3D08, 0x828F4070]` respectively. All three handles are Meyers-style singletons with one dispatcher each. The "called 8 times" claim came from miscounting raw entries to the OUTER getter sub_8217C850 — but that getter is itself a Meyers-singleton-getter; only the FIRST entry cascades through to bl 0x821783D8 (gated on `[0x828F48D8] bit 0`). 2. The producer indirection layer is the singleton-getter itself. Static byte-scan of .rdata / .data shows 0 hits for the dispatcher addresses — no static registry table holds them. But the xrefs table for the OUTER getters reveals 5–6 callers each, MOSTLY non-create-chain, sharing the canonical producer pattern: `bl outer_singleton_getter; lwz r3, OFFSET(r3); bl 0x824AA1D8` (with OFFSET=80 for 0x100c, =36 for 0x15e0). So the AUDIT-003 xref audit was necessary but not sufficient — it correctly saw "no direct producer references" but missed the singleton-getter indirection layer. 3. Dispatcher struct layouts (128-byte dumps captured at -n 50M --halt-on-deadlock): - 0x828F3D08 (handle 0x100c): event_handle at +0x4C (0x100c), thread_handle at +0x48 (0x1010), self-pointer at +0x74, capacity 7 at +0x28, queue empty (+0/+3C = -1). - 0x828F4070 (handle 0x15e0): event_handle at +0x20 (0x15e0), sibling-handle 0x15E4 at +0x1C, queue empty (+0x10 = -1). - 0x828F3EC0 (handle 0x1004): event_handle at +0x78 (0x1004), 4 guest-heap sub-buffers at +0x20/+0x3C/+0x44/+0x50 in 0x4xxxxxxx range — noticeably different layout from the other two pure POD job queues. Files: crates/xenia-kernel/src/state.rs ctor_probe_pcs / dump_addrs + fire_ctor_probe_if_match + 2 tests crates/xenia-app/src/main.rs Exec --ctor-probe / --dump-addr CLI parsing, prologue hook, end-of-run struct dumper audit-findings.md KRNBUG-AUDIT-004 entry audit-runs/audit-004/ 50M probe runs (v1 outer-getter hits, v2 inner-ctor hits proving the singleton hypothesis) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 17:09:47 +02:00
MechaCat02	f84e947547	feat(kernel): KRNBUG-AUDIT-003 — vtable/RTTI class probe at handle creation + wait Adds a read-only MSVC RTTI traversal helper (`read_class_at_this`) and a `probe_create_stack_classes` integration that walks each captured back-chain frame for handle creates in `--trace-handles-focus` and probes each frame's most-likely `this` candidate (live r31/r30/r3 for frame 0; saved-r31/r30 from the prologue spill area at [fp-12]/ [fp-16] for deeper frames). False-positive guard rejects the CRT static-init iterator pattern (vtable's first two slots must be image- range function pointers — PPC instruction words like `mflr r12` are not in 0x82xxxxxx). `dump_thread_diagnostic` now takes `&GuestMemory` so the FOCUS report prints, for each parked waiter, a WAIT-THREAD block with full back- chain frames and per-slot saved-register dump for offline lookup. End-to-end finding (-n 500M producer-trace): * Handle 0x100c dispatcher = 0x828F3D08 (image rdata; verified by sub_82181750 disasm + xref table). [this+0] = -1 sentinel — POD job queue, NOT a C++ polymorphic class. * Handle 0x15e0 dispatcher = 0x828F4070 (same shape). * Handle 0x1004's 8-instance pool members still TBD (MSVC ctors didn't preserve `this` in r31). * 0x42450b5c is a separate audit class (heap-allocated, parks via non-`do_wait_single` path). Decisive xref audit: every reference to 0x828F3D08 / 0x828F4070 in the static analysis is in a ctor or the CRT init driver. NO producer code references either dispatcher base. Confirms `signal_attempts=0` is unreachable-producer, not broken-producer. Tests: 581 → 586 green (+5: RTTI-intact / RTTI-stripped / non-object / cstring / probe_create_stack integration). `--stable-digest -n 100M` instructions=100000002 unchanged. Master HEAD prior: `6440261`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 21:14:56 +02:00
MechaCat02	2a9fd1fc86	feat(kernel): KRNBUG-AUDIT-002 — multi-frame guest stack capture at handle creation Adds `walk_guest_back_chain` (PPC EABI back-chain walker) and a `record_create_with_stack` audit hook gated on `--trace-handles-focus`. NtCreateEvent / NtCreateSemaphore / NtCreateTimer / XamTaskSchedule now route through the new helper so focused handles capture up to 6 stack frames at allocation time. Diagnostic-only, read-only memory access: unfocused handles pay one HashSet lookup, focused ones pay six back-chain dereferences. Lockstep determinism preserved. End-to-end finding: handles 0x1004 (8-instance pool via static ctor at 0x8280F810), 0x100c (singleton built inside main()), 0x15e0 (singleton in distinct cluster) are silph-framework dispatcher objects whose producer code is unreached at -n 500M. The producer hunt now has class ownership; vtable/RTTI readout is the next step. Tests: 576 → 581 green. `--stable-digest -n 100M` instructions=100000002 unchanged. Master HEAD prior: `9d45efe`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 20:41:06 +02:00
MechaCat02	07068e7616	feat(audio): APUBUG-PRODUCER-001 — XAudio register driver client + opt-in callback ticker Replace the three XAudio kernel-export stubs (Register/Unregister/SubmitFrame) with canary-faithful implementations and add a periodic buffer-complete callback ticker reusing the existing SavedCallbackCtx injection machinery. Canary parity: - xboxkrnl_audio.cc:56-93 — read callback_ptr[0..1], wrap callback_arg in a 4-byte big-endian guest heap buffer (`wrapped_callback_arg`), write `0x4155_xxxx` to driver_ptr. - audio_system.cc:139-141 — guest callback receives r3 = wrapped pointer, not raw callback_arg. - audio_driver.h:21-24 — frame rate 256 samples / 48 kHz ≈ 5.33 ms. Implementation: - New `crates/xenia-kernel/src/xaudio.rs` — `XAudioClient`, `XAudioState` (8-slot table, pending FIFO, dual-mode ticker), `XAUDIO_INSTR_PERIOD = 48_000` (lockstep) and `XAUDIO_PERIOD = 5.333 ms` (--parallel), same pattern as KRNBUG-D08 v-sync. - `try_inject_audio_callback` in xenia-app mirrors `try_inject_graphics_interrupt`, shares `interrupts.saved` slot for mutex with graphics callbacks. Gating: ticker + injector run only when `--xaudio-tick` / `XENIA_XAUDIO_TICK=1`. Default off because Sylpheed's audio callback enters an infinite `KeWaitForSingleObject` loop on first invocation (canary's host worker thread provides the buffer-completion fence we don't model), which hijacks a guest HW thread and regresses `swaps=2 → 1`. Default-off preserves the lockstep `sylpheed_nm.json` goldens exactly. Producer hunt outcome (FALSIFIED for parked handles 0x1004/0x100c/0x15e4): at `-n 500M --xaudio-tick` all 3 handles still show `signal_attempts=0 (primary=0, ghost=0)`. Audio callback is not the missing producer. Next candidate per audit-findings.md is Timer DPC delivery (KeSetTimer / KeInsertQueueDpc). Tests: 562 → 576 green (10 in `xaudio.rs`, 4 in `exports.rs`). Lockstep `--stable-digest -n 100M` default-off: instructions=100000002, swaps=2 (matches pre-change baseline byte-for-byte). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 19:50:22 +02:00
MechaCat02	691404e36e	fix(xam): XAMBUG-PRODUCER-001 — XamTaskSchedule spawns a real guest thread Replaces the no-op stub at xam.rs:204 with a canary-faithful implementation mirroring xenia-canary/src/xenia/kernel/xam/xam_task.cc:43-80. Allocates a ThreadImage, allocates a KernelObject::Thread handle, and routes through Scheduler::spawn with entry=callback and start_context=message_ptr (canary's third positional XThread ctor arg). Stack size = max(0x4000, page-aligned 0x10_0000). Producer-hypothesis outcome (500M --trace-handles-focus run): the call site at 0x824a9a10 is never reached during this boot horizon, so XamTaskSchedule cannot be the missing producer for the 3 parked Event/Manual handles (0x1004, 0x100c, 0x15e4). The fix still lands — the stub was a real correctness bug that would manifest the moment the boot advances past the current deadlock. Next candidate per audit-findings.md: XAudioRegisterRenderDriverClient. - Workspace tests: 561 → 562 green (new test xam::tests::xam_task_schedule_spawns_real_thread). - --stable-digest -n 100M: instructions=100000002 unchanged from baseline; lockstep determinism preserved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 18:32:40 +02:00
MechaCat02	27d3608174	fix(kernel): KRNBUG-D08 — wall-clock v-sync under --parallel The synthetic v-sync ticker used a per-instruction proxy (VSYNC_INSTR_PERIOD = 150 k) tuned for ~10 MIPS lockstep throughput → 60 Hz. Audit M11 observed this drifts under `--parallel`: with 6 worker threads sharing the kernel mutex, the dispatcher executes more PPC instructions per tick callback, so the accumulator never crosses 150 k. Result: ~629 v-syncs/100M lockstep → ~2 v-syncs/100M --parallel. Hybrid solution preserves lockstep determinism (which the goldens depend on) while fixing --parallel: * `tick_vsync_instr(instr_count)` — legacy instruction-count ticker, used by lockstep. Bit-stable across runs. * `tick_vsync_wallclock()` — new Instant-based ticker. Fires `floor(elapsed / VSYNC_PERIOD)` v-syncs since the anchor and advances the anchor by that many full periods (no lazy backlog). Capped at INTERRUPT_QUEUE_CAP per call so a forward-jumping clock can't overflow the FIFO. * `KernelState.parallel_active` flag set at startup from `--parallel` / `XENIA_PARALLEL=1`. Read by `coord_pre_round` in main.rs to choose between the two tickers. Verification: * cargo test --workspace --release: 561 passing (+3 new wall-clock tests vs prior 558 baseline). * lockstep -n 100M --stable-digest: BIT-IDENTICAL to pre-Phase-3 baseline. interrupts_delivered preserved at ~630 (was ~629 pre-fix). * --parallel --reservations-table -n 30M: interrupts_delivered rose from ~2 to 17. (FIFO INTERRUPT_QUEUE_CAP=4 still caps burst delivery; that's a separate bottleneck — addressed by raising cap when --parallel queue depth becomes the next blocker.) Trade-off: --parallel runs are non-deterministic at the v-sync rate by design (per audit M05 PPCBUG-703 already). Lockstep stays bit-identical, so the `sylpheed_n*m.json` goldens are untouched. Audit IDs: KRNBUG-D08 (closed). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 17:34:30 +02:00
MechaCat02	d1105aafae	diag(audit): KRNBUG-AUDIT-001 — focused parked-waiter ghost-trail diagnostic Adds a one-run diagnostic that distinguishes "guest never called Nt/KeSetEvent on this handle" from "signal landed but waiter wasn't woken", for any handle named via `--trace-handles-focus`. Parked-waiter context (project_xenia_rs_sylpheed_stage3_2026_04_29): four worker threads block Sylpheed past `draws=0` on handles 0x1004 / 0x100c / 0x15e4 / 0x42450b5c (mr=true, sig=false). The pre-existing audit dropped signal-attempts that targeted handles without a primary trail, so we couldn't tell whether the producer was unreachable in the guest or whether the signal landed but missed its waiter. Three changes: * audit.rs: `HandleAudit` gains `focus: HashSet<u32>` and `ghost_trails: HashMap<u32, GhostTrail>`. `record_signal` auto-falls-through to a new `record_signal_attempt_ghost` when no primary trail exists AND the handle is in `focus`. Bounded by AUDIT_RING_CAPACITY per handle. Two new tests cover the focus ghost-trail and no-double-record invariants. * main.rs: new `--trace-handles-focus=<LIST>` flag (hex 0x or decimal, comma-separated) populates `kernel.audit.focus`. Implies `--trace-handles`. New "=== Handle audit (focus) ===" section in `dump_thread_diagnostic` emits per-handle: - signal_attempts (primary + ghost), waits, wakes - merged cycle-sorted timeline (last 16) - GuestExport / KernelInternal classification - <AUDIT_BLIND> marker when waiter_count > 0 but the audit saw no waits (i.e. waiter parked via a non-audit path — CS / spinlock / DPC). - DIAGNOSIS conclusion that selects between five branches. * `cmd_check` passes None for focus → goldens unaffected. Empirical run output at -n 500M lockstep with `--trace-handles-focus=0x1004,0x100c,0x15e4,0x42450b5c`: handle=0x00001004 kind=Event/Manual waiters=1 signaled=false signal_attempts=0 (primary=0, ghost=0) waits=1 wakes=0 created cycle=0 tid=1 lr=0x824a9f6c src=NtCreateEvent => producer is a missing kernel signal source (or BST-paradox upstream) ... (same shape for 0x100c, 0x15e4) handle=0x42450b5c kind=<UNCREATED> waiters=1 signal_attempts=0 waits=0 wakes=0 <AUDIT_BLIND> => waiter parked via non-audited path Conclusion: hypothesis (A) confirmed for all 4 handles. Producer is NOT a wake/eligibility bug — it is a genuinely missing kernel signal source. The 3 Event/Manual handles share a creator (lr=0x824a9f6c, tid=1) and the same wait-call wrapper at lr=0x824ac578 — these are 3 worker threads all parked on "work-available" notifications that never come. Verification: * cargo test --workspace --release: 558 passing (+2 new ghost-trail tests vs prior 556 baseline) * lockstep -n 100M --stable-digest: bit-identical to master HEAD Audit IDs: KRNBUG-AUDIT-001 (closed — diagnostic instrumentation). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 17:22:14 +02:00
MechaCat02	7a1b6b3306	fix(gpu): GPUBUG-DRAIN-001 — silence VdSwap PM4 fallback under --parallel The Phase-C VdSwap PM4 ring path (commit `82f3d61`) emits two "PM4_XE_SWAP not consumed by drain" warnings when running: exec sylpheed.iso --ui --quiet --halt-on-deadlock \ --parallel --reservations-table Lockstep -n 100M never trips it. Two distinct race windows: (a) Inline backend (--ui forces it): drain(mem, 4096) hit its fixed packet cap before reaching the PM4_XE_SWAP we'd just injected at the WPTR tail. With 6 CPU threads, the ring accumulates >4096 packets between vd_swap callbacks. (b) Threaded backend (--parallel without --ui): the worker's DrainFence handler has a 900 ms deadline and game-batched IBs (8-10 M packets observed) keep it from reaching the tail in any reasonable budget. If the worker eventually drained past the injected packet later, the safety-net direct notify would double-count. Three changes: * gpu_system.rs: new `drain_until_wptr(target, time_budget)` draining by the canary `WorkerThreadMain` predicate (read_offset != target) instead of a fixed packet count. 900 ms deadline mirrors the threaded DrainFence handler. * handle.rs: inline `drain_to_current_wptr` switches to `drain_until_wptr`. DrainFence handler publishes the digest mirror BEFORE replying so the CPU's post-drain `digest_snapshot` sees fresh stats. * exports.rs (vd_swap): skip the PM4 ring injection unconditionally and route swap notification through `notify_xe_swap` directly. Tail-injection is unreliable under --parallel for both backends. The slot-0 fetch-constant patch is deferred (GPUBUG-FETCH-PATCH-001); draws=0 today so a stale slot 0 has no observable effect. Verification: * cargo test --workspace --release: 556 passing (unchanged). * Lockstep -n 100M --stable-digest: bit-identical to pre-fix master HEAD `aa3f1d3`. {instructions:100000002, imports:987685, unimpl:0, draws:0, swaps:2, ...} * check --parallel --reservations-table -n 30M: 0 warnings (was 2). swaps=2. * exec --gpu-inline --parallel --reservations-table -n 30M: 0 warnings (was 2 with drained=8M-10M observed). swaps=2. Audit IDs: GPUBUG-DRAIN-001 (closed), GPUBUG-FETCH-PATCH-001 (filed, deferred). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 17:12:15 +02:00
MechaCat02	e7d0fcf2c9	fix(kernel): KRNBUG-017 — real KfSpinLock + KeReleaseSpinLockFromRaisedIrql The Kf-family spinlock exports were registered as stubs: KfAcquireSpinLock → stub_return_zero (didn't write lock) KfReleaseSpinLock → stub_success (didn't clear lock) KeReleaseSpinLockFromRaisedIrql → stub_success (same) KeTryToAcquireSpinLockAtRaisedIrql → returned 1 but didn't set lock value Guest code that read the lock value back (e.g. nested acquire/release sanity checks, debug assertions) saw 0 even after "acquiring", and could enter critical regions without contention serialization. Under `--parallel` the coarse Arc<Mutex<KernelState>> already serializes us, so the audit's P0-under-parallel ranking is about correctness of the lock value visible to guest code, not mutual-exclusion (which is provided by the host mutex). Implementation mirrors canary's `xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_threading.cc`: - KfAcquireSpinLock: write 1 to SpinLock, return 0 (old IRQL) - KfReleaseSpinLock: write 0 to SpinLock - KeReleaseSpinLockFromRaisedIrql: write 0 to SpinLock - KeTryToAcquireSpinLockAtRaisedIrql: write 1 to *SpinLock, return 1 Single-threaded HLE: contention can never be observed (we never run two guest threads simultaneously without holding the kernel mutex), so the spin-loop can degenerate to an unconditional acquire. Verification at -n 100M lockstep: swaps: 2 → 2 (unchanged) draws: 0 → 0 (gated by F2/F3/G) packets: ~59M (within noise) Tests: 76 kernel pass (no count change; existing harness covers the new write semantics implicitly via guest-memory smoke tests). Closes KRNBUG-017 (P0 under --parallel). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 14:24:47 +02:00
MechaCat02	82f3d611e2	fix(gpu,kernel): KRNBUG-Vd-04 / GPUBUG-001 / XMODBUG-013 — VdSwap PM4 ring path The pre-fix VdSwap zero-filled the guest's reserved buffer with NOPs and called `state.gpu.notify_xe_swap` directly — bypassing the ring, leaving the PM4_XE_SWAP handler at gpu_system.rs:1232 dead code, and skipping the PM4_TYPE0(SHADER_CONSTANT_FETCH_00_0, 6) patch. Sylpheed's bloom/ blur "sample frame N for frame N+1" path samples fetch-constant slot 0 expecting the frontbuffer descriptor; without the patch, slot 0 stayed stale and any shader sampling it read garbage. This commit writes the canary VdSwap PM4 sequence directly into the primary ring at the current write pointer (read via the shared MMIO atomic), then advances WPTR over the injection. The natural CP drain consumes PM4_XE_SWAP — bumping `swaps_seen` and patching fetch-constant slot 0 — without going through any direct kernel→GPU bypass. Sequence per xenia-canary VdSwap_entry (xboxkrnl_video.cc:438-521): 1) PM4_TYPE0(0x4800, count=6) + 6 fetch-header dwords (with base_address re-patched from virtual to physical >> 12). 2) PM4_TYPE3(PM4_XE_SWAP, count=4) + signature + frontbuffer_phys + width + height. Mechanism notes: - buffer_ptr in xenia-rs is in the system command buffer, NOT the primary ring (verified empirically: buffer_ptr=0x4acd4df8 vs ring_base=0x0accb000, size 4 KB). Canary's VdSwap writes to buffer_ptr because its ring layout maps the reserved slot inside the ring; xenia-rs's doesn't, so we have to write at the actual ring WPTR address (cached on KernelState.ring_base from VdInitializeRingBuffer). - The original "buffer_ptr zero-fill + bump WPTR by 64" path is preserved before the injection — it exposes any game-batched PM4 packets and keeps the buffer_ptr region skippable per existing game compat behavior. - A safety-net fallback at the end calls `notify_xe_swap` directly if swaps_seen didn't advance during the drain (e.g. a ring-arithmetic edge case). Idempotent — only fires when the PM4 path didn't. - KRNBUG-Mm-04 deferred: virt→phys uses the masked stub `virt & 0x1FFF_FFFF`, sufficient for the standard heap. Mechanical changes: - crates/xenia-gpu/src/pm4.rs: add make_packet_type0 / type2 / type3 helpers + round-trip unit test (mirrors canary xenos.h:1682-1709). - crates/xenia-gpu/src/handle.rs: add mmio_cp_rb_wptr_load accessor (Acquire-load) so the kernel can compute ring offsets. - crates/xenia-kernel/src/state.rs: cache ring_base / ring_size_dwords on KernelState (set by VdInitializeRingBuffer). - crates/xenia-kernel/src/exports.rs: rewrite the vd_swap PM4-emit block; patch fetch_dwords[1] base_address virt→phys before injection. Verification at -n 100M lockstep: swaps: 2 → 2 (game fires VdSwap exactly twice) draws: 0 → 0 (gated by Phases D+E) fallback warning: 0 occurrences (PM4 path consumed both swaps) instructions: ~100M Tests: 552 passing (553 with new pm4 round-trip test). Lockstep stable-fields determinism: byte-identical across two 100M runs. The "swaps > 2" prediction in the audit's plan assumed the game would fire VdSwap more often once the path worked; empirically Sylpheed only calls VdSwap twice within 100M instructions (this is the renderer plateau the audit identified). The success criterion for Phase C is that the PM4 path is now operational, which Phases D+E require for visible draws. Closes KRNBUG-Vd-04, GPUBUG-001, XMODBUG-013. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 14:00:23 +02:00
MechaCat02	5f0d6487ea	xenia-kernel: HLE expansion, scheduler integration, audit + UI bridge Major HLE buildout in exports.rs: KeInitializeSemaphore now seeds count/limit, XexGet{Module,Procedure}Address use distinct HMODULE_XBOXKRNL/HMODULE_XAM pseudo-handles with a reverse (ModuleId,ordinal)→thunk_addr map, plus sweeping additions across sync primitives, file I/O, semaphores, events, threads, and allocator paths needed to advance Sylpheed past VdSwap=2. New modules: - thread.rs — ThreadRef + per-thread suspension/wake plumbing - interrupts.rs — IRQ delivery, pending-IRQ slots, IPI helpers - path.rs — guest path normalization (D:\\, game:\\, etc.) - audit.rs — --trace-handles harness backing the handle audit - ui_bridge.rs — kernel-side endpoint of the xenia-ui bridge (input snapshots, framebuffer publish handles) state.rs grows to own the HW-slot scheduler state, the new audit / UI bridge handles, and the per-handle reverse maps. xam.rs and objects.rs follow suit for the HLE additions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 16:29:00 +02:00
MechaCat02	c694bb3f43	Initial commit: xenia-rs workspace for Xbox 360 RE Rust reimplementation of the xenia Xbox 360 emulator targeting reverse- engineering and preservation, initially scoped to Project Sylpheed. Includes: - XEX2 loader (LZX decompression, AES decryption, PE parsing) - XISO / XGD2 disc image VFS - PPC interpreter with 200+ opcodes and VMX128 decoding - Static analyzer: functions, cross-references, labels, asm + SQLite output - HLE kernel covering the xboxkrnl/xam subset used by Sylpheed init - Debugger with in-memory and SQLite-backed execution tracing - `xenia-rs` CLI with extract/dis/exec commands that produce cumulative, superset SQLite databases and opt-in instruction/import/branch traces Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-16 23:14:56 +02:00

20 Commits