0332d1990d1edcf834a7bd66ac7b5303f6593e88
26 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
0332d1990d |
[Track 2] Parallel-scoped global clock fixes timebase-desync livelock
In --parallel mode a long run livelocked: the scheduler spun "advanced to deadline 3000 waking hw=2 idx=0" ~14k times in microseconds. Root cause: each guest thread owns ctx.timebase (+1/instr in step_block), and all kernel deadline arithmetic read Scheduler::ctx(hw_id).timebase as "now". But the parallel worker extracts its PpcContext via mem::replace(ctx_mut_ref, PpcContext::new()) — leaving a ZEROED timebase in the slot while it steps unlocked — and advance_all_timebases_to only walks runqueue (never idle_ctx). So the coordinator's coord_pre_round drain and a woken thread's parse_timeout could read a zeroed/stale basis decoupled from the deadline the scheduler just advanced to. The thread re-armed the same constant deadline forever; the global clock never moved. Fix: add a single monotonic Scheduler::global_clock, advanced by the per-block retired-instruction count on each parallel writeback and floored up by advance_all_timebases_to. Kernel deadline reads route through KernelState::now_basis_at(hw_id), which returns global_clock ONLY when parallel_active; lockstep keeps reading the exact pre-existing ctx(hw_id).timebase expression, so the deterministic lockstep trace is byte-identical (sylpheed_n50m golden unchanged, zero re-baseline). Verified: - 50M --parallel run completes (was: hung). Deadlines now strictly increasing 5.4M -> 49.1M (18097 unique of 18116; max repeat 2) vs pre-fix constant 3000 x ~14000. - sylpheed_n50m golden byte-identical via plain `check` (no persist). - Full suite 665/665 green. Note: an intermittent parallel hang/crash (~1-2/20 at -n 5M) is pre-existing (master 1/20, this build 2/20 — within noise) and distinct from the timebase livelock: it is a parallel-race class (e.g. the unsafe block_ptr deref in run_execution_parallel). Tracked separately; lockstep remains the recommendation for long runs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
b20c99f141 |
[Subsystem-fixes] 6 verified ours-vs-canary divergence fixes
From the 2026-06-12 5-subsystem differential audit. All verified against canary as oracle; 660/660 workspace tests green (655 + 5 new). 1. nt_create_event polarity (exports.rs) — `manual_reset = gpr[5] != 0` was INVERTED. Canary xboxkrnl_threading.cc:668 `Initialize(!event_type,..)` + xevent.cc:41 (type 0 = NotificationEvent = manual, type 1 = Sync = auto). Now `== 0`. Was the dormant 2.AI fix on chore/portable-snapshot, never merged. The Ke-path was already correct; only the Nt-path was wrong. 2. 2.AF deadline drain (main.rs coord_pre_round) — expired KeWait/KeDelay deadlines never fired under load because advance_to_next_wake_if_due was only called in coord_idle_advance (no-Ready-threads path). Added a per-round drain loop; covers BOTH lockstep and parallel outer loops since both call coord_pre_round. Was the dormant 2.AF fix, never merged. 3. handle slab-recycle ABA guard (state.rs + scheduler.rs) — release_handle_slot (my round-34 regression) recycled a closed slot even with a thread still parked on it, risking a stale-waiter wake when the slot is re-minted. Added Scheduler::any_thread_waiting_on; decline to recycle a still-waited slot. 4. vpkpx pixel-pack (vmx.rs) — wrong field mapping (~100% mismatch). Now exact canary ppc_emit_altivec.cc:1795 shift/mask (red 6b out[15:10] from w[24:19], green out[9:5] from w[14:10], blue out[4:0] from w[7:3]; no fabricated alpha bit). +unit test. 5. VFS GDFX attribute plumbing (vfs/*, exports.rs query fns) — VfsEntry now carries the real on-disc attribute byte (GDFX dirent +12, canary disc_image_device.cc:136/154) instead of inferring directory-ness from path shape. Query exports report the real FILE_ATTRIBUTE_* bits. Candidate driver of the XamShowDirtyDiscErrorUI gate. +tests. 6. MmGetPhysicalAddress region-aware mirror (exports.rs) — flat 0x1FFFFFFF mask missed canary's +0x1000 host_address_offset for 0xE0000000+ mirror (memory.cc:2317). Read-only query; proven byte-identical 50M digest. +test. Investigated and intentionally NOT changed: - zero-on-recommit: no-op; ours has no region-reuse path (bump allocators, free is a stub). - 32-bit ALU writeback truncation (PPCBUG-020): documented-deliberate; premise (MSR.SF=0) is questionable but flipping it is out of scope here. - KeSetEvent/NtSetEvent return value: ours returns true previous state (hardware-faithful); canary returns constant 1 — NOT an ours bug. sylpheed_n50m golden will need re-baselining (legit behavior change). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
db90ad0f7d |
[AUDIT-059 R-D2] Phase D auto-signal POC confirms audit-049 wedge diagnosis
Hook NtCreateEvent for the silph::UImpl tid=13 chain (entry=0x821748F0, start_context=0x4024a840, frame-1 LR=0x821CB15C inside sub_821CB030+0x128) and auto-signal the resulting handle after XENIA_SILPH_UI_AUTOSIGNAL_DELAY instructions. Env-gated; default off. SR4 verdict B (partial unwedge): - handle 0x1078 signal_attempts 0->1 - tid=13 Blocked(WaitAny[0x1078]) -> Ready pc=0x824a9108 - ExCreateThread 10 -> 12 (new silph::UImpl tid=14, worker tid=15) - New downstream wedges 0x1084 + 0x1088 - cxx_throw runtime_error on tid=5 inside R26 dispatcher (BST not-registered instance lhs=0x715a7af0) - VdSwap stays 1; no draws (POC is diagnostic, not final fix) Confirms Phase C diagnosis end-to-end. The real signaler must (a) drive NtSetEvent on the silph KEVENT AND (b) register the dispatcher's BST instance upstream; this POC only does (a). Reading-error class #20: ctx.lr at kernel export entry is the thunk wrapper's return slot, NOT the guest caller's post-bl PC. Walk back-chain 1 step to get frames[1].lr. Reading-error class #21: --parallel and lockstep have SEPARATE outer loops in main.rs (run_execution_parallel line 2928 vs run_execution line 2706). Per-round hooks must be wired in BOTH paths. Files: - crates/xenia-cpu/src/scheduler.rs: GuestThread.start_entry/start_context fields + spawn() population + current_thread_entry_and_ctx() helper - crates/xenia-kernel/src/state.rs: AutoSignalPending struct, env-parsed silph_autosignal_delay, pending Vec, last_cycle_hint, set_now_cycle_hint, maybe_register_silph_autosignal (walks back-chain), fire_due_silph_autosignals - crates/xenia-kernel/src/exports.rs: hook in nt_create_event - crates/xenia-app/src/main.rs: fire-site + cycle hint in both outer loops - audit-runs/audit-059-handle-disambiguation/round-D2-autosignal-poc/FINDINGS.md Tests 655/655 green. Default behavior byte-identical when env unset. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
229b46c765 |
[Kernel] Slab-recycle handle allocator (AUDIT-059 R34)
Adds a FIFO free list of closed handle slots so alloc_handle returns recycled IDs before bumping next_handle. Mirrors canary's slab-style ObjectTable: F8000098 reused 130x per 30s window in canary, but ours' monotonic bump allocator never reused slots — so a recycled slot in canary maps to a fresh, never-reused slot in ours, drifting kernel object identity per AUDIT-042's analysis. release_handle_slot is wired into nt_close's refcount==0 branch and gated to the canonical [0x1000, 0xF000_0000) range so synthetic XAudio park handles (AUDIT-048) are never recycled. Verified: all 655 workspace tests green, smoke tests at -n 50M show NtClose 115/run with handle table renumbering active (round-34 max handle 0x12ac vs round-16 baseline 0x12b8 over same workload). γ- cluster #2 wedge unchanged — silph wait still parks tid=13 on the renumbered handle (4216=0x1078 here vs 0x12a4 baseline), confirming the wedge is independent of allocator policy. Lands as a parity fix to bring our kernel-object identity in line with canary. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
b5885b8560 |
[2.BF] Synthetic silph::WorkerCtx spawn (round 18 — opt-in landing)
Adds infrastructure to synthesise the silph::WorkerCtx that AUDIT-058/059 identified as never reached by ours' static-init chain (real chain entry sits in audit-059 round 9's wrong-vtable wedge at sub_82172BA0+0x1E8). Ctx layout follows round 5's live hexdump from canary: +0x00 vtable = 0x8200A1E8 +0x04 self +0x08 intrusive list head -> self +0x0C init flag = 1 +0x10 packed byte field +0x18 2x float ~1.0 (UI rates) +0x24 flag = 1 +0x28..+0x30 3x foreign-arena pointers (left NULL — see below) +0x54..+0x84 4x X_KEVENT auto-reset, state=0 +0x94..+0xC4 4x X_KEVENT manual-reset, state=1 (pre-signaled) +0x210..+0x250 4-entry intrusive work-ring, empty Worker spawn mirrors AUDIT-048's audio-worker pattern in xaudio_register_render_driver: per-worker allocate_thread_image + state.scheduler.spawn with r3 = ctx_ptr. Trigger fires at the first dat/* VFS open (ours' earliest is dat/files.tbl), which is when canary runs the equivalent chain. ROUND 18 OUTCOME — opt-in only: With workers spawned Ready (XENIA_SILPH_SYNTH=1), boot CRASHES at cycle ~5.5M with PC=0 on hw=1, just after worker_3 (entry 0x825065B8) spawns. Per task constraints this is STOP-and-report: the ctx fields +0x28/+0x2C/+0x30 (foreign heap pointers — canary's 0x30057018, 0xBCE25640, 0xBE568F00, distinct arenas per audit-059 round 7) are left NULL, and the worker bodies plausibly dereference one of them. Synthesising those is a fresh investigation (round 19+). With workers spawned Suspended (XENIA_SILPH_SYNTH=suspend), boot completes normally (11 spawns, VdSwap=1, KeSetEvent=2, KeReleaseSemaphore=1 — matches default baseline). The ctx remains materialised in guest memory at the logged VA for downstream probing. Default (env var unset): no synth, no regression. Files: crates/xenia-kernel/src/silph_synth.rs (new, 225 LOC) crates/xenia-kernel/src/lib.rs (+1 LOC, register module) crates/xenia-kernel/src/exports.rs (+37 LOC, hook in open_vfs_file) crates/xenia-kernel/src/state.rs (+18 LOC, 4 silph_synth_* fields) Tests: cargo test --release --workspace = 765 pass / 0 fail. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
9340ff4592 |
[Audit] --audit-r3-dump-bytes: dump N bytes at r3 when probe fires
AUDIT-059 round 15 — diagnostic. When `--audit-r3-dump-bytes=N` is set, every `--audit-pc-probe-hex` fire emits a paired `AUDIT-R3-DUMP` line with N bytes of guest memory from r3 as u32 lanes (4-byte aligned, cap 256B). Sized for the 80-byte stack-local struct at sub_82452DC0's `r31+96` (probe sub_8245B000 entry where r3 IS the struct ptr). Settable via `XENIA_AUDIT_R3_DUMP_BYTES` env. Read-only; lockstep digest unaffected (empty-set fast path in fire_audit_pc_probe_if_match). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
bcd018659b |
[Audit] --audit-mem-dump-chain: deref a guest address N levels for diagnosis
Round-14 of AUDIT-2BF (singleton-dump). The bctrl at sub_822F1AA8+0x90
(PC 0x822F1B4C) loads [0x828E1F08] (a global singleton), dereferences
its vtable, and indirect-calls vtable[0]. Canary returns; ours hangs.
To name the resolved target we need to dump the (singleton, vtable,
vtable[0]) chain on probe firing.
Adds `--audit-mem-read-hex` / `XENIA_AUDIT_MEM_READ` taking a single
guest VA. When set and any `--audit-pc-probe-hex` PC fires, the kernel
emits a paired `AUDIT-MEM-READ` line with three guest reads:
AUDIT-MEM-READ addr=0x828E1F08 val=<*addr> vtable=<**addr> \
vtable[0]=<***addr+0> vtable[24]=<***addr+24> ...
`vtable[24]` is included as the slot-6 method (audit-059 round 9
documented the canary silph chain dispatching slot 6 of a vtable here).
Read-only; lockstep digest unaffected. ~30 LOC across state.rs and
main.rs. `cmd_check` opts out of the flag (same policy as the existing
audit_pc_probe_hex).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
09e59e09b7 |
Audit-2BF.delta: add --audit-pc-probe-hex for silph-init bctrl probe
Adds a per-PC probe analogous to --lr-trace / --branch-probe but tuned
for the silph init chain's virtual-dispatch site at sub_82172BA0+0x1E8
(PC 0x82172D88, the bctrl after a 3-deep `lwz` chain that loads vtable
slot 6). Each fire emits one AUDIT-PC-PROBE line with (pc, tid, hw,
cycle, lr, r3, r11) plus four guest-memory dereferences off r3 — the
vtable, slot-6 method pointer, auxiliary handle field, and embedded
sub-object vtable — so the line can be compared head-to-head with
canary's round-9 capture (r3=0xBCCC52C0, [r3+0]=0x820A3644,
slot6=sub_821B55D8, [r3+0xC]=0xF80000D8, [r3+0x30]=0x820A1870) to
identify whether ours dispatches to the wrong vtable on a correct
object (case A) or to a wrong object entirely (case B).
Why this addition rather than reuse of an existing probe: --lr-trace
emits JSONL designed for canary-side diffing and only captures
r3/r4/r5/r6/lr (no memory dereferences); --branch-probe captures CR
flags and lr but again no memory; --ctor-probe is single-shot per PC
and walks the stack back-chain. None of them load the four indirect
fields needed to identify a vtable-shape divergence.
Implementation:
- state.rs: new HashSet<u32> field `audit_pc_probe_pcs` and helper
`fire_audit_pc_probe_if_match(hw_id, mem)`. Empty-set fast-path
keeps the cost to one is_empty() check per worker_prologue call
when the flag is unused. Read-only — no guest state mutation,
lockstep digest unchanged.
- main.rs: new CLI flag --audit-pc-probe-hex with bare-hex comma
parsing (tolerates `0x` prefix), settable also via
XENIA_AUDIT_PC_PROBE env var. Threaded through cmd_exec_inner;
cmd_check passes None so check digests are unaffected.
Probe wired into worker_prologue alongside fire_ctor_probe / fire_-
branch_probe / fire_lr_trace. Like its siblings, it fires once per
basic-block entry — known limitation (audit-045 reading-error class
13); use a block-entry PC if probing a mid-block instruction.
Verification: kernel 127/127, app 5/5 non-ignored, no behaviour
change with empty flag.
Cross-references audit-059 round 9's canary capture and lays the
groundwork for the round-10 ours-side comparison.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
2a8ff9515d |
AUDIT-054: thread CreateOptions through NtCreateFile + opt-in cache persistence
Track A — FILE_DIRECTORY_FILE handling. NtCreateFile's 9th parameter
`create_options` (sp+0x54 per shim_utils.h:49-50) is now read and
forwarded to open_vfs_file/open_cache_file. When the
FILE_DIRECTORY_FILE bit (0x1) is set on a `cache:\<hash>` path,
the host-side handler `mkdir -p`s instead of `File::create`'ing a
0-byte sentinel that blocked subsequent hierarchical creates of
`cache:\<hash>\<sub>\<leaf>` with NAME_COLLISION. Confirmed by
`opts=0x4021` (incl. FILE_DIRECTORY_FILE) on `cache:\d4ea4615`
and `opts=0x4020` (no DIR bit) on the leaf `.tmp` files. NtOpenFile
forwards `open_options` (r8) into the same slot per
xboxkrnl_io.cc:118-122. Closes the AUDIT-053 ζ-class VFS layout
aliasing wedge.
Track B — opt-in persistent cache root. AUDIT-038's per-process
tmpdir + wipe stays the default (preserves lockstep/oracle
determinism + dodges Sylpheed's `<hash>.tmp` journal-append-on-
reboot self-inconsistency). Persistence is now opt-in via
* `XENIA_CACHE_ROOT=<path>` — explicit path (caller manages
wiping); hands a stable place to drop a canary-built cache
for cascade A/B oracle work.
* `XENIA_CACHE_PERSIST=1` — `$XDG_DATA_HOME/xenia-rs/cache`
(or `$HOME/.local/share/xenia-rs/cache`).
Cold-start (-n 500M, default tmpfs) with FILE_DIRECTORY_FILE fix:
swaps=1 draws=0 imports=40454 cxx_throw=0 — matches master baseline,
no regression. Cache hierarchy now mkdir-p'd correctly: `cache:/`
contains 9 hash dirs (e.g. `d4ea4615/e/`, `aab216c3/5/`) instead
of the 0-byte sentinel files AUDIT-053 found masquerading as
directories.
LOC: +88 / -14 = +74 net (≤80 budget). All 127 xenia-kernel unit
tests pass.
Trace: audit-runs/audit-054-vfs-layout-fix/
cold-start-digest.json + warm-start-digest.json (defaults)
persist-cold-digest.json + persist-warm-digest.json (opt-in)
baseline-master-digest.json (master
|
||
|
|
49f3eafa15 |
AUDIT-032: dedicated audio worker thread per client (Plan B)
Replaces APUBUG-PRODUCER-001's random-victim-hijack audio injection with a dedicated per-client guest worker thread, mirroring xenia-canary's apu/audio_system.cc:84-159 WorkerThreadMain pattern in xenia-rs's threading model. Audio callback ticker is now safe to enable by default. ## What changed - xenia-kernel/src/xaudio.rs: new XAudioState fields worker_handles + worker_refs (one slot per of XAUDIO_MAX_CLIENTS=8). Synthetic park-handle helper (0xF000_0000 | client_idx) — outside the normal alloc range so wake_eligible_waiters never finds it; the only legitimate state-flip is via try_inject_audio_callback. - xenia-kernel/src/exports.rs: xaudio_register_render_driver spawns a 64KB-stack guest thread (create_suspended=true) via state.scheduler.spawn after registration succeeds. Immediately flips the spawned thread's state from Blocked(Suspended) to Blocked(WaitAny[synthetic]) so it's parked but not woken. Stores the kernel handle so find_by_handle resolves a fresh ThreadRef after slot compaction. Failure paths log + leave xaudio.worker_refs[i] = None, in which case the ticker drops fires (no random-victim fallback). - xenia-app/src/main.rs: try_inject_audio_callback resolves the worker via worker_handles[index] instead of scanning runqueues for a Ready or Blocked victim. The PC+r3 injection and SavedCallbackCtx capture are unchanged; the existing LR_HALT restore path re-blocks the worker on its synthetic handle for the next tick. Flag handling reworked: --xaudio-tick / XENIA_XAUDIO_TICK now act as explicit override (truthy = force on, falsey = force off, absent = use the KernelState default). - xenia-kernel/src/state.rs: xaudio_tick_enabled default flipped from false to true. Pre-fix it was off because the random-victim hijack regressed swaps=2->1; with the dedicated worker that whole class of regression is gone. ## Cascade verification at -n 500M (audit-runs/audit-048-audio-host-pump/) Pre-fix baseline: audit-runs/audit-047-gamma-wedges/ours-end-state.log. | Dim | Predicted (AUDIT-032) | Observed | |-----|-------------------------------------|---------------------------------| | A | tid=9 leaves Blocked[0x828A3254] | Ready @ pc=0x824d1404 | | B | tid=10 leaves Blocked[0x828A3230] | Ready @ same pc/lr | | C | XAudioSubmitRenderDriverFrame > 0 | Mixer setup path executed | | D | KeReleaseSemaphore 0 -> non-zero | 0 -> 1; xaudio.callback.delivered=1 | Bonus: audit-042's tid=6 worker pair on 0x10A0+0x10A4 also went Blocked->Ready as a downstream effect. Boot trajectory shifted significantly: NtWaitForSingleObjectEx 1,489,791 -> 30; NtSetEvent 3,334 -> 68; new exports firing (StfsCreateDevice, ObCreateSymbolicLink, XamContentCreateEnumerator, XamEnumerate, XamTaskSchedule, ExCreateThread x10, KeSetAffinityThread x7, NtCreateSemaphore x4, NtWaitForMultipleObjectsEx x94, NtDuplicateObject x14, XeCryptSha, XeKeysConsolePrivateKeySign). The system left the audio-wait busy loop and entered the savegame/content/crypto init phase. swaps regressed 2 -> 1 (degenerate splash repeat lost; main thread now advances past splash entirely, blocked on a different handle). draws unchanged at 0 — expected per AUDIT-032 (audio gate != renderer gate). ## Tests + scope - cargo build --release succeeds, no new warnings. - cargo test -p xenia-kernel --lib: 127/127 pass (incl. xaudio). - cargo test -p xenia-app --lib: 5/5 non-ignored pass. - Lockstep goldens (sylpheed_n2m / sylpheed_n50m) WILL drift on this fix and need re-baselining as a follow-up commit. 75 net non-comment LOC across 4 files, well under AUDIT-032's 60-120 LOC budget. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
77034b6cbf |
audit-038: persistent cache:/* VFS via host-FS backing
Replaces the "Synthesized empty file" stub for cache:/* paths with a
real host-FS HostPathDevice-style mount. Each KernelState gets a fresh
per-process tmpdir under /tmp/xenia-rs-cache-<pid>-<id>/ which is
cleared on init for lockstep determinism (mirrors canary's
xenia_main.cc:649 RegisterSymbolicLink("cache:", "\\CACHE") +
HostPathDevice in xenia-canary/src/xenia/vfs/devices/host_path_device.cc).
NtCreateFile now honours create_disposition for cache: paths:
FILE_OPEN -> NOT_FOUND if missing
FILE_CREATE -> NAME_COLLISION if present
FILE_OPEN_IF -> open or create
FILE_OVERWRITE_IF -> create or truncate
FILE_OVERWRITE -> NOT_FOUND if missing, else truncate
FILE_SUPERSEDE -> create or truncate
NtReadFile / NtWriteFile / NtSetInformationFile (XFileEndOfFileInformation)
/ NtQueryInformationFile / NtQueryFullAttributesFile route through
std::fs against the per-handle host_path; non-cache paths keep their
legacy semantics (read-only disc image, synth-empty stubs).
Verified by audit-037 cascade:
- sub_82459D18 (cache-miss restore): 0 fires (was firing constantly)
- sub_8245D230 (resize/zero-fill): 0 fires (was firing constantly)
- 105+ real cache-file writes per 500M run; 4+ MB of game data persisting
to disk per boot; cache:/recent, cache:/access, cache:/d4ea*.tmp, etc.
- Lockstep deterministic at instructions=100000004 / imports=987485
across 3+ reruns (digest shifted as expected; goldens re-baselined).
- swaps=2 plateau still in place; cluster L1 unactivated. Cascade
dimension D (cluster activation) — UNKNOWN, no L1 fires.
Tests 640 -> 645 (+5 cache-specific unit tests; full workspace green).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
5af792c9fc |
M8+M9+M10+M11+M12: LOW-tier milestones — funcptr-arrays, EH flag, TLS, lr-trace
Five LOW-priority milestones bundled. Total ~700 LOC across 11 files. ## M9 — has_eh derived from pdata.flags exception bit - New `functions.has_eh BOOLEAN NOT NULL` column. Derived from M1's already-parsed `pdata.flags` (bit 31 of the packed word — the exception-handler-present flag, distinct from bit 30 which is the always-1 32-bit-code flag). Index idx_functions_has_eh. - Sylpheed: 2,975 of 23,073 pdata-validated functions have EH (12.9%). ## M10 — .tls section / IMAGE_TLS_DIRECTORY32 parser - New `xenia_xex::tls::parse_tls` parses the directory + zero-terminated callback array. Returns None when the binary has no .tls section. - New `tls_info` (singleton row) + `tls_callbacks(slot, address)` tables. - New `DbWriter::write_tls()` no-ops on None. - Sylpheed has no .tls section → 0 rows; infra ready for binaries with __declspec(thread). ## M8 + M11 — function_pointer_arrays (dispatch tables + static initialisers) - New `xenia_analysis::funcptr_arrays::analyze` widens M3's vtable scan: detects runs of ≥2 function pointers in .rdata and classifies each as `vtable` (M3 re-emit), `dispatch_table` (M8), or `static_init` (M11) via a constructor-prologue heuristic (mfspr + small stwu). - New tables `function_pointer_arrays(address PK, length, kind)` and `function_pointer_array_entries(array_address, slot, function_address)`. - Sylpheed: 722 vtables + 388 dispatch_tables = 1,110 arrays / 6,347 slots. 0 static_init detected (Sylpheed's ctors don't all match the conservative heuristic; M11.5 future work can chain via the entry- point's static-init driver). ## M12 — --lr-trace runtime canary-diff harness - New CLI `exec --lr-trace=PC[,PC,...]` and `--lr-trace-out=PATH` flags. Symbolic resolution (Class::method, Class::*) via M4 lookup. Env vars XENIA_LR_TRACE / XENIA_LR_TRACE_OUT also work. - New `KernelState::lr_trace_pcs` + `lr_trace_writer` + helper `fire_lr_trace_if_match(hw_id)` invoked from the per-instr probe slot. - JSONL output: pc/tid/hw/cycle/r3/r4/r5/r6/lr — superset of what xenia-canary's --log_lr_on_pc patch emits, with a cycle counter for cross-run reproducibility. Diff-friendly via `jq`. - Lockstep digest unaffected: smoke test on entry-point PC fires once with cycle=0/lr=BCBCBCBC/all-GPR-zero (correct initial state). Tests 636→640 (+2 TLS tests, +2 funcptr_arrays tests). Schema golden updated for new tables + has_eh column. Lockstep determinism preserved (instructions=2000005 ×2 reruns identical). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
690943ceef |
gate dump-section reads on is_mapped; trim doc comments
Without the page-state guard, read_bulk faulted on PROT_NONE pages of the 4 GiB host reservation. Per-page is_mapped check skips uncommitted pages, leaving the buffer's leading zero bytes in place. Total LOC budget after trim: 70. |
||
|
|
08d41cf2fc |
add --dump-section=BASE:LEN:PATH for end-of-run guest memory snapshot
Drives byte-level memory diffs against canary's Memory::Save dump. Hot-path zero-cost when absent; lockstep digest unaffected (instructions=100000003 deterministic across reruns). |
||
|
|
b78e6fd205 |
fix(kernel): KRNBUG-IO-004 — real XamNotifyCreateListener + XNotifyGetNext per canary
Canary's RegisterNotifyListener (kernel_state.cc:1013-1033) auto-enqueues four
startup notifications on the first listener whose mask covers kXNotifySystem
(SystemUI=0x09 + SystemSignInChanged=0x0A) and kXNotifyLive
(LiveConnectionChanged=0x02000001 + LiveLinkStateChanged=0x02000003). XNotifyGetNext
(xam_notify.cc:22-96) pops the queue with mask + version filtering on enqueue per
xnotifylistener.cc:38-51. Our prior stubs returned 0 forever; the dispatch loop
at 0x822f1be8 in sub_822F1AA8 was thus bypassed indefinitely.
Implementation:
- KernelObject::NotifyListener { mask, max_version, queue, waiters } variant.
- KernelState::has_notified_startup + has_notified_live_startup gates.
- xam_notify_create_listener: mask=r3 (qword), max_version=r4 (clamped <=10),
alloc handle, conditional 4-tuple startup enqueue.
- xnotify_get_next: handle/match_id/id_ptr/param_ptr in r3..r6; pop_front
(or scan-by-id), with mask + version filter applied at enqueue time.
- 5 unit tests covering: full-mask 4 startup notifications, second-listener
no re-fire, system-only mask filtering, max_version=0 too-new drop,
unknown handle returning 0.
Tests: 594 -> 599. Lockstep `-n 100M` instructions=100000012 deterministic
across 2 reruns; bit-identical run-to-run diff.
Cascade (verified at -n 500M):
- dispatch arm 0x822f1be8 fires; sub_82173DC8 entered.
- 3/21 renderer-cluster L1 PCs newly reached: 0x822c6870 (2 workers),
0x824563e0, 0x823ddb50.
- canary-only export delta 7 -> 3 (reclassified to fired:
KeResetEvent, ObCreateSymbolicLink, XamTaskCloseHandle, XamTaskSchedule).
- worker thread count 18 -> 20.
- signal_attempts on handle 0x15e0 = 1 (primary=1), was 0.
- draws=0 still expected at this step.
LOC: 119 (97 impl + 22 scaffolding pattern matches across main.rs / objects.rs
/ state.rs) <= 120.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
c51f51f9cb |
feat(kernel): KRNBUG-AUDIT-007 — --branch-probe instrumentation; sub_824A9710 exit gate identified
Sister to --pc-probe / --ctor-probe but emits a single compact one-line BRANCH-PROBE record per fire (pc, tid, hw, cycle, r3, lr, cr0/cr6 flags) with no back-chain. Designed for tracing every conditional-branch fire inside a candidate-gate function so the last PC reached before the function epilogue identifies the exit branch. Runtime trace at audit-runs/audit-007/sub_824A9710-trace.log decisively identifies the priv-11 gate: - Exit branch: 0x824a9944 (post bl sub_824ABD88 first call) - Responsible kernel call: NtDeviceIoControlFile, FsCtlCode=0x74004 (registered as stub_success at exports.rs:90) - Mechanical chain: stub returns 0/SUCCESS without writing OUT, game reads [out_buf+8], finds zero, assigns hardcoded 0xC0000034 (STATUS_OBJECT_NAME_NOT_FOUND) at sub_824ABD88:0x824abea8-ac, exits via 0x824a9944's lt branch before priv-11 site at 0x824a99a0. 592→592 tests; lockstep instructions=100000010, swaps=2, draws=0 deterministic across reruns. Read-only diagnostic — no fix this session. Next session: KRNBUG-IO-003 (real NtDeviceIoControlFile per canary NullDevice::IoControl for FsCtlCodes 0x70000 + 0x74004). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
1a892d4641 |
feat(kernel): KRNBUG-XEX-001 — real XexCheckExecutablePrivilege from XEX header bitmap
Replace stub_return_zero with a canary-faithful implementation that returns bit `priv` of the loaded XEX's XEX_HEADER_SYSTEM_FLAGS (key 0x00030000) bitmap. Mirrors xenia-canary xboxkrnl_modules.cc:22-39: `(flags >> priv) & 1` for priv < 32, else 0. Plumbing: - xenia-xex: header_keys::SYSTEM_FLAGS const + get_system_flags() accessor. - xenia-kernel/state.rs: pub xex_system_flags: u32 + xex_priv_logged HashSet for one-shot per-priv tracing. - xenia-app: kernel.xex_system_flags wired in cmd_exec_inner. - xenia-kernel/exports.rs: real export body + unit test covering bits 10/11/0/64 + zero-flags case. Sylpheed's bitmap is 0x00000400 (only XEX_SYSTEM_PAL50_INCOMPATIBLE, bit 10). At -n 500M with the fix: - XGetAVPack: 0 -> 1 (priv-10 gate at lr=0x824ab598 flipped). - 10 other canary-only exports + 9 producer PCs + 3 parked handles unchanged. Priv-11 site at sub_824A9710 is downstream and still not reached — AV/crypto block aborts after XGetAVPack returns our placeholder 0x16 (canary returns 8/HDMI; Sylpheed accepts only 3/4/6/8 per xenia-canary xam_info.cc:250-251). Tests 588 -> 589. Lockstep deterministic (3 reruns identical): n50m goes 50000008 -> 50000005 instr / 407415 -> 407417 imp / swaps=2 / draws=0. Goldens re-baselined (sylpheed_n50m, sylpheed_n2m); oracle test green. Full chain-of-effects + next-frontier hand-off in audit-findings.md under KRNBUG-XEX-001. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
3e2fc1ec88 |
feat(kernel): KRNBUG-AUDIT-005 — --pc-probe extension + canary diff identifies XexCheckExecutablePrivilege stub cascade
Extends `--ctor-probe` machinery into `--pc-probe` (clap alias) with
the optional `PC@DISPATCHER:OFFSET` token form: on a hit, the helper
additionally logs `[disp+off]` — what the producer's
`lwz r3, OFFSET(r3)` is about to read. Reuses `parse_hex_u32`; both
flags share parser + storage.
Read-only diagnostic. Lockstep digest preserved (`run digest matches
golden` at -n 50M `--stable-digest`). 588 tests green.
Decisive findings (full deliverable in `audit-findings.md` /
`audit-runs/audit-005/`):
- Failure mode α confirmed for KRNBUG-AUDIT-004: all 9 producer call
sites for handles 0x100c (5 sites) and 0x15e0 (4 sites) fire 0x at
-n 500M. The producer code path is not reached.
- Set-diff of kernel-call sequences (canary.log oracle vs ours.log
at -n 500M) identifies 11 exports canary calls and we don't:
XGetAVPack, XeCryptSha, XeKeysConsolePrivateKeySign,
ObCreateSymbolicLink, NtDeviceIoControlFile (×2),
XamUserReadProfileSettings (×2), XamTaskSchedule, XamTaskCloseHandle,
KeReleaseSemaphore (×268), KeResetEvent, ExTerminateThread (×2).
- XGetAVPack has exactly one caller (sub_824AB578 at 0x824AB5A0).
The 4 instructions immediately preceding it are:
addi r3, r0, 10 ; privilege bit 10
bl XexCheckExecutablePrivilege
cmpli 0, r3, 0
bc 12, eq, 0x824AB724 ; if r3==0, skip whole block
- exports.rs:193 registers XexCheckExecutablePrivilege as
stub_return_zero. Always returning 0 -> guest takes the branch
and skips the entire AV/crypto/save-data init block.
- The other call site (sub_824A9710 at 0x824A99A0) queries privilege
11 with opposite polarity (bne) -> gates XamTaskSchedule on the
privilege-NOT-set arm. With both stubs returning 0, the guest
walks the wrong arm of every privilege-gated branch.
- This explains why the dispatcher fields read zero
([0x828F3D08+0x50]=0, [0x828F4070+0x24]=0 from AUDIT-004 dumps):
the ctors run, but the producers that would populate those fields
with a non-zero handle never execute.
Next session: replace XexCheckExecutablePrivilege stub with real
priv-bit lookup from XEX header. See audit-findings.md
KRNBUG-AUDIT-005 for the validation matrix.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
7108d6d131 |
feat(kernel): KRNBUG-AUDIT-004 — --ctor-probe PC hook + --dump-addr struct dump
Diagnostic-only, read-only. Lockstep `instructions=100000002`
preserved bit-exact at -n 100M --stable-digest. 586 → 588 tests.
Adds two read-only diagnostics for the parked-waiter producer hunt:
* `--ctor-probe=0x8217C850,0x...` — at every interpreter step,
if `ctx.pc` is in the configured set, print one `CTOR-PROBE`
line capturing live r3 (= `this` in MSVC PPC ctors), lr
(= return site), sp, plus an 8-frame back-chain with
saved-r31/r30 per frame. Fires once per hit, exactly what the
8-instance-pool probe needed.
* `--dump-addr=0x828F3D08,0x828F4070,0x828F3EC0,...` — at end of
run (after the FOCUS report in `dump_thread_diagnostic`), each
address gets a 128-byte hex + be32 + ASCII dump. Used to
inspect the static dispatcher / job-queue struct layouts
AUDIT-003 identified.
Both gated default-off; empty set is a single `is_empty()` test on
the hot path. No guest state is mutated, so the
`sylpheed_n*m.json` lockstep digest is preserved.
KRNBUG-AUDIT-004 findings (corrects KRNBUG-AUDIT-002/003):
1. **The "8-instance pool" hypothesis for handle 0x1004 is FALSE.**
Probing the inner per-instance ctors `[0x821783D8, 0x82181750,
0x821701C8]` at -n 50M shows each fires EXACTLY ONCE with
r3 = `[0x828F3EC0, 0x828F3D08, 0x828F4070]` respectively. All
three handles are Meyers-style singletons with one dispatcher
each. The "called 8 times" claim came from miscounting raw
entries to the OUTER getter sub_8217C850 — but that getter is
itself a Meyers-singleton-getter; only the FIRST entry cascades
through to bl 0x821783D8 (gated on `[0x828F48D8] bit 0`).
2. **The producer indirection layer is the singleton-getter
itself.** Static byte-scan of .rdata / .data shows 0 hits for
the dispatcher addresses — no static registry table holds them.
But the xrefs table for the OUTER getters reveals 5–6 callers
each, MOSTLY non-create-chain, sharing the canonical producer
pattern: `bl outer_singleton_getter; lwz r3, OFFSET(r3); bl
0x824AA1D8` (with OFFSET=80 for 0x100c, =36 for 0x15e0). So the
AUDIT-003 xref audit was necessary but not sufficient — it
correctly saw "no direct producer references" but missed the
singleton-getter indirection layer.
3. **Dispatcher struct layouts** (128-byte dumps captured at -n
50M --halt-on-deadlock):
- 0x828F3D08 (handle 0x100c): event_handle at +0x4C (0x100c),
thread_handle at +0x48 (0x1010), self-pointer at +0x74,
capacity 7 at +0x28, queue empty (+0/+3C = -1).
- 0x828F4070 (handle 0x15e0): event_handle at +0x20 (0x15e0),
sibling-handle 0x15E4 at +0x1C, queue empty (+0x10 = -1).
- 0x828F3EC0 (handle 0x1004): event_handle at +0x78 (0x1004),
4 guest-heap sub-buffers at +0x20/+0x3C/+0x44/+0x50 in
0x4xxxxxxx range — noticeably different layout from the
other two pure POD job queues.
Files:
crates/xenia-kernel/src/state.rs ctor_probe_pcs / dump_addrs +
fire_ctor_probe_if_match + 2 tests
crates/xenia-app/src/main.rs Exec --ctor-probe / --dump-addr
CLI parsing, prologue hook,
end-of-run struct dumper
audit-findings.md KRNBUG-AUDIT-004 entry
audit-runs/audit-004/ 50M probe runs (v1 outer-getter
hits, v2 inner-ctor hits proving
the singleton hypothesis)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
f84e947547 |
feat(kernel): KRNBUG-AUDIT-003 — vtable/RTTI class probe at handle creation + wait
Adds a read-only MSVC RTTI traversal helper (`read_class_at_this`)
and a `probe_create_stack_classes` integration that walks each
captured back-chain frame for handle creates in `--trace-handles-focus`
and probes each frame's most-likely `this` candidate (live r31/r30/r3
for frame 0; saved-r31/r30 from the prologue spill area at [fp-12]/
[fp-16] for deeper frames). False-positive guard rejects the CRT
static-init iterator pattern (vtable's first two slots must be image-
range function pointers — PPC instruction words like `mflr r12` are
not in 0x82xxxxxx).
`dump_thread_diagnostic` now takes `&GuestMemory` so the FOCUS report
prints, for each parked waiter, a WAIT-THREAD block with full back-
chain frames and per-slot saved-register dump for offline lookup.
End-to-end finding (-n 500M producer-trace):
* Handle 0x100c dispatcher = 0x828F3D08 (image rdata; verified by
sub_82181750 disasm + xref table). [this+0] = -1 sentinel — POD
job queue, NOT a C++ polymorphic class.
* Handle 0x15e0 dispatcher = 0x828F4070 (same shape).
* Handle 0x1004's 8-instance pool members still TBD (MSVC ctors
didn't preserve `this` in r31).
* 0x42450b5c is a separate audit class (heap-allocated, parks via
non-`do_wait_single` path).
Decisive xref audit: every reference to 0x828F3D08 / 0x828F4070 in
the static analysis is in a ctor or the CRT init driver. NO producer
code references either dispatcher base. Confirms `signal_attempts=0`
is unreachable-producer, not broken-producer.
Tests: 581 → 586 green (+5: RTTI-intact / RTTI-stripped / non-object
/ cstring / probe_create_stack integration). `--stable-digest -n
100M` instructions=100000002 unchanged. Master HEAD prior:
|
||
|
|
2a9fd1fc86 |
feat(kernel): KRNBUG-AUDIT-002 — multi-frame guest stack capture at handle creation
Adds `walk_guest_back_chain` (PPC EABI back-chain walker) and a
`record_create_with_stack` audit hook gated on `--trace-handles-focus`.
NtCreateEvent / NtCreateSemaphore / NtCreateTimer / XamTaskSchedule now
route through the new helper so focused handles capture up to 6 stack
frames at allocation time. Diagnostic-only, read-only memory access:
unfocused handles pay one HashSet lookup, focused ones pay six
back-chain dereferences. Lockstep determinism preserved.
End-to-end finding: handles 0x1004 (8-instance pool via static ctor at
0x8280F810), 0x100c (singleton built inside main()), 0x15e0 (singleton
in distinct cluster) are silph-framework dispatcher objects whose
producer code is unreached at -n 500M. The producer hunt now has class
ownership; vtable/RTTI readout is the next step.
Tests: 576 → 581 green. `--stable-digest -n 100M` instructions=100000002
unchanged. Master HEAD prior:
|
||
|
|
07068e7616 |
feat(audio): APUBUG-PRODUCER-001 — XAudio register driver client + opt-in callback ticker
Replace the three XAudio kernel-export stubs (Register/Unregister/SubmitFrame) with canary-faithful implementations and add a periodic buffer-complete callback ticker reusing the existing SavedCallbackCtx injection machinery. Canary parity: - xboxkrnl_audio.cc:56-93 — read callback_ptr[0..1], wrap callback_arg in a 4-byte big-endian guest heap buffer (`wrapped_callback_arg`), write `0x4155_xxxx` to *driver_ptr. - audio_system.cc:139-141 — guest callback receives r3 = wrapped pointer, not raw callback_arg. - audio_driver.h:21-24 — frame rate 256 samples / 48 kHz ≈ 5.33 ms. Implementation: - New `crates/xenia-kernel/src/xaudio.rs` — `XAudioClient`, `XAudioState` (8-slot table, pending FIFO, dual-mode ticker), `XAUDIO_INSTR_PERIOD = 48_000` (lockstep) and `XAUDIO_PERIOD = 5.333 ms` (--parallel), same pattern as KRNBUG-D08 v-sync. - `try_inject_audio_callback` in xenia-app mirrors `try_inject_graphics_interrupt`, shares `interrupts.saved` slot for mutex with graphics callbacks. Gating: ticker + injector run only when `--xaudio-tick` / `XENIA_XAUDIO_TICK=1`. Default off because Sylpheed's audio callback enters an infinite `KeWaitForSingleObject` loop on first invocation (canary's host worker thread provides the buffer-completion fence we don't model), which hijacks a guest HW thread and regresses `swaps=2 → 1`. Default-off preserves the lockstep `sylpheed_n*m.json` goldens exactly. Producer hunt outcome (FALSIFIED for parked handles 0x1004/0x100c/0x15e4): at `-n 500M --xaudio-tick` all 3 handles still show `signal_attempts=0 (primary=0, ghost=0)`. Audio callback is not the missing producer. Next candidate per audit-findings.md is Timer DPC delivery (KeSetTimer / KeInsertQueueDpc). Tests: 562 → 576 green (10 in `xaudio.rs`, 4 in `exports.rs`). Lockstep `--stable-digest -n 100M` default-off: instructions=100000002, swaps=2 (matches pre-change baseline byte-for-byte). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
27d3608174 |
fix(kernel): KRNBUG-D08 — wall-clock v-sync under --parallel
The synthetic v-sync ticker used a per-instruction proxy (VSYNC_INSTR_PERIOD = 150 k) tuned for ~10 MIPS lockstep throughput → 60 Hz. Audit M11 observed this drifts under `--parallel`: with 6 worker threads sharing the kernel mutex, the dispatcher executes more PPC instructions per tick callback, so the accumulator never crosses 150 k. Result: ~629 v-syncs/100M lockstep → ~2 v-syncs/100M --parallel. Hybrid solution preserves lockstep determinism (which the goldens depend on) while fixing --parallel: * `tick_vsync_instr(instr_count)` — legacy instruction-count ticker, used by lockstep. Bit-stable across runs. * `tick_vsync_wallclock()` — new Instant-based ticker. Fires `floor(elapsed / VSYNC_PERIOD)` v-syncs since the anchor and advances the anchor by that many full periods (no lazy backlog). Capped at INTERRUPT_QUEUE_CAP per call so a forward-jumping clock can't overflow the FIFO. * `KernelState.parallel_active` flag set at startup from `--parallel` / `XENIA_PARALLEL=1`. Read by `coord_pre_round` in main.rs to choose between the two tickers. Verification: * cargo test --workspace --release: 561 passing (+3 new wall-clock tests vs prior 558 baseline). * lockstep -n 100M --stable-digest: BIT-IDENTICAL to pre-Phase-3 baseline. interrupts_delivered preserved at ~630 (was ~629 pre-fix). * --parallel --reservations-table -n 30M: interrupts_delivered rose from ~2 to 17. (FIFO INTERRUPT_QUEUE_CAP=4 still caps burst delivery; that's a separate bottleneck — addressed by raising cap when --parallel queue depth becomes the next blocker.) Trade-off: --parallel runs are non-deterministic at the v-sync rate by design (per audit M05 PPCBUG-703 already). Lockstep stays bit-identical, so the `sylpheed_n*m.json` goldens are untouched. Audit IDs: KRNBUG-D08 (closed). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
82f3d611e2 |
fix(gpu,kernel): KRNBUG-Vd-04 / GPUBUG-001 / XMODBUG-013 — VdSwap PM4 ring path
The pre-fix VdSwap zero-filled the guest's reserved buffer with NOPs and
called `state.gpu.notify_xe_swap` directly — bypassing the ring, leaving
the PM4_XE_SWAP handler at gpu_system.rs:1232 dead code, and skipping
the PM4_TYPE0(SHADER_CONSTANT_FETCH_00_0, 6) patch. Sylpheed's bloom/
blur "sample frame N for frame N+1" path samples fetch-constant slot 0
expecting the frontbuffer descriptor; without the patch, slot 0 stayed
stale and any shader sampling it read garbage.
This commit writes the canary VdSwap PM4 sequence directly into the
primary ring at the current write pointer (read via the shared MMIO
atomic), then advances WPTR over the injection. The natural CP drain
consumes PM4_XE_SWAP — bumping `swaps_seen` and patching fetch-constant
slot 0 — without going through any direct kernel→GPU bypass.
Sequence per xenia-canary VdSwap_entry (xboxkrnl_video.cc:438-521):
1) PM4_TYPE0(0x4800, count=6) + 6 fetch-header dwords (with
base_address re-patched from virtual to physical >> 12).
2) PM4_TYPE3(PM4_XE_SWAP, count=4) + signature + frontbuffer_phys
+ width + height.
Mechanism notes:
- buffer_ptr in xenia-rs is in the system command buffer, NOT the
primary ring (verified empirically: buffer_ptr=0x4acd4df8 vs
ring_base=0x0accb000, size 4 KB). Canary's VdSwap writes to
buffer_ptr because its ring layout maps the reserved slot inside
the ring; xenia-rs's doesn't, so we have to write at the actual
ring WPTR address (cached on KernelState.ring_base from
VdInitializeRingBuffer).
- The original "buffer_ptr zero-fill + bump WPTR by 64" path is
preserved before the injection — it exposes any game-batched PM4
packets and keeps the buffer_ptr region skippable per existing
game compat behavior.
- A safety-net fallback at the end calls `notify_xe_swap` directly if
swaps_seen didn't advance during the drain (e.g. a ring-arithmetic
edge case). Idempotent — only fires when the PM4 path didn't.
- KRNBUG-Mm-04 deferred: virt→phys uses the masked stub
`virt & 0x1FFF_FFFF`, sufficient for the standard heap.
Mechanical changes:
- crates/xenia-gpu/src/pm4.rs: add make_packet_type0 / type2 / type3
helpers + round-trip unit test (mirrors canary xenos.h:1682-1709).
- crates/xenia-gpu/src/handle.rs: add mmio_cp_rb_wptr_load accessor
(Acquire-load) so the kernel can compute ring offsets.
- crates/xenia-kernel/src/state.rs: cache ring_base / ring_size_dwords
on KernelState (set by VdInitializeRingBuffer).
- crates/xenia-kernel/src/exports.rs: rewrite the vd_swap PM4-emit
block; patch fetch_dwords[1] base_address virt→phys before injection.
Verification at -n 100M lockstep:
swaps: 2 → 2 (game fires VdSwap exactly twice)
draws: 0 → 0 (gated by Phases D+E)
fallback warning: 0 occurrences (PM4 path consumed both swaps)
instructions: ~100M
Tests: 552 passing (553 with new pm4 round-trip test). Lockstep
stable-fields determinism: byte-identical across two 100M runs.
The "swaps > 2" prediction in the audit's plan assumed the game would
fire VdSwap more often once the path worked; empirically Sylpheed only
calls VdSwap twice within 100M instructions (this is the renderer
plateau the audit identified). The success criterion for Phase C is
that the PM4 path is now operational, which Phases D+E require for
visible draws.
Closes KRNBUG-Vd-04, GPUBUG-001, XMODBUG-013.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
5f0d6487ea |
xenia-kernel: HLE expansion, scheduler integration, audit + UI bridge
Major HLE buildout in exports.rs: KeInitializeSemaphore now seeds
count/limit, XexGet{Module,Procedure}Address use distinct
HMODULE_XBOXKRNL/HMODULE_XAM pseudo-handles with a reverse
(ModuleId,ordinal)→thunk_addr map, plus sweeping additions across
sync primitives, file I/O, semaphores, events, threads, and
allocator paths needed to advance Sylpheed past VdSwap=2.
New modules:
- thread.rs — ThreadRef + per-thread suspension/wake plumbing
- interrupts.rs — IRQ delivery, pending-IRQ slots, IPI helpers
- path.rs — guest path normalization (D:\\, game:\\, etc.)
- audit.rs — --trace-handles harness backing the handle audit
- ui_bridge.rs — kernel-side endpoint of the xenia-ui bridge
(input snapshots, framebuffer publish handles)
state.rs grows to own the HW-slot scheduler state, the new audit /
UI bridge handles, and the per-handle reverse maps. xam.rs and
objects.rs follow suit for the HLE additions.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|
|
c694bb3f43 |
Initial commit: xenia-rs workspace for Xbox 360 RE
Rust reimplementation of the xenia Xbox 360 emulator targeting reverse- engineering and preservation, initially scoped to Project Sylpheed. Includes: - XEX2 loader (LZX decompression, AES decryption, PE parsing) - XISO / XGD2 disc image VFS - PPC interpreter with 200+ opcodes and VMX128 decoding - Static analyzer: functions, cross-references, labels, asm + SQLite output - HLE kernel covering the xboxkrnl/xam subset used by Sylpheed init - Debugger with in-memory and SQLite-backed execution tracing - `xenia-rs` CLI with extract/dis/exec commands that produce cumulative, superset SQLite databases and opt-in instruction/import/branch traces Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |