Compare commits
24 Commits
iterate-2A
...
iterate-2J
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
ed2e0e72fd | ||
|
|
f75bc96d17 | ||
|
|
de21c7a544 | ||
|
|
f3b7e8b760 | ||
|
|
7e2603a9e5 | ||
|
|
5aaadfec36 | ||
|
|
0332d1990d | ||
|
|
6271ba1f55 | ||
|
|
48b19e490f | ||
|
|
341196a111 | ||
|
|
b20c99f141 | ||
|
|
db90ad0f7d | ||
|
|
481591fdb2 | ||
|
|
52c30d82a7 | ||
|
|
229b46c765 | ||
|
|
40f208ea4e | ||
|
|
8683fb59ed | ||
|
|
b5885b8560 | ||
|
|
9340ff4592 | ||
|
|
bcd018659b | ||
|
|
09e59e09b7 | ||
|
|
5a8fe21ad5 | ||
|
|
51489e34db | ||
|
|
9a93152981 |
5
.gitignore
vendored
5
.gitignore
vendored
@@ -11,3 +11,8 @@ audit-*.md
|
||||
*.stdout
|
||||
*.stderr
|
||||
*.log
|
||||
|
||||
# Runtime cache artifacts (vkd3d-proton / DXVK shader caches dropped into the
|
||||
# working dir by the Wine canary build)
|
||||
vkd3d-proton.cache*
|
||||
*.dxvk-cache
|
||||
|
||||
0
audit-runs/audit-009/branch-probe.trace
Normal file
0
audit-runs/audit-009/branch-probe.trace
Normal file
131
audit-runs/audit-059-handle-disambiguation/FINDINGS.md
Normal file
131
audit-runs/audit-059-handle-disambiguation/FINDINGS.md
Normal file
@@ -0,0 +1,131 @@
|
||||
# AUDIT-059 — handle disambiguation (iterate 2.BD)
|
||||
|
||||
**Date:** 2026-06-06. **Engines:** ours `target/release/xenia-rs -n 50M` (3.9 s wall, 50M instr, 40k import calls), canary Wine `xenia_canary.exe --mute=true --audit_handle_lifecycle=true` (~35 s wall, 34k log lines, 0 fatals).
|
||||
|
||||
## Verdict — HANDOFF's wedge handles are stale
|
||||
|
||||
HANDOFF said: *"opt_callback signals 0x108c, tid=1 wedges on 0x10e8."* Both IDs are now `<UNCREATED>` in ours, along with `0x1090 / 0x10dc / 0x10fc / 0x1104` (also in HANDOFF's adjacent list). The allocation order shifted since that snapshot.
|
||||
|
||||
## Real wedges, current code state
|
||||
|
||||
| Handle | Kind | Engine state | Waiter | Notes |
|
||||
|---|---|---|---|---|
|
||||
| **0x12a4** | `<UNCREATED>` | `<AUDIT_BLIND>`, waiters=1 | **tid=1 main**, pc=0x824ac578 | Wait went via `do_wait_single` but creation never hit `NtCreateEvent` — `KeInitializeEvent` path. **This is the iterate-2.BC wedge** (recorded as "0x10e8" in HANDOFF — same site, different ID). |
|
||||
| **0x12ac** | Event/Auto | `<NO_SIGNALS_DESPITE_WAITS>`, waiters=1 | **tid=13** silph UI cluster, pc=0x824ac578 lr=0x821cb1e0 | Frame trail: `0x821cb1e0 → 0x821cbae0 → 0x821cc454 → 0x821c4f18 → 0x82174a80`. Frames 3-5 carry `silph::UImpl@GamePart_Title` / `silph::VGamePart_Title` vtables — **audit-049's cluster, unchanged**. |
|
||||
| 0x12b8 | Event/Auto | NO_SIGNALS, waiters=1 | (tid TBD) | Sibling, 0xC bytes from 0x12ac. |
|
||||
| 0x1020 | Event/Manual | NO_SIGNALS, waiters=1 | — | γ-class. |
|
||||
| 0x1040 | Event/Auto | NO_SIGNALS, waits=32 (hot poll) | — | Heavy wait, no signal. |
|
||||
| 0x10a8 | Event/Auto | NO_SIGNALS, waits=7 | — | γ-class. |
|
||||
| 0x10e4 | Event/Manual | NO_SIGNALS, waiters=1, waits=2 | — | γ-class. |
|
||||
|
||||
**Working handles** (sanity baseline): 0x1028 (Sema, 8 waits / 7 signals / 7 wakes), 0x10d0 (Sema, 2 waits / 1 signal / 1 wake), 0x10f0 (Event/Auto, 1/1/1 ✓ marked `<SUSPECT>` but actually fine), 0x10e0 (Event/Manual, 32 primary signals from somewhere).
|
||||
|
||||
## GPU interrupt delivery — the iterate-2.BC delta confirmed
|
||||
|
||||
| Engine | gpu.interrupt.delivered (vsync) | EmulateCPInterruptDPC / vblank pump |
|
||||
|---|---:|---:|
|
||||
| **ours** | 54 (source=0) + 1 (source=1) | — |
|
||||
| **canary** | — | **4712** in 30 s ≈ 157 Hz |
|
||||
|
||||
**~87× ratio.** Confirms HANDOFF's diagnosis: ours' victim-thread injector dies once guest threads all park; canary's host frame-limiter thread keeps firing regardless.
|
||||
|
||||
## Canary signaler attribution
|
||||
|
||||
Top KeSetEvent guest_ptrs in canary (30 s window):
|
||||
|
||||
| guest_ptr | KeSetEvent fires | Inferred role |
|
||||
|---|---:|---|
|
||||
| `0x828A3254` | 5729 | Audio host-pump worker (per AUDIT-032: `r3=0x828A3230` region) |
|
||||
| `0x828A3244` | 5728 | Audio host-pump sibling |
|
||||
| `0x828A3244` + 16-byte stride | — | Static XEX-image audio event struct |
|
||||
| `0xBCE25234` | 1301 | **silph UI cluster PKEVENT** (heap-allocated, 0x10 stride). Likely ours' 0x12ac analog. |
|
||||
| `0xBCE25214 / 0xBCE25244 / 0xBCE25224` | 648 / 603 / 603 | Sibling silph UI PKEVENTs (0x10 stride struct). Likely ours' 0x12a4 / 0x12b8 / 0x1040 analogs. |
|
||||
|
||||
Ours signals every one of those equivalents **0 times**.
|
||||
|
||||
## Round 2 — LR-extended probes name the producer
|
||||
|
||||
Extended the canary probes with guest-LR capture (5 sites in `xboxkrnl_threading.cc`, 10 LOC). Re-ran the harness. Now each `KeSetEvent` line carries the guest function that signaled the event. Result for the silph UI cluster:
|
||||
|
||||
| PKEVENT | KeSetEvent count | Producer LR(s) |
|
||||
|---|---:|---|
|
||||
| `0xBCE25214` | 574 | `0x82508510` (single producer) |
|
||||
| `0xBCE25224` | 565 | `0x82508358` (single producer) |
|
||||
| `0xBCE25234` | 1153 | `0x82506C90` (579) + `0x82508524` (574) |
|
||||
| `0xBCE25244` | 570 | `0x82506F9C` (single producer) |
|
||||
| `0xBCE25284` | 1 | `0x82507ABC` (one-shot 5th-worker init?) |
|
||||
|
||||
All 6 producer LRs sit in `0x82506000–0x82509000`. **This is exactly the `sub_825070F0` worker thread cluster** that audit-057/058 already named:
|
||||
|
||||
> *audit-057: "sub_825070F0 (4 missing, initializes 4 workers w/ shared ctx 0xBCE25340, entries 0x82506528/58/88/B8)"*
|
||||
|
||||
The 4 worker entries (`0x82506528/58/88/B8`) are inside `sub_82506xxx` — exactly where the producer LRs `0x82506C90`/`0x82506F9C` live. The other producer LRs `0x825083xx` / `0x825085xx` are in downstream callees (workers call deeper code which itself calls KeSetEvent).
|
||||
|
||||
For comparison the audio host-pump pair gets a single sharp producer too:
|
||||
- `0x828A3254` × 5271 ← `lr=0x824D2A44`
|
||||
- `0x828A3244` × 5271 ← `lr=0x824D292C`
|
||||
|
||||
(These match AUDIT-032's PC `0x824D229C / r3=0x828A3230` region — already-understood audio host-pump.)
|
||||
|
||||
## Verdict — 2.BE is INSUFFICIENT for the silph UI wedge
|
||||
|
||||
The silph UI PKEVENTs are signaled exclusively by threads spawned by `sub_825070F0`. Per audit-057/058, **`sub_825070F0` fires 0× in ours** — those 4 worker threads never spawn. Therefore the PKEVENTs are never signaled. Therefore tid=13 (`0x12ac` in ours) wedges forever.
|
||||
|
||||
**`sub_825070F0`'s call chain is gated by the audit-009 "unreachability island"** — a CRT-driven fnptr-array bootstrap that ours fails to enumerate. VSync delivery is irrelevant to that bootstrap; the host frame-limiter thread does not drive CRT initializers.
|
||||
|
||||
Therefore:
|
||||
- **2.BE alone CANNOT unwedge tid=13.** It will close the 54-vs-4712 VSync delivery gap and may unblock things downstream of vsync, but the silph UI wedge has an independent missing-signaler root cause.
|
||||
- **2.BE may still unwedge tid=1 main on `0x12a4`** — that wait went via `KeInitializeEvent` (handle never hit `NtCreateEvent` in ours, hence `<AUDIT_BLIND>`). Whether `0x12a4`'s signaler depends on VSync is unknown without further probing.
|
||||
|
||||
## Implications for next moves
|
||||
|
||||
A single fix won't take us to draws > 0. We need at least two:
|
||||
|
||||
1. **2.BE (VSync delivery)** — still worth landing for the architectural correctness it brings, AND because it's the only fix that can unwedge tid=1 main's `0x12a4` if that's vsync-derived. ~60–80 LOC per Agent C's plan.
|
||||
2. **2.BF (sub_825070F0 activation)** — this is the audit-058 unfinished business. Options:
|
||||
- (a) **Static work:** trace canary's CRT-driven fnptr-array path that activates the silph UI bootstrap; backport the missing init into ours. High info, slow. Requires more probing.
|
||||
- (b) **Direct synthetic spawn:** ours injects host-side `ExCreateThread` calls for the 4 worker entries at boot completion, mirroring AUDIT-048's audio-host-pump precedent. Pragmatic; ~40 LOC; risks getting context (`0xBCE25340`) wrong.
|
||||
|
||||
A possible third move:
|
||||
|
||||
3. **Re-probe with LR on Wait paths** (we already added it but didn't grep for it) — to tell us whether tid=1's wait on `0x12a4` is the same LR as `sub_825070F0`-chain or a totally different signaler. If different, it's a 3rd missing producer.
|
||||
|
||||
## Round 4 — wait-side guest LR via one-frame back-chain walk
|
||||
|
||||
After fixing the PPC stack-walk offset (Xbox 360 stores saved LR at `[prev_sp - 8]`, not the `+4` AIX convention), wait-side LR comes through cleanly.
|
||||
|
||||
**Canary's top wait sites:**
|
||||
|
||||
| canary handle | wait count | guest_lr | LR region | mapping |
|
||||
|---|---:|---|---|---|
|
||||
| `F800005C` | 1635 | `0x8216EE14` | kernel early-boot infra | unrelated |
|
||||
| `F800000C` | 1597 | `0x824AFFC4` | xboxkrnl wrapper (scheduler / work-queue?) | unrelated |
|
||||
| **`F80000DC`** | **476** | **`0x821C7D3C`** | **silph::UImpl/GamePart** | **= ours' 0x12ac silph UI wedge** |
|
||||
| `F80000B0` | 6 across | `0x821CBAE0` + `0x821CC19C` + `0x822DFE2x/D0` | **exact match with audit-049's frame trail** | sibling silph UI wait |
|
||||
|
||||
Identity proof: ours' audit-049 frame trail for the silph UI wedge was `0x821cb1e0 / 0x821cbae0 / 0x821cc454 / 0x821c4f18 / 0x82174a80`. Round 4 captures `0x821CBAE0` and `0x821CC19C` (adjacent PCs) as wait LRs in canary — same cluster, same code.
|
||||
|
||||
**Refined verdict.** ours' `0x12a4` (tid=1 main, AUDIT_BLIND) and `0x12ac` (tid=13 silph UI) are 8 bytes apart — likely sibling KEVENT fields in the same silph UI struct. canary's analogs are in the `F80000xx` namespace, similarly clustered. The single fix that addresses both:
|
||||
|
||||
> **2.BF (b)** — synthetic host-side spawn of `sub_825070F0`'s 4 workers at the audit-058-identified context (`0xBCE25340`), entries `0x82506528/58/88/B8`. Once those workers run, they signal the silph UI PKEVENT cluster, unwedging BOTH tid=1 main and tid=13 silph UI in one shot.
|
||||
|
||||
2.BE (host-driven VSync ISR delivery) becomes follow-on work after the UI bootstrap completes and frame pacing actually matters.
|
||||
|
||||
## Open questions for iterate 2.BD′ / 2.BE planning
|
||||
|
||||
1. **Does 2.BE alone unwedge tid=13?** Cheapest verification path: land 2.BE and re-run audit-059, see whether `0x12ac` signal count goes 0 → non-zero.
|
||||
2. **What is the LR-pattern of canary's `KeSetEvent guest_ptr=0xBCE25234` callers?** The current probe doesn't capture LR — extending the cvar to do so on a filtered subset would let us name the producer function in canary's namespace.
|
||||
3. **Does the GPU frame-limiter's CP interrupt actually walk into the silph UI cluster?** I.e., does `EmulateCPInterruptDPC` → `interrupt_callback` → guest code ever hit `sub_821CB030` or its callees? An LR probe inside `EmulateCPInterruptDPC` would answer this.
|
||||
|
||||
## Artifacts
|
||||
|
||||
- `canary.log` 2.2 MB / 34,095 lines / 32,977 AUDIT-HLC lines
|
||||
- `canary.stdout` 2.2 MB (duplicate of canary.log due to log_file fallback)
|
||||
- `canary.stderr` 8.4 KB (Wine diagnostics)
|
||||
- `ours.log` 479 lines (focus ledger + thread diagnostics + final state)
|
||||
- `ours.stderr` 317 lines (kernel-call counters)
|
||||
- `vkd3d-proton.cache.write` 15 KB (build artifact, ignored)
|
||||
|
||||
Commits in play (xenia-canary, fork-local only):
|
||||
- `03362b59f` cross-build-wine (cross-compile toolchain)
|
||||
- `d031d7c51` audit-handle-lifecycle-probes (this audit's probes)
|
||||
116
audit-runs/audit-059-handle-disambiguation/ROUND_34_PLAN.md
Normal file
116
audit-runs/audit-059-handle-disambiguation/ROUND_34_PLAN.md
Normal file
@@ -0,0 +1,116 @@
|
||||
# Round 34 — silph_ui_synth.rs (cluster B sibling) — DEFERRED PLAN
|
||||
|
||||
## Background
|
||||
|
||||
Rounds 23-33 drove γ-cluster #2 down to the actual gate: **`sub_821741C8`** (silph worker-dispatch loop) fires 0× in ours / 471× in canary (tid=6). It's invoked via dynamic vtable slot 9 from `sub_821752C0` thunk. The vtable writer is in the audit-050 unreachability island — there's no static caller chain to hook into.
|
||||
|
||||
The fix shape is a synth module analogous to `silph_synth.rs` (rounds 18-21):
|
||||
- Synthesize a singleton-like object with the right vtable
|
||||
- Spawn a guest thread at the right entry with this object as r3
|
||||
- Let the dispatch chain do the rest
|
||||
|
||||
Rounds 18-21 took 4 rounds to land cluster A's analog and ended at "workers run live but idle" because of missing foreign-pointer fields. Cluster B will face similar challenges.
|
||||
|
||||
## Sub-round breakdown (estimated 5-8 rounds)
|
||||
|
||||
### 34.α — Probe canary's dispatcher singleton (1 round)
|
||||
Capture canary's runtime state at `sub_821741C8` entry:
|
||||
- `r3 = 0xBCA44C00` (canary tid=6's dispatcher singleton)
|
||||
- Dump `r3..r3+0x80` to identify all fields
|
||||
- Note vtable address at `[r3+0]`
|
||||
|
||||
```bash
|
||||
WINEDEBUG=-all wine xenia_canary.exe --mute=true --audit_handle_lifecycle=true \
|
||||
--audit_jit_prolog_pc=0x821741C8 --audit_jit_prolog_r3_bytes=128 \
|
||||
--audit_jit_prolog_mem_dump=<vtable_va_from_r3+0> \
|
||||
...
|
||||
```
|
||||
|
||||
### 34.β — Probe full vtable layout (1 round)
|
||||
Read the vtable bytes statically from the PE (canary's `[r3+0]` IS a static XEX VA — same trick as round 21):
|
||||
- Read 32-64 slots from PE at file offset = vtable VA - 0x82000000
|
||||
- Confirm slot 9 = `sub_821C7CB8` and `vtable+0x24` thunk to `sub_821741C8`
|
||||
- Look at all other slots — do any reference deep guest code that needs more init?
|
||||
|
||||
Cross-reference each slot's DB reach. If a slot is the dispatcher's own method body, it'll be called from within the chain — needs to exist.
|
||||
|
||||
### 34.γ — Skeleton synth + thread spawn (1 round)
|
||||
Create `crates/xenia-kernel/src/silph_ui_synth.rs` mirroring `silph_synth.rs` structure:
|
||||
```rust
|
||||
pub fn spawn_silph_ui_dispatcher(state: &mut KernelState, mem: &GuestMemory, scheduler: &mut Scheduler) -> Result<u32, &'static str> {
|
||||
if state.silph_ui_synth_done { return Ok(state.silph_ui_synth_ctx); }
|
||||
|
||||
// Allocate ~0x100-0x200 bytes for the dispatcher singleton
|
||||
let ctx = state.heap_alloc(0x200, 16)?;
|
||||
mem.write_zeros(ctx, 0x200);
|
||||
|
||||
// Install static-XEX vtable at [+0]
|
||||
mem.write_u32(ctx + 0x00, VTABLE_VA); // discovered in 34.β
|
||||
|
||||
// Other init fields from 34.α dump
|
||||
// ...
|
||||
|
||||
// Spawn dispatcher thread at sub_821748F0 with r3=ctx
|
||||
scheduler.spawn(SpawnParams{
|
||||
entry: 0x821748F0,
|
||||
start_context: ctx,
|
||||
create_suspended: false,
|
||||
...
|
||||
})?;
|
||||
|
||||
state.silph_ui_synth_done = true;
|
||||
state.silph_ui_synth_ctx = ctx;
|
||||
Ok(ctx)
|
||||
}
|
||||
```
|
||||
|
||||
Hook point: first reach of `sub_821CB030` in the existing silph factory chain (the call site that should normally trigger this dispatcher's creation in canary).
|
||||
|
||||
Add 3-mode env gate: `XENIA_SILPH_UI_SYNTH={unset|=suspend|=1}`.
|
||||
|
||||
### 34.δ — Run + diagnose first crash (1 round)
|
||||
Almost certainly crashes on a NULL deref of one of the singleton's fields. Use round 19's pattern:
|
||||
- Probe at thread entry + early BB heads
|
||||
- Identify the offset that's accessed
|
||||
- Compare to canary's value at that offset
|
||||
|
||||
### 34.ε..η — Iterate on field fills (2-4 rounds)
|
||||
Each crash identifies one more required field. Fill it. Re-run. Continue until workers idle (verdict D analog).
|
||||
|
||||
### 34.θ — Producer-side seeding (1 round)
|
||||
Even with the dispatcher running, work-items may not flow. Per round 32 it's pool 3 that's starved (271 fires in canary). The producers are `sub_821CBEA8 / sub_821D24A0 / sub_821CD458` — they may need their own bootstrap. Probe what triggers them in canary.
|
||||
|
||||
## Verification at each stage
|
||||
|
||||
After every commit:
|
||||
- `cargo test --release --workspace` — 765/765 must pass
|
||||
- `XENIA_CACHE_PERSIST=1 XENIA_SILPH_UI_SYNTH=1 ./target/release/xenia-rs exec <ISO> -n 50000000 --trace-handles-focus=0x1218,0x1224,0x12a4,0x12ac`
|
||||
- Check:
|
||||
- No crash
|
||||
- `sub_821741C8` fires
|
||||
- `sub_82450b68` r4=3 fires increase
|
||||
- Handle 0x1224 / 0x1218 transition out of NO_SIGNALS_DESPITE_WAITS
|
||||
- Eventually: `VdSwap > 1, draws > 0`
|
||||
|
||||
## Risk register
|
||||
|
||||
- **High**: dispatcher singleton may require many more fields than the analog WorkerCtx (rounds 18-21 needed 8 KEVENTs + ring + descriptors + index table; UI dispatcher likely has similar scope)
|
||||
- **High**: foreign-arena pointers in canary's heap (similar to round 19's `[+0x28/+0x2C/+0x30]`) may need their own synthesis
|
||||
- **Medium**: cluster B's worker may itself spawn threads which need contexts which need... cascading scope
|
||||
- **Low**: workspace tests breaking (probe infrastructure is solid)
|
||||
- **Low**: existing iterate-2BE work regressing (it's on a separate branch)
|
||||
|
||||
## Off-ramps
|
||||
|
||||
If we hit a wall at any sub-round, the off-ramps are:
|
||||
1. Land the infrastructure as opt-in (rounds 18-21 pattern) and ship cluster A + cluster B both as opt-in env vars
|
||||
2. Drop cluster B entirely and PR the iterate-2BE work to master (production-ready architectural fix)
|
||||
3. Pivot to lockstep diff of inflate function (round 30 hypothesis (i)) if cluster B keeps producing crash-fix layers
|
||||
|
||||
## Branch plan
|
||||
|
||||
New branch: `iterate-2BF/silph-ui-synth` off `iterate-2BF/synthetic-silph-spawn` HEAD `40f208e`. Each sub-round = 1 commit. All commits opt-in via env var; default behavior unchanged.
|
||||
|
||||
## When ready to execute
|
||||
|
||||
Dispatch with the prompt at the round-33 agent's recommendation, starting at sub-round 34.α.
|
||||
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,66 @@
|
||||
AUDIT-PC-PROBE pc=0x8216ea68 tid=1 hw=0 cycle=5362918 lr=0x824ab8e0 r3=0x00000000 r11=0x00000000 [r3+0]=0x00000000 [[r3+0]+24]=0x00000000 [r3+0x0C]=0x00000000 [r3+0x30]=0x00000000
|
||||
AUDIT-PC-PROBE pc=0x822f1aa8 tid=1 hw=0 cycle=6181256 lr=0x8216ee14 r3=0x40d09a40 r11=0x40111910 [r3+0]=0x00000021 [[r3+0]+24]=0x00000000 [r3+0x0C]=0x40541a40 [r3+0x30]=0x00000000
|
||||
AUDIT-PC-PROBE pc=0x822f1b38 tid=1 hw=0 cycle=6181641 lr=0x822f1b38 r3=0x00000001 r11=0x824b0000 [r3+0]=0x00000000 [[r3+0]+24]=0x00000000 [r3+0x0C]=0x00000000 [r3+0x30]=0x00000000
|
||||
AUDIT-PC-PROBE pc=0x821746b0 tid=1 hw=0 cycle=9229300 lr=0x82173c38 r3=0x40ba9a80 r11=0x00000000 [r3+0]=0x40111910 [[r3+0]+24]=0x00000000 [r3+0x0C]=0x00000000 [r3+0x30]=0x00000000
|
||||
AUDIT-PC-PROBE pc=0x821748f0 tid=13 hw=1 cycle=0 lr=0xbcbcbcbc r3=0x4024a840 r11=0x00000000 [r3+0]=0x40ba9a80 [[r3+0]+24]=0x00000000 [r3+0x0C]=0x00000000 [r3+0x30]=0x4250dec0
|
||||
|
||||
=== Final State ===
|
||||
PC: 0x00000000
|
||||
LR: 0xbcbcbcbc
|
||||
CTR: 0x00000000
|
||||
CR: 0x00000000
|
||||
XER: CA=0 OV=0 SO=0
|
||||
|
||||
=== Thread diagnostics ===
|
||||
hw=0 idx=0 tid=1 state=Blocked(WaitAny { handles: [4208], deadline: None }) pc=0x824ac578 lr=0x824ac578 sp=0x700ff6e0
|
||||
r0=0x82153bf0 r3=0x00000000 r4=0x00000001 r5=0x00000000 r6=0x00000000 r7=0x03a72328
|
||||
r8=0x43b77284 r9=0x43b77328 r10=0x00000001 r11=0x00000103 r12=0x82173c64 r13=0x7fff0000
|
||||
hw=0 idx=1 tid=11 state=Blocked(WaitAny { handles: [2190094916, 2190094880], deadline: None }) pc=0x824d2a94 lr=0x824d2a94 sp=0x71497d90
|
||||
r0=0x00000000 r3=0x00000000 r4=0x71497de0 r5=0x00000001 r6=0x00000003 r7=0x00000001
|
||||
r8=0x00000000 r9=0x00000000 r10=0x71497df0 r11=0x828a3244 r12=0xbcbcbcbc r13=0x4b9f1000
|
||||
hw=1 idx=0 tid=2 state=Blocked(WaitAny { handles: [2189887804], deadline: None }) pc=0x824a95f8 lr=0x824a95f8 sp=0x710ffd20
|
||||
r0=0x0000030c r3=0x00000000 r4=0x00000003 r5=0x00000001 r6=0x00000000 r7=0x00000000
|
||||
r8=0x00000001 r9=0x6f000000 r10=0x824a9178 r11=0x82870000 r12=0x824a94f0 r13=0x4acc3000
|
||||
hw=1 idx=1 tid=13 state=Blocked(WaitAny { handles: [4216], deadline: None }) pc=0x824ac578 lr=0x824ac578 sp=0x715a7a20
|
||||
r0=0x821511d0 r3=0x00000000 r4=0x00000001 r5=0x00000000 r6=0x00000000 r7=0x03a723d0
|
||||
r8=0x43b77334 r9=0x43b77334 r10=0x40541f80 r11=0x00000001 r12=0x821cb1e0 r13=0x4d1d4000
|
||||
hw=2 idx=0 tid=7 state=Blocked(WaitAny { handles: [1111821148], deadline: Some(42946672) }) pc=0x824cd4f4 lr=0x824cd4f4 sp=0x71187e60
|
||||
r0=0x00000000 r3=0x00000000 r4=0x00000003 r5=0x00000001 r6=0x00000000 r7=0x71187eb0
|
||||
r8=0x00000000 r9=0x00000000 r10=0x00000002 r11=0x00000002 r12=0xbcbcbcbc r13=0x4b1d6000
|
||||
hw=2 idx=1 tid=8 state=Blocked(WaitAny { handles: [4176, 4128], deadline: None }) pc=0x824ab214 lr=0x824ab214 sp=0x71287c90
|
||||
r0=0x00000000 r3=0x00000000 r4=0x71287cf0 r5=0x00000001 r6=0x00000001 r7=0x00000000
|
||||
r8=0x00000000 r9=0x00009030 r10=0x00000002 r11=0x00000020 r12=0x822f1ff0 r13=0x4b90a000
|
||||
hw=3 idx=0 tid=4 state=Blocked(WaitAny { handles: [4120], deadline: None }) pc=0x824ac578 lr=0x824ac578 sp=0x7112fb80
|
||||
r0=0x821511a0 r3=0x00000000 r4=0x00000001 r5=0x00000000 r6=0x00000000 r7=0x03a723d0
|
||||
r8=0x43b7732c r9=0x828f0000 r10=0x00000008 r11=0x00000000 r12=0x8245a660 r13=0x4adc6000
|
||||
hw=3 idx=1 tid=5 state=Blocked(WaitAny { handles: [4224], deadline: None }) pc=0x824ac578 lr=0x824ac578 sp=0x7116fbe0
|
||||
r0=0x821511a0 r3=0x00000000 r4=0x00000001 r5=0x00000000 r6=0x00000000 r7=0x03a723d0
|
||||
r8=0x43b7732c r9=0x828f0000 r10=0x00000001 r11=0x00000000 r12=0x82458b34 r13=0x4adc8000
|
||||
hw=4 idx=0 tid=9 state=Ready pc=0x824d140c lr=0x824d22b4 sp=0x71387df0
|
||||
r0=0x00000000 r3=0x4250dedc r4=0x4250e040 r5=0x00000001 r6=0x00000000 r7=0x00000000
|
||||
r8=0x4b9ec000 r9=0x01010000 r10=0x01010000 r11=0x00000000 r12=0x824d22a8 r13=0x4b9ec000
|
||||
hw=5 idx=0 tid=3 state=Blocked(WaitAny { handles: [4112], deadline: None }) pc=0x824ac578 lr=0x824ac578 sp=0x7111fdf0
|
||||
r0=0x82153bf0 r3=0x00000000 r4=0x00000001 r5=0x00000000 r6=0x00000000 r7=0x00000a10
|
||||
r8=0x00000010 r9=0x00000000 r10=0x00009030 r11=0x00000000 r12=0x82181988 r13=0x4adc4000
|
||||
hw=5 idx=1 tid=6 state=Ready pc=0x824ab214 lr=0x824ab214 sp=0x7117fc60
|
||||
r0=0x821511a0 r3=0x00000001 r4=0x7117fcc0 r5=0x00000001 r6=0x00000001 r7=0x00000000
|
||||
r8=0x7117fcb0 r9=0x00009030 r10=0x00000002 r11=0x00000020 r12=0x82458d68 r13=0x4adca000
|
||||
hw=5 idx=2 tid=10 state=Ready pc=0x824d1404 lr=0x824d22b4 sp=0x71487e00
|
||||
r0=0x00000000 r3=0x4250dedc r4=0x4250e040 r5=0x00000001 r6=0x00000000 r7=0x00000000
|
||||
r8=0x4b9ee000 r9=0x01010000 r10=0x01010000 r11=0x00000000 r12=0x824d22a8 r13=0x4b9ee000
|
||||
hw=5 idx=3 tid=12 state=Ready pc=0x824aa6a4 lr=0x824aa6a4 sp=0x714a7da0
|
||||
r0=0x00000000 r3=0x000000ff r4=0x00000020 r5=0x714a7df4 r6=0x00000000 r7=0x00000000
|
||||
r8=0x00000000 r9=0x00000000 r10=0x00000000 r11=0x00000001 r12=0x8217898c r13=0x4d1d2000
|
||||
|
||||
-- Handle waiter lists --
|
||||
handle=0x00001020 Semaphore(0/2147483647) waiters(tid)=[8]
|
||||
handle=0x42450b5c Event(sig=false, mr=true) waiters(tid)=[7]
|
||||
handle=0x828a3244 Event(sig=false, mr=false) waiters(tid)=[11]
|
||||
handle=0x00001018 Semaphore(0/2147483647) waiters(tid)=[4]
|
||||
handle=0x8287093c Event(sig=false, mr=false) waiters(tid)=[2]
|
||||
handle=0x00001070 Thread(id=13, exit=None) waiters(tid)=[1]
|
||||
handle=0x00001080 Event(sig=false, mr=false) waiters(tid)=[5]
|
||||
handle=0x00001078 Event(sig=false, mr=false) waiters(tid)=[13]
|
||||
handle=0x828a3220 Event(sig=false, mr=true) waiters(tid)=[11]
|
||||
handle=0x00001050 Event(sig=false, mr=true) waiters(tid)=[8]
|
||||
handle=0x00001010 Event(sig=false, mr=true) waiters(tid)=[3]
|
||||
@@ -0,0 +1,167 @@
|
||||
# Round-A1..A4 findings — canary tid=6 spawn chain & divergence frontier
|
||||
|
||||
## Anchor reframe (round-37 misread corrected)
|
||||
|
||||
The "factory/registry layer divergence at [0x828E1F08]" framing is falsified.
|
||||
Both engines install the SAME static-XEX `.rdata` vtable `0x820A183C` at the
|
||||
singleton's `[+0]`. The instance VAs differ only because of ε-class allocator
|
||||
divergence (audit-043).
|
||||
|
||||
| Probe | Canary | Ours |
|
||||
|----------------------------|----------------------|----------------------|
|
||||
| `[0x828E1F08]` | 0xBC22C910 (heap) | 0x40111910 (heap) |
|
||||
| `[[0x828E1F08]+0]` vtable | 0x820A183C | 0x820A183C (SAME) |
|
||||
| `vtable[+0]` thunk | 0x82175330 | 0x82175330 (SAME) |
|
||||
| `vtable[+8]` thunk | 0x82175340 → b sub_821741C8 | SAME (vtable bytes from XEX `.rdata`) |
|
||||
|
||||
The thunks at 0x82175330+ are 8-byte `lwz r3, 8(r3); b <real_method>`
|
||||
trampolines. Slot 2 (`+0x08`) is the worker dispatch entry that round 33
|
||||
identified as 471× in canary tid=6 / 0× in ours.
|
||||
|
||||
## A.1 — Canary dispatcher loop is in sub_822F1AA8 on tid=6
|
||||
|
||||
Probe `--audit_jit_prolog_pc=0x821741C8 --audit_jit_prolog_r3_bytes=256` on
|
||||
canary (35 s):
|
||||
|
||||
- ~1678 fires of sub_821741C8 on **tid=6**
|
||||
- r3 at entry = `0xBCCC4A80` (the inner sub-object of the silph::UImpl
|
||||
singleton — extracted via the thunk's `lwz r3, 8(r3)`)
|
||||
- LR at entry = `0x822F1D5C` (return PC after the `bctrl` at 0x822F1D58 inside
|
||||
sub_822F1AA8)
|
||||
- Singleton's `[+C0..+D0]` UTF-16 spells "HF Frequency" (a UI label)
|
||||
|
||||
The dispatch site in canary (the `bctrl`) is at PC 0x822F1D58 inside
|
||||
sub_822F1AA8:
|
||||
```
|
||||
0x822F1D40: lwz r3, 7944(r25) ; r3 = [r25+0x1F08] = [0x828E1F08]
|
||||
0x822F1D4C: lwz r11, 0(r3) ; vtable
|
||||
0x822F1D50: lwz r11, 8(r11) ; vtable[+8] = thunk 0x82175340
|
||||
0x822F1D54: mtctr r11
|
||||
0x822F1D58: bctrl ; → 0x82175340 → b 0x821741C8
|
||||
```
|
||||
|
||||
## A.2 — Canary tid=6 spawn site is sub_821746B0 at PC 0x82174824
|
||||
|
||||
Enumeration of `ExCreateThread` calls in canary (35 s, 21 unique tuples):
|
||||
|
||||
```
|
||||
entry=821748F0 start_ctx=BC365700 lr=824AC5F0 guest_lr=82174828 ← silph dispatcher #1
|
||||
entry=821748F0 start_ctx=BC366DA0 lr=824AC5F0 guest_lr=82174828 ← silph dispatcher #2
|
||||
```
|
||||
|
||||
PC `0x82174824` is the `bl 0x82172370` (the `ExCreateThread` thunk) inside
|
||||
`sub_821746B0`. The setup is:
|
||||
```
|
||||
0x8217480C: lis r11, 0x8217
|
||||
0x82174810: li r7, 0
|
||||
0x82174814: li r6, 4 ; priority
|
||||
0x82174818: mr r5, r29 ; start_ctx
|
||||
0x8217481C: addi r4, r11, 18672 ; r4 = 0x821748F0 (entry)
|
||||
0x82174820: li r3, 0
|
||||
0x82174824: bl 0x82172370 ; ExCreateThread
|
||||
```
|
||||
|
||||
The entry `0x821748F0` is a thread main that calls `bl 0x821749C0` (the
|
||||
inner dispatch).
|
||||
|
||||
## A.3 — sub_822F1AA8 spawns a SECOND thread at 0x822F1B08
|
||||
|
||||
The dispatch-loop function `sub_822F1AA8` itself ALSO spawns a thread at
|
||||
PC 0x822F1B08 with entry=`sub_822F1EE0` and `start_ctx=BCE24A40`:
|
||||
```
|
||||
0x822F1AEC: lis r11, 0x822F
|
||||
0x822F1AFC: addi r4, r11, 7904 ; r4 = 0x822F1EE0
|
||||
0x822F1B08: bl 0x82172370 ; ExCreateThread
|
||||
```
|
||||
|
||||
sub_822F1EE0 → sub_822F1F20 contains its own atomic state-machine + wait loop.
|
||||
|
||||
## A.3' — sub_822F1AA8 has exactly 2 callers, both in sub_8216EA68
|
||||
|
||||
```
|
||||
source=0x8216ECCC source_func=0x8216EA68 kind=call
|
||||
source=0x8216EE10 source_func=0x8216EA68 kind=call
|
||||
```
|
||||
|
||||
So sub_8216EA68 is the only function that drives sub_822F1AA8.
|
||||
|
||||
## A.4 — Ours' divergence is INSIDE the spawned thread, NOT at the spawn
|
||||
|
||||
Mirror-probed ours at `sub_821746B0` body BB heads (parallel mode, 50M
|
||||
instructions, XENIA_CACHE_PERSIST=1):
|
||||
|
||||
| PC | Fires | Notes |
|
||||
|-------------|-------|------------------------------------------------|
|
||||
| 0x821746B0 | 1 | Entry. r3=0x40ba9a80 |
|
||||
| 0x821746E0 | 1 | After `bl 0x8284DCFC` (critical-section) |
|
||||
| 0x82174798 | 1 | After the early `beq` (r28==0 branch) |
|
||||
| 0x821747B8 | 1 | **Past the gate**: `[0x828E2B14]=0x40105000` non-NULL; `bl 0x82150EF8` returned r3=0x4024a840 (NON-NULL) |
|
||||
| 0x821747D8 | 1 | After the inner `bl 0x821723F0` |
|
||||
| 0x8217480C | 1 | Enters the spawn block |
|
||||
| 0x82174828 | 1 | **Post-`bl ExCreateThread`**, r3=0x1070 = thread handle |
|
||||
|
||||
**OURS DOES SPAWN THE THREAD VIA THIS SITE.** The returned handle 0x1070 is
|
||||
**tid=13's thread handle** (per round 37 final state). So **ours' tid=13 IS
|
||||
the same logical thread as canary's tid=6** — spawned by the identical call
|
||||
site with the same entry (0x821748F0).
|
||||
|
||||
## A.4 — Divergence is INSIDE the spawned thread's body
|
||||
|
||||
Round 37's frame trail for ours' tid=13 wedge:
|
||||
`0x821CB1E0 → 0x821CBAE0 → 0x821CC454 → 0x821C4F18 → 0x82174A80`
|
||||
|
||||
The LAST frame `0x82174A80` is **inside sub_821749C0** (= the inner dispatch
|
||||
called from sub_821748F0). It's right after the vtable dispatch at
|
||||
0x82174A78 (`bctrl` on `[r30+vtable][+16]`):
|
||||
|
||||
```
|
||||
0x82174a64: mr r3, r30 ; r3 = some object
|
||||
0x82174a68: lwz r11, 0(r30)
|
||||
0x82174a6c: lwz r4, 4(r29)
|
||||
0x82174a70: lwz r5, 8(r31)
|
||||
0x82174a74: lwz r11, 16(r11) ; r11 = vtable[+0x10]
|
||||
0x82174a78: mtctr r11
|
||||
0x82174a7c: bctrl ; dispatch
|
||||
0x82174a80: lwz r3, 0(r29) ; ← wedge frame top (LR after bctrl)
|
||||
```
|
||||
|
||||
So `sub_821749C0`'s vtable[+0x10] dispatch on tid=13/tid=6's `r30` object
|
||||
lands at audit-049 territory in ours (chain through sub_821CB030+0x128 that
|
||||
ends waiting forever on handle 0x1078). In canary, the same dispatch on the
|
||||
same object SHOULD land somewhere that ultimately reaches sub_822F1AA8's
|
||||
dispatch loop and runs sub_821741C8 1678× via vtable[+8].
|
||||
|
||||
**The object `r30` is the result of `bl 0x821CF3F0`** at PC 0x821749DC. So
|
||||
sub_821CF3F0 returns a registry-lookup object; the vtable on this object's
|
||||
slot +0x10 method's body determines whether the thread wedges or runs.
|
||||
|
||||
## Phase B classification
|
||||
|
||||
Class 3 — **Missing init-time precondition**. Ours reaches the spawn site,
|
||||
ours' tid=13 enters the chain, ours' tid=13 enters sub_821749C0, but the
|
||||
vtable[+0x10] dispatch at PC 0x82174A78 in ours lands in audit-049 territory
|
||||
(wait forever on 0x1078) rather than continuing through the canonical chain
|
||||
toward sub_822F1AA8's outer dispatch loop.
|
||||
|
||||
Possible classes to refine in next round:
|
||||
- **3a**: same vtable but state-dependent — `r30`'s field at a specific offset
|
||||
differs in ours vs canary, causing the method body to take a different
|
||||
branch.
|
||||
- **3b**: the vtable in `r30` is DIFFERENT in ours vs canary (e.g., ours has
|
||||
a base-class vtable but canary has a derived-class vtable).
|
||||
- **4**: synthesis fallback — spawn a SECOND thread that runs sub_822F1AA8's
|
||||
dispatch loop directly, bypassing the wedged sub_821749C0 chain.
|
||||
|
||||
## Next probe (A.4.5)
|
||||
|
||||
Probe both engines at sub_821749C0 entry filtering tid=13 (ours) / tid=6
|
||||
(canary), capturing:
|
||||
- `r3` and `r4` at entry (the factory-output object and the ctx)
|
||||
- After the `bl 0x821CF3F0` at 0x821749DC: capture r30 (= sub_821CF3F0
|
||||
return — the object whose vtable is dispatched at 0x82174A78)
|
||||
- At PC 0x82174A78 (the divergent bctrl): r30 + r30+0 (vtable) + vtable[+0x10]
|
||||
(the dispatch target)
|
||||
|
||||
If ours and canary have IDENTICAL `vtable[+0x10]` targets but the method
|
||||
body's behavior differs → class 3a (state divergence). If targets differ →
|
||||
class 3b (vtable identity divergence).
|
||||
@@ -0,0 +1,91 @@
|
||||
AUDIT-PC-PROBE pc=0x821746b0 tid=1 hw=0 cycle=9228833 lr=0x82173c38 r3=0x40ba9a80 r11=0x00000000 [r3+0]=0x40111910 [[r3+0]+24]=0x00000000 [r3+0x0C]=0x00000000 [r3+0x30]=0x00000000
|
||||
AUDIT-MEM-READ addr=0x828e2b14 val=0x40105000 vtable=0x40105004 vtable[0]=0x40105008 vtable[24]=0x40105020 pc=0x821746b0 tid=1 cycle=9228833
|
||||
AUDIT-PC-PROBE pc=0x821746e0 tid=1 hw=0 cycle=9228856 lr=0x821746e0 r3=0x00000000 r11=0x00000000 [r3+0]=0x00000000 [[r3+0]+24]=0x00000000 [r3+0x0C]=0x00000000 [r3+0x30]=0x00000000
|
||||
AUDIT-MEM-READ addr=0x828e2b14 val=0x40105000 vtable=0x40105004 vtable[0]=0x40105008 vtable[24]=0x40105020 pc=0x821746e0 tid=1 cycle=9228856
|
||||
AUDIT-PC-PROBE pc=0x82174798 tid=1 hw=0 cycle=9228859 lr=0x821746e0 r3=0x00000000 r11=0x00000000 [r3+0]=0x00000000 [[r3+0]+24]=0x00000000 [r3+0x0C]=0x00000000 [r3+0x30]=0x00000000
|
||||
AUDIT-MEM-READ addr=0x828e2b14 val=0x40105000 vtable=0x40105004 vtable[0]=0x40105008 vtable[24]=0x40105020 pc=0x82174798 tid=1 cycle=9228859
|
||||
AUDIT-PC-PROBE pc=0x821747b8 tid=1 hw=0 cycle=9229012 lr=0x821747ac r3=0x4024a840 r11=0x4024a840 [r3+0]=0x4024ace0 [[r3+0]+24]=0x43777290 [r3+0x0C]=0x4024a820 [r3+0x30]=0x4250dec0
|
||||
AUDIT-MEM-READ addr=0x828e2b14 val=0x40105000 vtable=0x40105004 vtable[0]=0x40105008 vtable[24]=0x40105020 pc=0x821747b8 tid=1 cycle=9229012
|
||||
AUDIT-PC-PROBE pc=0x821747d8 tid=1 hw=0 cycle=9229440 lr=0x821747cc r3=0x4024a840 r11=0xffffffff [r3+0]=0x40ba9a80 [[r3+0]+24]=0x00000000 [r3+0x0C]=0x00000000 [r3+0x30]=0x4250dec0
|
||||
AUDIT-MEM-READ addr=0x828e2b14 val=0x40105000 vtable=0x40105004 vtable[0]=0x40105008 vtable[24]=0x40105020 pc=0x821747d8 tid=1 cycle=9229440
|
||||
AUDIT-PC-PROBE pc=0x8217480c tid=1 hw=0 cycle=9229443 lr=0x821747cc r3=0x4024a840 r11=0xffffffff [r3+0]=0x40ba9a80 [[r3+0]+24]=0x00000000 [r3+0x0C]=0x00000000 [r3+0x30]=0x4250dec0
|
||||
AUDIT-MEM-READ addr=0x828e2b14 val=0x40105000 vtable=0x40105004 vtable[0]=0x40105008 vtable[24]=0x40105020 pc=0x8217480c tid=1 cycle=9229443
|
||||
AUDIT-PC-PROBE pc=0x82174828 tid=1 hw=0 cycle=9229509 lr=0x82174828 r3=0x00001070 r11=0x824b0000 [r3+0]=0x00000000 [[r3+0]+24]=0x00000000 [r3+0x0C]=0x00000000 [r3+0x30]=0x00000000
|
||||
AUDIT-MEM-READ addr=0x828e2b14 val=0x40105000 vtable=0x40105004 vtable[0]=0x40105008 vtable[24]=0x40105020 pc=0x82174828 tid=1 cycle=9229509
|
||||
|
||||
=== Final State ===
|
||||
PC: 0x824ac578
|
||||
LR: 0x824ac578
|
||||
CTR: 0x82153bf0
|
||||
CR: 0x24000028
|
||||
XER: CA=0 OV=0 SO=0
|
||||
r0 : 0x0000000082153bf0
|
||||
r1 : 0x00000000700ff6e0
|
||||
r2 : 0x0000000020000000
|
||||
r4 : 0x0000000000000001
|
||||
r7 : 0x0000000003a72328
|
||||
r8 : 0x0000000043b77284
|
||||
r9 : 0x0000000043b77328
|
||||
r10: 0x0000000000000001
|
||||
r11: 0x0000000000000103
|
||||
r12: 0x0000000082173c64
|
||||
r13: 0x000000007fff0000
|
||||
r18: 0x0000000040d09a7c
|
||||
r23: 0x00000000828f3844
|
||||
r26: 0x000000004024a620
|
||||
r27: 0x00000000820a17a8
|
||||
r31: 0x0000000000001070
|
||||
|
||||
=== Thread diagnostics ===
|
||||
hw=0 idx=0 tid=1 state=Blocked(WaitAny { handles: [4208], deadline: None }) pc=0x824ac578 lr=0x824ac578 sp=0x700ff6e0
|
||||
r0=0x82153bf0 r3=0x00000000 r4=0x00000001 r5=0x00000000 r6=0x00000000 r7=0x03a72328
|
||||
r8=0x43b77284 r9=0x43b77328 r10=0x00000001 r11=0x00000103 r12=0x82173c64 r13=0x7fff0000
|
||||
hw=0 idx=1 tid=11 state=Blocked(WaitAny { handles: [2190094916, 2190094880], deadline: None }) pc=0x824d2a94 lr=0x824d2a94 sp=0x71497d90
|
||||
r0=0x00000000 r3=0x00000000 r4=0x71497de0 r5=0x00000001 r6=0x00000003 r7=0x00000001
|
||||
r8=0x00000000 r9=0x00000000 r10=0x71497df0 r11=0x828a3244 r12=0xbcbcbcbc r13=0x4b9f1000
|
||||
hw=1 idx=0 tid=2 state=Blocked(WaitAny { handles: [2189887804], deadline: None }) pc=0x824a95f8 lr=0x824a95f8 sp=0x710ffd20
|
||||
r0=0x0000030c r3=0x00000000 r4=0x00000003 r5=0x00000001 r6=0x00000000 r7=0x00000000
|
||||
r8=0x00000001 r9=0x6f000000 r10=0x824a9178 r11=0x82870000 r12=0x824a94f0 r13=0x4acc3000
|
||||
hw=1 idx=1 tid=13 state=Blocked(WaitAny { handles: [4216], deadline: None }) pc=0x824ac578 lr=0x824ac578 sp=0x715a7a20
|
||||
r0=0x821511d0 r3=0x00000000 r4=0x00000001 r5=0x00000000 r6=0x00000000 r7=0x03a723d0
|
||||
r8=0x43b77334 r9=0x43b77334 r10=0x40541f80 r11=0x00000001 r12=0x821cb1e0 r13=0x4d1d4000
|
||||
hw=2 idx=0 tid=7 state=Blocked(WaitAny { handles: [1111821148], deadline: Some(42946672) }) pc=0x824cd4f4 lr=0x824cd4f4 sp=0x71187e60
|
||||
r0=0x00000000 r3=0x00000000 r4=0x00000003 r5=0x00000001 r6=0x00000000 r7=0x71187eb0
|
||||
r8=0x00000000 r9=0x00000000 r10=0x00000002 r11=0x00000002 r12=0xbcbcbcbc r13=0x4b1d6000
|
||||
hw=2 idx=1 tid=8 state=Blocked(WaitAny { handles: [4176, 4132], deadline: None }) pc=0x824ab214 lr=0x824ab214 sp=0x71287c90
|
||||
r0=0x00000000 r3=0x00000000 r4=0x71287cf0 r5=0x00000001 r6=0x00000001 r7=0x00000000
|
||||
r8=0x00000000 r9=0x00009030 r10=0x00000002 r11=0x00000020 r12=0x822f1ff0 r13=0x4b90a000
|
||||
hw=3 idx=0 tid=4 state=Blocked(WaitAny { handles: [4120], deadline: None }) pc=0x824ac578 lr=0x824ac578 sp=0x7112fb80
|
||||
r0=0x821511a0 r3=0x00000000 r4=0x00000001 r5=0x00000000 r6=0x00000000 r7=0x03a723d0
|
||||
r8=0x43b7732c r9=0x828f0000 r10=0x00000008 r11=0x00000000 r12=0x8245a660 r13=0x4adc6000
|
||||
hw=3 idx=1 tid=5 state=Blocked(WaitAny { handles: [4224], deadline: None }) pc=0x824ac578 lr=0x824ac578 sp=0x7116fbe0
|
||||
r0=0x821511a0 r3=0x00000000 r4=0x00000001 r5=0x00000000 r6=0x00000000 r7=0x03a723d0
|
||||
r8=0x43b7732c r9=0x828f0000 r10=0x00000001 r11=0x00000000 r12=0x82458b34 r13=0x4adc8000
|
||||
hw=4 idx=0 tid=9 state=Ready pc=0x824d140c lr=0x824d22b4 sp=0x71387df0
|
||||
r0=0x00000000 r3=0x4250dedc r4=0x4250e040 r5=0x00000001 r6=0x00000000 r7=0x00000000
|
||||
r8=0x4b9ec000 r9=0x01010000 r10=0x01010000 r11=0x00000000 r12=0x824d22a8 r13=0x4b9ec000
|
||||
hw=5 idx=0 tid=3 state=Blocked(WaitAny { handles: [4112], deadline: None }) pc=0x824ac578 lr=0x824ac578 sp=0x7111fdf0
|
||||
r0=0x82153bf0 r3=0x00000000 r4=0x00000001 r5=0x00000000 r6=0x00000000 r7=0x00000a10
|
||||
r8=0x00000010 r9=0x00000000 r10=0x00009030 r11=0x00000000 r12=0x82181988 r13=0x4adc4000
|
||||
hw=5 idx=1 tid=6 state=Ready pc=0x824ab214 lr=0x824ab214 sp=0x7117fc60
|
||||
r0=0x821511a0 r3=0x00000001 r4=0x7117fcc0 r5=0x00000001 r6=0x00000001 r7=0x00000000
|
||||
r8=0x7117fcb0 r9=0x00009030 r10=0x00000002 r11=0x00000020 r12=0x82458d68 r13=0x4adca000
|
||||
hw=5 idx=2 tid=10 state=Ready pc=0x824d140c lr=0x824d22b4 sp=0x71487e00
|
||||
r0=0x00000000 r3=0x4250dedc r4=0x4250e040 r5=0x00000001 r6=0x00000000 r7=0x00000000
|
||||
r8=0x4b9ee000 r9=0x01010000 r10=0x01010000 r11=0x00000000 r12=0x824d22a8 r13=0x4b9ee000
|
||||
hw=5 idx=3 tid=12 state=Ready pc=0x824aa6a4 lr=0x824aa6a4 sp=0x714a7da0
|
||||
r0=0x00000000 r3=0x000000ff r4=0x00000020 r5=0x714a7df4 r6=0x00000000 r7=0x00000000
|
||||
r8=0x00000000 r9=0x00000000 r10=0x00000000 r11=0x00000001 r12=0x8217898c r13=0x4d1d2000
|
||||
|
||||
-- Handle waiter lists --
|
||||
handle=0x00001024 Semaphore(0/2147483647) waiters(tid)=[8]
|
||||
handle=0x00001010 Event(sig=false, mr=true) waiters(tid)=[3]
|
||||
handle=0x00001070 Thread(id=13, exit=None) waiters(tid)=[1]
|
||||
handle=0x00001080 Event(sig=false, mr=false) waiters(tid)=[5]
|
||||
handle=0x828a3244 Event(sig=false, mr=false) waiters(tid)=[11]
|
||||
handle=0x00001018 Semaphore(0/2147483647) waiters(tid)=[4]
|
||||
handle=0x00001050 Event(sig=false, mr=true) waiters(tid)=[8]
|
||||
handle=0x00001078 Event(sig=false, mr=false) waiters(tid)=[13]
|
||||
handle=0x8287093c Event(sig=false, mr=false) waiters(tid)=[2]
|
||||
handle=0x828a3220 Event(sig=false, mr=true) waiters(tid)=[11]
|
||||
handle=0x42450b5c Event(sig=false, mr=true) waiters(tid)=[7]
|
||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,136 @@
|
||||
# Phase A synthesis — canary tid=6 IS the main thread; the wedge is sub_822F1AA8's loop exit
|
||||
|
||||
## Top-line finding
|
||||
|
||||
**Canary's `tid=6` is canary's main thread.** Confirmed by probing `entry_point`
|
||||
(`sub_824AB748`) with `--audit_jit_prolog_pc=0x824AB748`: fires 1× on
|
||||
`tid=00000006` with `lr=BCBCBCBC` (= OS-initial / no caller). Ours numbers
|
||||
its main thread `tid=1`. Same logical thread; different label.
|
||||
|
||||
Therefore "tid=6 fires sub_821741C8 471×" (round 33) means **the main thread**
|
||||
loops inside `sub_822F1AA8` firing `sub_821741C8` ~1678×/30s in canary. In
|
||||
ours, the main thread (tid=1) runs `sub_822F1AA8` ONCE, exits the loop, and
|
||||
proceeds to thread-join on the spawned init thread (handle 0x1070 = tid=13),
|
||||
which is itself blocked forever on handle 0x1078.
|
||||
|
||||
## Call chain (identical in both engines, different runtime behavior)
|
||||
|
||||
```
|
||||
entry_point (sub_824AB748)
|
||||
│
|
||||
├─ sub_824ACB38 CRT-driven fnptr-array iterator (audit-050 region)
|
||||
├─ ...
|
||||
└─ sub_8216EA68 Many local calls including:
|
||||
├─ ExCreateThread(entry=sub_8217F0F8 ...) ; sibling thread
|
||||
├─ sub_822F1AA8(controller=...) ; FIRST call (PC 0x8216ECCC)
|
||||
└─ sub_822F1AA8(controller=0xBCE24A40 canary / ; SECOND call (PC 0x8216EE10)
|
||||
0x40d09a40 ours) ↑ this is the loop
|
||||
```
|
||||
|
||||
The SECOND call is what runs the dispatcher loop. Its LR = 0x8216EE14.
|
||||
Confirmed in both engines.
|
||||
|
||||
## sub_822F1AA8 loop structure
|
||||
|
||||
```
|
||||
0x822F1AA8: entry, r30 = r3 (controller)
|
||||
0x822F1AEC-0x822F1B08: ExCreateThread(entry=sub_822F1EE0, ctx=r30) → r29 = handle
|
||||
0x822F1B30-0x822F1B34: bl 0x824AA8B0(r3=r29) ; ?
|
||||
0x822F1B38-0x822F1B4C: first bctrl → vtable[+0] of [0x828E1F08]
|
||||
0x822F1B50-0x822F1B74: setup, bl 0x824AA330 INFINITE wait on [r22+32]
|
||||
0x822F1B80-0x822F1BA8: post-wait setup; [r30+0] |= 0x2
|
||||
0x822F1BB0-0x822F1BBC: TOP-OF-LOOP CHECK: if [r30+0] & 0x10000000 → goto 0x822F1E10 (exit)
|
||||
0x822F1BCC..0x822F1DEC: loop body (includes the vtable[+8] bctrl → sub_821741C8 at PC 0x822F1D58)
|
||||
0x822F1DEC-0x822F1DFC: bl 0x824AA330 INFINITE wait on [r23+0]
|
||||
0x822F1E00-0x822F1E0C: END-OF-ITERATION CHECK: if [r30+0] & 0x10000000 == 0 → goto 0x822F1BCC (re-loop)
|
||||
0x822F1E10-0x822F1E18: EXIT: [r30+0] |= 0x02000000 (set MSB-6 = LSB-25)
|
||||
0x822F1E1C-0x822F1E24: release something via bl 0x824AA2F0
|
||||
0x822F1E28-0x822F1E30: bl 0x824AA330 INFINITE on [r30+28] = SPAWNED THREAD HANDLE (thread join!)
|
||||
0x822F1E40: bl 0x824AA3E0
|
||||
0x822F1E44-0x822F1E5C: final cleanup: vtable[+24] bctrl on [0x828E1F08]
|
||||
0x822F1E60-0x822F1E78: [r30+0] = 0, then [r30+0] |= 1; bl 0x824567E0
|
||||
0x822F1E7C-0x822F1E88: epilogue
|
||||
```
|
||||
|
||||
**Loop exit gate**: `[r30+0] & 0x10000000` (bit 28 LSB / bit 3 MSB). Set →
|
||||
exit. Both top-of-loop check (0x822F1BBC) and end-of-iteration check
|
||||
(0x822F1E0C) gate on the same bit.
|
||||
|
||||
## What's different between engines
|
||||
|
||||
| Engine | [r30+0] at entry | Loop iterations | Exits sub_822F1AA8? |
|
||||
|--------|------------------|------------------|----------------------|
|
||||
| canary | 0x21 (per probe) | ~1678+ in 30s | NO (stays in loop) |
|
||||
| ours | 0x21 (per probe) | 0 (probes show none of the loop-body PCs fire after entry) | YES (exits quickly) |
|
||||
|
||||
Both engines have `[r30+0]=0x21` at entry — bit 28 NOT set. After the `ori
|
||||
r11, r11, 0x2` at 0x822F1B90, both should have `[r30+0]=0x23`. Bit 28 still
|
||||
not set.
|
||||
|
||||
So **some code sets bit 28 on [r30+0] between sub_822F1AA8 entry and the
|
||||
loop check** in ours but not in canary.
|
||||
|
||||
Mem-watch on 0x40d09a40 (ours' controller VA) shows **zero guest writes** in
|
||||
my 50M-instruction parallel run. Possible reasons:
|
||||
- The setter writes from kernel/runtime code that mem-watch doesn't capture
|
||||
(kernel-host store, not guest JIT store)
|
||||
- The setter writes via a computed alias (different VA but same backing)
|
||||
- The bit IS set via a probe-quantum-elided JIT store
|
||||
|
||||
## Phase B classification
|
||||
|
||||
**Class 3a — state-divergence on the controller object**. The vtable
|
||||
identity is the same (round-37 confirmed `0x820A183C` in both). The
|
||||
controller object's bit 28 of `[+0]` evolves differently during the setup
|
||||
between sub_822F1AA8 entry and the loop check.
|
||||
|
||||
Class 4 (synthesis) is now LESS attractive: ours' main thread DOES reach
|
||||
sub_822F1AA8 with the right controller. We don't need to spawn the
|
||||
dispatcher — we need to PREVENT the main thread from exiting the loop.
|
||||
|
||||
## Pragmatic next step — JIT instrumentation to find bit-28 setter
|
||||
|
||||
Most direct diagnostic: add a JIT hook in xenia-cpu that, for guest stores
|
||||
in the range [0x822F1AA8, 0x822F1E10), captures the guest PC + the written
|
||||
value when the store would set bit 28 of any address. This identifies the
|
||||
exact PC that sets the loop-exit bit.
|
||||
|
||||
Alternative: extend `--mem-watch` to also capture kernel-side stores by
|
||||
hooking the GuestMemory write path at the kernel-state level.
|
||||
|
||||
Even simpler: add a one-shot `--bit-watch=ADDR:MASK` cvar that fires when
|
||||
the value at ADDR has any bit in MASK transition from 0→1, regardless of
|
||||
who wrote it. This is the cleanest diagnostic for this exact pattern.
|
||||
|
||||
## Fix shape (when bit-28 setter is identified)
|
||||
|
||||
If the bit-28 setter is inside the vtable[+0] dispatch chain at 0x822F1B4C
|
||||
(target sub_82173990), then the fix might be a state-init issue in the
|
||||
kernel/runtime.
|
||||
|
||||
If the bit-28 setter is inside the inner wait or one of the kernel calls
|
||||
(`bl 0x824AA8B0`, `bl 0x824AA330`), the fix might be a missing event signal
|
||||
or a wrong handle-state evolution.
|
||||
|
||||
If we can't identify the setter cleanly, the synthesis fallback is to
|
||||
**inject a kernel-side hook that clears bit 28 of [r30+0] on every entry to
|
||||
sub_822F1AA8's bit-check site (0x822F1BB0)**. Crude but should keep the
|
||||
main thread in the loop.
|
||||
|
||||
## Why this is a clearer wedge picture than rounds 22-33
|
||||
|
||||
Rounds 22-33 chased the audit-049 wedge from various angles. The diagnoses
|
||||
landed on different layers:
|
||||
- R22: "wrong cluster targeted" (cluster A vs B)
|
||||
- R26-30: "state-machine progression bug"
|
||||
- R32-33: "pool 3 starvation; bootstrap walk-back"
|
||||
|
||||
This round establishes the simplest possible framing:
|
||||
|
||||
> **Canary's main thread loops forever in a dispatcher; ours' main thread
|
||||
> exits the loop after one setup phase. The exit is gated by a single bit
|
||||
> on the controller's flag word.**
|
||||
|
||||
If bit 28 of `[controller+0]` could be permanently cleared, ours' main
|
||||
thread would stay in the loop, sub_821741C8 would dispatch, signals would
|
||||
flow, tid=13 would complete, draws would happen.
|
||||
@@ -0,0 +1,79 @@
|
||||
AUDIT-PC-PROBE pc=0x822f1aa8 tid=1 hw=0 cycle=6180796 lr=0x8216ee14 r3=0x40d09a40 r11=0x40111910 [r3+0]=0x00000021 [[r3+0]+24]=0x00000000 [r3+0x0C]=0x40541a40 [r3+0x30]=0x00000000
|
||||
AUDIT-PC-PROBE pc=0x822f1b38 tid=1 hw=0 cycle=6181181 lr=0x822f1b38 r3=0x00000001 r11=0x824b0000 [r3+0]=0x00000000 [[r3+0]+24]=0x00000000 [r3+0x0C]=0x00000000 [r3+0x30]=0x00000000
|
||||
|
||||
=== Final State ===
|
||||
PC: 0x824ac578
|
||||
LR: 0x824ac578
|
||||
CTR: 0x82153bf0
|
||||
CR: 0x24000028
|
||||
XER: CA=0 OV=0 SO=0
|
||||
r0 : 0x0000000082153bf0
|
||||
r1 : 0x00000000700ff6e0
|
||||
r2 : 0x0000000020000000
|
||||
r4 : 0x0000000000000001
|
||||
r7 : 0x0000000003a72328
|
||||
r8 : 0x0000000043b77284
|
||||
r9 : 0x0000000043b77328
|
||||
r10: 0x0000000000000001
|
||||
r11: 0x0000000000000103
|
||||
r12: 0x0000000082173c64
|
||||
r13: 0x000000007fff0000
|
||||
r18: 0x0000000040d09a7c
|
||||
r23: 0x00000000828f3844
|
||||
r26: 0x000000004024a4e0
|
||||
r27: 0x00000000820a17a8
|
||||
r31: 0x0000000000001070
|
||||
|
||||
=== Thread diagnostics ===
|
||||
hw=0 idx=0 tid=1 state=Blocked(WaitAny { handles: [4208], deadline: None }) pc=0x824ac578 lr=0x824ac578 sp=0x700ff6e0
|
||||
r0=0x82153bf0 r3=0x00000000 r4=0x00000001 r5=0x00000000 r6=0x00000000 r7=0x03a72328
|
||||
r8=0x43b77284 r9=0x43b77328 r10=0x00000001 r11=0x00000103 r12=0x82173c64 r13=0x7fff0000
|
||||
hw=0 idx=1 tid=11 state=Blocked(WaitAny { handles: [2190094916, 2190094880], deadline: None }) pc=0x824d2a94 lr=0x824d2a94 sp=0x71497d90
|
||||
r0=0x00000000 r3=0x00000000 r4=0x71497de0 r5=0x00000001 r6=0x00000003 r7=0x00000001
|
||||
r8=0x00000000 r9=0x00000000 r10=0x71497df0 r11=0x828a3244 r12=0xbcbcbcbc r13=0x4b9f1000
|
||||
hw=1 idx=0 tid=2 state=Blocked(WaitAny { handles: [2189887804], deadline: None }) pc=0x824a95f8 lr=0x824a95f8 sp=0x710ffd20
|
||||
r0=0x0000030c r3=0x00000000 r4=0x00000003 r5=0x00000001 r6=0x00000000 r7=0x00000000
|
||||
r8=0x00000001 r9=0x6f000000 r10=0x824a9178 r11=0x82870000 r12=0x824a94f0 r13=0x4acc3000
|
||||
hw=1 idx=1 tid=13 state=Blocked(WaitAny { handles: [4216], deadline: None }) pc=0x824ac578 lr=0x824ac578 sp=0x715a7a20
|
||||
r0=0x821511d0 r3=0x00000000 r4=0x00000001 r5=0x00000000 r6=0x00000000 r7=0x03a723d0
|
||||
r8=0x43b77334 r9=0x43b77334 r10=0x40541f80 r11=0x00000001 r12=0x821cb1e0 r13=0x4d1d4000
|
||||
hw=2 idx=0 tid=7 state=Blocked(WaitAny { handles: [1111821148], deadline: Some(42946672) }) pc=0x824cd4f4 lr=0x824cd4f4 sp=0x71187e60
|
||||
r0=0x00000000 r3=0x00000000 r4=0x00000003 r5=0x00000001 r6=0x00000000 r7=0x71187eb0
|
||||
r8=0x00000000 r9=0x00000000 r10=0x00000002 r11=0x00000002 r12=0xbcbcbcbc r13=0x4b1d6000
|
||||
hw=2 idx=1 tid=8 state=Blocked(WaitAny { handles: [4176, 4132], deadline: None }) pc=0x824ab214 lr=0x824ab214 sp=0x71287c90
|
||||
r0=0x00000000 r3=0x00000000 r4=0x71287cf0 r5=0x00000001 r6=0x00000001 r7=0x00000000
|
||||
r8=0x00000000 r9=0x00009030 r10=0x00000002 r11=0x00000020 r12=0x822f1ff0 r13=0x4b90a000
|
||||
hw=3 idx=0 tid=4 state=Blocked(WaitAny { handles: [4120], deadline: None }) pc=0x824ac578 lr=0x824ac578 sp=0x7112fb80
|
||||
r0=0x821511a0 r3=0x00000000 r4=0x00000001 r5=0x00000000 r6=0x00000000 r7=0x03a723d0
|
||||
r8=0x43b7732c r9=0x828f0000 r10=0x00000008 r11=0x00000000 r12=0x8245a660 r13=0x4adc6000
|
||||
hw=3 idx=1 tid=5 state=Blocked(WaitAny { handles: [4224], deadline: None }) pc=0x824ac578 lr=0x824ac578 sp=0x7116fbe0
|
||||
r0=0x821511a0 r3=0x00000000 r4=0x00000001 r5=0x00000000 r6=0x00000000 r7=0x03a723d0
|
||||
r8=0x43b7732c r9=0x828f0000 r10=0x00000001 r11=0x00000000 r12=0x82458b34 r13=0x4adc8000
|
||||
hw=4 idx=0 tid=9 state=Ready pc=0x824d1404 lr=0x824d22b4 sp=0x71387df0
|
||||
r0=0x00000000 r3=0x4250dedc r4=0x4250e040 r5=0x00000001 r6=0x00000000 r7=0x00000000
|
||||
r8=0x4b9ec000 r9=0x01010000 r10=0x01010000 r11=0x00000000 r12=0x824d22a8 r13=0x4b9ec000
|
||||
hw=5 idx=0 tid=3 state=Blocked(WaitAny { handles: [4112], deadline: None }) pc=0x824ac578 lr=0x824ac578 sp=0x7111fdf0
|
||||
r0=0x82153bf0 r3=0x00000000 r4=0x00000001 r5=0x00000000 r6=0x00000000 r7=0x00000a10
|
||||
r8=0x00000010 r9=0x00000000 r10=0x00009030 r11=0x00000000 r12=0x82181988 r13=0x4adc4000
|
||||
hw=5 idx=1 tid=6 state=Ready pc=0x824ab214 lr=0x824ab214 sp=0x7117fc60
|
||||
r0=0x821511a0 r3=0x00000001 r4=0x7117fcc0 r5=0x00000001 r6=0x00000001 r7=0x00000000
|
||||
r8=0x7117fcb0 r9=0x00009030 r10=0x00000002 r11=0x00000020 r12=0x82458d68 r13=0x4adca000
|
||||
hw=5 idx=2 tid=10 state=Ready pc=0x824d1404 lr=0x824d22b4 sp=0x71487e00
|
||||
r0=0x00000000 r3=0x4250dedc r4=0x4250e040 r5=0x00000001 r6=0x00000000 r7=0x00000000
|
||||
r8=0x4b9ee000 r9=0x01010000 r10=0x01010000 r11=0x00000000 r12=0x824d22a8 r13=0x4b9ee000
|
||||
hw=5 idx=3 tid=12 state=Ready pc=0x824aa6a4 lr=0x824aa6a4 sp=0x714a7da0
|
||||
r0=0x00000000 r3=0x000000ff r4=0x00000020 r5=0x714a7df4 r6=0x00000000 r7=0x00000000
|
||||
r8=0x00000000 r9=0x00000000 r10=0x00000000 r11=0x00000001 r12=0x8217898c r13=0x4d1d2000
|
||||
|
||||
-- Handle waiter lists --
|
||||
handle=0x00001018 Semaphore(0/2147483647) waiters(tid)=[4]
|
||||
handle=0x8287093c Event(sig=false, mr=false) waiters(tid)=[2]
|
||||
handle=0x00001070 Thread(id=13, exit=None) waiters(tid)=[1]
|
||||
handle=0x42450b5c Event(sig=false, mr=true) waiters(tid)=[7]
|
||||
handle=0x00001078 Event(sig=false, mr=false) waiters(tid)=[13]
|
||||
handle=0x00001080 Event(sig=false, mr=false) waiters(tid)=[5]
|
||||
handle=0x828a3244 Event(sig=false, mr=false) waiters(tid)=[11]
|
||||
handle=0x00001024 Semaphore(0/2147483647) waiters(tid)=[8]
|
||||
handle=0x828a3220 Event(sig=false, mr=true) waiters(tid)=[11]
|
||||
handle=0x00001010 Event(sig=false, mr=true) waiters(tid)=[3]
|
||||
handle=0x00001050 Event(sig=false, mr=true) waiters(tid)=[8]
|
||||
@@ -0,0 +1,127 @@
|
||||
# Phase C.1 — Validation refutes Phase A's bit-28 setter hypothesis
|
||||
|
||||
## TL;DR
|
||||
|
||||
Phase A claimed: "bit 28 of `[0x40d09a40]` (controller word) gets set in ours, causing sub_822F1AA8's dispatcher loop to exit early; candidate setter is `sub_821B55D8` at PC `0x821B5DA4`."
|
||||
|
||||
**Phase C.1 falsifies this in 4 sub-rounds:**
|
||||
|
||||
1. **`sub_821B55D8` is dead code** in both engines — its `XamInputSetState` wrapper `sub_824AA858` fires 0× in both.
|
||||
2. **`[0x40d09a40]` is never set to anything with bit 28** — `--dump-addr` at end of run shows `+0x00 = 0x00000021`, the entry value. Bit 28 is NEVER set.
|
||||
3. **The actual wedge is at the `bcctrl` at PC `0x822F1B4C`** (inside sub_822F1AA8 setup, BEFORE the dispatcher loop). tid=1 never reaches the loop top-check.
|
||||
4. **The bcctrl calls `sub_82173990`** (vtable[0] of the dispatcher singleton at `[0x828E1F08]`), which eventually waits for tid=13 to terminate. tid=13 wedges in the audit-049 silph::UImpl@GamePart_Title chain on handle `0x1078`.
|
||||
|
||||
The C.2 force-clear POC (the planned next step) would have **zero effect** because bit 28 is never set. Skipped per plan stopping criterion.
|
||||
|
||||
## Probe-fire counts (ours, 50M-instr parallel)
|
||||
|
||||
| PC | sub-round | fires | meaning |
|
||||
|---|---|---|---|
|
||||
| `0x821B55D8` (Phase A candidate fn entry) | 1 | **0** | function never reached → β/γ |
|
||||
| `0x821B5D98,DA0,DAC,D48` (loop BB heads) | 1 | **0** | function never reached |
|
||||
| `0x822F1AA8` (sub_822F1AA8 entry) | 2,3,4 | 2-3 | reached |
|
||||
| `0x822F1B38` (post-`bl 0x824AA8B0`) | 4 | 2 | reached |
|
||||
| `0x822F1B50` (post-`bcctrl`) | 4 | **0** | **bcctrl never returns** |
|
||||
| `0x822F1B60,B78,B80,BBC` (loop setup/top) | 3 | 0 | unreachable past bcctrl |
|
||||
| `0x822F1E10` (loop exit cleanup) | 2 | 0 | loop never entered, never exited |
|
||||
| `0x822F1E34` (post-thread-join) | 2 | 0 | never reached |
|
||||
| `0x82173990` (vtable[0] target) | 4 | 2 | called via bcctrl, r3=singleton (LR=0x822F1B50) |
|
||||
| `0x821748F0` (tid=13 entry) | 4 | 2 | tid=13 runs |
|
||||
| `0x821C4EB0` (silph::UImpl@GamePart_Title) | 4 | 2 | audit-009/049 reached on tid=13 |
|
||||
| `0x82457388,0x824574C0,0x82457408,0x82457490` (other oris candidates) | 2 | 0 | unreachable |
|
||||
|
||||
## Canary probe results
|
||||
|
||||
| PC | fires | meaning |
|
||||
|---|---|---|
|
||||
| `0x824AA858` (XamInputSetState wrapper) | **0** | sub_821B55D8 chain is dead code in CANARY too |
|
||||
| `0x822F1B50` (post-bcctrl, attempted) | **0** | canary's JitProlog only fires at function entries, so not directly testable; but per audit round-33 sub_821741C8 fires 471× in canary → bcctrl DOES return in canary |
|
||||
|
||||
## Critical evidence: `--dump-addr=0x40d09a40` at end of run
|
||||
|
||||
```
|
||||
addr=0x40d09a40
|
||||
+0x00: 00 00 00 21 00 00 00 01 42 44 df 00 40 54 1a 40
|
||||
^^^^^^^^^^^ ^^^^^^^^^^^
|
||||
+0x10: 40 54 1b 40 40 54 1b 80 40 54 1b c0 00 00 10 54
|
||||
+0x20: 00 00 00 00 40 24 a8 20 00 00 00 08 00 00 00 00
|
||||
```
|
||||
|
||||
- `[+0x00] = 0x00000021` ← bit 28 (mask 0x10000000) is NOT SET. Same value as at sub_822F1AA8 entry.
|
||||
- `[+0x1c] = 0x00001054` ← spawned init thread handle (= tid=8's thread handle, NOT 0x1070)
|
||||
- Thread state: tid=1 waits on handle `0x1070`, tid=13 waits on handle `0x1078`.
|
||||
|
||||
Handle `0x1070` is **tid=13's thread handle** (per stderr: `ExCreateThread: tid=13 handle=0x1070 entry=0x821748f0 ctx=0x4024a840 suspended=true`). So tid=1's wait at the wedge point is a **thread-join on tid=13**, NOT a thread-join on the dispatcher init thread (tid=8, handle 0x1054).
|
||||
|
||||
## Wedge path (corrected)
|
||||
|
||||
```
|
||||
entry_point (sub_824AB748) [tid=1 main]
|
||||
└─ sub_8216EA68
|
||||
└─ sub_822F1AA8(controller=0x40d09a40) [LR=0x8216EE14]
|
||||
├─ ExCreateThread(entry=sub_822F1EE0, ctx=controller) [PC 0x822F1B08]
|
||||
│ ⇒ tid=8 spawn, handle=0x1054 (suspended)
|
||||
├─ bl 0x824AA8B0 (no-op probe) [PC 0x822F1B34]
|
||||
└─ bcctrl on vtable[+0] of [0x828E1F08] singleton [PC 0x822F1B4C]
|
||||
│
|
||||
└─ sub_82173990(r3=singleton) [r3=0x40ba9a80, vtable=0x40111910]
|
||||
└─ ... (768-byte function with ≥18 calls; calls sub_82448AA0, sub_824AA7A0,
|
||||
sub_82448BC8, sub_82448C50, sub_8216F218, sub_8217C850, sub_82178E50,
|
||||
sub_821835E0, ...)
|
||||
└─ ... → KeWaitForSingleObject INFINITE on handle 0x1070
|
||||
(= tid=13's thread handle, thread-join)
|
||||
⇒ WEDGE — tid=13 never exits
|
||||
|
||||
(Concurrently — spawned somewhere else, not from sub_822F1AA8:)
|
||||
[tid=13, spawn-handle=0x1070, ctx=0x4024a840]
|
||||
└─ sub_821748F0 (worker boilerplate, entry from ExCreateThread)
|
||||
├─ sub_82172798, sub_82172818
|
||||
└─ sub_821749C0
|
||||
└─ sub_821CF3F0
|
||||
└─ ... → sub_821C4EB0 (UImpl@GamePart_Title@silph) [audit-009/049!]
|
||||
└─ ... → sub_821CB030 (creates KEVENT at +0x128)
|
||||
⇒ KeWaitForSingleObject INFINITE on handle 0x1078
|
||||
⇒ WEDGE — handle 0x1078 is never signaled in ours
|
||||
```
|
||||
|
||||
## Why Phase A's hypothesis is wrong
|
||||
|
||||
Phase A:
|
||||
1. Disassembled sub_822F1AA8's body, observed the bit-28 loop-exit check at `0x822F1BB8` and end-of-iter check at `0x822F1E0C`.
|
||||
2. Mem-watch on `0x40d09a40` showed zero stores → inferred "the setter writes via some path mem-watch doesn't capture."
|
||||
3. DB-scanned `oris ?, ?, 0x1000` (49 sites), found `sub_821B55D8 + 0x821B5DA4` with pattern `bl sub_824AA858 ; if r3 == 0xAA: oris r11, 0x1000 ; stw`.
|
||||
4. Concluded `sub_821B55D8` was the setter.
|
||||
|
||||
What Phase A missed:
|
||||
- Mem-watch's 0-stores result was correct: **NO setter exists**. Bit 28 is never set in either engine. The mem-watch null-result was a hint that the bit-28 hypothesis itself was wrong, but Phase A interpreted it as "mem-watch misses something."
|
||||
- The disasm-based hypothesis was visually compelling (a loop iterating arrays and setting bit 28 when a kernel call returns 0xAA) but never verified runtime.
|
||||
- `sub_821B55D8` is itself dead code in both engines.
|
||||
|
||||
## Reading-error class #19: disasm-pattern-match without runtime verification
|
||||
|
||||
When scanning for a hypothesized signal source via DB pattern-match (`oris ?, ?, 0x1000`), the analyst must run a probe to verify the suspected site is *both reached* and *takes the suspected path* before declaring it the cause. Phase A bypassed both checks. The single `--dump-addr=0x40d09a40` flag in sub-round 2 (literally 4 keystrokes added to the existing probe command) revealed the central assumption was wrong.
|
||||
|
||||
## Real divergence (handed to next session)
|
||||
|
||||
This is the **same wedge as audit-049/058/059**: tid=13 wedges in the silph::UImpl@GamePart_Title cluster on handle `0x1078`. tid=1 wedges on tid=13's thread-handle (`0x1070`) inside `sub_82173990`'s call chain.
|
||||
|
||||
`sub_82173990` is vtable[0] of the dispatcher singleton at `[0x828E1F08]`. It's a 768-byte function with ≥18 calls; the actual wait site is somewhere down its tree. To localize where in `sub_82173990` the wait happens, probe its BB heads + the `KeWaitForSingleObject` thunks (`sub_824AA330`, `sub_824AA708`).
|
||||
|
||||
The fix-shape is **NOT** "force-clear bit 28." The fix-shape is **"signal handle 0x1078 in the audit-049 cluster, or short-circuit tid=13's wait."** Round 22 (silph_synth.rs) attempted the cluster-A version of this. Cluster B (silph::UImpl) needs its own synthesis or a kernel-side signal of handle 0x1078.
|
||||
|
||||
## Phase C verdict
|
||||
|
||||
- C.1: 4 sub-rounds executed (within budget).
|
||||
- C.2: **NOT EXECUTED** — POC would be no-op since bit 28 is never set. Per plan stopping criterion, do not proceed to C.2 blind when C.1 refutes the diagnosis.
|
||||
- C.3: not applicable.
|
||||
- Branch state: no source changes. Audit artifacts only.
|
||||
|
||||
## Files in this directory
|
||||
|
||||
- `ours-c1-probe.log/stderr` — sub-round 1, probe at sub_821B55D8 BB heads (0 fires)
|
||||
- `ours-sr2-confirm-bit28.log/stderr` — sub-round 2, probe loop top/exit + dump-addr (bit 28 NEVER SET)
|
||||
- `ours-sr3-wait-trace.log/stderr` — sub-round 3, probe wait site + handle 0x1070 trace
|
||||
- `ours-sr4-bcctrl-trace.log/stderr` — sub-round 4, probe pre/post bcctrl + sub_82173990 entry + tid=13 entry (decisive)
|
||||
- canary side in `../round-C1-setter-validation-canary/`:
|
||||
- `canary-824AA858.log` — XamInputSetState wrapper fires 0× in canary too
|
||||
- `canary-822F1B50.log` — JitProlog can't probe at BB-internal PCs (function-entry-only)
|
||||
@@ -0,0 +1,144 @@
|
||||
# Phase D — Audit-049 Auto-Signal POC — FINDINGS
|
||||
|
||||
**Branch**: `iterate-2C/silph-ui-spawn-trace` (extends Phase C `481591f`)
|
||||
**Date**: 2026-06-11
|
||||
**Sub-rounds**: D2.SR1 → D2.SR4 (4/4 used)
|
||||
**Verdict**: **B — partial unwedge**
|
||||
|
||||
## Mission
|
||||
|
||||
Phase C diagnosed the audit-049 wedge as tid=13 (silph::UImpl@GamePart_Title) waiting INFINITE on a KEVENT created at `sub_821CB030+0x128` (`lr=0x821cb15c`, post-bl PC). The Phase D POC tests this diagnosis by hooking `NtCreateEvent` from that exact call site and auto-signaling the resulting handle after a configurable delay (`XENIA_SILPH_UI_AUTOSIGNAL_DELAY` instructions).
|
||||
|
||||
If tid=13 unblocks, the diagnosis is confirmed. If new wedges or new threads appear downstream, even better — that's actual game progression past the wedge.
|
||||
|
||||
## Result summary
|
||||
|
||||
| Symptom | SR2/SR3 baseline | SR4 (POC firing) |
|
||||
|---|---|---|
|
||||
| `silph autosignal: scheduled handle=0x1078 caller_lr=0x821cb15c` | yes (SR2/SR3) | yes |
|
||||
| `silph autosignal: firing handle=0x1078` | NO | **yes (cycle 16326209)** |
|
||||
| handle 0x1078 final | `signaled=false waiters=1 <NO_SIGNALS_DESPITE_WAITS>` | `signal_attempts=1 waiters=0` |
|
||||
| tid=13 final state | `Blocked(WaitAny[0x1078])` | **`Ready` pc=0x824a9108** |
|
||||
| tid=1 final state | `Blocked(WaitAny[0x1070])` thread-join | `Blocked(WaitAny[0x1070])` (tid=13 not yet exited) |
|
||||
| ExCreateThread total | 10 | **12 (+tid=14, +tid=15)** |
|
||||
| New downstream wedges | none past 0x1078 | **0x1084 (Event/Auto), 0x1088 (Event/Manual)** |
|
||||
| `cxx_throw` runtime_error decoded | none | **yes, stack depth 6, top L0=0x82612b50 → L4=sub_82450B60+0x1A8 → L6=sub_82450a50** |
|
||||
| VdSwap | 1 | 1 |
|
||||
| gpu.interrupt.delivered{source=0} | 6393 | 4539 (different trajectory, no draws) |
|
||||
|
||||
**Conclusion**: tid=13 unwedged cleanly from the audit-049 wait, spawned two follow-on threads (tid=14 entry=`silph` ctx=`0x40929c00`, tid=15 a worker), and progressed deep enough into the silph::UImpl state machine to throw a `runtime_error` from sub_82450a50 → sub_82450B60+0x1A8 (the dispatcher cluster from round 26). The auto-signal **is not** the proper signaler — it lets tid=13 proceed but downstream state-machine invariants the missing real signaler would have established are not in place, so the dispatcher trips on a "not-registered instance" lookup.
|
||||
|
||||
This is a **clean confirmation** of the Phase C diagnosis: the wedge handle, the wait site, and the LR filter are all correct. The fix shape is:
|
||||
- Either: synthesize the missing signaler properly (cluster-B silph_ui_synth.rs analogue from R33's deferred plan)
|
||||
- Or: track what the auto-signal needed to write into the work-item state (`[+8]` field per R26) BEFORE signaling, so the dispatcher's BST lookup succeeds
|
||||
|
||||
## Sub-round detail
|
||||
|
||||
### D2.SR1 — initial run, hook never fires (wrong LR filter)
|
||||
|
||||
Filter checked `creator_lr ∈ [0x821CB15C, 0x821CB160]` against `ctx.lr` at `nt_create_event` entry. But `ctx.lr` is the **thunk wrapper return slot** (`0x824a9f6c`), not the guest caller's post-bl PC. Confirmed via handle-audit `created stack` dump: frame 0 lr=`0x824a9f6c`, frame 1 lr=`0x821cb15c`. The guest caller's LR lives one frame up the PPC EABI back-chain.
|
||||
|
||||
Diagnosis classification: **D (filter mismatch)**. Reading-error class #20 (new).
|
||||
|
||||
### D2.SR2 — frame-1-LR fix; hook schedules, never fires
|
||||
|
||||
Refactored `maybe_register_silph_autosignal` to take `(ctx, mem)`, walk back-chain via existing `walk_guest_back_chain` (1 step), match the saved LR. Hook now fires:
|
||||
|
||||
```
|
||||
silph autosignal: scheduled handle=0x1078 caller_lr=0x821cb15c for cycle 10000 (now=0, delay=10000)
|
||||
```
|
||||
|
||||
But no "firing" log appears, and tid=13 stays Blocked. Classification: **D (drain site never reached)**.
|
||||
|
||||
### D2.SR3 — diagnostic added; confirms drain site never visited
|
||||
|
||||
Added a one-shot info-level "tick (first visit, none due)" log inside `fire_due_silph_autosignals` when pending is non-empty but nothing due. Re-ran. **The tick-diagnostic never fired either** — proving the function isn't being called at all in `--parallel` mode.
|
||||
|
||||
Root cause: `--parallel` dispatches to `run_execution_parallel` (line 2928 of main.rs), which has its own outer loop at line 3186. My Phase D wiring only touched the lockstep path at line 2763. Classification: **D (wrong code path wired)**.
|
||||
|
||||
### D2.SR4 — parallel-path wiring added; hook fires; tid=13 unblocks
|
||||
|
||||
Added the same `set_now_cycle_hint` + `fire_due_silph_autosignals` calls inside the parallel outer loop, right after `coord_pre_round` (and under the same `kernel_arc` guard, so no extra locking). Re-built, re-ran.
|
||||
|
||||
Now all three log lines appear:
|
||||
|
||||
```
|
||||
silph autosignal: scheduled handle=0x1078 caller_lr=0x821cb15c for cycle 16326202 (now=16316202, delay=10000)
|
||||
silph autosignal: tick (first visit, none due) now=16316213 pending=1 first_deadline=16326202
|
||||
silph autosignal: firing handle=0x1078 prev_signaled=Some(false) at cycle 16326209
|
||||
```
|
||||
|
||||
`now=16316202` at schedule time confirms `set_now_cycle_hint` is wired through correctly (the parallel path was simply never visited in SR2/SR3). Fire at cycle 16326209 = deadline 16326202 + 7-cycle scheduler granularity. Diagnostic classification: **B (partial unwedge — new waits and cxx_throw downstream)**.
|
||||
|
||||
## Code shape
|
||||
|
||||
POC is ~70 LOC across four files, all env-gated. Default off.
|
||||
|
||||
| File | Change | Lines |
|
||||
|---|---|---|
|
||||
| `crates/xenia-cpu/src/scheduler.rs` | `GuestThread.start_entry/start_context` fields; `spawn()` populates; `current_thread_entry_and_ctx()` helper | +18 |
|
||||
| `crates/xenia-kernel/src/state.rs` | `AutoSignalPending` struct; `silph_autosignal_*` fields; `set_now_cycle_hint`, `maybe_register_silph_autosignal`, `fire_due_silph_autosignals` methods | +95 |
|
||||
| `crates/xenia-kernel/src/exports.rs` | Hook in `nt_create_event` | +3 |
|
||||
| `crates/xenia-app/src/main.rs` | Fire-site wiring in lockstep loop (line 2788) **and** parallel loop (line 3215) | +12 |
|
||||
|
||||
Tests stay green at **655/655**.
|
||||
|
||||
## Reading-error class #20 (new)
|
||||
|
||||
**`ctx.lr` at kernel export entry ≠ guest caller's post-bl PC.** When a guest `bl` calls an export thunk, the thunk-wrapper has its own frame between the guest caller and the export body. At export-body entry, `ctx.lr` holds the *wrapper's* return slot, not the guest caller's post-bl PC.
|
||||
|
||||
To match a specific guest call site by LR, the export must walk one step up the back-chain (`walk_guest_back_chain(ctx.gpr[1], ctx.lr, mem, 2)`) and use `frames[1].lr`.
|
||||
|
||||
SR1 burned one full sub-round on this. Detect early in future POCs by comparing `ctx.lr` against the handle-audit's `created stack` frame dump for a known-good event (e.g. one created from a labelled site).
|
||||
|
||||
## Reading-error class #21 (new)
|
||||
|
||||
**`--parallel` and lockstep have separate outer loops in main.rs.** They share `coord_pre_round` (carved out exactly for this reason), but anything wired adjacent to that call site only takes effect on the path it's wired on. Lockstep is `run_execution` (line 2706, outer loop at 2763). Parallel is `run_execution_parallel` (line 2928, outer loop at 3186).
|
||||
|
||||
Per-round hooks added for a specific build mode must be wired in **both** paths. SR2/SR3 burned two sub-rounds on this.
|
||||
|
||||
## Files modified + LR mapping (for follow-up sessions)
|
||||
|
||||
**Wedge handle creation** (confirmed by handle-audit dump):
|
||||
```
|
||||
created cycle=0 tid=13 lr=0x824a9f6c [src=NtCreateEvent thunk return]
|
||||
created stack (6 frames):
|
||||
[ 0] fp=0x715a7a10 lr=0x824a9f6c ← ctx.lr at nt_create_event
|
||||
[ 1] fp=0x715a7aa0 lr=0x821cb15c ← guest caller's post-bl PC (filter on this)
|
||||
[ 2] fp=0x715a7bd0 lr=0x821cbae0 ← sub_821CBA08 frame
|
||||
[ 3] fp=0x715a7cd0 lr=0x821cc454 ← sub_821CC3F8 frame
|
||||
[ 4] fp=0x715a7d60 lr=0x821c4f18 ← sub_821C4EB0 frame (silph::UImpl@GamePart_Title)
|
||||
[ 5] fp=0x715a7e00 lr=0x82174a80 ← sub_821748F0 trampoline frame
|
||||
```
|
||||
|
||||
**Downstream cxx_throw stack** (after auto-signal fires, tid=5 throws runtime_error):
|
||||
```
|
||||
L0 lr=0x82612b50 std::exception throw path
|
||||
L1 lr=0x825f2444
|
||||
L2 lr=0x824547e8
|
||||
L3 lr=0x82451418
|
||||
L4 lr=0x82450d08 ← sub_82450B60+0x1A8 (dispatcher, audit-059 R26)
|
||||
L5 lr=0x82450b34
|
||||
L6 lr=0x82450a50 ← sub_82450a50 (worker dispatch)
|
||||
|
||||
cxx_throw runtime_error decoded magic=0x19930520
|
||||
cxx_throw BST ceil search candidate_key=0x828e2b2c match_found=false
|
||||
cxx_throw lhs (not-registered instance) lhs=0x715a7af0
|
||||
```
|
||||
|
||||
This confirms the dispatcher reached audit-049 territory (R26's `sub_82450B60+0x1A8` PC `0x82450D08`), looked up a runtime instance in its BST keyed by VA, and the instance was never registered. **The auto-signal bypassed an upstream registration step** the real signaler would have driven.
|
||||
|
||||
## Recommendation
|
||||
|
||||
Ship the POC env-gated (default off; no behavior change unless opted in). The verdict-B success makes it a useful diagnostic flag for future audit-049 work: future investigations can set `XENIA_SILPH_UI_AUTOSIGNAL_DELAY=10000` to skip the wedge and probe downstream behavior without first writing the proper signaler.
|
||||
|
||||
Long-term fix path remains the R33 silph_ui_synth.rs analogue: synthesize the missing signaler + its precondition state (BST instance registration at `0x715a7af0`-equivalent, work-item state `[+8]` per R26). The auto-signal POC is **not** the final fix — it confirms diagnosis but doesn't honor the dispatcher's BST registry invariant.
|
||||
|
||||
## Artifacts
|
||||
|
||||
- `poc-sr1.log`, `poc-sr1.stderr` — initial run, filter mismatch (D)
|
||||
- `poc-sr2.log`, `poc-sr2.stderr` — frame-1-LR fix, no fire (D)
|
||||
- `poc-sr3.log`, `poc-sr3.stderr` — diagnostic added, no fire (D, parallel path unwired)
|
||||
- `poc-sr4.log`, `poc-sr4.stderr` — parallel-path wired, **fires + partial unwedge (B)**
|
||||
|
||||
All `.log`/`.stderr` files are `.gitignore`d; this `FINDINGS.md` is the only artifact-side commit.
|
||||
@@ -0,0 +1,200 @@
|
||||
0x82450b60: lwz r18, 9792(r31)
|
||||
0x82450b64: lwz r16, 13880(r14)
|
||||
0x82450b68: mflr r12
|
||||
0x82450b6c: bl 0x825F0F74
|
||||
0x82450b70: subi r31, r1, 176
|
||||
0x82450b74: stwu r1, -176(r1)
|
||||
0x82450b78: mr r29, r4
|
||||
0x82450b7c: mr r27, r3
|
||||
0x82450b80: cmpwi cr6, r29, 5
|
||||
0x82450b84: bne cr6, 0x82450B94
|
||||
0x82450b88: addi r28, r27, 196
|
||||
0x82450b8c: addi r26, r27, 28
|
||||
0x82450b90: b 0x82450BAC
|
||||
0x82450b94: slwi r11, r29, 2
|
||||
0x82450b98: mr r26, r27
|
||||
0x82450b9c: add r11, r29, r11
|
||||
0x82450ba0: slwi r11, r11, 2
|
||||
0x82450ba4: add r11, r11, r27
|
||||
0x82450ba8: addi r28, r11, 96
|
||||
0x82450bac: addi r23, r27, 56
|
||||
0x82450bb0: mr r3, r23
|
||||
0x82450bb4: stw r23, 84(r31)
|
||||
0x82450bb8: bl 0x8284DCFC
|
||||
0x82450bbc: mr r3, r26
|
||||
0x82450bc0: bl 0x8284DCFC
|
||||
0x82450bc4: lwz r7, 16(r28)
|
||||
0x82450bc8: cntlzw r11, r7
|
||||
0x82450bcc: extrwi r11, r11, 1, 26
|
||||
0x82450bd0: cmplwi cr6, r11, 0x0
|
||||
0x82450bd4: beq cr6, 0x82450BEC
|
||||
0x82450bd8: mr r3, r26
|
||||
0x82450bdc: bl 0x8284DD0C
|
||||
0x82450be0: mr r3, r23
|
||||
0x82450be4: bl 0x8284DD0C
|
||||
0x82450be8: b 0x82450EE8
|
||||
0x82450bec: lwz r11, 12(r28)
|
||||
0x82450bf0: lwz r9, 8(r28)
|
||||
0x82450bf4: srwi r10, r11, 2
|
||||
0x82450bf8: clrlwi r8, r11, 30
|
||||
0x82450bfc: cmplw cr6, r9, r10
|
||||
0x82450c00: bgt cr6, 0x82450C08
|
||||
0x82450c04: sub r10, r10, r9
|
||||
0x82450c08: lwz r9, 4(r28)
|
||||
0x82450c0c: slwi r10, r10, 2
|
||||
0x82450c10: slwi r8, r8, 2
|
||||
0x82450c14: lwz r6, 8(r28)
|
||||
0x82450c18: addi r11, r11, 1
|
||||
0x82450c1c: slwi r6, r6, 2
|
||||
0x82450c20: li r24, 0
|
||||
0x82450c24: lwzx r10, r10, r9
|
||||
0x82450c28: cmplw cr6, r6, r11
|
||||
0x82450c2c: lwzx r30, r10, r8
|
||||
0x82450c30: stw r11, 12(r28)
|
||||
0x82450c34: stw r30, 80(r31)
|
||||
0x82450c38: bgt cr6, 0x82450C40
|
||||
0x82450c3c: stw r24, 12(r28)
|
||||
0x82450c40: subic. r11, r7, 1
|
||||
0x82450c44: stw r11, 16(r28)
|
||||
0x82450c48: bne 0x82450C50
|
||||
0x82450c4c: stw r24, 12(r28)
|
||||
0x82450c50: addi r25, r27, 28
|
||||
0x82450c54: mr r3, r25
|
||||
0x82450c58: bl 0x8284DCFC
|
||||
0x82450c5c: mr r3, r25
|
||||
0x82450c60: stw r30, 216(r27)
|
||||
0x82450c64: bl 0x8284DD0C
|
||||
0x82450c68: mr r3, r26
|
||||
0x82450c6c: bl 0x8284DD0C
|
||||
0x82450c70: lwz r11, 28(r30)
|
||||
0x82450c74: clrlwi r11, r11, 31
|
||||
0x82450c78: cmplwi cr6, r11, 0x0
|
||||
0x82450c7c: bne cr6, 0x82450D30
|
||||
0x82450c80: lwz r11, 8(r30)
|
||||
0x82450c84: cmplwi cr6, r11, 0x1
|
||||
0x82450c88: blt cr6, 0x82450CE4
|
||||
0x82450c8c: bne cr6, 0x82450D3C
|
||||
0x82450c90: lwz r11, 28(r30)
|
||||
0x82450c94: rlwinm r11, r11, 0, 29, 29
|
||||
0x82450c98: cmplwi cr6, r11, 0x0
|
||||
0x82450c9c: beq cr6, 0x82450CB0
|
||||
0x82450ca0: mr r4, r30
|
||||
0x82450ca4: mr r3, r27
|
||||
0x82450ca8: bl 0x824510E0
|
||||
0x82450cac: b 0x82450CBC
|
||||
0x82450cb0: mr r4, r30
|
||||
0x82450cb4: mr r3, r27
|
||||
0x82450cb8: bl 0x824517B0
|
||||
0x82450cbc: stw r29, 220(r27)
|
||||
0x82450cc0: bl 0x824AA830
|
||||
0x82450cc4: mr r11, r3
|
||||
0x82450cc8: lwz r3, 92(r27)
|
||||
0x82450ccc: li r5, 0
|
||||
0x82450cd0: addi r11, r11, 66
|
||||
0x82450cd4: li r4, 1
|
||||
0x82450cd8: stw r11, 224(r27)
|
||||
0x82450cdc: bl 0x824AB158
|
||||
0x82450ce0: b 0x82450D3C
|
||||
0x82450ce4: lwz r11, 28(r30)
|
||||
0x82450ce8: mr r4, r30
|
||||
0x82450cec: mr r3, r27
|
||||
0x82450cf0: rlwinm r11, r11, 0, 29, 29
|
||||
0x82450cf4: cmplwi cr6, r11, 0x0
|
||||
0x82450cf8: beq cr6, 0x82450D04
|
||||
0x82450cfc: bl 0x82450F68
|
||||
0x82450d00: b 0x82450D08
|
||||
0x82450d04: bl 0x82451238
|
||||
0x82450d08: stw r29, 220(r27)
|
||||
0x82450d0c: bl 0x824AA830
|
||||
0x82450d10: mr r11, r3
|
||||
0x82450d14: lwz r3, 92(r27)
|
||||
0x82450d18: li r5, 0
|
||||
0x82450d1c: addi r11, r11, 66
|
||||
0x82450d20: li r4, 1
|
||||
0x82450d24: stw r11, 224(r27)
|
||||
0x82450d28: bl 0x824AB158
|
||||
0x82450d2c: b 0x82450D3C
|
||||
0x82450d30: lwz r11, 28(r30)
|
||||
0x82450d34: ori r11, r11, 0x2
|
||||
0x82450d38: stw r11, 28(r30)
|
||||
0x82450d3c: lwz r11, 8(r30)
|
||||
0x82450d40: mr r29, r24
|
||||
0x82450d44: cmpwi cr6, r11, 2
|
||||
0x82450d48: blt cr6, 0x82450E08
|
||||
0x82450d4c: cmpwi cr6, r11, 3
|
||||
0x82450d50: ble cr6, 0x82450DA0
|
||||
0x82450d54: cmpwi cr6, r11, 4
|
||||
0x82450d58: bne cr6, 0x82450E08
|
||||
0x82450d5c: lwz r11, 28(r30)
|
||||
0x82450d60: rlwinm r11, r11, 0, 29, 29
|
||||
0x82450d64: cmplwi cr6, r11, 0x0
|
||||
0x82450d68: bne cr6, 0x82450D98
|
||||
0x82450d6c: lwz r29, 36(r30)
|
||||
0x82450d70: mr r3, r29
|
||||
0x82450d74: lwz r11, 0(r29)
|
||||
0x82450d78: lwz r11, 4(r11)
|
||||
0x82450d7c: mtctr r11
|
||||
0x82450d80: bctrl
|
||||
0x82450d84: clrlwi r11, r3, 24
|
||||
0x82450d88: cmplwi cr6, r11, 0x0
|
||||
0x82450d8c: beq cr6, 0x82450D98
|
||||
0x82450d90: mr r3, r29
|
||||
0x82450d94: bl 0x8244FB38
|
||||
0x82450d98: li r29, 1
|
||||
0x82450d9c: b 0x82450E28
|
||||
0x82450da0: addi r3, r30, 40
|
||||
0x82450da4: bl 0x82451DB8
|
||||
0x82450da8: lwz r11, 32(r30)
|
||||
0x82450dac: cmplwi cr6, r11, 0x0
|
||||
0x82450db0: beq cr6, 0x82450DCC
|
||||
0x82450db4: rlwinm r11, r11, 0, 0, 31
|
||||
0x82450db8: lwz r10, 4(r30)
|
||||
0x82450dbc: lwz r11, 4(r11)
|
||||
0x82450dc0: cmplw cr6, r10, r11
|
||||
0x82450dc4: li r11, 1
|
||||
0x82450dc8: beq cr6, 0x82450DD0
|
||||
0x82450dcc: mr r11, r24
|
||||
0x82450dd0: clrlwi r11, r11, 24
|
||||
0x82450dd4: cmplwi cr6, r11, 0x0
|
||||
0x82450dd8: beq cr6, 0x82450E00
|
||||
0x82450ddc: lwz r4, 8(r30)
|
||||
0x82450de0: lwz r5, 0(r30)
|
||||
0x82450de4: lwz r3, 32(r30)
|
||||
0x82450de8: cmpwi cr6, r4, 1
|
||||
0x82450dec: ble cr6, 0x82450DFC
|
||||
0x82450df0: bl 0x8245D9D8
|
||||
0x82450df4: li r29, 1
|
||||
0x82450df8: b 0x82450E28
|
||||
0x82450dfc: stw r4, 8(r3)
|
||||
0x82450e00: li r29, 1
|
||||
0x82450e04: b 0x82450E28
|
||||
0x82450e08: mr r3, r26
|
||||
0x82450e0c: stw r26, 88(r31)
|
||||
0x82450e10: bl 0x8284DCFC
|
||||
0x82450e14: addi r4, r31, 80
|
||||
0x82450e18: mr r3, r28
|
||||
0x82450e1c: bl 0x823232C0
|
||||
0x82450e20: mr r3, r26
|
||||
0x82450e24: bl 0x8284DD0C
|
||||
0x82450e28: clrlwi r11, r29, 24
|
||||
0x82450e2c: cmplwi cr6, r11, 0x0
|
||||
0x82450e30: beq cr6, 0x82450ECC
|
||||
0x82450e34: lwz r11, 28(r30)
|
||||
0x82450e38: rlwinm r11, r11, 0, 30, 30
|
||||
0x82450e3c: cmplwi cr6, r11, 0x0
|
||||
0x82450e40: beq cr6, 0x82450E68
|
||||
0x82450e44: mr r3, r26
|
||||
0x82450e48: stw r26, 88(r31)
|
||||
0x82450e4c: bl 0x8284DCFC
|
||||
0x82450e50: addi r4, r31, 80
|
||||
0x82450e54: mr r3, r28
|
||||
0x82450e58: bl 0x823232C0
|
||||
0x82450e5c: mr r3, r26
|
||||
0x82450e60: bl 0x8284DD0C
|
||||
0x82450e64: b 0x82450ECC
|
||||
0x82450e68: lwz r11, 40(r30)
|
||||
0x82450e6c: cmplwi cr6, r11, 0x0
|
||||
0x82450e70: beq cr6, 0x82450EA4
|
||||
0x82450e74: rlwinm r3, r11, 0, 0, 31
|
||||
0x82450e78: bl 0x82458A70
|
||||
0x82450e7c: lwz r29, 40(r30)
|
||||
@@ -0,0 +1,80 @@
|
||||
0x82451238: mflr r12
|
||||
0x8245123c: li r0, 0
|
||||
0x82451240: stw r0, 4(r1)
|
||||
0x82451244: bl 0x825F0F80
|
||||
0x82451248: subi r31, r1, 160
|
||||
0x8245124c: stwu r1, -160(r1)
|
||||
0x82451250: mr r30, r4
|
||||
0x82451254: li r9, 1
|
||||
0x82451258: lwz r10, 32(r30)
|
||||
0x8245125c: stw r30, 188(r31)
|
||||
0x82451260: stw r9, 8(r30)
|
||||
0x82451264: cmplwi cr6, r10, 0x0
|
||||
0x82451268: beq cr6, 0x82451288
|
||||
0x8245126c: lwz r11, 4(r30)
|
||||
0x82451270: lwz r8, 4(r10)
|
||||
0x82451274: cmplw cr6, r11, r8
|
||||
0x82451278: bne cr6, 0x82451288
|
||||
0x8245127c: mr r11, r9
|
||||
0x82451280: li r26, 0
|
||||
0x82451284: b 0x82451290
|
||||
0x82451288: li r26, 0
|
||||
0x8245128c: mr r11, r26
|
||||
0x82451290: clrlwi r11, r11, 24
|
||||
0x82451294: cmplwi cr6, r11, 0x0
|
||||
0x82451298: beq cr6, 0x824512A0
|
||||
0x8245129c: stw r9, 8(r10)
|
||||
0x824512a0: lwz r3, 36(r30)
|
||||
0x824512a4: lwz r11, 0(r3)
|
||||
0x824512a8: lwz r11, 32(r11)
|
||||
0x824512ac: mtctr r11
|
||||
0x824512b0: bctrl
|
||||
0x824512b4: mr r27, r3
|
||||
0x824512b8: stw r26, 84(r31)
|
||||
0x824512bc: stw r27, 96(r31)
|
||||
0x824512c0: bl 0x82454498
|
||||
0x824512c4: addi r4, r31, 84
|
||||
0x824512c8: bl 0x82454580
|
||||
0x824512cc: stw r26, 92(r31)
|
||||
0x824512d0: addi r11, r27, 2047
|
||||
0x824512d4: lis r10, 0x2
|
||||
0x824512d8: clrrwi r11, r11, 11
|
||||
0x824512dc: cmplw cr6, r11, r10
|
||||
0x824512e0: stw r11, 100(r31)
|
||||
0x824512e4: ble cr6, 0x824512F4
|
||||
0x824512e8: lis r11, 0x8207
|
||||
0x824512ec: addi r11, r11, 6724
|
||||
0x824512f0: b 0x824512F8
|
||||
0x824512f4: addi r11, r31, 100
|
||||
0x824512f8: addi r3, r31, 84
|
||||
0x824512fc: lwz r4, 0(r11)
|
||||
0x82451300: bl 0x82454B08
|
||||
0x82451304: mr r8, r8
|
||||
0x82451308: mr r28, r3
|
||||
0x8245130c: stw r28, 92(r31)
|
||||
0x82451310: b 0x82451324
|
||||
0x82451314: lwz r30, 188(r31)
|
||||
0x82451318: lwz r27, 96(r31)
|
||||
0x8245131c: li r26, 0
|
||||
0x82451320: lwz r28, 92(r31)
|
||||
0x82451324: addi r3, r31, 84
|
||||
0x82451328: bl 0x82454AA0
|
||||
0x8245132c: mr r29, r3
|
||||
0x82451330: cmplwi cr6, r28, 0x0
|
||||
0x82451334: beq cr6, 0x82451684
|
||||
0x82451338: lwz r3, 36(r30)
|
||||
0x8245133c: li r8, 0
|
||||
0x82451340: addi r7, r31, 88
|
||||
0x82451344: mr r6, r29
|
||||
0x82451348: mr r5, r29
|
||||
0x8245134c: mr r4, r28
|
||||
0x82451350: lwz r11, 0(r3)
|
||||
0x82451354: lwz r11, 28(r11)
|
||||
0x82451358: mtctr r11
|
||||
0x8245135c: bctrl
|
||||
0x82451360: clrlwi r11, r3, 24
|
||||
0x82451364: cmplwi cr6, r11, 0x0
|
||||
0x82451368: beq cr6, 0x82451684
|
||||
0x8245136c: lwz r11, 28(r30)
|
||||
0x82451370: rlwinm r11, r11, 0, 28, 28
|
||||
0x82451374: cmplwi cr6, r11, 0x0
|
||||
@@ -0,0 +1,52 @@
|
||||
=== Fire counts ===
|
||||
ours: 3
|
||||
canary: 7
|
||||
|
||||
=== Per-LR breakdown ===
|
||||
ours:
|
||||
lr=0x82458674: 3
|
||||
canary:
|
||||
lr=0x82457bd4: 2
|
||||
lr=0x82458674: 5
|
||||
|
||||
=== Side-by-side first 5 fires (entry registers) ===
|
||||
|
||||
--- fire #0 ---
|
||||
ours: tid=6 cycle=363 lr=0x82458674 r3=0x40ba9ac0
|
||||
dump: 419fecda 000007f6 00000000 41d7dd10 00001688 00000000 00000000 41f5dd80 82457958 823f53f0 00000000 00000000 00000001 00000000 00000000 4024a5c0
|
||||
canary: tid=11 cycle=<unk> lr=0x82458674 r3=0xbccc4ac0 r4=0x00000000 r5=0x00000001 r6=0x00000001 r7=0x00000000
|
||||
dump: bdb19cda 000007f6 00000000 bde98d10 00001688 00000000 00000000 be078d80 82457958 823f53f0 00000000 00000000 00000001 00000000 00000000 bc365760
|
||||
|
||||
--- fire #1 ---
|
||||
ours: tid=6 cycle=140548 lr=0x82458674 r3=0x40ba9b80
|
||||
dump: 42c0f09a 00018ff6 00000000 43777210 0004d055 00000000 00000000 41f60d80 82457958 823f53f0 00000000 00000000 00000001 00000000 00000000 4024a960
|
||||
canary: tid=11 cycle=<unk> lr=0x82458674 r3=0xbccc4b80 r4=0x00000000 r5=0x00000001 r6=0x00000001 r7=0x00000000
|
||||
dump: bed2a09a 00018ff6 00000000 bf892210 0004d055 00000000 00000000 be07bd80 82457958 823f53f0 00000000 00000000 00000001 00000000 00000000 bc365840
|
||||
|
||||
--- fire #2 ---
|
||||
ours: tid=6 cycle=5957876 lr=0x82458674 r3=0x40ba9b80
|
||||
dump: 419fecda 000007f6 00000000 414f5f70 000003b9 00000000 00000000 41f60d80 82457958 823f53f0 00000000 00000040 00000001 00000000 00000000 4024a980
|
||||
canary: tid=11 cycle=<unk> lr=0x82458674 r3=0xbccc4b80 r4=0x00000000 r5=0x00000001 r6=0x00000001 r7=0x00000000
|
||||
dump: bdb19cda 000007f6 00000000 bd610b90 000003b9 00000000 00000000 be07bd80 82457958 823f53f0 00000000 00000040 00000001 00000000 00000000 bc365860
|
||||
|
||||
--- fire #3 ---
|
||||
ours: <no fire>
|
||||
canary: tid=11 cycle=<unk> lr=0x82458674 r3=0xbccc5300 r4=0x00000000 r5=0x00000001 r6=0x00000001 r7=0x00000000
|
||||
dump: bdb1acda 000007f6 00000000 bce24ed0 00000167 00000000 00000000 be07bd80 82457958 823f53f0 00000000 00000000 00000001 00000000 00000000 bc365f40
|
||||
|
||||
--- fire #4 ---
|
||||
ours: <no fire>
|
||||
canary: tid=6 cycle=<unk> lr=0x82457bd4 r3=0x701cf3c0 r4=0x00000004 r5=0x00002530 r6=0x00008000 r7=0x00000001
|
||||
dump: be95af9a 0000c170 00000000 b2050010 000681e9 00000000 00000000 be07bd80 82457958 823f53f0 00000000 0000c17a 00000001 701cf4e0 00000000 be95af90
|
||||
|
||||
=== Equivalence check: u32 lanes at +0x04 and +0x10 (work-item magic + counter) ===
|
||||
Both fields are stable identifiers across engines (host VAs differ but data should match).
|
||||
|
||||
Index of fields:
|
||||
[+0x04] = work-item 'size?' (looks like a length field)
|
||||
[+0x10] = state counter (per round 30, this is [+128/4 ?]) — but in dump it's u32[4]
|
||||
|
||||
ours [+04,+10]: [(2038, 5768), (102390, 315477), (2038, 953)]
|
||||
canary [+04,+10]: [(2038, 5768), (102390, 315477), (2038, 953), (2038, 359), (49520, 426473), (232195, 999643), (6134, 13763)]
|
||||
|
||||
ours fires whose [+04,+10] match a canary fire: 3/3
|
||||
@@ -0,0 +1,175 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Round 35 lockstep diff: align sub_8280AD40 entry fires between
|
||||
ours (--audit-pc-probe-hex AUDIT-PC-PROBE / AUDIT-R3-DUMP) and
|
||||
canary (AUDIT-HLC JitProlog).
|
||||
|
||||
Outputs side-by-side rendering of:
|
||||
- per-fire entry register snapshot (r3..r10, lr)
|
||||
- 64-byte r3 dump (u32 lanes, big-endian)
|
||||
Alignment is by tid + invocation order (no input-equivalence required).
|
||||
"""
|
||||
import re
|
||||
import sys
|
||||
import os
|
||||
|
||||
THIS_DIR = os.path.dirname(os.path.abspath(__file__))
|
||||
OURS_LOG = os.path.join(THIS_DIR, "ours.log")
|
||||
CANARY_LOG = os.path.join(
|
||||
os.path.dirname(THIS_DIR), "round35-lockstep-inflate-canary", "canary.log"
|
||||
)
|
||||
|
||||
PC_TARGET = 0x8280AD40
|
||||
|
||||
|
||||
def parse_ours(path):
|
||||
"""Pair AUDIT-PC-PROBE lines with their following AUDIT-R3-DUMP lines."""
|
||||
fires = []
|
||||
cur = None
|
||||
with open(path) as f:
|
||||
for line in f:
|
||||
line = line.strip()
|
||||
if line.startswith("AUDIT-PC-PROBE"):
|
||||
m = re.search(
|
||||
r"pc=0x([0-9a-f]+) tid=(\d+) hw=\d+ cycle=(\d+) lr=0x([0-9a-f]+) r3=0x([0-9a-f]+) r11=0x([0-9a-f]+)",
|
||||
line,
|
||||
)
|
||||
if not m:
|
||||
continue
|
||||
pc = int(m.group(1), 16)
|
||||
if pc != PC_TARGET:
|
||||
cur = None
|
||||
continue
|
||||
cur = {
|
||||
"tid": int(m.group(2)),
|
||||
"cycle": int(m.group(3)),
|
||||
"lr": int(m.group(4), 16),
|
||||
"r3": int(m.group(5), 16),
|
||||
"dump": [],
|
||||
}
|
||||
fires.append(cur)
|
||||
elif line.startswith("AUDIT-R3-DUMP") and cur is not None:
|
||||
lanes = re.findall(r"\+0x[0-9a-f]+=0x([0-9a-f]+)", line)
|
||||
cur["dump"] = [int(x, 16) for x in lanes]
|
||||
cur = None
|
||||
return fires
|
||||
|
||||
|
||||
def parse_canary(path):
|
||||
"""Pair AUDIT-HLC JitProlog header lines with following r3+NN dump lines."""
|
||||
fires = []
|
||||
cur = None
|
||||
hdr_re = re.compile(
|
||||
r"AUDIT-HLC JitProlog pc=8280AD40 tid=([0-9A-F]+) r3=([0-9A-F]+) r4=([0-9A-F]+) "
|
||||
r"r5=([0-9A-F]+) r6=([0-9A-F]+) r7=([0-9A-F]+) r8=([0-9A-F]+) r9=([0-9A-F]+) r10=([0-9A-F]+) lr=([0-9A-F]+)"
|
||||
)
|
||||
dump_re = re.compile(
|
||||
r"AUDIT-HLC JitProlog pc=8280AD40 r3\+([0-9A-F]+): ([0-9A-F]+) ([0-9A-F]+) ([0-9A-F]+) ([0-9A-F]+)"
|
||||
)
|
||||
with open(path) as f:
|
||||
for line in f:
|
||||
line = line.strip()
|
||||
m = hdr_re.search(line)
|
||||
if m:
|
||||
cur = {
|
||||
"tid": int(m.group(1), 16),
|
||||
"r3": int(m.group(2), 16),
|
||||
"r4": int(m.group(3), 16),
|
||||
"r5": int(m.group(4), 16),
|
||||
"r6": int(m.group(5), 16),
|
||||
"r7": int(m.group(6), 16),
|
||||
"r8": int(m.group(7), 16),
|
||||
"r9": int(m.group(8), 16),
|
||||
"r10": int(m.group(9), 16),
|
||||
"lr": int(m.group(10), 16),
|
||||
"dump": [],
|
||||
}
|
||||
fires.append(cur)
|
||||
continue
|
||||
m = dump_re.search(line)
|
||||
if m and cur is not None:
|
||||
off = int(m.group(1), 16)
|
||||
for i in range(4):
|
||||
word = int(m.group(2 + i), 16)
|
||||
# extend dump to fit
|
||||
idx = off // 4 + i
|
||||
while len(cur["dump"]) <= idx:
|
||||
cur["dump"].append(0)
|
||||
cur["dump"][idx] = word
|
||||
return fires
|
||||
|
||||
|
||||
def fmt_dump(d):
|
||||
return " ".join(f"{w:08x}" for w in d[:16])
|
||||
|
||||
|
||||
def main():
|
||||
ours = parse_ours(OURS_LOG)
|
||||
canary = parse_canary(CANARY_LOG)
|
||||
|
||||
print(f"=== Fire counts ===")
|
||||
print(f" ours: {len(ours)}")
|
||||
print(f" canary: {len(canary)}")
|
||||
print()
|
||||
|
||||
print(f"=== Per-LR breakdown ===")
|
||||
for label, fires in (("ours", ours), ("canary", canary)):
|
||||
lr_counts = {}
|
||||
for f in fires:
|
||||
lr_counts[f["lr"]] = lr_counts.get(f["lr"], 0) + 1
|
||||
print(f" {label}:")
|
||||
for lr, n in sorted(lr_counts.items()):
|
||||
print(f" lr=0x{lr:08x}: {n}")
|
||||
print()
|
||||
|
||||
print(f"=== Side-by-side first 5 fires (entry registers) ===")
|
||||
n = max(len(ours), len(canary))
|
||||
n = min(n, 5)
|
||||
for i in range(n):
|
||||
print(f"\n--- fire #{i} ---")
|
||||
if i < len(ours):
|
||||
f = ours[i]
|
||||
print(
|
||||
f" ours: tid={f['tid']:<3} cycle={f['cycle']:<10} lr=0x{f['lr']:08x} r3=0x{f['r3']:08x}"
|
||||
)
|
||||
print(f" dump: {fmt_dump(f['dump'])}")
|
||||
else:
|
||||
print(f" ours: <no fire>")
|
||||
if i < len(canary):
|
||||
f = canary[i]
|
||||
print(
|
||||
f" canary: tid={f['tid']:<3} cycle=<unk> lr=0x{f['lr']:08x} r3=0x{f['r3']:08x} "
|
||||
f"r4=0x{f['r4']:08x} r5=0x{f['r5']:08x} r6=0x{f['r6']:08x} r7=0x{f['r7']:08x}"
|
||||
)
|
||||
print(f" dump: {fmt_dump(f['dump'])}")
|
||||
else:
|
||||
print(f" canary: <no fire>")
|
||||
|
||||
print()
|
||||
print("=== Equivalence check: u32 lanes at +0x04 and +0x10 (work-item magic + counter) ===")
|
||||
print(" Both fields are stable identifiers across engines (host VAs differ but data should match).")
|
||||
print()
|
||||
print(" Index of fields:")
|
||||
print(" [+0x04] = work-item 'size?' (looks like a length field)")
|
||||
print(" [+0x10] = state counter (per round 30, this is [+128/4 ?]) — but in dump it's u32[4]")
|
||||
print()
|
||||
# +0x04 is dump[1], +0x10 is dump[4]
|
||||
ours_keys = [(f["dump"][1], f["dump"][4]) if len(f["dump"]) > 4 else None for f in ours]
|
||||
canary_keys = [(f["dump"][1], f["dump"][4]) if len(f["dump"]) > 4 else None for f in canary]
|
||||
print(f" ours [+04,+10]: {ours_keys}")
|
||||
print(f" canary [+04,+10]: {canary_keys}")
|
||||
print()
|
||||
# Cross-match: every ours key should appear in canary (canary is a superset)
|
||||
matched = []
|
||||
unmatched_ours = []
|
||||
for k in ours_keys:
|
||||
if k in canary_keys:
|
||||
matched.append(k)
|
||||
else:
|
||||
unmatched_ours.append(k)
|
||||
print(f" ours fires whose [+04,+10] match a canary fire: {len(matched)}/{len(ours)}")
|
||||
if unmatched_ours:
|
||||
print(f" ours fires with NO canary match: {unmatched_ours}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,17 @@
|
||||
K> F8000008 AUDIT-HLC JitProlog pc=821741C8 tid=00000006 r3=BCCC4A80 r4=00000018 r5=828F3888 r6=701CF924 r7=82456F00 r8=00000000 r9=00000000 r10=00000018 lr=822F1D5C
|
||||
K> F8000008 AUDIT-HLC JitProlog pc=821741C8 r3+00: BC22C910 00010004 00000000 000003E8
|
||||
K> F8000008 AUDIT-HLC JitProlog pc=821741C8 r3+10: 0101FFFF 00000000 00000000 01010000
|
||||
K> F8000008 AUDIT-HLC JitProlog pc=821741C8 r3+20: FFFFFFFF 00000000 00000000 00000000
|
||||
K> F8000008 AUDIT-HLC JitProlog pc=821741C8 r3+30: 00000000 BC365BC0 00000000 00000000
|
||||
K> F8000008 AUDIT-HLC JitProlog pc=821741C8 r3+40: 00000000 00000000 00000000 BDE9A398
|
||||
K> F8000008 AUDIT-HLC JitProlog pc=821741C8 r3+50: BC365560 00000000 00000000 00000000
|
||||
K> F8000008 AUDIT-HLC JitProlog pc=821741C8 r3+60: 00000000 00000000 00000000 01010040
|
||||
K> F8000008 AUDIT-HLC JitProlog pc=821741C8 r3+70: 00000000 00000000 00000000 FFFFFFFF
|
||||
K> F8000008 AUDIT-HLC JitProlog pc=821741C8 r3+80: 00000000 00000000 00000000 BC22C930
|
||||
K> F8000008 AUDIT-HLC JitProlog pc=821741C8 r3+90: 00000000 00000001 00000800 00000000
|
||||
K> F8000008 AUDIT-HLC JitProlog pc=821741C8 r3+A0: F800004C 00000000 00000000 BC365220
|
||||
K> F8000008 AUDIT-HLC JitProlog pc=821741C8 r3+B0: BC3655C0 00000000 00000000 00000000
|
||||
K> F8000008 AUDIT-HLC JitProlog pc=821741C8 r3+C0: 00CC0048 00460020 00460072 00650071
|
||||
K> F8000008 AUDIT-HLC JitProlog pc=821741C8 r3+D0: 00750065 006E0063 00790000 01010000
|
||||
K> F8000008 AUDIT-HLC JitProlog pc=821741C8 r3+E0: 00000000 00000000 00000000 FFFFFFFF
|
||||
K> F8000008 AUDIT-HLC JitProlog pc=821741C8 r3+F0: 00000000 00000000 00000000 BD610B80
|
||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@@ -0,0 +1,89 @@
|
||||
warn: CreateDXGIFactory2: Ignoring flags
|
||||
info: Game: xenia_canary.exe
|
||||
info: DXVK: v2.7.1
|
||||
info: Build: x86_64 gcc 15.1.0
|
||||
info: Vulkan: Found vkGetInstanceProcAddr in winevulkan.dll @ 0x6ffffbd84000
|
||||
info: Extension providers:
|
||||
info: Platform WSI
|
||||
info: OpenVR
|
||||
info: OpenVR: could not open registry key, status 2
|
||||
info: OpenVR: Failed to locate module
|
||||
info: OpenXR
|
||||
info: Enabled instance extensions:
|
||||
info: VK_EXT_surface_maintenance1
|
||||
info: VK_KHR_get_surface_capabilities2
|
||||
info: VK_KHR_surface
|
||||
info: VK_KHR_win32_surface
|
||||
info: Found device: NVIDIA GeForce GTX 1070 Ti (NVIDIA 580.159.3)
|
||||
info: Found device: llvmpipe (LLVM 20.1.2, 256 bits) (llvmpipe 25.2.8)
|
||||
info: Skipping: Software driver
|
||||
info: DXGI: Hiding actual GPU, reporting:
|
||||
info: vendor ID: 0x1002
|
||||
info: device ID: 0x73df
|
||||
warn: DxgiAdapter::QueryInterface: Unknown interface query
|
||||
warn: f0db4c7f-fe5a-42a2-bd62-f2a6cf6fc83e
|
||||
564.236:00dc:013c:info:vkd3d-proton:vkd3d_instance_apply_application_workarounds: Program name: "xenia_canary.exe" (hash: c099ade372da5277)
|
||||
564.236:00dc:013c:info:vkd3d-proton:vkd3d_instance_deduce_config_flags_from_environment: shader_cache is used, global_pipeline_cache is enforced.
|
||||
564.236:00dc:013c:info:vkd3d-proton:vkd3d_config_flags_init_once: VKD3D_CONFIG=''.
|
||||
564.240:00dc:013c:info:vkd3d-proton:vkd3d_get_vk_version: vkd3d-proton - applicationVersion: 3.0.1.
|
||||
564.240:00dc:013c:info:vkd3d-proton:vkd3d_instance_init: vkd3d-proton - build: 3b10bd7a7ec6a73.
|
||||
564.399:00dc:013c:info:vkd3d-proton:vkd3d_init_device_caps: Not all relevant pipeline stages are supported by EXT_dgc. Skipping.
|
||||
564.825:00dc:013c:info:vkd3d-proton:vkd3d_memory_info_decide_hvv_usage: Topology: Device heaps are split. Assuming small BAR situation.
|
||||
564.825:00dc:013c:info:vkd3d-proton:vkd3d_memory_info_upload_hvv_memory_properties: Topology: HVV usage is not allowed, using HOST_COHERENT for UPLOAD.
|
||||
564.827:00dc:013c:info:vkd3d-proton:vkd3d_bindless_state_get_bindless_flags: Device does not support VK_EXT_mutable_descriptor_type (or VALVE).
|
||||
564.827:00dc:013c:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
564.827:00dc:013c:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
564.827:00dc:013c:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
564.827:00dc:013c:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
564.827:00dc:013c:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
564.827:00dc:013c:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
564.827:00dc:013c:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
564.839:00dc:013c:info:vkd3d-proton:d3d12_device_caps_init_shader_model: Enabling support for SM 6.6.
|
||||
564.839:00dc:013c:fixme:vkd3d-proton:d3d12_device_caps_init_feature_options1: TotalLaneCount = 2432, may be inaccurate.
|
||||
564.839:00dc:013c:info:vkd3d-proton:d3d12_device_determine_ray_tracing_tier: DXR support enabled.
|
||||
564.840:00dc:013c:info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Remapping VKD3D_SHADER_CACHE to: vkd3d-proton.cache.
|
||||
564.840:00dc:013c:info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Attempting to load disk cache from: vkd3d-proton.cache.
|
||||
564.843:00dc:0154:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Performing async setup of stream archive ...
|
||||
564.844:00dc:0154:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_merge: Promoting write cache to read cache. No need to merge any disk caches.
|
||||
564.844:00dc:0154:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Merging pipeline libraries took 1.012 ms.
|
||||
564.845:00dc:0154:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Mapping read-only cache took 0.607 ms.
|
||||
564.845:00dc:0154:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Parsing stream archive took 0.370 ms.
|
||||
564.845:00dc:0154:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Done performing async setup of stream archive.
|
||||
564.903:00dc:013c:info:vkd3d-proton:vkd3d_get_vk_version: vkd3d-proton - applicationVersion: 3.0.1.
|
||||
564.903:00dc:013c:info:vkd3d-proton:vkd3d_instance_init: vkd3d-proton - build: 3b10bd7a7ec6a73.
|
||||
564.946:00dc:013c:info:vkd3d-proton:vkd3d_init_device_caps: Not all relevant pipeline stages are supported by EXT_dgc. Skipping.
|
||||
565.065:00dc:013c:info:vkd3d-proton:vkd3d_memory_info_decide_hvv_usage: Topology: Device heaps are split. Assuming small BAR situation.
|
||||
565.065:00dc:013c:info:vkd3d-proton:vkd3d_memory_info_upload_hvv_memory_properties: Topology: HVV usage is not allowed, using HOST_COHERENT for UPLOAD.
|
||||
565.066:00dc:013c:info:vkd3d-proton:vkd3d_bindless_state_get_bindless_flags: Device does not support VK_EXT_mutable_descriptor_type (or VALVE).
|
||||
565.066:00dc:013c:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
565.066:00dc:013c:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
565.066:00dc:013c:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
565.066:00dc:013c:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
565.066:00dc:013c:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
565.066:00dc:013c:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
565.066:00dc:013c:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
565.067:00dc:013c:info:vkd3d-proton:d3d12_device_caps_init_shader_model: Enabling support for SM 6.6.
|
||||
565.067:00dc:013c:fixme:vkd3d-proton:d3d12_device_caps_init_feature_options1: TotalLaneCount = 2432, may be inaccurate.
|
||||
565.067:00dc:013c:info:vkd3d-proton:d3d12_device_determine_ray_tracing_tier: DXR support enabled.
|
||||
565.067:00dc:013c:info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Remapping VKD3D_SHADER_CACHE to: vkd3d-proton.cache.
|
||||
565.067:00dc:013c:info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Attempting to load disk cache from: vkd3d-proton.cache.
|
||||
565.068:00dc:015c:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Performing async setup of stream archive ...
|
||||
565.068:00dc:015c:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_merge: No write cache exists. No need to merge any disk caches.
|
||||
565.068:00dc:015c:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Merging pipeline libraries took 0.136 ms.
|
||||
565.068:00dc:015c:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Mapping read-only cache took 0.221 ms.
|
||||
565.069:00dc:015c:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Parsing stream archive took 0.031 ms.
|
||||
565.069:00dc:015c:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Done performing async setup of stream archive.
|
||||
565.075:00dc:013c:fixme:vkd3d-proton:d3d12_command_queue_init: Ignoring priority 0x64.
|
||||
warn: DXGIGetDebugInterface1: Stub
|
||||
info: DXGI: Hiding actual GPU, reporting:
|
||||
info: vendor ID: 0x1002
|
||||
info: device ID: 0x73df
|
||||
565.173:00dc:00e0:info:vkd3d-proton:dxgi_vk_swap_chain_init: Creating swapchain (1280 x 720), BufferCount = 3.
|
||||
565.194:00dc:00e0:info:vkd3d-proton:dxgi_vk_swap_chain_init_sync_objects: Ensure maximum latency of 3 frames with KHR_present_wait.
|
||||
565.195:00dc:00e0:info:vkd3d-proton:dxgi_vk_swap_chain_init_sleep_state: Timer interval is 1.0 ms.
|
||||
warn: DXGI: MakeWindowAssociation: Ignoring flags
|
||||
warn: DxgiOutput::WaitForVBlank: Inaccurate
|
||||
info: Setting timer interval to 1000 us
|
||||
565.773:00dc:0164:info:vkd3d-proton:dxgi_vk_swap_chain_recreate_swapchain_in_present_task: Got 3 swapchain images.
|
||||
566.349:00dc:016c:fixme:vkd3d-proton:vkd3d_texture_view_desc_fixup: Remapping 2D to 2D_ARRAY. Needs Vulkan spec tightening to match D3D12 properly.
|
||||
566.387:00dc:0164:info:vkd3d-proton:dxgi_vk_swap_chain_recreate_swapchain_in_present_task: Got 3 swapchain images.
|
||||
@@ -0,0 +1,89 @@
|
||||
warn: CreateDXGIFactory2: Ignoring flags
|
||||
info: Game: xenia_canary.exe
|
||||
info: DXVK: v2.7.1
|
||||
info: Build: x86_64 gcc 15.1.0
|
||||
info: Vulkan: Found vkGetInstanceProcAddr in winevulkan.dll @ 0x6ffffbfb4000
|
||||
info: Extension providers:
|
||||
info: Platform WSI
|
||||
info: OpenVR
|
||||
info: OpenVR: could not open registry key, status 2
|
||||
info: OpenVR: Failed to locate module
|
||||
info: OpenXR
|
||||
info: Enabled instance extensions:
|
||||
info: VK_EXT_surface_maintenance1
|
||||
info: VK_KHR_get_surface_capabilities2
|
||||
info: VK_KHR_surface
|
||||
info: VK_KHR_win32_surface
|
||||
info: Found device: NVIDIA GeForce GTX 1070 Ti (NVIDIA 580.159.3)
|
||||
info: Found device: llvmpipe (LLVM 20.1.2, 256 bits) (llvmpipe 25.2.8)
|
||||
info: Skipping: Software driver
|
||||
info: DXGI: Hiding actual GPU, reporting:
|
||||
info: vendor ID: 0x1002
|
||||
info: device ID: 0x73df
|
||||
warn: DxgiAdapter::QueryInterface: Unknown interface query
|
||||
warn: f0db4c7f-fe5a-42a2-bd62-f2a6cf6fc83e
|
||||
805.907:00d0:0124:info:vkd3d-proton:vkd3d_instance_apply_application_workarounds: Program name: "xenia_canary.exe" (hash: c099ade372da5277)
|
||||
805.907:00d0:0124:info:vkd3d-proton:vkd3d_instance_deduce_config_flags_from_environment: shader_cache is used, global_pipeline_cache is enforced.
|
||||
805.907:00d0:0124:info:vkd3d-proton:vkd3d_config_flags_init_once: VKD3D_CONFIG=''.
|
||||
805.910:00d0:0124:info:vkd3d-proton:vkd3d_get_vk_version: vkd3d-proton - applicationVersion: 3.0.1.
|
||||
805.910:00d0:0124:info:vkd3d-proton:vkd3d_instance_init: vkd3d-proton - build: 3b10bd7a7ec6a73.
|
||||
805.955:00d0:0124:info:vkd3d-proton:vkd3d_init_device_caps: Not all relevant pipeline stages are supported by EXT_dgc. Skipping.
|
||||
806.100:00d0:0124:info:vkd3d-proton:vkd3d_memory_info_decide_hvv_usage: Topology: Device heaps are split. Assuming small BAR situation.
|
||||
806.100:00d0:0124:info:vkd3d-proton:vkd3d_memory_info_upload_hvv_memory_properties: Topology: HVV usage is not allowed, using HOST_COHERENT for UPLOAD.
|
||||
806.101:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_get_bindless_flags: Device does not support VK_EXT_mutable_descriptor_type (or VALVE).
|
||||
806.101:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
806.101:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
806.101:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
806.101:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
806.101:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
806.101:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
806.101:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
806.105:00d0:0124:info:vkd3d-proton:d3d12_device_caps_init_shader_model: Enabling support for SM 6.6.
|
||||
806.105:00d0:0124:fixme:vkd3d-proton:d3d12_device_caps_init_feature_options1: TotalLaneCount = 2432, may be inaccurate.
|
||||
806.105:00d0:0124:info:vkd3d-proton:d3d12_device_determine_ray_tracing_tier: DXR support enabled.
|
||||
806.105:00d0:0124:info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Remapping VKD3D_SHADER_CACHE to: vkd3d-proton.cache.
|
||||
806.105:00d0:0124:info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Attempting to load disk cache from: vkd3d-proton.cache.
|
||||
806.106:00d0:013c:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Performing async setup of stream archive ...
|
||||
806.106:00d0:013c:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_merge: No write cache exists. No need to merge any disk caches.
|
||||
806.106:00d0:013c:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Merging pipeline libraries took 0.161 ms.
|
||||
806.107:00d0:013c:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Mapping read-only cache took 0.185 ms.
|
||||
806.107:00d0:013c:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Parsing stream archive took 0.028 ms.
|
||||
806.107:00d0:013c:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Done performing async setup of stream archive.
|
||||
806.154:00d0:0124:info:vkd3d-proton:vkd3d_get_vk_version: vkd3d-proton - applicationVersion: 3.0.1.
|
||||
806.154:00d0:0124:info:vkd3d-proton:vkd3d_instance_init: vkd3d-proton - build: 3b10bd7a7ec6a73.
|
||||
806.197:00d0:0124:info:vkd3d-proton:vkd3d_init_device_caps: Not all relevant pipeline stages are supported by EXT_dgc. Skipping.
|
||||
806.310:00d0:0124:info:vkd3d-proton:vkd3d_memory_info_decide_hvv_usage: Topology: Device heaps are split. Assuming small BAR situation.
|
||||
806.310:00d0:0124:info:vkd3d-proton:vkd3d_memory_info_upload_hvv_memory_properties: Topology: HVV usage is not allowed, using HOST_COHERENT for UPLOAD.
|
||||
806.310:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_get_bindless_flags: Device does not support VK_EXT_mutable_descriptor_type (or VALVE).
|
||||
806.310:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
806.310:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
806.310:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
806.310:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
806.310:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
806.310:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
806.310:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
806.312:00d0:0124:info:vkd3d-proton:d3d12_device_caps_init_shader_model: Enabling support for SM 6.6.
|
||||
806.312:00d0:0124:fixme:vkd3d-proton:d3d12_device_caps_init_feature_options1: TotalLaneCount = 2432, may be inaccurate.
|
||||
806.312:00d0:0124:info:vkd3d-proton:d3d12_device_determine_ray_tracing_tier: DXR support enabled.
|
||||
806.312:00d0:0124:info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Remapping VKD3D_SHADER_CACHE to: vkd3d-proton.cache.
|
||||
806.312:00d0:0124:info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Attempting to load disk cache from: vkd3d-proton.cache.
|
||||
806.313:00d0:0144:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Performing async setup of stream archive ...
|
||||
806.313:00d0:0144:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_merge: No write cache exists. No need to merge any disk caches.
|
||||
806.313:00d0:0144:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Merging pipeline libraries took 0.156 ms.
|
||||
806.314:00d0:0144:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Mapping read-only cache took 0.659 ms.
|
||||
806.314:00d0:0144:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Parsing stream archive took 0.035 ms.
|
||||
806.314:00d0:0144:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Done performing async setup of stream archive.
|
||||
806.319:00d0:0124:fixme:vkd3d-proton:d3d12_command_queue_init: Ignoring priority 0x64.
|
||||
warn: DXGIGetDebugInterface1: Stub
|
||||
info: DXGI: Hiding actual GPU, reporting:
|
||||
info: vendor ID: 0x1002
|
||||
info: device ID: 0x73df
|
||||
806.408:00d0:00d4:info:vkd3d-proton:dxgi_vk_swap_chain_init: Creating swapchain (1280 x 720), BufferCount = 3.
|
||||
806.422:00d0:00d4:info:vkd3d-proton:dxgi_vk_swap_chain_init_sync_objects: Ensure maximum latency of 3 frames with KHR_present_wait.
|
||||
806.423:00d0:00d4:info:vkd3d-proton:dxgi_vk_swap_chain_init_sleep_state: Timer interval is 1.0 ms.
|
||||
warn: DXGI: MakeWindowAssociation: Ignoring flags
|
||||
warn: DxgiOutput::WaitForVBlank: Inaccurate
|
||||
info: Setting timer interval to 1000 us
|
||||
806.948:00d0:014c:info:vkd3d-proton:dxgi_vk_swap_chain_recreate_swapchain_in_present_task: Got 3 swapchain images.
|
||||
807.499:00d0:0154:fixme:vkd3d-proton:vkd3d_texture_view_desc_fixup: Remapping 2D to 2D_ARRAY. Needs Vulkan spec tightening to match D3D12 properly.
|
||||
807.521:00d0:014c:info:vkd3d-proton:dxgi_vk_swap_chain_recreate_swapchain_in_present_task: Got 3 swapchain images.
|
||||
@@ -0,0 +1,89 @@
|
||||
warn: CreateDXGIFactory2: Ignoring flags
|
||||
info: Game: xenia_canary.exe
|
||||
info: DXVK: v2.7.1
|
||||
info: Build: x86_64 gcc 15.1.0
|
||||
info: Vulkan: Found vkGetInstanceProcAddr in winevulkan.dll @ 0x6ffffbfb4000
|
||||
info: Extension providers:
|
||||
info: Platform WSI
|
||||
info: OpenVR
|
||||
info: OpenVR: could not open registry key, status 2
|
||||
info: OpenVR: Failed to locate module
|
||||
info: OpenXR
|
||||
info: Enabled instance extensions:
|
||||
info: VK_EXT_surface_maintenance1
|
||||
info: VK_KHR_get_surface_capabilities2
|
||||
info: VK_KHR_surface
|
||||
info: VK_KHR_win32_surface
|
||||
info: Found device: NVIDIA GeForce GTX 1070 Ti (NVIDIA 580.159.3)
|
||||
info: Found device: llvmpipe (LLVM 20.1.2, 256 bits) (llvmpipe 25.2.8)
|
||||
info: Skipping: Software driver
|
||||
info: DXGI: Hiding actual GPU, reporting:
|
||||
info: vendor ID: 0x1002
|
||||
info: device ID: 0x73df
|
||||
warn: DxgiAdapter::QueryInterface: Unknown interface query
|
||||
warn: f0db4c7f-fe5a-42a2-bd62-f2a6cf6fc83e
|
||||
893.096:00d4:0128:info:vkd3d-proton:vkd3d_instance_apply_application_workarounds: Program name: "xenia_canary.exe" (hash: c099ade372da5277)
|
||||
893.096:00d4:0128:info:vkd3d-proton:vkd3d_instance_deduce_config_flags_from_environment: shader_cache is used, global_pipeline_cache is enforced.
|
||||
893.096:00d4:0128:info:vkd3d-proton:vkd3d_config_flags_init_once: VKD3D_CONFIG=''.
|
||||
893.099:00d4:0128:info:vkd3d-proton:vkd3d_get_vk_version: vkd3d-proton - applicationVersion: 3.0.1.
|
||||
893.099:00d4:0128:info:vkd3d-proton:vkd3d_instance_init: vkd3d-proton - build: 3b10bd7a7ec6a73.
|
||||
893.145:00d4:0128:info:vkd3d-proton:vkd3d_init_device_caps: Not all relevant pipeline stages are supported by EXT_dgc. Skipping.
|
||||
893.308:00d4:0128:info:vkd3d-proton:vkd3d_memory_info_decide_hvv_usage: Topology: Device heaps are split. Assuming small BAR situation.
|
||||
893.308:00d4:0128:info:vkd3d-proton:vkd3d_memory_info_upload_hvv_memory_properties: Topology: HVV usage is not allowed, using HOST_COHERENT for UPLOAD.
|
||||
893.308:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_get_bindless_flags: Device does not support VK_EXT_mutable_descriptor_type (or VALVE).
|
||||
893.308:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
893.308:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
893.308:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
893.308:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
893.308:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
893.308:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
893.308:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
893.310:00d4:0128:info:vkd3d-proton:d3d12_device_caps_init_shader_model: Enabling support for SM 6.6.
|
||||
893.310:00d4:0128:fixme:vkd3d-proton:d3d12_device_caps_init_feature_options1: TotalLaneCount = 2432, may be inaccurate.
|
||||
893.310:00d4:0128:info:vkd3d-proton:d3d12_device_determine_ray_tracing_tier: DXR support enabled.
|
||||
893.310:00d4:0128:info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Remapping VKD3D_SHADER_CACHE to: vkd3d-proton.cache.
|
||||
893.310:00d4:0128:info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Attempting to load disk cache from: vkd3d-proton.cache.
|
||||
893.311:00d4:0140:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Performing async setup of stream archive ...
|
||||
893.311:00d4:0140:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_merge: No write cache exists. No need to merge any disk caches.
|
||||
893.311:00d4:0140:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Merging pipeline libraries took 0.187 ms.
|
||||
893.312:00d4:0140:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Mapping read-only cache took 0.161 ms.
|
||||
893.312:00d4:0140:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Parsing stream archive took 0.040 ms.
|
||||
893.312:00d4:0140:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Done performing async setup of stream archive.
|
||||
893.360:00d4:0128:info:vkd3d-proton:vkd3d_get_vk_version: vkd3d-proton - applicationVersion: 3.0.1.
|
||||
893.360:00d4:0128:info:vkd3d-proton:vkd3d_instance_init: vkd3d-proton - build: 3b10bd7a7ec6a73.
|
||||
893.405:00d4:0128:info:vkd3d-proton:vkd3d_init_device_caps: Not all relevant pipeline stages are supported by EXT_dgc. Skipping.
|
||||
893.520:00d4:0128:info:vkd3d-proton:vkd3d_memory_info_decide_hvv_usage: Topology: Device heaps are split. Assuming small BAR situation.
|
||||
893.520:00d4:0128:info:vkd3d-proton:vkd3d_memory_info_upload_hvv_memory_properties: Topology: HVV usage is not allowed, using HOST_COHERENT for UPLOAD.
|
||||
893.520:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_get_bindless_flags: Device does not support VK_EXT_mutable_descriptor_type (or VALVE).
|
||||
893.520:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
893.520:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
893.520:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
893.520:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
893.520:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
893.520:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
893.520:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
893.522:00d4:0128:info:vkd3d-proton:d3d12_device_caps_init_shader_model: Enabling support for SM 6.6.
|
||||
893.522:00d4:0128:fixme:vkd3d-proton:d3d12_device_caps_init_feature_options1: TotalLaneCount = 2432, may be inaccurate.
|
||||
893.522:00d4:0128:info:vkd3d-proton:d3d12_device_determine_ray_tracing_tier: DXR support enabled.
|
||||
893.522:00d4:0128:info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Remapping VKD3D_SHADER_CACHE to: vkd3d-proton.cache.
|
||||
893.522:00d4:0128:info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Attempting to load disk cache from: vkd3d-proton.cache.
|
||||
893.523:00d4:0148:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Performing async setup of stream archive ...
|
||||
893.523:00d4:0148:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_merge: No write cache exists. No need to merge any disk caches.
|
||||
893.523:00d4:0148:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Merging pipeline libraries took 0.153 ms.
|
||||
893.523:00d4:0148:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Mapping read-only cache took 0.199 ms.
|
||||
893.523:00d4:0148:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Parsing stream archive took 0.034 ms.
|
||||
893.523:00d4:0148:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Done performing async setup of stream archive.
|
||||
893.529:00d4:0128:fixme:vkd3d-proton:d3d12_command_queue_init: Ignoring priority 0x64.
|
||||
warn: DXGIGetDebugInterface1: Stub
|
||||
info: DXGI: Hiding actual GPU, reporting:
|
||||
info: vendor ID: 0x1002
|
||||
info: device ID: 0x73df
|
||||
893.622:00d4:00d8:info:vkd3d-proton:dxgi_vk_swap_chain_init: Creating swapchain (1280 x 720), BufferCount = 3.
|
||||
893.631:00d4:00d8:info:vkd3d-proton:dxgi_vk_swap_chain_init_sync_objects: Ensure maximum latency of 3 frames with KHR_present_wait.
|
||||
893.632:00d4:00d8:info:vkd3d-proton:dxgi_vk_swap_chain_init_sleep_state: Timer interval is 1.0 ms.
|
||||
warn: DXGI: MakeWindowAssociation: Ignoring flags
|
||||
warn: DxgiOutput::WaitForVBlank: Inaccurate
|
||||
info: Setting timer interval to 1000 us
|
||||
894.203:00d4:0150:info:vkd3d-proton:dxgi_vk_swap_chain_recreate_swapchain_in_present_task: Got 3 swapchain images.
|
||||
894.705:00d4:0158:fixme:vkd3d-proton:vkd3d_texture_view_desc_fixup: Remapping 2D to 2D_ARRAY. Needs Vulkan spec tightening to match D3D12 properly.
|
||||
894.727:00d4:0150:info:vkd3d-proton:dxgi_vk_swap_chain_recreate_swapchain_in_present_task: Got 3 swapchain images.
|
||||
@@ -0,0 +1,89 @@
|
||||
warn: CreateDXGIFactory2: Ignoring flags
|
||||
info: Game: xenia_canary.exe
|
||||
info: DXVK: v2.7.1
|
||||
info: Build: x86_64 gcc 15.1.0
|
||||
info: Vulkan: Found vkGetInstanceProcAddr in winevulkan.dll @ 0x6ffffbfb4000
|
||||
info: Extension providers:
|
||||
info: Platform WSI
|
||||
info: OpenVR
|
||||
info: OpenVR: could not open registry key, status 2
|
||||
info: OpenVR: Failed to locate module
|
||||
info: OpenXR
|
||||
info: Enabled instance extensions:
|
||||
info: VK_EXT_surface_maintenance1
|
||||
info: VK_KHR_get_surface_capabilities2
|
||||
info: VK_KHR_surface
|
||||
info: VK_KHR_win32_surface
|
||||
info: Found device: NVIDIA GeForce GTX 1070 Ti (NVIDIA 580.159.3)
|
||||
info: Found device: llvmpipe (LLVM 20.1.2, 256 bits) (llvmpipe 25.2.8)
|
||||
info: Skipping: Software driver
|
||||
info: DXGI: Hiding actual GPU, reporting:
|
||||
info: vendor ID: 0x1002
|
||||
info: device ID: 0x73df
|
||||
warn: DxgiAdapter::QueryInterface: Unknown interface query
|
||||
warn: f0db4c7f-fe5a-42a2-bd62-f2a6cf6fc83e
|
||||
956.778:00d0:0124:info:vkd3d-proton:vkd3d_instance_apply_application_workarounds: Program name: "xenia_canary.exe" (hash: c099ade372da5277)
|
||||
956.778:00d0:0124:info:vkd3d-proton:vkd3d_instance_deduce_config_flags_from_environment: shader_cache is used, global_pipeline_cache is enforced.
|
||||
956.778:00d0:0124:info:vkd3d-proton:vkd3d_config_flags_init_once: VKD3D_CONFIG=''.
|
||||
956.781:00d0:0124:info:vkd3d-proton:vkd3d_get_vk_version: vkd3d-proton - applicationVersion: 3.0.1.
|
||||
956.781:00d0:0124:info:vkd3d-proton:vkd3d_instance_init: vkd3d-proton - build: 3b10bd7a7ec6a73.
|
||||
956.826:00d0:0124:info:vkd3d-proton:vkd3d_init_device_caps: Not all relevant pipeline stages are supported by EXT_dgc. Skipping.
|
||||
956.983:00d0:0124:info:vkd3d-proton:vkd3d_memory_info_decide_hvv_usage: Topology: Device heaps are split. Assuming small BAR situation.
|
||||
956.983:00d0:0124:info:vkd3d-proton:vkd3d_memory_info_upload_hvv_memory_properties: Topology: HVV usage is not allowed, using HOST_COHERENT for UPLOAD.
|
||||
956.983:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_get_bindless_flags: Device does not support VK_EXT_mutable_descriptor_type (or VALVE).
|
||||
956.983:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
956.983:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
956.983:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
956.983:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
956.983:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
956.983:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
956.983:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
956.985:00d0:0124:info:vkd3d-proton:d3d12_device_caps_init_shader_model: Enabling support for SM 6.6.
|
||||
956.985:00d0:0124:fixme:vkd3d-proton:d3d12_device_caps_init_feature_options1: TotalLaneCount = 2432, may be inaccurate.
|
||||
956.985:00d0:0124:info:vkd3d-proton:d3d12_device_determine_ray_tracing_tier: DXR support enabled.
|
||||
956.985:00d0:0124:info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Remapping VKD3D_SHADER_CACHE to: vkd3d-proton.cache.
|
||||
956.985:00d0:0124:info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Attempting to load disk cache from: vkd3d-proton.cache.
|
||||
956.985:00d0:013c:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Performing async setup of stream archive ...
|
||||
956.986:00d0:013c:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_merge: No write cache exists. No need to merge any disk caches.
|
||||
956.986:00d0:013c:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Merging pipeline libraries took 0.171 ms.
|
||||
956.986:00d0:013c:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Mapping read-only cache took 0.269 ms.
|
||||
956.986:00d0:013c:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Parsing stream archive took 0.028 ms.
|
||||
956.986:00d0:013c:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Done performing async setup of stream archive.
|
||||
957.031:00d0:0124:info:vkd3d-proton:vkd3d_get_vk_version: vkd3d-proton - applicationVersion: 3.0.1.
|
||||
957.031:00d0:0124:info:vkd3d-proton:vkd3d_instance_init: vkd3d-proton - build: 3b10bd7a7ec6a73.
|
||||
957.075:00d0:0124:info:vkd3d-proton:vkd3d_init_device_caps: Not all relevant pipeline stages are supported by EXT_dgc. Skipping.
|
||||
957.186:00d0:0124:info:vkd3d-proton:vkd3d_memory_info_decide_hvv_usage: Topology: Device heaps are split. Assuming small BAR situation.
|
||||
957.186:00d0:0124:info:vkd3d-proton:vkd3d_memory_info_upload_hvv_memory_properties: Topology: HVV usage is not allowed, using HOST_COHERENT for UPLOAD.
|
||||
957.186:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_get_bindless_flags: Device does not support VK_EXT_mutable_descriptor_type (or VALVE).
|
||||
957.186:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
957.186:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
957.186:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
957.186:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
957.186:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
957.186:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
957.186:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
957.188:00d0:0124:info:vkd3d-proton:d3d12_device_caps_init_shader_model: Enabling support for SM 6.6.
|
||||
957.188:00d0:0124:fixme:vkd3d-proton:d3d12_device_caps_init_feature_options1: TotalLaneCount = 2432, may be inaccurate.
|
||||
957.188:00d0:0124:info:vkd3d-proton:d3d12_device_determine_ray_tracing_tier: DXR support enabled.
|
||||
957.188:00d0:0124:info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Remapping VKD3D_SHADER_CACHE to: vkd3d-proton.cache.
|
||||
957.188:00d0:0124:info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Attempting to load disk cache from: vkd3d-proton.cache.
|
||||
957.188:00d0:0144:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Performing async setup of stream archive ...
|
||||
957.188:00d0:0144:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_merge: No write cache exists. No need to merge any disk caches.
|
||||
957.189:00d0:0144:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Merging pipeline libraries took 0.172 ms.
|
||||
957.189:00d0:0144:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Mapping read-only cache took 0.231 ms.
|
||||
957.189:00d0:0144:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Parsing stream archive took 0.029 ms.
|
||||
957.189:00d0:0144:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Done performing async setup of stream archive.
|
||||
957.195:00d0:0124:fixme:vkd3d-proton:d3d12_command_queue_init: Ignoring priority 0x64.
|
||||
warn: DXGIGetDebugInterface1: Stub
|
||||
info: DXGI: Hiding actual GPU, reporting:
|
||||
info: vendor ID: 0x1002
|
||||
info: device ID: 0x73df
|
||||
957.285:00d0:00d4:info:vkd3d-proton:dxgi_vk_swap_chain_init: Creating swapchain (1280 x 720), BufferCount = 3.
|
||||
957.295:00d0:00d4:info:vkd3d-proton:dxgi_vk_swap_chain_init_sync_objects: Ensure maximum latency of 3 frames with KHR_present_wait.
|
||||
957.295:00d0:00d4:info:vkd3d-proton:dxgi_vk_swap_chain_init_sleep_state: Timer interval is 1.0 ms.
|
||||
warn: DXGI: MakeWindowAssociation: Ignoring flags
|
||||
warn: DxgiOutput::WaitForVBlank: Inaccurate
|
||||
info: Setting timer interval to 1000 us
|
||||
957.806:00d0:014c:info:vkd3d-proton:dxgi_vk_swap_chain_recreate_swapchain_in_present_task: Got 3 swapchain images.
|
||||
958.343:00d0:0154:fixme:vkd3d-proton:vkd3d_texture_view_desc_fixup: Remapping 2D to 2D_ARRAY. Needs Vulkan spec tightening to match D3D12 properly.
|
||||
958.382:00d0:014c:info:vkd3d-proton:dxgi_vk_swap_chain_recreate_swapchain_in_present_task: Got 3 swapchain images.
|
||||
@@ -0,0 +1,89 @@
|
||||
warn: CreateDXGIFactory2: Ignoring flags
|
||||
info: Game: xenia_canary.exe
|
||||
info: DXVK: v2.7.1
|
||||
info: Build: x86_64 gcc 15.1.0
|
||||
info: Vulkan: Found vkGetInstanceProcAddr in winevulkan.dll @ 0x6ffffbfb4000
|
||||
info: Extension providers:
|
||||
info: Platform WSI
|
||||
info: OpenVR
|
||||
info: OpenVR: could not open registry key, status 2
|
||||
info: OpenVR: Failed to locate module
|
||||
info: OpenXR
|
||||
info: Enabled instance extensions:
|
||||
info: VK_EXT_surface_maintenance1
|
||||
info: VK_KHR_get_surface_capabilities2
|
||||
info: VK_KHR_surface
|
||||
info: VK_KHR_win32_surface
|
||||
info: Found device: NVIDIA GeForce GTX 1070 Ti (NVIDIA 580.159.3)
|
||||
info: Found device: llvmpipe (LLVM 20.1.2, 256 bits) (llvmpipe 25.2.8)
|
||||
info: Skipping: Software driver
|
||||
info: DXGI: Hiding actual GPU, reporting:
|
||||
info: vendor ID: 0x1002
|
||||
info: device ID: 0x73df
|
||||
warn: DxgiAdapter::QueryInterface: Unknown interface query
|
||||
warn: f0db4c7f-fe5a-42a2-bd62-f2a6cf6fc83e
|
||||
1217.108:00d4:0128:info:vkd3d-proton:vkd3d_instance_apply_application_workarounds: Program name: "xenia_canary.exe" (hash: c099ade372da5277)
|
||||
1217.108:00d4:0128:info:vkd3d-proton:vkd3d_instance_deduce_config_flags_from_environment: shader_cache is used, global_pipeline_cache is enforced.
|
||||
1217.108:00d4:0128:info:vkd3d-proton:vkd3d_config_flags_init_once: VKD3D_CONFIG=''.
|
||||
1217.111:00d4:0128:info:vkd3d-proton:vkd3d_get_vk_version: vkd3d-proton - applicationVersion: 3.0.1.
|
||||
1217.111:00d4:0128:info:vkd3d-proton:vkd3d_instance_init: vkd3d-proton - build: 3b10bd7a7ec6a73.
|
||||
1217.160:00d4:0128:info:vkd3d-proton:vkd3d_init_device_caps: Not all relevant pipeline stages are supported by EXT_dgc. Skipping.
|
||||
1217.307:00d4:0128:info:vkd3d-proton:vkd3d_memory_info_decide_hvv_usage: Topology: Device heaps are split. Assuming small BAR situation.
|
||||
1217.307:00d4:0128:info:vkd3d-proton:vkd3d_memory_info_upload_hvv_memory_properties: Topology: HVV usage is not allowed, using HOST_COHERENT for UPLOAD.
|
||||
1217.307:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_get_bindless_flags: Device does not support VK_EXT_mutable_descriptor_type (or VALVE).
|
||||
1217.307:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1217.307:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1217.307:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1217.307:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1217.307:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1217.307:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1217.307:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1217.309:00d4:0128:info:vkd3d-proton:d3d12_device_caps_init_shader_model: Enabling support for SM 6.6.
|
||||
1217.309:00d4:0128:fixme:vkd3d-proton:d3d12_device_caps_init_feature_options1: TotalLaneCount = 2432, may be inaccurate.
|
||||
1217.309:00d4:0128:info:vkd3d-proton:d3d12_device_determine_ray_tracing_tier: DXR support enabled.
|
||||
1217.309:00d4:0128:info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Remapping VKD3D_SHADER_CACHE to: vkd3d-proton.cache.
|
||||
1217.309:00d4:0128:info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Attempting to load disk cache from: vkd3d-proton.cache.
|
||||
1217.310:00d4:0140:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Performing async setup of stream archive ...
|
||||
1217.310:00d4:0140:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_merge: No write cache exists. No need to merge any disk caches.
|
||||
1217.310:00d4:0140:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Merging pipeline libraries took 0.166 ms.
|
||||
1217.310:00d4:0140:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Mapping read-only cache took 0.173 ms.
|
||||
1217.310:00d4:0140:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Parsing stream archive took 0.031 ms.
|
||||
1217.310:00d4:0140:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Done performing async setup of stream archive.
|
||||
1217.360:00d4:0128:info:vkd3d-proton:vkd3d_get_vk_version: vkd3d-proton - applicationVersion: 3.0.1.
|
||||
1217.360:00d4:0128:info:vkd3d-proton:vkd3d_instance_init: vkd3d-proton - build: 3b10bd7a7ec6a73.
|
||||
1217.403:00d4:0128:info:vkd3d-proton:vkd3d_init_device_caps: Not all relevant pipeline stages are supported by EXT_dgc. Skipping.
|
||||
1217.515:00d4:0128:info:vkd3d-proton:vkd3d_memory_info_decide_hvv_usage: Topology: Device heaps are split. Assuming small BAR situation.
|
||||
1217.515:00d4:0128:info:vkd3d-proton:vkd3d_memory_info_upload_hvv_memory_properties: Topology: HVV usage is not allowed, using HOST_COHERENT for UPLOAD.
|
||||
1217.515:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_get_bindless_flags: Device does not support VK_EXT_mutable_descriptor_type (or VALVE).
|
||||
1217.515:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1217.515:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1217.515:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1217.515:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1217.515:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1217.515:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1217.515:00d4:0128:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1217.516:00d4:0128:info:vkd3d-proton:d3d12_device_caps_init_shader_model: Enabling support for SM 6.6.
|
||||
1217.516:00d4:0128:fixme:vkd3d-proton:d3d12_device_caps_init_feature_options1: TotalLaneCount = 2432, may be inaccurate.
|
||||
1217.516:00d4:0128:info:vkd3d-proton:d3d12_device_determine_ray_tracing_tier: DXR support enabled.
|
||||
1217.516:00d4:0128:info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Remapping VKD3D_SHADER_CACHE to: vkd3d-proton.cache.
|
||||
1217.516:00d4:0128:info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Attempting to load disk cache from: vkd3d-proton.cache.
|
||||
1217.517:00d4:0148:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Performing async setup of stream archive ...
|
||||
1217.517:00d4:0148:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_merge: No write cache exists. No need to merge any disk caches.
|
||||
1217.517:00d4:0148:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Merging pipeline libraries took 0.157 ms.
|
||||
1217.517:00d4:0148:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Mapping read-only cache took 0.208 ms.
|
||||
1217.518:00d4:0148:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Parsing stream archive took 0.032 ms.
|
||||
1217.518:00d4:0148:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Done performing async setup of stream archive.
|
||||
1217.524:00d4:0128:fixme:vkd3d-proton:d3d12_command_queue_init: Ignoring priority 0x64.
|
||||
warn: DXGIGetDebugInterface1: Stub
|
||||
info: DXGI: Hiding actual GPU, reporting:
|
||||
info: vendor ID: 0x1002
|
||||
info: device ID: 0x73df
|
||||
1217.612:00d4:00d8:info:vkd3d-proton:dxgi_vk_swap_chain_init: Creating swapchain (1280 x 720), BufferCount = 3.
|
||||
1217.622:00d4:00d8:info:vkd3d-proton:dxgi_vk_swap_chain_init_sync_objects: Ensure maximum latency of 3 frames with KHR_present_wait.
|
||||
1217.622:00d4:00d8:info:vkd3d-proton:dxgi_vk_swap_chain_init_sleep_state: Timer interval is 1.0 ms.
|
||||
warn: DXGI: MakeWindowAssociation: Ignoring flags
|
||||
warn: DxgiOutput::WaitForVBlank: Inaccurate
|
||||
info: Setting timer interval to 1000 us
|
||||
1218.136:00d4:0150:info:vkd3d-proton:dxgi_vk_swap_chain_recreate_swapchain_in_present_task: Got 3 swapchain images.
|
||||
1218.678:00d4:0158:fixme:vkd3d-proton:vkd3d_texture_view_desc_fixup: Remapping 2D to 2D_ARRAY. Needs Vulkan spec tightening to match D3D12 properly.
|
||||
1218.699:00d4:0150:info:vkd3d-proton:dxgi_vk_swap_chain_recreate_swapchain_in_present_task: Got 3 swapchain images.
|
||||
@@ -0,0 +1,89 @@
|
||||
warn: CreateDXGIFactory2: Ignoring flags
|
||||
info: Game: xenia_canary.exe
|
||||
info: DXVK: v2.7.1
|
||||
info: Build: x86_64 gcc 15.1.0
|
||||
info: Vulkan: Found vkGetInstanceProcAddr in winevulkan.dll @ 0x6ffffbfb4000
|
||||
info: Extension providers:
|
||||
info: Platform WSI
|
||||
info: OpenVR
|
||||
info: OpenVR: could not open registry key, status 2
|
||||
info: OpenVR: Failed to locate module
|
||||
info: OpenXR
|
||||
info: Enabled instance extensions:
|
||||
info: VK_EXT_surface_maintenance1
|
||||
info: VK_KHR_get_surface_capabilities2
|
||||
info: VK_KHR_surface
|
||||
info: VK_KHR_win32_surface
|
||||
info: Found device: NVIDIA GeForce GTX 1070 Ti (NVIDIA 580.159.3)
|
||||
info: Found device: llvmpipe (LLVM 20.1.2, 256 bits) (llvmpipe 25.2.8)
|
||||
info: Skipping: Software driver
|
||||
info: DXGI: Hiding actual GPU, reporting:
|
||||
info: vendor ID: 0x1002
|
||||
info: device ID: 0x73df
|
||||
warn: DxgiAdapter::QueryInterface: Unknown interface query
|
||||
warn: f0db4c7f-fe5a-42a2-bd62-f2a6cf6fc83e
|
||||
1413.916:00d0:0124:info:vkd3d-proton:vkd3d_instance_apply_application_workarounds: Program name: "xenia_canary.exe" (hash: c099ade372da5277)
|
||||
1413.916:00d0:0124:info:vkd3d-proton:vkd3d_instance_deduce_config_flags_from_environment: shader_cache is used, global_pipeline_cache is enforced.
|
||||
1413.916:00d0:0124:info:vkd3d-proton:vkd3d_config_flags_init_once: VKD3D_CONFIG=''.
|
||||
1413.919:00d0:0124:info:vkd3d-proton:vkd3d_get_vk_version: vkd3d-proton - applicationVersion: 3.0.1.
|
||||
1413.919:00d0:0124:info:vkd3d-proton:vkd3d_instance_init: vkd3d-proton - build: 3b10bd7a7ec6a73.
|
||||
1413.963:00d0:0124:info:vkd3d-proton:vkd3d_init_device_caps: Not all relevant pipeline stages are supported by EXT_dgc. Skipping.
|
||||
1414.109:00d0:0124:info:vkd3d-proton:vkd3d_memory_info_decide_hvv_usage: Topology: Device heaps are split. Assuming small BAR situation.
|
||||
1414.109:00d0:0124:info:vkd3d-proton:vkd3d_memory_info_upload_hvv_memory_properties: Topology: HVV usage is not allowed, using HOST_COHERENT for UPLOAD.
|
||||
1414.109:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_get_bindless_flags: Device does not support VK_EXT_mutable_descriptor_type (or VALVE).
|
||||
1414.109:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1414.109:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1414.109:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1414.109:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1414.109:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1414.109:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1414.109:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1414.111:00d0:0124:info:vkd3d-proton:d3d12_device_caps_init_shader_model: Enabling support for SM 6.6.
|
||||
1414.111:00d0:0124:fixme:vkd3d-proton:d3d12_device_caps_init_feature_options1: TotalLaneCount = 2432, may be inaccurate.
|
||||
1414.111:00d0:0124:info:vkd3d-proton:d3d12_device_determine_ray_tracing_tier: DXR support enabled.
|
||||
1414.111:00d0:0124:info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Remapping VKD3D_SHADER_CACHE to: vkd3d-proton.cache.
|
||||
1414.111:00d0:0124:info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Attempting to load disk cache from: vkd3d-proton.cache.
|
||||
1414.112:00d0:013c:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Performing async setup of stream archive ...
|
||||
1414.112:00d0:013c:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_merge: No write cache exists. No need to merge any disk caches.
|
||||
1414.112:00d0:013c:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Merging pipeline libraries took 0.173 ms.
|
||||
1414.113:00d0:013c:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Mapping read-only cache took 0.276 ms.
|
||||
1414.113:00d0:013c:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Parsing stream archive took 0.029 ms.
|
||||
1414.113:00d0:013c:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Done performing async setup of stream archive.
|
||||
1414.157:00d0:0124:info:vkd3d-proton:vkd3d_get_vk_version: vkd3d-proton - applicationVersion: 3.0.1.
|
||||
1414.157:00d0:0124:info:vkd3d-proton:vkd3d_instance_init: vkd3d-proton - build: 3b10bd7a7ec6a73.
|
||||
1414.199:00d0:0124:info:vkd3d-proton:vkd3d_init_device_caps: Not all relevant pipeline stages are supported by EXT_dgc. Skipping.
|
||||
1414.310:00d0:0124:info:vkd3d-proton:vkd3d_memory_info_decide_hvv_usage: Topology: Device heaps are split. Assuming small BAR situation.
|
||||
1414.310:00d0:0124:info:vkd3d-proton:vkd3d_memory_info_upload_hvv_memory_properties: Topology: HVV usage is not allowed, using HOST_COHERENT for UPLOAD.
|
||||
1414.311:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_get_bindless_flags: Device does not support VK_EXT_mutable_descriptor_type (or VALVE).
|
||||
1414.311:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1414.311:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1414.311:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1414.311:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1414.311:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1414.311:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1414.311:00d0:0124:info:vkd3d-proton:vkd3d_bindless_state_add_binding: Device supports VK_EXT_descriptor_buffer!
|
||||
1414.312:00d0:0124:info:vkd3d-proton:d3d12_device_caps_init_shader_model: Enabling support for SM 6.6.
|
||||
1414.312:00d0:0124:fixme:vkd3d-proton:d3d12_device_caps_init_feature_options1: TotalLaneCount = 2432, may be inaccurate.
|
||||
1414.312:00d0:0124:info:vkd3d-proton:d3d12_device_determine_ray_tracing_tier: DXR support enabled.
|
||||
1414.312:00d0:0124:info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Remapping VKD3D_SHADER_CACHE to: vkd3d-proton.cache.
|
||||
1414.312:00d0:0124:info:vkd3d-proton:vkd3d_pipeline_library_init_disk_cache: Attempting to load disk cache from: vkd3d-proton.cache.
|
||||
1414.313:00d0:0144:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Performing async setup of stream archive ...
|
||||
1414.313:00d0:0144:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_merge: No write cache exists. No need to merge any disk caches.
|
||||
1414.313:00d0:0144:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Merging pipeline libraries took 0.158 ms.
|
||||
1414.313:00d0:0144:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Mapping read-only cache took 0.256 ms.
|
||||
1414.313:00d0:0144:info:vkd3d-proton:vkd3d_pipeline_library_disk_cache_initial_setup: Parsing stream archive took 0.031 ms.
|
||||
1414.313:00d0:0144:info:vkd3d-proton:vkd3d_pipeline_library_disk_thread_main: Done performing async setup of stream archive.
|
||||
1414.319:00d0:0124:fixme:vkd3d-proton:d3d12_command_queue_init: Ignoring priority 0x64.
|
||||
warn: DXGIGetDebugInterface1: Stub
|
||||
info: DXGI: Hiding actual GPU, reporting:
|
||||
info: vendor ID: 0x1002
|
||||
info: device ID: 0x73df
|
||||
1414.406:00d0:00d4:info:vkd3d-proton:dxgi_vk_swap_chain_init: Creating swapchain (1280 x 720), BufferCount = 3.
|
||||
1414.416:00d0:00d4:info:vkd3d-proton:dxgi_vk_swap_chain_init_sync_objects: Ensure maximum latency of 3 frames with KHR_present_wait.
|
||||
1414.416:00d0:00d4:info:vkd3d-proton:dxgi_vk_swap_chain_init_sleep_state: Timer interval is 1.0 ms.
|
||||
warn: DXGI: MakeWindowAssociation: Ignoring flags
|
||||
warn: DxgiOutput::WaitForVBlank: Inaccurate
|
||||
info: Setting timer interval to 1000 us
|
||||
1414.927:00d0:014c:info:vkd3d-proton:dxgi_vk_swap_chain_recreate_swapchain_in_present_task: Got 3 swapchain images.
|
||||
1415.477:00d0:0154:fixme:vkd3d-proton:vkd3d_texture_view_desc_fixup: Remapping 2D to 2D_ARRAY. Needs Vulkan spec tightening to match D3D12 properly.
|
||||
1415.500:00d0:014c:info:vkd3d-proton:dxgi_vk_swap_chain_recreate_swapchain_in_present_task: Got 3 swapchain images.
|
||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
47
audit-runs/iterate-2D-deferred-fixes/DEFERRED_FIXES.md
Normal file
47
audit-runs/iterate-2D-deferred-fixes/DEFERRED_FIXES.md
Normal file
@@ -0,0 +1,47 @@
|
||||
# iterate-2D Deferred Structural Fixes — Outcome
|
||||
|
||||
Branch `iterate-2D/subsystem-fixes`. After verification + the user's go-ahead:
|
||||
|
||||
## Issue 1 — 32-bit word-form ALU truncation (PPCBUG-020) — ✅ FIXED & LANDED
|
||||
Commit **341196a**. Confirmed load-bearing via runtime ours-vs-canary capture:
|
||||
Sylpheed's ms→LARGE_INTEGER converter `sub_824ACA88` (`clrldi; mulli r11,r11,-10000; std`)
|
||||
produced `0x00000000_FFFD8F00` in ours vs canary's correct `0xFFFFFFFF_FFFD8F00` for a 16 ms
|
||||
wait — a positive (absolute) timeout → ~26000× over-wait that froze the main frame loop.
|
||||
Fixed the 17 data-losing word-form ops (full 64-bit result, CA/OV/CR0 preserved byte-identical),
|
||||
updated 7 bug-asserting tests, re-baselined `sylpheed_n50m` (imports 40454→1790936), `sylpheed_n2m`
|
||||
unchanged. 660/660 + ignored oracle green; lockstep determinism preserved. Boot unwedged
|
||||
(parallel NtWaitForMultipleObjectsEx 94→30428; frozen worker/critical-section loops now run).
|
||||
VdSwap still 1 — rendering progression needs the out-of-scope acd1656 fixes (nt_create_event
|
||||
polarity + 2.AF), not in this branch.
|
||||
|
||||
## Issue 2 — Memory page-size per-region collapse — DEFERRED (verified NOT load-bearing)
|
||||
Sylpheed requests `MmAllocatePhysicalMemoryEx` with flags=0, alignment(r8)=0 (default); ours returns
|
||||
self-consistent 4K-aligned addresses and boots. ours has no 0xA0/0xC0/0xE0 physical-region model at
|
||||
all, so a faithful fix is a region-model rewrite that shifts every physical guest VA (golden-breaking,
|
||||
invalidates the audit-059 VA map) with no demonstrated boot benefit. A partial page-size-only change
|
||||
would shift VAs for zero correctness gain — do NOT do it piecemeal. Pursue only if a render-path
|
||||
struct is proven to depend on physical region/alignment.
|
||||
|
||||
## Issue 3 — Timing — LEFT (not load-bearing / determinism-coupled)
|
||||
- 3d DPC/APC: INERT — the only timer (NtSetTimerEx) passes a NULL APC routine; no
|
||||
NtQueueApcThread/KeInsertQueueDpc imported.
|
||||
- 3b timeout sign: was a SYMPTOM of Issue 1 (the "positive absolute" timeouts were mulli-corruption
|
||||
artifacts) — resolved by the Issue 1 fix.
|
||||
- 3a/3c timebase/skew: timebase = instruction-count IS the deterministic lockstep clock; must not
|
||||
become wallclock. 2.AF deadline-drain already present. Not load-bearing for Sylpheed.
|
||||
|
||||
## Issue 4 — VFS synthesized-success-on-miss — LEFT (risky / coupled to Issue 1 trajectory)
|
||||
The synthesis fallback handles a MIX (writable-partition probes partition0/Cache0 + a genuine disc
|
||||
miss dat/files.tbl, verified absent from the ISO). Canary doesn't fire XamShowDirtyDiscErrorUI during
|
||||
boot (the one "DirtyDisc" log hit is the import-table declaration). Not cleanly separable without
|
||||
heuristic disc-vs-partition routing. Re-verify on the corrected post-Issue-1 (and post-acd1656)
|
||||
trajectory before changing.
|
||||
|
||||
## Issue 5 — Mutant object — SKIPPED (verified unused)
|
||||
Sylpheed's XEX import table contains NO mutant symbols (NtCreateMutant/NtReleaseMutant/KeReleaseMutant/
|
||||
KeInitializeMutant/NtQueryMutant) — the game cannot call them; unimplemented=0 across boot. A correct
|
||||
implementation needs mutant hand-off semantics + an owner-type redesign (the existing
|
||||
`Mutex { owner: Option<u8> }` tracks a HW slot, not a thread) in the determinism-critical wait path,
|
||||
for code that never executes. Per the mandate's skip-if-unused criterion, left unimplemented. Can be
|
||||
added on request as a pure canary-parity / future-title feature (determinism-safe since no Sylpheed
|
||||
mutant ever exists at runtime).
|
||||
@@ -242,6 +242,44 @@ enum Commands {
|
||||
/// line). Stdout when omitted.
|
||||
#[arg(long)]
|
||||
lr_trace_out: Option<String>,
|
||||
/// AUDIT-2BF — comma-separated list of guest PCs (hex, no `0x`
|
||||
/// prefix required) to capture as one-line `AUDIT-PC-PROBE`
|
||||
/// records on every fire. Designed for the silph init chain
|
||||
/// virtual-dispatch site at `sub_82172BA0+0x1E8` (PC
|
||||
/// `0x82172D88`, a `bctrl` after a 3-deep vtable-slot-6 load).
|
||||
/// Each record carries (pc, tid, hw, cycle, lr, r3, r11) plus
|
||||
/// four guest-memory dereferences off r3: `[r3+0]` (vtable),
|
||||
/// `[[r3+0]+24]` (slot 6 method = bctrl target), `[r3+0x0C]`
|
||||
/// (auxiliary handle), `[r3+0x30]` (embedded sub-object vtable).
|
||||
/// Compares directly against canary's round-9 capture:
|
||||
/// r3=0xBCCC52C0, [r3+0]=0x820A3644, slot6=sub_821B55D8,
|
||||
/// [r3+0xC]=0xF80000D8, [r3+0x30]=0x820A1870. Read-only;
|
||||
/// lockstep digest unaffected. Settable via
|
||||
/// `XENIA_AUDIT_PC_PROBE`. Example:
|
||||
/// `--audit-pc-probe-hex=82172D88,82172D80`.
|
||||
#[arg(long)]
|
||||
audit_pc_probe_hex: Option<String>,
|
||||
/// AUDIT-2BF round 14 — guest VA (hex, optional `0x` prefix) to
|
||||
/// dereference 3 deep on every `--audit-pc-probe-hex` fire.
|
||||
/// Emits a paired `AUDIT-MEM-READ` line with the singleton value,
|
||||
/// vtable, vtable[0] (= first virtual method, the bctrl target
|
||||
/// at `0x822F1B4C`), and vtable[24] (= slot 6 = canary's silph
|
||||
/// chain target `sub_821B55D8`). Compare ours vs canary to
|
||||
/// determine whether the bctrl dispatches to the same function
|
||||
/// or a different one. Read-only; lockstep digest unaffected.
|
||||
/// Settable via `XENIA_AUDIT_MEM_READ`. Example:
|
||||
/// `--audit-mem-read-hex=828E1F08`.
|
||||
#[arg(long)]
|
||||
audit_mem_read_hex: Option<String>,
|
||||
/// AUDIT-052 — number of bytes (4-byte aligned, max 256) to
|
||||
/// dump from `r3` on every `--audit-pc-probe-hex` fire. Emits a
|
||||
/// paired `AUDIT-R3-DUMP` line with the u32 lanes. Designed for
|
||||
/// the 80-byte stack-local struct at `sub_82452DC0` (`r31+96`)
|
||||
/// when probing `sub_8245B000` entry — where `r3` IS the struct
|
||||
/// pointer. Read-only; lockstep digest unaffected. Settable via
|
||||
/// `XENIA_AUDIT_R3_DUMP_BYTES`. Example: `--audit-r3-dump-bytes=80`.
|
||||
#[arg(long)]
|
||||
audit_r3_dump_bytes: Option<u32>,
|
||||
},
|
||||
/// Browse XISO disc image contents
|
||||
Browse {
|
||||
@@ -405,6 +443,9 @@ fn main() -> Result<()> {
|
||||
probe_db,
|
||||
lr_trace,
|
||||
lr_trace_out,
|
||||
audit_pc_probe_hex,
|
||||
audit_mem_read_hex,
|
||||
audit_r3_dump_bytes,
|
||||
} => cmd_exec(
|
||||
&path,
|
||||
max_instructions,
|
||||
@@ -431,6 +472,9 @@ fn main() -> Result<()> {
|
||||
probe_db.as_deref(),
|
||||
lr_trace.as_deref(),
|
||||
lr_trace_out.as_deref(),
|
||||
audit_pc_probe_hex.as_deref(),
|
||||
audit_mem_read_hex.as_deref(),
|
||||
audit_r3_dump_bytes,
|
||||
),
|
||||
Commands::Browse { path } => cmd_browse(&path),
|
||||
Commands::Info { path } => cmd_info(&path),
|
||||
@@ -662,6 +706,9 @@ fn cmd_exec(
|
||||
probe_db: Option<&str>,
|
||||
lr_trace: Option<&str>,
|
||||
lr_trace_out: Option<&str>,
|
||||
audit_pc_probe_hex: Option<&str>,
|
||||
audit_mem_read_hex: Option<&str>,
|
||||
audit_r3_dump_bytes: Option<u32>,
|
||||
) -> Result<()> {
|
||||
cmd_exec_inner(
|
||||
path,
|
||||
@@ -689,6 +736,9 @@ fn cmd_exec(
|
||||
probe_db,
|
||||
lr_trace,
|
||||
lr_trace_out,
|
||||
audit_pc_probe_hex,
|
||||
audit_mem_read_hex,
|
||||
audit_r3_dump_bytes,
|
||||
None,
|
||||
None,
|
||||
false,
|
||||
@@ -735,6 +785,9 @@ fn cmd_check(
|
||||
None, // probe_db — same
|
||||
None, // lr_trace — same
|
||||
None, // lr_trace_out — same
|
||||
None, // audit_pc_probe_hex — diagnostic, never wanted on goldens
|
||||
None, // audit_mem_read_hex — same
|
||||
None, // audit_r3_dump_bytes — same
|
||||
out,
|
||||
expect,
|
||||
stable_digest,
|
||||
@@ -767,6 +820,9 @@ fn cmd_exec_inner(
|
||||
probe_db: Option<&str>,
|
||||
lr_trace: Option<&str>,
|
||||
lr_trace_out: Option<&str>,
|
||||
audit_pc_probe_hex: Option<&str>,
|
||||
audit_mem_read_hex: Option<&str>,
|
||||
audit_r3_dump_bytes: Option<u32>,
|
||||
digest_out: Option<&str>,
|
||||
digest_expect: Option<&str>,
|
||||
stable_digest: bool,
|
||||
@@ -1167,6 +1223,107 @@ fn cmd_exec_inner(
|
||||
}
|
||||
}
|
||||
|
||||
// AUDIT-2BF — `--audit-pc-probe-hex=82172D88,...`. Bare-hex tokens
|
||||
// (with or without `0x` prefix). Parses every comma-separated entry
|
||||
// as a u32 PC and inserts into `kernel.audit_pc_probe_pcs`. Empty
|
||||
// set is the hot-path no-op (single is_empty() check).
|
||||
let audit_pc_probe_combined: Option<String> = match (
|
||||
audit_pc_probe_hex, std::env::var("XENIA_AUDIT_PC_PROBE").ok(),
|
||||
) {
|
||||
(Some(s), _) => Some(s.to_string()),
|
||||
(None, Some(s)) if !s.is_empty() => Some(s),
|
||||
_ => None,
|
||||
};
|
||||
if let Some(list) = audit_pc_probe_combined {
|
||||
for token in list.split(',').map(str::trim).filter(|s| !s.is_empty()) {
|
||||
let hex = token.strip_prefix("0x").or_else(|| token.strip_prefix("0X")).unwrap_or(token);
|
||||
let pc = u32::from_str_radix(hex, 16)
|
||||
.map_err(|e| anyhow::anyhow!("--audit-pc-probe-hex {token:?}: {e}"))?;
|
||||
kernel.audit_pc_probe_pcs.insert(pc);
|
||||
}
|
||||
if !quiet && !kernel.audit_pc_probe_pcs.is_empty() {
|
||||
let mut pcs: Vec<u32> = kernel.audit_pc_probe_pcs.iter().copied().collect();
|
||||
pcs.sort_unstable();
|
||||
let strs: Vec<String> = pcs.iter().map(|p| format!("{p:#010x}")).collect();
|
||||
tracing::info!(
|
||||
"audit-pc-probe armed: {} ({})",
|
||||
kernel.audit_pc_probe_pcs.len(),
|
||||
strs.join(", "),
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
// AUDIT-2BF round 14 — `--audit-mem-read-hex=828E1F08`. Single
|
||||
// hex VA (optional `0x` prefix). Stored on `kernel.audit_mem_read_addr`.
|
||||
// Paired with `audit_pc_probe_pcs`: on every probe fire, the kernel
|
||||
// emits a second `AUDIT-MEM-READ` line dereferencing 3 deep so we can
|
||||
// resolve vtable[0] / vtable[24] at the singleton.
|
||||
let audit_mem_read_combined: Option<String> = match (
|
||||
audit_mem_read_hex, std::env::var("XENIA_AUDIT_MEM_READ").ok(),
|
||||
) {
|
||||
(Some(s), _) => Some(s.to_string()),
|
||||
(None, Some(s)) if !s.is_empty() => Some(s),
|
||||
_ => None,
|
||||
};
|
||||
if let Some(tok) = audit_mem_read_combined {
|
||||
let tok = tok.trim();
|
||||
if !tok.is_empty() {
|
||||
let hex = tok.strip_prefix("0x").or_else(|| tok.strip_prefix("0X")).unwrap_or(tok);
|
||||
let addr = u32::from_str_radix(hex, 16)
|
||||
.map_err(|e| anyhow::anyhow!("--audit-mem-read-hex {tok:?}: {e}"))?;
|
||||
kernel.audit_mem_read_addr = Some(addr);
|
||||
if !quiet {
|
||||
tracing::info!("audit-mem-read armed: {:#010x}", addr);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// AUDIT-052 — `--audit-r3-dump-bytes=80`. When set, every
|
||||
// `--audit-pc-probe-hex` fire emits a paired `AUDIT-R3-DUMP` line
|
||||
// with N bytes from `r3` (4-byte aligned, capped at 256). Sized for
|
||||
// the 80-byte stack-local struct at `sub_82452DC0`'s `r31+96` —
|
||||
// probe `sub_8245B000` entry where `r3 == parent's r31+96`.
|
||||
let audit_r3_dump_combined: Option<u32> = match (
|
||||
audit_r3_dump_bytes, std::env::var("XENIA_AUDIT_R3_DUMP_BYTES").ok(),
|
||||
) {
|
||||
(Some(n), _) => Some(n),
|
||||
(None, Some(s)) if !s.is_empty() => Some(
|
||||
s.parse::<u32>().map_err(|e| anyhow::anyhow!("--audit-r3-dump-bytes {s:?}: {e}"))?,
|
||||
),
|
||||
_ => None,
|
||||
};
|
||||
if let Some(n) = audit_r3_dump_combined {
|
||||
if n > 0 {
|
||||
kernel.audit_r3_dump_bytes = Some(n);
|
||||
if !quiet {
|
||||
tracing::info!("audit-r3-dump armed: {} bytes", n);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// iterate-2E — pointer-chase probe. `XENIA_AUDIT_DEREF=<reg>:<off>`
|
||||
// (e.g. `4:36`). On each AUDIT-PC-PROBE fire, dumps gpr[reg] as a base
|
||||
// object, the sub-object at [base+off], and that sub-object's vtable.
|
||||
// Read-only; lockstep digest unaffected.
|
||||
if let Ok(spec) = std::env::var("XENIA_AUDIT_DEREF") {
|
||||
if !spec.is_empty() {
|
||||
let (rs, os) = spec
|
||||
.split_once(':')
|
||||
.ok_or_else(|| anyhow::anyhow!("XENIA_AUDIT_DEREF {spec:?}: expected <reg>:<off>"))?;
|
||||
let reg: u8 = rs.trim_start_matches('r').parse()
|
||||
.map_err(|e| anyhow::anyhow!("XENIA_AUDIT_DEREF reg {rs:?}: {e}"))?;
|
||||
let off: u32 = if let Some(h) = os.strip_prefix("0x") {
|
||||
u32::from_str_radix(h, 16)
|
||||
} else {
|
||||
os.parse::<u32>()
|
||||
}.map_err(|e| anyhow::anyhow!("XENIA_AUDIT_DEREF off {os:?}: {e}"))?;
|
||||
kernel.audit_deref = Some((reg, off));
|
||||
if !quiet {
|
||||
tracing::info!("audit-deref armed: r{} +0x{:x}", reg, off);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Diagnostic. Parse `--dump-addr=0x828F3D08,...` (or
|
||||
// `XENIA_DUMP_ADDR=...`) into `kernel.dump_addrs`. The contents
|
||||
// are dumped at end-of-run by `dump_thread_diagnostic`. Pure
|
||||
@@ -1340,16 +1497,28 @@ fn cmd_exec_inner(
|
||||
mem.write_u32(addr, block);
|
||||
}
|
||||
("xboxkrnl.exe", 0x00AD) => {
|
||||
// KeTimeStampBundle — 0x18 block with FILETIME at +0 and
|
||||
// interrupt-time u64 at +0x10. Mirrors the clock used by
|
||||
// KeQuerySystemTime so fast-path readers see consistent values.
|
||||
// KeTimeStampBundle — X_TIME_STAMP_BUNDLE (canary layout,
|
||||
// kernel_state.h): +0x00 interrupt_time u64, +0x08
|
||||
// system_time u64 (FILETIME 100ns), +0x10 tick_count u32
|
||||
// (milliseconds since boot), +0x14 padding. The guest's
|
||||
// worker-hub channel-dispatch loop (sub_82450A68 @
|
||||
// 0x82450b10) polls [block+0x10] (tick_count) and gates
|
||||
// dispatch on a `tick_count + 66` (ms) deadline. The block
|
||||
// MUST be ticked over the run or that deadline never
|
||||
// elapses (tid14 0x109c starvation gate). Initialize to a
|
||||
// zero-uptime base; KernelState::update_timestamp_bundle
|
||||
// ticks it every round from the deterministic global_clock.
|
||||
let block = alloc_zero(0x18, &mut mem, &mut kernel);
|
||||
if block != 0 {
|
||||
let fake_time: u64 = 132_500_000_000_000_000; // ~2021 FILETIME
|
||||
mem.write_u32(block, (fake_time >> 32) as u32);
|
||||
mem.write_u32(block + 4, fake_time as u32);
|
||||
mem.write_u32(block + 0x10, (fake_time >> 32) as u32);
|
||||
mem.write_u32(block + 0x14, fake_time as u32);
|
||||
// FILETIME base (~2021) so system_time is plausible.
|
||||
let fake_time: u64 = 132_500_000_000_000_000;
|
||||
mem.write_u32(block, 0); // interrupt_time hi
|
||||
mem.write_u32(block + 4, 0); // interrupt_time lo
|
||||
mem.write_u32(block + 0x08, (fake_time >> 32) as u32); // system_time hi
|
||||
mem.write_u32(block + 0x0C, fake_time as u32); // system_time lo
|
||||
mem.write_u32(block + 0x10, 0); // tick_count (ms) = 0 at boot
|
||||
mem.write_u32(block + 0x14, 0); // padding
|
||||
kernel.timestamp_bundle_addr = block;
|
||||
}
|
||||
mem.write_u32(addr, block);
|
||||
}
|
||||
@@ -1990,7 +2159,34 @@ fn coord_pre_round(
|
||||
}
|
||||
|
||||
kernel.fire_due_timers();
|
||||
try_inject_graphics_interrupt(kernel);
|
||||
// 2.AF — fire expired wait-deadlines under load. Without this drain,
|
||||
// `advance_to_next_wake_if_due` only runs in `coord_idle_advance` (the
|
||||
// no-Ready-threads path), so a thread whose `KeWait*`/`KeDelay` deadline
|
||||
// expires while other threads keep the scheduler busy sits Blocked
|
||||
// forever (observed: tid=5's 42.95ms deadline unfired 29s+). Drain every
|
||||
// entry whose deadline `<=` the current guest timebase — the same `now`
|
||||
// basis `fire_due_timers` uses, so the two stay in lock-step — and let
|
||||
// `handle_timeout_wake` stamp `STATUS_TIMEOUT` and scrub the waiter from
|
||||
// each handle. `advance_to_next_wake_if_due` pops at most one due wake
|
||||
// per call and returns `None` once the earliest remaining deadline is in
|
||||
// the future, so this loop terminates. Deterministic: `ctx(0).timebase`
|
||||
// is the guest-cycle timebase, not host_ns. This runs in `coord_pre_round`
|
||||
// which both the lockstep and parallel outer loops call every round.
|
||||
loop {
|
||||
let now = kernel.now_basis_at(0);
|
||||
let Some((r, reason)) = kernel.scheduler.advance_to_next_wake_if_due(now)
|
||||
else {
|
||||
break;
|
||||
};
|
||||
kernel.handle_timeout_wake(r, reason);
|
||||
}
|
||||
// Graphics-interrupt delivery is no longer done here — see
|
||||
// `dispatch_graphics_interrupts`, called from the outer loop with
|
||||
// `mem` and `&mut stats` in scope. The audio path still uses the
|
||||
// asynchronous LR-sentinel inject because each XAudio client has a
|
||||
// dedicated worker thread (audit-048 Plan B) that the callback
|
||||
// runs on; we just queue the source and the worker_prologue's
|
||||
// halt-sentinel restore path closes the loop.
|
||||
if kernel.xaudio_tick_enabled {
|
||||
try_inject_audio_callback(kernel);
|
||||
}
|
||||
@@ -2010,6 +2206,24 @@ fn coord_idle_advance(
|
||||
shutdown: &Option<std::sync::Arc<std::sync::atomic::AtomicBool>>,
|
||||
stats: &ExecStats,
|
||||
) -> RoundCtl {
|
||||
// Path β (iterate-2.BE follow-up): when the scheduler has no Ready
|
||||
// threads, `coord_pre_round`'s instruction-count vsync ticker stops
|
||||
// advancing (instruction_count is frozen). That starves the
|
||||
// host-driven graphics ISR dispatcher: queue stays empty, no
|
||||
// deliveries occur, and the very stall we're trying to break out of
|
||||
// gets worse. Tick vsync from wallclock here unconditionally — it's
|
||||
// a host-clock read, independent of instruction count, and the
|
||||
// dispatcher in the outer loop will drain whatever we queue on the
|
||||
// next pass. Mirrors the `--parallel` ticker choice in
|
||||
// `coord_pre_round` (`tick_vsync_wallclock` branch).
|
||||
if kernel.interrupts.tick_vsync_wallclock() {
|
||||
use std::sync::atomic::Ordering;
|
||||
let mmio = kernel.gpu.mmio();
|
||||
let prev = mmio.d1mode_vblank_vline_status.load(Ordering::Relaxed);
|
||||
mmio.d1mode_vblank_vline_status
|
||||
.store(prev | 0x1, Ordering::Relaxed);
|
||||
}
|
||||
|
||||
let next_timer = kernel.earliest_timer_deadline();
|
||||
let next_wait = kernel.scheduler.earliest_wait_deadline();
|
||||
let target = match (next_timer, next_wait) {
|
||||
@@ -2218,6 +2432,7 @@ fn worker_prologue(
|
||||
// the helper, no overhead on the hot path.
|
||||
kernel.fire_ctor_probe_if_match(hw_id, mem);
|
||||
kernel.fire_branch_probe_if_match(hw_id);
|
||||
kernel.fire_audit_pc_probe_if_match(hw_id, mem);
|
||||
kernel.fire_lr_trace_if_match(hw_id);
|
||||
|
||||
if mem.has_mem_watch() {
|
||||
@@ -2416,6 +2631,10 @@ fn worker_prologue(
|
||||
|
||||
match result {
|
||||
StepResult::Continue => {}
|
||||
StepResult::Yield => {
|
||||
// db16cyc spin-wait hint (per-instruction path): yield the slot.
|
||||
kernel.scheduler.yield_current();
|
||||
}
|
||||
StepResult::SystemCall => {
|
||||
tracing::warn!("SYSCALL at {:#010x} (hw={})", pc, hw_id);
|
||||
}
|
||||
@@ -2495,6 +2714,11 @@ fn worker_epilogue(
|
||||
|
||||
match result {
|
||||
StepResult::Continue => {}
|
||||
StepResult::Yield => {
|
||||
// db16cyc spin-wait hint: hand the slot to a Ready peer so the
|
||||
// spinner doesn't starve the co-located thread it is waiting on.
|
||||
kernel.scheduler.yield_current();
|
||||
}
|
||||
StepResult::SystemCall => {
|
||||
let last_pc = block.instrs.last().map(|i| i.addr).unwrap_or(pc_before);
|
||||
tracing::warn!("SYSCALL at {:#010x} (hw={})", last_pc, hw_id);
|
||||
@@ -2595,12 +2819,21 @@ fn run_execution(
|
||||
let mut workers: [WorkerCtx; xenia_cpu::scheduler::HW_THREAD_COUNT] =
|
||||
std::array::from_fn(|i| WorkerCtx::new(i as u8, force_per_instr));
|
||||
|
||||
// Iterate-2.BE — decode cache used by the synchronous ISR
|
||||
// dispatcher. ISRs are short (~40 PPC instructions) but fire
|
||||
// every ~16.7 ms, so persisting the cache across calls avoids
|
||||
// re-decoding the same handful of pages 60×/s.
|
||||
let mut isr_decode_cache = xenia_cpu::decoder::DecodeCache::new();
|
||||
|
||||
'outer: loop {
|
||||
// Per-round prologue: budget / shutdown / heartbeat / vsync /
|
||||
// timers / graphics-interrupt injection. Carved into
|
||||
// timers / audio-interrupt injection. Carved into
|
||||
// `coord_pre_round` so the parallel scheduler (Step 03+) can
|
||||
// call the same coordination logic between phaser barriers
|
||||
// without duplicating it from the lockstep path.
|
||||
// without duplicating it from the lockstep path. The
|
||||
// graphics-interrupt dispatch is hoisted out — it runs
|
||||
// *synchronously* (host-driven, iterate-2.BE) and needs `mem`
|
||||
// + `&mut stats` which aren't in `coord_pre_round`'s scope.
|
||||
match coord_pre_round(
|
||||
kernel,
|
||||
&stats,
|
||||
@@ -2612,6 +2845,39 @@ fn run_execution(
|
||||
RoundCtl::BreakOuter => break,
|
||||
RoundCtl::Continue => {}
|
||||
}
|
||||
// ITERATE-2C Phase D — deposit the current instruction count so
|
||||
// `nt_create_event` can compute absolute auto-signal deadlines,
|
||||
// then drain any pending auto-signals whose deadline has passed.
|
||||
// Both calls are no-ops when `XENIA_SILPH_UI_AUTOSIGNAL_DELAY`
|
||||
// is unset (the pending queue stays empty).
|
||||
kernel.set_now_cycle_hint(stats.instruction_count);
|
||||
// Drive the coherent monotonic "now" the kernel deadline-arithmetic
|
||||
// reads (`KernelState::now_basis_at` -> `Scheduler::global_clock`)
|
||||
// from the deterministic retired-instruction count. Floored up (never
|
||||
// backwards). This is the LOCKSTEP analogue of the parallel writeback's
|
||||
// `advance_global_clock`: a parked/poll thread computing a relative
|
||||
// timeout via `parse_timeout` now reads a real, non-zero, monotone
|
||||
// basis instead of `idle_ctx`'s timebase-0, so its deadline lands in
|
||||
// the future and `coord_idle_advance` stops re-arming the constant
|
||||
// past deadline forever (the timebase-desync livelock / render-gate
|
||||
// root). Pure function of guest instructions -> bit-reproducible.
|
||||
kernel
|
||||
.scheduler
|
||||
.advance_global_clock_to(stats.instruction_count);
|
||||
// ITERATE-2J — tick the KeTimeStampBundle (ordinal 0x00AD) from the
|
||||
// same deterministic clock so the guest's worker-hub tick_count
|
||||
// deadline gate (`[block+0x10] + 66` ms) actually elapses. Without
|
||||
// this the block is frozen at boot and the hub spins forever,
|
||||
// starving tid14 on event 0x109c.
|
||||
kernel.update_timestamp_bundle(mem, kernel.scheduler.global_clock());
|
||||
kernel.fire_due_silph_autosignals(stats.instruction_count);
|
||||
dispatch_graphics_interrupts(
|
||||
kernel,
|
||||
mem,
|
||||
&mut stats,
|
||||
&mut isr_decode_cache,
|
||||
thunk_map,
|
||||
);
|
||||
|
||||
// Snapshot round schedule. `round_schedule` also advances rng state
|
||||
// when seeded; mutation is intentional.
|
||||
@@ -2789,6 +3055,10 @@ fn run_execution_parallel(
|
||||
|
||||
let throttle_start = Instant::now();
|
||||
|
||||
// Iterate-2.BE — decode cache for the synchronous ISR dispatcher.
|
||||
// Lives on the coordinator (this) thread; workers never touch it.
|
||||
let mut isr_decode_cache = xenia_cpu::decoder::DecodeCache::new();
|
||||
|
||||
const COORD_ID: u8 = xenia_cpu::scheduler::HW_THREAD_COUNT as u8; // = 6
|
||||
const PARTY_COUNT: u32 = xenia_cpu::scheduler::HW_THREAD_COUNT as u32 + 1;
|
||||
|
||||
@@ -2939,6 +3209,16 @@ fn run_execution_parallel(
|
||||
.and_then(|t| guard.scheduler.find_by_tid(t))
|
||||
.unwrap_or(thread_ref);
|
||||
*guard.scheduler.ctx_mut_ref(target_ref) = ctx_taken;
|
||||
// Advance the parallel-mode coherent clock by
|
||||
// the instructions this block retired. This is
|
||||
// the single authoritative "now" the kernel
|
||||
// deadline-arithmetic reads in parallel mode
|
||||
// (per-thread `ctx.timebase` is incoherent here
|
||||
// because peers extract/zero their slots) —
|
||||
// keeping it monotonic breaks the timebase-
|
||||
// desync livelock where a woken thread re-armed
|
||||
// the same constant deadline forever.
|
||||
guard.scheduler.advance_global_clock(executed);
|
||||
// worker_epilogue's exit_current path
|
||||
// expects scheduler.current to be set
|
||||
// to the running thread.
|
||||
@@ -3025,6 +3305,41 @@ fn run_execution_parallel(
|
||||
}
|
||||
let mut guard = pre_outcome.1;
|
||||
|
||||
// ITERATE-2C Phase D — same auto-signal hook as the lockstep
|
||||
// path. Held under the same `kernel_arc` guard the rest of
|
||||
// this prologue runs under, so no extra locking.
|
||||
{
|
||||
let s = stats_mtx.lock().expect("stats mutex poisoned");
|
||||
guard.set_now_cycle_hint(s.instruction_count);
|
||||
guard.fire_due_silph_autosignals(s.instruction_count);
|
||||
}
|
||||
|
||||
// ITERATE-2J — tick the KeTimeStampBundle (ordinal 0x00AD) from
|
||||
// the parallel-mode coherent global_clock (summed per-block
|
||||
// retired instructions). Same fix as the lockstep loop: keeps the
|
||||
// guest's worker-hub tick_count deadline gate advancing so it
|
||||
// dispatches channel-3 and unblocks tid14 on event 0x109c.
|
||||
{
|
||||
let clock = guard.scheduler.global_clock();
|
||||
guard.update_timestamp_bundle(mem, clock);
|
||||
}
|
||||
|
||||
// Iterate-2.BE — host-driven synchronous ISR dispatch.
|
||||
// Runs under the kernel lock while workers are still parked
|
||||
// at the phaser B2 barrier (the coordinator hasn't published
|
||||
// the runnable mask or arrived at the phaser yet), so no
|
||||
// contention with worker steps.
|
||||
{
|
||||
let mut s = stats_mtx.lock().expect("stats mutex poisoned");
|
||||
dispatch_graphics_interrupts(
|
||||
&mut *guard,
|
||||
mem,
|
||||
&mut *s,
|
||||
&mut isr_decode_cache,
|
||||
thunk_map,
|
||||
);
|
||||
}
|
||||
|
||||
guard.scheduler.begin_round();
|
||||
let order = guard.scheduler.round_schedule();
|
||||
|
||||
@@ -3140,121 +3455,161 @@ fn run_execution_parallel(
|
||||
stats_mtx.into_inner().expect("stats mutex poisoned")
|
||||
}
|
||||
|
||||
/// First-Pixels M2 — inject a queued graphics interrupt into HW thread 0
|
||||
/// when it's safe to do so (callback registered, no interrupt already
|
||||
/// running). Called at the top of each scheduler round.
|
||||
/// Iterate-2.BE — host-driven synchronous dispatch of all queued
|
||||
/// graphics interrupts. Mirrors canary's
|
||||
/// [`EmulateCPInterruptDPC`](../../../../xenia-canary/src/xenia/kernel/kernel_state.cc#L1370)
|
||||
/// → [`Processor::Execute`](../../../../xenia-canary/src/xenia/cpu/processor.cc#L413)
|
||||
/// path: pick a guest thread, borrow its `PpcContext`, jam the ISR
|
||||
/// PC + args into it, and **run the interpreter inline on the host
|
||||
/// thread** until the ISR returns to `LR_HALT_SENTINEL`. Then restore
|
||||
/// the borrowed context and continue.
|
||||
///
|
||||
/// Unlike the earlier P6 version which only delivered when HW 0 was
|
||||
/// `Ready`, this one also delivers when HW 0 is `Blocked`: the injector
|
||||
/// stashes the block reason into the new `HwState::ServicingIrq(reason)`
|
||||
/// variant, flips the thread to that state so `round_schedule` runs it,
|
||||
/// and — on callback return to `LR_HALT_SENTINEL` — the restore path
|
||||
/// re-creates `Blocked(reason)`, unless a `wake()` during the callback
|
||||
/// (e.g. `KeSetEvent` → `wake_eligible_waiters`) flipped it to `Ready`,
|
||||
/// in which case the wait was resolved and we leave it.
|
||||
/// Drains the full pending FIFO each call — canary's frame-limiter
|
||||
/// runs at its own cadence and our queue can already hold up to
|
||||
/// `INTERRUPT_QUEUE_CAP` coalesced v-sync events.
|
||||
///
|
||||
/// This is the fix that unblocks games (like Sylpheed) which gate their
|
||||
/// main loop on a v-sync callback signaling an event the main thread
|
||||
/// waits on. The earlier "only-when-Ready" policy dropped 397 of 399
|
||||
/// observed v-syncs on a 1 B-instruction Sylpheed probe; now they
|
||||
/// actually get delivered.
|
||||
fn try_inject_graphics_interrupt(kernel: &mut xenia_kernel::KernelState) {
|
||||
/// Why this replaces the prior victim-mutate-then-wait scheme: with
|
||||
/// the old asynchronous injection, when every guest thread idled (post
|
||||
/// boot, when Sylpheed's main thread reaches its WAIT_FOREVER on the
|
||||
/// vsync-driven PKEVENT and all worker threads are likewise Blocked),
|
||||
/// the next scheduler round had no `Ready` victim and `Blocked` ones
|
||||
/// still required at least one round of execution to reach the
|
||||
/// callback. Audit-059 measured `gpu.interrupt.delivered = 54` over
|
||||
/// 3.9 s vs canary's 4712 — an 87× shortfall. Host-driven dispatch
|
||||
/// makes delivery rate a function of wall clock, not guest-thread
|
||||
/// readiness.
|
||||
///
|
||||
/// Victim selection still mirrors the canary precedent: prefer Ready
|
||||
/// (no state mangling), else any Blocked thread (we temporarily flip
|
||||
/// to `ServicingIrq(reason)` for the duration of the inline run so
|
||||
/// `call_export` etc. see a coherent thread state, and restore the
|
||||
/// `Blocked(reason)` on the way out unless the ISR itself signaled a
|
||||
/// wake). Idle / Exited / already-ServicingIrq slots are skipped — if
|
||||
/// nothing remains the source is dropped (still the right behavior;
|
||||
/// canary's `XThread::GetCurrentThread()` would assert).
|
||||
///
|
||||
/// All execution while in-flight runs against the borrowed thread's
|
||||
/// `ctx`. We set `scheduler.current = Some(target_ref)` so kernel
|
||||
/// imports (`KeSetEvent`, `KeReleaseSemaphore`, etc.) reach the right
|
||||
/// context, then restore the previous `current` on the way out. The
|
||||
/// dispatch is single-threaded — under `--parallel` it runs on the
|
||||
/// coordinator with workers parked at the phaser barrier, so there is
|
||||
/// no contention.
|
||||
fn dispatch_graphics_interrupts(
|
||||
kernel: &mut xenia_kernel::KernelState,
|
||||
mem: &xenia_memory::GuestMemory,
|
||||
stats: &mut ExecStats,
|
||||
decode_cache: &mut xenia_cpu::decoder::DecodeCache,
|
||||
thunk_map: &HashMap<u32, (ModuleId, u16, String)>,
|
||||
) {
|
||||
use xenia_cpu::interpreter::{step_cached, StepResult};
|
||||
use xenia_cpu::scheduler::HwState;
|
||||
const LR_HALT: u32 = xenia_cpu::context::LR_HALT_SENTINEL as u32;
|
||||
/// Defensive cap so a runaway ISR can't lock the coordinator on
|
||||
/// the per-tick dispatch. Real Sylpheed vsync ISR is ~40 PPC
|
||||
/// instructions; canary's `Processor::Execute` has no analogous
|
||||
/// cap because it runs on a dedicated host thread, but we run
|
||||
/// inline on the coordinator so a budget is prudent.
|
||||
const MAX_INSTRS_PER_ISR: u64 = 1_000_000;
|
||||
|
||||
if kernel.interrupts.is_in_callback() {
|
||||
return;
|
||||
}
|
||||
let Some(cb) = kernel.interrupts.callback else {
|
||||
// No callback registered; drain any pending entries (they
|
||||
// wouldn't have made it into the queue per `queue_interrupt`'s
|
||||
// own `callback.is_none()` guard, but be defensive).
|
||||
kernel.interrupts.pending.clear();
|
||||
return;
|
||||
};
|
||||
|
||||
let Some(source) = kernel.interrupts.peek_next() else {
|
||||
return;
|
||||
// Iterate-2.BF.γ: graphics dispatch is fully synchronous (host-driven,
|
||||
// iterate-2.BE) — it borrows a guest thread, runs the ISR to
|
||||
// LR_HALT_SENTINEL, and restores all in-call before returning. So it
|
||||
// CAN safely coexist with an audio callback mid-flight, *as long as we
|
||||
// pick a different victim thread* than the one audio borrowed. The old
|
||||
// blanket `is_in_callback()` gate caused 5.85M skipped dispatches in
|
||||
// lockstep boot (vs 55 with-pending dispatches) — audio is essentially
|
||||
// always mid-flight on its dedicated worker, which choked vsync
|
||||
// delivery at ~54. Exclude only audio's borrowed thread; the queue
|
||||
// drains synchronously and graphics ISR completion does not touch
|
||||
// `interrupts.saved` (used exclusively by the async audio path).
|
||||
let audio_borrowed = if kernel.interrupts.is_in_callback() {
|
||||
kernel.interrupts.injected_ref
|
||||
} else {
|
||||
None
|
||||
};
|
||||
|
||||
// Canary's `EmulateCPInterruptDPC` (kernel_state.cc:1373) dispatches on
|
||||
// whatever the current thread happens to be — real hardware fires the
|
||||
// interrupt on CPU 2 and the kernel impersonates a DPC on top of
|
||||
// whichever thread is active. Hard-anchoring to HW 0 breaks the moment
|
||||
// `main()` returns: Sylpheed's main thread exits right after init, the
|
||||
// render worker spins on a `PKEVENT` inside the interrupt callback's
|
||||
// user_data struct (`user_data + 0x5C`), and because HW 0 is now
|
||||
// `Exited(_)` our injector drops every subsequent vsync — the PKEVENT
|
||||
// is never signaled and the worker polls forever.
|
||||
//
|
||||
// Pick the first HW thread we can plausibly run the callback on:
|
||||
// 1. Prefer `Ready` (no state-mangling needed)
|
||||
// 2. Else take a `Blocked(reason)` thread and swap to
|
||||
// `ServicingIrq(reason)` so the round scheduler runs it; the
|
||||
// LR-sentinel restore path reinstates the block on callback return
|
||||
// 3. Skip `Idle`, `Exited`, or already-`ServicingIrq` slots
|
||||
//
|
||||
// The callback itself just signals a game-side event and returns — it
|
||||
// doesn't care which HW thread it ran on.
|
||||
// Pass 1: find any Ready thread across all slots.
|
||||
while let Some(source) = kernel.interrupts.peek_next() {
|
||||
// Victim selection: Ready first, then Blocked (canary's
|
||||
// `XThread::GetCurrentThread()` analog — any live thread will
|
||||
// do for borrowing context). Skip Idle/Exited/ServicingIrq.
|
||||
// Skip the audio-borrowed thread (if any) to avoid clobbering
|
||||
// its `SavedCallbackCtx` mid-flight.
|
||||
let excluded = audio_borrowed;
|
||||
let mut victim: Option<xenia_cpu::ThreadRef> = None;
|
||||
'outer_ready: for (hw_id, slot) in kernel.scheduler.slots.iter().enumerate() {
|
||||
for (idx, t) in slot.runqueue.iter().enumerate() {
|
||||
let r = xenia_cpu::ThreadRef::new(hw_id as u8, idx as u16);
|
||||
if excluded == Some(r) {
|
||||
continue;
|
||||
}
|
||||
if matches!(t.state, HwState::Ready) {
|
||||
victim = Some(xenia_cpu::ThreadRef::new(hw_id as u8, idx as u16));
|
||||
victim = Some(r);
|
||||
break 'outer_ready;
|
||||
}
|
||||
}
|
||||
}
|
||||
// Pass 2: any Blocked thread (we'll flip it to ServicingIrq).
|
||||
if victim.is_none() {
|
||||
'outer_blocked: for (hw_id, slot) in kernel.scheduler.slots.iter().enumerate() {
|
||||
for (idx, t) in slot.runqueue.iter().enumerate() {
|
||||
let r = xenia_cpu::ThreadRef::new(hw_id as u8, idx as u16);
|
||||
if excluded == Some(r) {
|
||||
continue;
|
||||
}
|
||||
if matches!(t.state, HwState::Blocked(_)) {
|
||||
victim = Some(xenia_cpu::ThreadRef::new(hw_id as u8, idx as u16));
|
||||
victim = Some(r);
|
||||
break 'outer_blocked;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
let Some(target_ref) = victim else {
|
||||
// All threads Idle/Exited/already servicing — nothing to inject on.
|
||||
// No donor at all — drop and exit (no point looping if the
|
||||
// next source has the same problem).
|
||||
kernel.interrupts.take_next();
|
||||
kernel.interrupts.dropped += 1;
|
||||
return;
|
||||
};
|
||||
|
||||
let t = kernel.scheduler.thread_mut(target_ref);
|
||||
let prev_state = t.state.clone();
|
||||
match prev_state {
|
||||
HwState::Ready => {}
|
||||
HwState::Blocked(reason) => {
|
||||
t.state = HwState::ServicingIrq(reason);
|
||||
}
|
||||
_ => unreachable!("victim selection above filtered out other variants"),
|
||||
// Commit: pop the queue, flag temporary state.
|
||||
let _ = kernel.interrupts.take_next();
|
||||
let prev_state = kernel.scheduler.thread(target_ref).state.clone();
|
||||
let was_blocked = matches!(prev_state, HwState::Blocked(_));
|
||||
if let HwState::Blocked(reason) = prev_state.clone() {
|
||||
kernel.scheduler.thread_mut(target_ref).state =
|
||||
HwState::ServicingIrq(reason);
|
||||
}
|
||||
|
||||
let _ = kernel.interrupts.take_next();
|
||||
// Save the borrowed ctx fields the ISR will clobber. Matches
|
||||
// canary's processor.cc:387-394 (save prev lr, run, restore).
|
||||
let saved = {
|
||||
let t = kernel.scheduler.thread_mut(target_ref);
|
||||
let saved = xenia_kernel::SavedCallbackCtx::capture(&t.ctx, source);
|
||||
kernel.interrupts.injected_ref = Some(target_ref);
|
||||
t.ctx.pc = cb.callback_pc;
|
||||
t.ctx.lr = xenia_cpu::context::LR_HALT_SENTINEL;
|
||||
// Canary `Processor::Execute` decrements the guest SP by 176 before
|
||||
// running the callback and restores on return (see Canary
|
||||
// processor.cc:383). Without this pad the callback's
|
||||
// `__savegprlr_N` prologue stomps the interrupted function's
|
||||
// already-saved LR at [r1-8], so when the interrupted function
|
||||
// later returns via `__restgprlr_N -> bclr` it jumps to
|
||||
// `LR_HALT_SENTINEL` and the thread exits prematurely. Matching
|
||||
// restore lives in `SavedCallbackCtx::restore` (which now also
|
||||
// restores r1).
|
||||
// Canary processor.cc:383 — pad SP so the callback's
|
||||
// __savegprlr_N prologue doesn't stomp the interrupted
|
||||
// function's saved LR at [r1-8].
|
||||
t.ctx.gpr[1] = t
|
||||
.ctx
|
||||
.gpr[1]
|
||||
.wrapping_sub(xenia_kernel::interrupts::CALLBACK_STACK_PAD as u64);
|
||||
t.ctx.gpr[3] = source as u64;
|
||||
t.ctx.gpr[4] = cb.user_data as u64;
|
||||
kernel.interrupts.saved = Some(saved);
|
||||
saved
|
||||
};
|
||||
|
||||
// Stash the previous `scheduler.current` (call_export reaches
|
||||
// it; imports the ISR calls must dispatch on the borrowed
|
||||
// thread). Restore on the way out.
|
||||
let prev_current = kernel.scheduler.current;
|
||||
kernel.scheduler.current = Some(target_ref);
|
||||
|
||||
metrics::counter!("gpu.interrupt.delivered", "source" => format!("{source}"))
|
||||
.increment(1);
|
||||
tracing::debug!(
|
||||
@@ -3262,24 +3617,116 @@ fn try_inject_graphics_interrupt(kernel: &mut xenia_kernel::KernelState) {
|
||||
hw_id = target_ref.hw_id,
|
||||
idx = target_ref.idx,
|
||||
callback = format_args!("{:#010x}", cb.callback_pc),
|
||||
"graphics interrupt: injecting"
|
||||
"graphics interrupt: dispatching synchronously (iterate-2.BE)"
|
||||
);
|
||||
|
||||
// Inline interpreter loop on the borrowed context until the
|
||||
// ISR returns to LR_HALT_SENTINEL (its `blr` writes
|
||||
// `lr → pc`). Per-instruction step handles imports via
|
||||
// thunk_map (the ISR typically just calls `KeSetEvent`).
|
||||
let mut isr_instrs: u64 = 0;
|
||||
loop {
|
||||
let pc = kernel.scheduler.ctx_mut_ref(target_ref).pc;
|
||||
if pc == LR_HALT {
|
||||
break;
|
||||
}
|
||||
if isr_instrs >= MAX_INSTRS_PER_ISR {
|
||||
tracing::warn!(
|
||||
pc = format_args!("{:#010x}", pc),
|
||||
isr_instrs,
|
||||
"graphics ISR exceeded MAX_INSTRS_PER_ISR; aborting"
|
||||
);
|
||||
break;
|
||||
}
|
||||
|
||||
// Import-thunk intercept: same shape as worker_prologue's
|
||||
// step 2 (line ~2287).
|
||||
if let Some((module, ordinal, _name)) = thunk_map.get(&pc) {
|
||||
let module = *module;
|
||||
let ordinal_u32 = *ordinal as u32;
|
||||
kernel.call_export(module, ordinal_u32, mem);
|
||||
let post_ref = kernel.scheduler.current;
|
||||
let c = match post_ref {
|
||||
Some(r) => kernel.scheduler.ctx_mut_ref(r),
|
||||
None => kernel.scheduler.ctx_mut_ref(target_ref),
|
||||
};
|
||||
c.pc = c.lr as u32;
|
||||
c.cycle_count += 1;
|
||||
c.timebase += 1;
|
||||
stats.instruction_count += 1;
|
||||
stats.import_count += 1;
|
||||
isr_instrs += 1;
|
||||
continue;
|
||||
}
|
||||
|
||||
if !mem.is_mapped(pc) {
|
||||
tracing::error!(
|
||||
pc = format_args!("{:#010x}", pc),
|
||||
isr_instrs,
|
||||
"graphics ISR hit unmapped PC; aborting"
|
||||
);
|
||||
break;
|
||||
}
|
||||
|
||||
let ctx = kernel.scheduler.ctx_mut_ref(target_ref);
|
||||
let page_ver = mem.page_version(ctx.pc);
|
||||
let r = step_cached(ctx, mem, decode_cache, page_ver);
|
||||
stats.instruction_count += 1;
|
||||
isr_instrs += 1;
|
||||
match r {
|
||||
StepResult::Continue => {}
|
||||
// db16cyc inside the synchronous ISR has no slot to yield —
|
||||
// the ISR runs to completion on the borrowed context.
|
||||
StepResult::Yield => {}
|
||||
StepResult::SystemCall => {
|
||||
tracing::warn!("graphics ISR hit `sc` instruction; aborting");
|
||||
break;
|
||||
}
|
||||
StepResult::Trap => {
|
||||
tracing::warn!("graphics ISR hit trap; aborting");
|
||||
break;
|
||||
}
|
||||
StepResult::Halted => break,
|
||||
StepResult::Unimplemented(op) => {
|
||||
tracing::warn!(?op, "graphics ISR hit unimplemented opcode; aborting");
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// Restore the borrowed context.
|
||||
saved.restore(kernel.scheduler.ctx_mut_ref(target_ref));
|
||||
kernel.scheduler.current = prev_current;
|
||||
kernel.interrupts.delivered += 1;
|
||||
|
||||
// Restore thread state. If the ISR signaled a wake on the
|
||||
// borrowed thread (e.g. canary `KeSetEvent` → scheduler wake)
|
||||
// the state may already be Ready; only re-block if still
|
||||
// ServicingIrq.
|
||||
if was_blocked {
|
||||
let t = kernel.scheduler.thread_mut(target_ref);
|
||||
if let HwState::ServicingIrq(reason) = t.state.clone() {
|
||||
t.state = HwState::Blocked(reason);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// AUDIT-032 Plan B — inject a pending XAudio buffer-complete callback
|
||||
/// into the **dedicated audio worker** registered for the head-of-queue
|
||||
/// client. Mirrors
|
||||
/// [`try_inject_graphics_interrupt`] (same SP-pad, same saved-context
|
||||
/// restore-on-sentinel) but the target thread is fixed at registration
|
||||
/// time instead of selected via the random-victim policy. The pre-fix
|
||||
/// client. Uses the asynchronous LR-sentinel injection mechanism (same
|
||||
/// SP-pad, same `SavedCallbackCtx` restore-on-sentinel as the pre-iterate-2.BE
|
||||
/// graphics path) but the target thread is fixed at registration time
|
||||
/// instead of selected via the random-victim policy. The pre-fix
|
||||
/// random-victim path corrupted unrelated thread state
|
||||
/// (APUBUG-PRODUCER-001 "HW-thread hijack"); per-client workers eliminate
|
||||
/// that whole class of regression.
|
||||
///
|
||||
/// Mutual exclusion with the graphics path is via the shared
|
||||
/// `interrupts.saved` slot — if a graphics callback is already in flight,
|
||||
/// `is_in_callback()` returns true and we bail until it returns to the
|
||||
/// `LR_HALT_SENTINEL`.
|
||||
/// Mutual exclusion with the graphics path (which is now synchronous —
|
||||
/// see `dispatch_graphics_interrupts`) is via the shared
|
||||
/// `interrupts.saved` slot — if an audio callback is already in flight,
|
||||
/// `is_in_callback()` returns true and `dispatch_graphics_interrupts`
|
||||
/// defers until it returns to the `LR_HALT_SENTINEL`.
|
||||
fn try_inject_audio_callback(kernel: &mut xenia_kernel::KernelState) {
|
||||
use xenia_cpu::scheduler::HwState;
|
||||
|
||||
|
||||
@@ -1,9 +1,9 @@
|
||||
{
|
||||
"instructions": 50000001,
|
||||
"imports": 40454,
|
||||
"instructions": 50000000,
|
||||
"imports": 339766,
|
||||
"unimpl": 0,
|
||||
"draws": 0,
|
||||
"swaps": 1,
|
||||
"swaps": 2,
|
||||
"unique_render_targets": 0,
|
||||
"shader_blobs_live": 0,
|
||||
"texture_cache_entries": 0
|
||||
|
||||
@@ -28,6 +28,15 @@ pub enum StepResult {
|
||||
Trap,
|
||||
/// Execution halted (by debugger or error).
|
||||
Halted,
|
||||
/// Executed the `db16cyc` spin-wait hint (`or r31,r31,r31`, encoding
|
||||
/// `0x7FFFFB78`). The PC has already advanced past the hint; this is a
|
||||
/// cooperative-yield signal so the scheduler hands the slot to a Ready
|
||||
/// peer. On real hardware all six HW threads run concurrently and the
|
||||
/// spin resolves naturally; under our round-robin lockstep a spinning
|
||||
/// barrier/spinlock participant would otherwise monopolize its slot and
|
||||
/// starve the co-located thread it is waiting on. Matches canary's
|
||||
/// `InstrEmit_orx` db16cyc → `DelayExecution()` handling.
|
||||
Yield,
|
||||
}
|
||||
|
||||
/// Execute a single PPC instruction.
|
||||
@@ -95,6 +104,9 @@ pub fn step_block(
|
||||
ctx.cycle_count += 1;
|
||||
ctx.timebase += 1;
|
||||
if !matches!(result, StepResult::Continue) {
|
||||
// `Yield` (db16cyc spin hint) terminates the block here so the
|
||||
// scheduler regains control and can rotate the slot; the PC has
|
||||
// already advanced past the hint inside `execute`.
|
||||
return result;
|
||||
}
|
||||
// PC discontinuity within a block. By construction only the
|
||||
@@ -117,65 +129,65 @@ fn execute(ctx: &mut PpcContext, mem: &dyn MemoryAccess, instr: &DecodedInstr) -
|
||||
ctx.pc += 4;
|
||||
}
|
||||
PpcOpcode::addis => {
|
||||
// Xbox 360 user mode is 32-bit ABI (MSR.SF=0), so addis must
|
||||
// produce a value whose upper 32 bits don't pollute downstream
|
||||
// 64-bit arithmetic. The PPC ISA in 64-bit mode sign-extends
|
||||
// simm16 before the shift, producing 0xFFFFFFFF_xxxx0000 for
|
||||
// negative simm16 (high bit set). When this value flows into
|
||||
// a 64-bit subfc against a zero-extended lwz value, the unsigned
|
||||
// 64-bit comparison yields wrong CA. Truncate to 32 bits to
|
||||
// simulate 32-bit ABI behavior.
|
||||
// PPCBUG-020 fix: Xenon is a 64-bit core; `addis` produces the full
|
||||
// 64-bit `RA + (EXTS(SI) << 16)`. Matches canary
|
||||
// (`Add(RA, Int64(EXTS(imm) << 16))`, stores full 64-bit).
|
||||
let ra_val = if instr.ra() == 0 { 0u64 } else { ctx.gpr[instr.ra()] };
|
||||
let result = ra_val.wrapping_add((instr.simm16() as i64 as u64) << 16);
|
||||
ctx.gpr[instr.rd()] = result as u32 as u64;
|
||||
ctx.gpr[instr.rd()] = result;
|
||||
ctx.pc += 4;
|
||||
}
|
||||
PpcOpcode::addic => {
|
||||
// PPCBUG-002: 32-bit ABI. CA must be from a 32-bit unsigned compare;
|
||||
// canary's `AddDidCarry` truncates both operands to int32 first.
|
||||
// PPCBUG-020 fix: full 64-bit `RA + EXTS(SI)` (canary `Add(RA,
|
||||
// Int64(EXTS(imm)))`). CA stays a 32-bit unsigned compare to match
|
||||
// canary's `AddDidCarry` (truncates operands to int32 first).
|
||||
let ra32 = ctx.gpr[instr.ra()] as u32;
|
||||
let imm32 = instr.simm16() as i32 as u32;
|
||||
let result32 = ra32.wrapping_add(imm32);
|
||||
ctx.xer_ca = if result32 < ra32 { 1 } else { 0 };
|
||||
ctx.gpr[instr.rd()] = result32 as u64;
|
||||
ctx.gpr[instr.rd()] = ctx.gpr[instr.ra()].wrapping_add(instr.simm16() as i64 as u64);
|
||||
ctx.pc += 4;
|
||||
}
|
||||
PpcOpcode::addicx => {
|
||||
// PPCBUG-003: same fix as addic plus CR0 i32 view.
|
||||
// PPCBUG-020 fix: full 64-bit result; CA 32-bit; CR0 32-bit i32 view
|
||||
// (= low 32 of the result; unchanged from the pre-fix behaviour).
|
||||
let ra32 = ctx.gpr[instr.ra()] as u32;
|
||||
let imm32 = instr.simm16() as i32 as u32;
|
||||
let result32 = ra32.wrapping_add(imm32);
|
||||
ctx.xer_ca = if result32 < ra32 { 1 } else { 0 };
|
||||
ctx.gpr[instr.rd()] = result32 as u64;
|
||||
ctx.gpr[instr.rd()] = ctx.gpr[instr.ra()].wrapping_add(instr.simm16() as i64 as u64);
|
||||
ctx.update_cr_signed(0, result32 as i32 as i64);
|
||||
ctx.pc += 4;
|
||||
}
|
||||
PpcOpcode::subficx => {
|
||||
// PPCBUG-005: 32-bit ABI. Sign-extended imm has bits 32-63 set for
|
||||
// negative SIMM, poisoning the writeback. Canary uses 32-bit form.
|
||||
// PPCBUG-020 fix: full 64-bit `EXTS(SI) - RA` (canary `Sub(Int64(
|
||||
// EXTS(imm)), RA)`). CA stays a 32-bit compare.
|
||||
let ra32 = ctx.gpr[instr.ra()] as u32;
|
||||
let imm32 = instr.simm16() as i32 as u32;
|
||||
let result32 = imm32.wrapping_sub(ra32);
|
||||
ctx.xer_ca = if imm32 >= ra32 { 1 } else { 0 };
|
||||
ctx.gpr[instr.rd()] = result32 as u64;
|
||||
ctx.gpr[instr.rd()] = (instr.simm16() as i64 as u64).wrapping_sub(ctx.gpr[instr.ra()]);
|
||||
ctx.pc += 4;
|
||||
}
|
||||
PpcOpcode::mulli => {
|
||||
// PPCBUG-004: 32-bit ABI. Read RA as i32 (low 32, sign-extended for
|
||||
// multiply), product fits in 32 bits per ISA (overflow wraps).
|
||||
let ra = ctx.gpr[instr.ra()] as i32 as i64;
|
||||
// PPCBUG-020 fix: full 64-bit low product of (full 64-bit RA) ×
|
||||
// EXTS(SI). Matches canary InstrEmit_mulli
|
||||
// (`StoreGPR(Mul(LoadGPR(RA), Int64(EXTS(imm))))`).
|
||||
let ra = ctx.gpr[instr.ra()] as i64;
|
||||
let imm = instr.simm16() as i64;
|
||||
ctx.gpr[instr.rd()] = (ra.wrapping_mul(imm) as u32) as u64;
|
||||
ctx.gpr[instr.rd()] = ra.wrapping_mul(imm) as u64;
|
||||
ctx.pc += 4;
|
||||
}
|
||||
|
||||
// ===== ALU: Register =====
|
||||
PpcOpcode::addx => {
|
||||
// PPCBUG-012+020: 32-bit ABI writeback truncation + CR0 i32 view.
|
||||
// PPCBUG-020 fix: full 64-bit `RA + RB` (canary `Add(RA, RB)`).
|
||||
// OV/CR0 keep their 32-bit computation (low 32 of the result is
|
||||
// unchanged), so only the previously-zeroed upper 32 bits change.
|
||||
let ra32 = ctx.gpr[instr.ra()] as u32;
|
||||
let rb32 = ctx.gpr[instr.rb()] as u32;
|
||||
let result32 = ra32.wrapping_add(rb32);
|
||||
ctx.gpr[instr.rd()] = result32 as u64;
|
||||
ctx.gpr[instr.rd()] = ctx.gpr[instr.ra()].wrapping_add(ctx.gpr[instr.rb()]);
|
||||
if instr.oe() {
|
||||
let true_sum = (ra32 as i32 as i128) + (rb32 as i32 as i128);
|
||||
overflow::apply(ctx, true_sum != (result32 as i32) as i128);
|
||||
@@ -186,12 +198,13 @@ fn execute(ctx: &mut PpcContext, mem: &dyn MemoryAccess, instr: &DecodedInstr) -
|
||||
ctx.pc += 4;
|
||||
}
|
||||
PpcOpcode::addcx => {
|
||||
// PPCBUG-013+020: 32-bit truncation; CA from u32 unsigned compare.
|
||||
// PPCBUG-020 fix: full 64-bit `RA + RB`; CA stays 32-bit (canary
|
||||
// `AddDidCarry` truncates to int32). Low 32 of result unchanged.
|
||||
let ra32 = ctx.gpr[instr.ra()] as u32;
|
||||
let rb32 = ctx.gpr[instr.rb()] as u32;
|
||||
let result32 = ra32.wrapping_add(rb32);
|
||||
ctx.xer_ca = if result32 < ra32 { 1 } else { 0 };
|
||||
ctx.gpr[instr.rd()] = result32 as u64;
|
||||
ctx.gpr[instr.rd()] = ctx.gpr[instr.ra()].wrapping_add(ctx.gpr[instr.rb()]);
|
||||
if instr.oe() {
|
||||
let true_sum = (ra32 as i32 as i128) + (rb32 as i32 as i128);
|
||||
overflow::apply(ctx, true_sum != (result32 as i32) as i128);
|
||||
@@ -202,13 +215,15 @@ fn execute(ctx: &mut PpcContext, mem: &dyn MemoryAccess, instr: &DecodedInstr) -
|
||||
ctx.pc += 4;
|
||||
}
|
||||
PpcOpcode::addex => {
|
||||
// PPCBUG-014+020: 32-bit truncation; CA from u32 unsigned compare.
|
||||
// PPCBUG-020 fix: full 64-bit `RA + RB + CA`; CA stays 32-bit.
|
||||
let ra32 = ctx.gpr[instr.ra()] as u32;
|
||||
let rb32 = ctx.gpr[instr.rb()] as u32;
|
||||
let ca = ctx.xer_ca as u32;
|
||||
let result32 = ra32.wrapping_add(rb32).wrapping_add(ca);
|
||||
ctx.xer_ca = if result32 < ra32 || (ca != 0 && result32 == ra32) { 1 } else { 0 };
|
||||
ctx.gpr[instr.rd()] = result32 as u64;
|
||||
ctx.gpr[instr.rd()] = ctx.gpr[instr.ra()]
|
||||
.wrapping_add(ctx.gpr[instr.rb()])
|
||||
.wrapping_add(ca as u64);
|
||||
if instr.oe() {
|
||||
let true_sum = (ra32 as i32 as i128) + (rb32 as i32 as i128) + (ca as i128);
|
||||
overflow::apply(ctx, true_sum != (result32 as i32) as i128);
|
||||
@@ -219,12 +234,12 @@ fn execute(ctx: &mut PpcContext, mem: &dyn MemoryAccess, instr: &DecodedInstr) -
|
||||
ctx.pc += 4;
|
||||
}
|
||||
PpcOpcode::addzex => {
|
||||
// PPCBUG-015+020: 32-bit truncation.
|
||||
// PPCBUG-020 fix: full 64-bit `RA + CA`; CA stays 32-bit.
|
||||
let ra32 = ctx.gpr[instr.ra()] as u32;
|
||||
let ca = ctx.xer_ca as u32;
|
||||
let result32 = ra32.wrapping_add(ca);
|
||||
ctx.xer_ca = if result32 < ra32 { 1 } else { 0 };
|
||||
ctx.gpr[instr.rd()] = result32 as u64;
|
||||
ctx.gpr[instr.rd()] = ctx.gpr[instr.ra()].wrapping_add(ca as u64);
|
||||
if instr.oe() {
|
||||
let true_sum = (ra32 as i32 as i128) + (ca as i128);
|
||||
overflow::apply(ctx, true_sum != (result32 as i32) as i128);
|
||||
@@ -235,12 +250,12 @@ fn execute(ctx: &mut PpcContext, mem: &dyn MemoryAccess, instr: &DecodedInstr) -
|
||||
ctx.pc += 4;
|
||||
}
|
||||
PpcOpcode::addmex => {
|
||||
// PPCBUG-016+020: 32-bit truncation. RT = RA + CA - 1.
|
||||
// PPCBUG-020 fix: full 64-bit `RA + CA - 1`; CA stays 32-bit.
|
||||
let ra32 = ctx.gpr[instr.ra()] as u32;
|
||||
let ca = ctx.xer_ca as u32;
|
||||
let result32 = ra32.wrapping_add(ca).wrapping_sub(1);
|
||||
ctx.xer_ca = if ra32 != 0 || ca != 0 { 1 } else { 0 };
|
||||
ctx.gpr[instr.rd()] = result32 as u64;
|
||||
ctx.gpr[instr.rd()] = ctx.gpr[instr.ra()].wrapping_add(ca as u64).wrapping_sub(1);
|
||||
if instr.oe() {
|
||||
let true_sum = (ra32 as i32 as i128) + (ca as i128) - 1;
|
||||
overflow::apply(ctx, true_sum != (result32 as i32) as i128);
|
||||
@@ -251,11 +266,12 @@ fn execute(ctx: &mut PpcContext, mem: &dyn MemoryAccess, instr: &DecodedInstr) -
|
||||
ctx.pc += 4;
|
||||
}
|
||||
PpcOpcode::subfx => {
|
||||
// PPCBUG-017+020: 32-bit truncation.
|
||||
// PPCBUG-020 fix: full 64-bit `RB - RA` (canary `Sub(RB, RA)`).
|
||||
// OV/CR0 keep their 32-bit view (low 32 of result unchanged).
|
||||
let ra32 = ctx.gpr[instr.ra()] as u32;
|
||||
let rb32 = ctx.gpr[instr.rb()] as u32;
|
||||
let result32 = rb32.wrapping_sub(ra32);
|
||||
ctx.gpr[instr.rd()] = result32 as u64;
|
||||
ctx.gpr[instr.rd()] = ctx.gpr[instr.rb()].wrapping_sub(ctx.gpr[instr.ra()]);
|
||||
if instr.oe() {
|
||||
let true_diff = (rb32 as i32 as i128) - (ra32 as i32 as i128);
|
||||
overflow::apply(ctx, true_diff != (result32 as i32) as i128);
|
||||
@@ -266,14 +282,13 @@ fn execute(ctx: &mut PpcContext, mem: &dyn MemoryAccess, instr: &DecodedInstr) -
|
||||
ctx.pc += 4;
|
||||
}
|
||||
PpcOpcode::subfcx => {
|
||||
// PPCBUG-007: 32-bit ABI. The `rb >= ra` u64 unsigned compare is
|
||||
// exactly the shape that broke addis. Defensive 32-bit truncation
|
||||
// is required for correct CA even after upstream cleanup.
|
||||
// PPCBUG-020 fix: full 64-bit `RB - RA`; CA stays a 32-bit `rb >= ra`
|
||||
// compare (canary `SubDidCarry` truncates to int32).
|
||||
let ra32 = ctx.gpr[instr.ra()] as u32;
|
||||
let rb32 = ctx.gpr[instr.rb()] as u32;
|
||||
let result32 = rb32.wrapping_sub(ra32);
|
||||
ctx.xer_ca = if rb32 >= ra32 { 1 } else { 0 };
|
||||
ctx.gpr[instr.rd()] = result32 as u64;
|
||||
ctx.gpr[instr.rd()] = ctx.gpr[instr.rb()].wrapping_sub(ctx.gpr[instr.ra()]);
|
||||
if instr.oe() {
|
||||
let true_diff = (rb32 as i32 as i128) - (ra32 as i32 as i128);
|
||||
overflow::apply(ctx, true_diff != (result32 as i32) as i128);
|
||||
@@ -284,14 +299,16 @@ fn execute(ctx: &mut PpcContext, mem: &dyn MemoryAccess, instr: &DecodedInstr) -
|
||||
ctx.pc += 4;
|
||||
}
|
||||
PpcOpcode::subfex => {
|
||||
// PPCBUG-008: 32-bit ABI. Compute in u32 space — `!ra` on u64 always
|
||||
// pollutes the upper 32 bits, making this an active poisoner.
|
||||
// PPCBUG-020 fix: full 64-bit `~RA + RB + CA` (canary semantics).
|
||||
// CA keeps its 32-bit compare. Low 32 of the result is unchanged.
|
||||
let ra32 = ctx.gpr[instr.ra()] as u32;
|
||||
let rb32 = ctx.gpr[instr.rb()] as u32;
|
||||
let ca = ctx.xer_ca as u32;
|
||||
let result32 = (!ra32).wrapping_add(rb32).wrapping_add(ca);
|
||||
ctx.xer_ca = if rb32 > ra32 || (rb32 == ra32 && ca != 0) { 1 } else { 0 };
|
||||
ctx.gpr[instr.rd()] = result32 as u64;
|
||||
ctx.gpr[instr.rd()] = (!ctx.gpr[instr.ra()])
|
||||
.wrapping_add(ctx.gpr[instr.rb()])
|
||||
.wrapping_add(ca as u64);
|
||||
if instr.oe() {
|
||||
// RT <- !RA + RB + CA == RB - RA - 1 + CA (32-bit semantics).
|
||||
let true_sum = (rb32 as i32 as i128) - (ra32 as i32 as i128) - 1 + (ca as i128);
|
||||
@@ -303,14 +320,13 @@ fn execute(ctx: &mut PpcContext, mem: &dyn MemoryAccess, instr: &DecodedInstr) -
|
||||
ctx.pc += 4;
|
||||
}
|
||||
PpcOpcode::subfzex => {
|
||||
// PPCBUG-018: same active-poisoning shape as subfex; operate in u32.
|
||||
// PPCBUG-020 fix: full 64-bit `~RA + CA` (canary semantics).
|
||||
let ra32 = ctx.gpr[instr.ra()] as u32;
|
||||
let ca = ctx.xer_ca as u32;
|
||||
let result32 = (!ra32).wrapping_add(ca);
|
||||
// RT <- !RA + CA (no -1 term). 32-bit carry-out only when
|
||||
// !ra32 = u32::MAX (i.e. ra32 = 0) AND ca = 1.
|
||||
// CA: 32-bit carry-out only when !ra32 = u32::MAX (ra32 = 0) AND ca = 1.
|
||||
ctx.xer_ca = if ra32 == 0 && ca != 0 { 1 } else { 0 };
|
||||
ctx.gpr[instr.rd()] = result32 as u64;
|
||||
ctx.gpr[instr.rd()] = (!ctx.gpr[instr.ra()]).wrapping_add(ca as u64);
|
||||
if instr.oe() {
|
||||
let true_sum = -(ra32 as i32 as i128) - 1 + (ca as i128);
|
||||
overflow::apply(ctx, true_sum != (result32 as i32) as i128);
|
||||
@@ -321,13 +337,13 @@ fn execute(ctx: &mut PpcContext, mem: &dyn MemoryAccess, instr: &DecodedInstr) -
|
||||
ctx.pc += 4;
|
||||
}
|
||||
PpcOpcode::subfmex => {
|
||||
// PPCBUG-019: also fixes the always-true CA edge — `!ra` on u64
|
||||
// is non-zero when ra32==0xFFFFFFFF and ca==0, so CA was stuck at 1.
|
||||
// PPCBUG-020 fix: full 64-bit `~RA + CA - 1` (canary semantics). CA
|
||||
// uses the 32-bit `!ra32` so it isn't stuck at 1 from u64 inversion.
|
||||
let ra32 = ctx.gpr[instr.ra()] as u32;
|
||||
let ca = ctx.xer_ca as u32;
|
||||
let result32 = (!ra32).wrapping_add(ca).wrapping_sub(1);
|
||||
ctx.xer_ca = if (!ra32) != 0 || ca != 0 { 1 } else { 0 };
|
||||
ctx.gpr[instr.rd()] = result32 as u64;
|
||||
ctx.gpr[instr.rd()] = (!ctx.gpr[instr.ra()]).wrapping_add(ca as u64).wrapping_sub(1);
|
||||
if instr.oe() {
|
||||
let true_sum = -(ra32 as i32 as i128) - 2 + (ca as i128);
|
||||
overflow::apply(ctx, true_sum != (result32 as i32) as i128);
|
||||
@@ -338,12 +354,11 @@ fn execute(ctx: &mut PpcContext, mem: &dyn MemoryAccess, instr: &DecodedInstr) -
|
||||
ctx.pc += 4;
|
||||
}
|
||||
PpcOpcode::negx => {
|
||||
// PPCBUG-006: 32-bit ABI. `(!ra).wrapping_add(1)` on u64 always
|
||||
// sets upper 32 bits — every neg poisoned the GPR. neg_ov also
|
||||
// checks at 64-bit INT_MIN; should be 32-bit INT_MIN.
|
||||
// PPCBUG-020 fix: full 64-bit `-RA` (canary `Sub(0, RA)`). OV keeps
|
||||
// the 32-bit INT_MIN check (low 32 of the result is unchanged).
|
||||
let ra32 = ctx.gpr[instr.ra()] as u32;
|
||||
let result32 = (!ra32).wrapping_add(1);
|
||||
ctx.gpr[instr.rd()] = result32 as u64;
|
||||
ctx.gpr[instr.rd()] = 0u64.wrapping_sub(ctx.gpr[instr.ra()]);
|
||||
if instr.oe() {
|
||||
overflow::apply(ctx, ra32 == 0x8000_0000);
|
||||
}
|
||||
@@ -353,12 +368,15 @@ fn execute(ctx: &mut PpcContext, mem: &dyn MemoryAccess, instr: &DecodedInstr) -
|
||||
ctx.pc += 4;
|
||||
}
|
||||
PpcOpcode::mullwx => {
|
||||
// PPCBUG-009: 32-bit ABI. Truncate product to u32 — overflow detection
|
||||
// (mullw_ov) still uses the full i64 product to catch the overflow.
|
||||
// PPCBUG-020 fix: full 64-bit low product of EXTS(RA[32:63]) ×
|
||||
// EXTS(RB[32:63]) (canary InstrEmit_mullwx stores the full i64
|
||||
// product). A 32×32 product can occupy the upper 32 bits (e.g.
|
||||
// 0x10000 × 0x10000 = 0x1_0000_0000); the old `as u32` dropped them.
|
||||
// OV uses the full product; CR0 keeps its 32-bit (low-word) view.
|
||||
let ra = ctx.gpr[instr.ra()] as i32 as i64;
|
||||
let rb = ctx.gpr[instr.rb()] as i32 as i64;
|
||||
let product = ra.wrapping_mul(rb);
|
||||
ctx.gpr[instr.rd()] = product as u32 as u64;
|
||||
ctx.gpr[instr.rd()] = product as u64;
|
||||
if instr.oe() {
|
||||
overflow::apply(ctx, overflow::mullw_ov(product));
|
||||
}
|
||||
@@ -542,6 +560,18 @@ fn execute(ctx: &mut PpcContext, mem: &dyn MemoryAccess, instr: &DecodedInstr) -
|
||||
ctx.gpr[instr.ra()] = ctx.gpr[instr.rs()] | ctx.gpr[instr.rb()];
|
||||
if instr.rc_bit() { ctx.update_cr_signed(0, ctx.gpr[instr.ra()] as u32 as i32 as i64); }
|
||||
ctx.pc += 4;
|
||||
// `or r31,r31,r31` with encoding 0x7FFFFB78 is the Xenon `db16cyc`
|
||||
// spin-wait hint (a no-op write of r31 onto itself). Canary's
|
||||
// `InstrEmit_orx` special-cases exactly this code → `DelayExecution()`.
|
||||
// Under our round-robin lockstep, a guest spinlock/barrier loop that
|
||||
// executes db16cyc would otherwise consume its whole block every round
|
||||
// and starve the co-located thread it is waiting on (the lock holder /
|
||||
// barrier peer). Surface it as a cooperative yield so the scheduler can
|
||||
// hand the slot to a Ready peer. The semantic result of the op is
|
||||
// already applied (r31 |= r31 is a no-op), so yielding is value-neutral.
|
||||
if instr.raw == 0x7FFF_FB78 {
|
||||
return StepResult::Yield;
|
||||
}
|
||||
}
|
||||
PpcOpcode::orcx => {
|
||||
// PPCBUG-028: same shape as andcx — operate in u32.
|
||||
@@ -620,7 +650,12 @@ fn execute(ctx: &mut PpcContext, mem: &dyn MemoryAccess, instr: &DecodedInstr) -
|
||||
PpcOpcode::slwx => {
|
||||
// PPCBUG-044: 32-bit ABI CR0 view. A result with bit 31 set
|
||||
// (e.g. 0x80000000) is negative in i32 view but positive in i64.
|
||||
let sh = ctx.gpr[instr.rb()] as u32;
|
||||
// Shift amount is RB[58:63] (6 bits): if >=32 the result is zeroed,
|
||||
// else shift by the low bits. Matches canary InstrEmit_slwx, which
|
||||
// masks `rb & 0x3F` then tests bit 5 — NOT a full-u32 `< 32` test
|
||||
// (a count like 0x40 has low-6-bits 0 and must pass the value
|
||||
// through, not zero it).
|
||||
let sh = ctx.gpr[instr.rb()] as u32 & 0x3F;
|
||||
ctx.gpr[instr.ra()] = if sh < 32 {
|
||||
((ctx.gpr[instr.rs()] as u32) << sh) as u64
|
||||
} else { 0 };
|
||||
@@ -630,7 +665,9 @@ fn execute(ctx: &mut PpcContext, mem: &dyn MemoryAccess, instr: &DecodedInstr) -
|
||||
PpcOpcode::srwx => {
|
||||
// PPCBUG-044: 32-bit ABI CR0 view (zero-extended right shift can never
|
||||
// have bit 31 set, but use the canonical form for consistency).
|
||||
let sh = ctx.gpr[instr.rb()] as u32;
|
||||
// Shift amount masked to RB[58:63] (6 bits) to match canary
|
||||
// InstrEmit_srwx (`rb & 0x3F`, test bit 5).
|
||||
let sh = ctx.gpr[instr.rb()] as u32 & 0x3F;
|
||||
ctx.gpr[instr.ra()] = if sh < 32 {
|
||||
((ctx.gpr[instr.rs()] as u32) >> sh) as u64
|
||||
} else { 0 };
|
||||
@@ -638,37 +675,46 @@ fn execute(ctx: &mut PpcContext, mem: &dyn MemoryAccess, instr: &DecodedInstr) -
|
||||
ctx.pc += 4;
|
||||
}
|
||||
PpcOpcode::srawx => {
|
||||
// PPCBUG-041+043 coupled: 32-bit ABI writeback truncation + CR0 i32.
|
||||
// CA logic is independently correct (uses u32 shifted-out test).
|
||||
// sraw: 32-bit arithmetic shift right. Per PowerISA the 32-bit result
|
||||
// is SIGN-extended into the full 64-bit RA (`RA <- r&m | (i64.s)&¬m`),
|
||||
// matching canary InstrEmit_srawx (`v = f.SignExtend(v, INT64_TYPE)`).
|
||||
// Earlier ours zero-extended (`result as u32 as u64`) — the PPCBUG-041
|
||||
// "writeback truncation" band-aid — which corrupts any negative shift
|
||||
// result consumed as a 64-bit value. CA logic is independently correct
|
||||
// (uses the u32 shifted-out test) and the CR0 view is unchanged (the
|
||||
// sign-extended i64 has the same i32 view).
|
||||
let rs = ctx.gpr[instr.rs()] as i32;
|
||||
let sh = ctx.gpr[instr.rb()] as u32 & 0x3F;
|
||||
if sh == 0 {
|
||||
ctx.gpr[instr.ra()] = rs as u32 as u64;
|
||||
let result: i32 = if sh == 0 {
|
||||
ctx.xer_ca = 0;
|
||||
rs
|
||||
} else if sh < 32 {
|
||||
let result = rs >> sh;
|
||||
ctx.xer_ca = if rs < 0 && (rs as u32) << (32 - sh) != 0 { 1 } else { 0 };
|
||||
ctx.gpr[instr.ra()] = result as u32 as u64;
|
||||
rs >> sh
|
||||
} else {
|
||||
ctx.gpr[instr.ra()] = if rs < 0 { 0xFFFF_FFFFu64 } else { 0 };
|
||||
// sh >= 32: result is all sign bits of rs.
|
||||
ctx.xer_ca = if rs < 0 { 1 } else { 0 };
|
||||
}
|
||||
if instr.rc_bit() { ctx.update_cr_signed(0, ctx.gpr[instr.ra()] as u32 as i32 as i64); }
|
||||
rs >> 31
|
||||
};
|
||||
ctx.gpr[instr.ra()] = result as i64 as u64;
|
||||
if instr.rc_bit() { ctx.update_cr_signed(0, result as i64); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
PpcOpcode::srawix => {
|
||||
// PPCBUG-042+043 coupled: same shape as srawx for the sh-immediate form.
|
||||
// srawi: same as srawx for the sh-immediate form (sh in 0..31).
|
||||
// Sign-extend the 32-bit result into the full 64-bit RA per PowerISA /
|
||||
// canary InstrEmit_srawix.
|
||||
let rs = ctx.gpr[instr.rs()] as i32;
|
||||
let sh = instr.sh();
|
||||
if sh == 0 {
|
||||
ctx.gpr[instr.ra()] = rs as u32 as u64;
|
||||
let result: i32 = if sh == 0 {
|
||||
ctx.xer_ca = 0;
|
||||
rs
|
||||
} else {
|
||||
let result = rs >> sh;
|
||||
ctx.xer_ca = if rs < 0 && (rs as u32) << (32 - sh) != 0 { 1 } else { 0 };
|
||||
ctx.gpr[instr.ra()] = result as u32 as u64;
|
||||
}
|
||||
if instr.rc_bit() { ctx.update_cr_signed(0, ctx.gpr[instr.ra()] as u32 as i32 as i64); }
|
||||
rs >> sh
|
||||
};
|
||||
ctx.gpr[instr.ra()] = result as i64 as u64;
|
||||
if instr.rc_bit() { ctx.update_cr_signed(0, result as i64); }
|
||||
ctx.pc += 4;
|
||||
}
|
||||
PpcOpcode::sldx => {
|
||||
@@ -1605,7 +1651,12 @@ fn execute(ctx: &mut PpcContext, mem: &dyn MemoryAccess, instr: &DecodedInstr) -
|
||||
match spr {
|
||||
crate::context::spr::XER => ctx.set_xer(val as u32),
|
||||
crate::context::spr::LR => ctx.lr = val,
|
||||
crate::context::spr::CTR => ctx.ctr = val as u32 as u64,
|
||||
// CTR is a 64-bit SPR — store the full GPR, matching canary
|
||||
// InstrEmit_mtspr (`f.StoreCTR(rt)`, no truncation). The PPCBUG-054
|
||||
// `val as u32 as u64` band-aid dropped the upper 32 bits, which a
|
||||
// later `mfspr rX, CTR` would read back wrong. (bdnz/bcctr only
|
||||
// ever consume CTR's low 32 bits, so branching is unaffected.)
|
||||
crate::context::spr::CTR => ctx.ctr = val,
|
||||
crate::context::spr::DEC => ctx.dec = val as u32,
|
||||
crate::context::spr::TBL_WRITE => {
|
||||
ctx.timebase = (ctx.timebase & 0xFFFF_FFFF_0000_0000) | (val & 0xFFFF_FFFF);
|
||||
@@ -5015,6 +5066,106 @@ mod tests {
|
||||
assert_eq!(ctx.pc, 4);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_db16cyc_yields() {
|
||||
// `or r31,r31,r31` encoding 0x7FFFFB78 is the Xenon db16cyc spin hint.
|
||||
// It must (a) be value-neutral (r31 unchanged), (b) advance PC, and
|
||||
// (c) report StepResult::Yield so the scheduler can hand off the slot.
|
||||
let mut ctx = PpcContext::new();
|
||||
let mut mem = TestMem::new();
|
||||
write_instr(&mut mem, 0, 0x7FFF_FB78);
|
||||
ctx.pc = 0;
|
||||
ctx.gpr[31] = 0x1234_5678_9ABC_DEF0;
|
||||
let r = step(&mut ctx, &mut mem);
|
||||
assert_eq!(ctx.gpr[31], 0x1234_5678_9ABC_DEF0, "db16cyc is value-neutral");
|
||||
assert_eq!(ctx.pc, 4, "PC advances past the hint");
|
||||
assert_eq!(r, StepResult::Yield, "db16cyc surfaces as a cooperative yield");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_plain_or_self_is_not_yield() {
|
||||
// A regular `or rN,rN,rN` that is NOT the db16cyc encoding (e.g. r3)
|
||||
// is an ordinary no-op move and must keep executing (Continue), so we
|
||||
// only yield on the exact spin-hint code canary special-cases.
|
||||
let mut ctx = PpcContext::new();
|
||||
let mut mem = TestMem::new();
|
||||
// or r3, r3, r3 (RT=RA=RB=3, Rc=0): 31<<26 | 3<<21 | 3<<16 | 3<<11 | 444<<1
|
||||
let raw = (31u32 << 26) | (3 << 21) | (3 << 16) | (3 << 11) | (444 << 1);
|
||||
write_instr(&mut mem, 0, raw);
|
||||
ctx.pc = 0;
|
||||
ctx.gpr[3] = 0xCAFE;
|
||||
let r = step(&mut ctx, &mut mem);
|
||||
assert_eq!(ctx.gpr[3], 0xCAFE);
|
||||
assert_eq!(ctx.pc, 4);
|
||||
assert_eq!(r, StepResult::Continue, "non-db16cyc or-self stays Continue");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_smt_priority_hints_are_nops_not_yields() {
|
||||
// iterate-2H spin/yield/sync hint-class audit. The PowerPC SMT
|
||||
// thread-priority hints `or 1,1,1` / `or 2,2,2` / `or 3,3,3` / `or 6,6,6`
|
||||
// (and the db8cyc family `or 26..30`) are reserved no-op encodings.
|
||||
// Canary's `InstrEmit_orx` emits `f.Nop()` for EVERY `or rX,rX,rX`
|
||||
// (RT==RB==RA && !Rc) form EXCEPT the exact db16cyc code 0x7FFFFB78,
|
||||
// which alone gets `f.DelayExecution()`. So ours must NOT yield on any
|
||||
// of these — over-yielding would diverge from canary and perturb the
|
||||
// deterministic schedule. (Audit evidence: none of 1/2/3/6/26..30 even
|
||||
// appear in Sylpheed's image; only `or 31,31,31` (db16cyc) is used as a
|
||||
// spin hint. This test locks the no-over-yield invariant regardless.)
|
||||
for r in [1u32, 2, 3, 6, 26, 27, 28, 29, 30] {
|
||||
let mut ctx = PpcContext::new();
|
||||
let mut mem = TestMem::new();
|
||||
// or rN,rN,rN, Rc=0: 31<<26 | r<<21 | r<<16 | r<<11 | 444<<1
|
||||
let raw = (31u32 << 26) | (r << 21) | (r << 16) | (r << 11) | (444 << 1);
|
||||
write_instr(&mut mem, 0, raw);
|
||||
ctx.pc = 0;
|
||||
ctx.gpr[r as usize] = 0xDEAD_BEEF_F00D_BA11;
|
||||
let res = step(&mut ctx, &mut mem);
|
||||
assert_eq!(
|
||||
ctx.gpr[r as usize], 0xDEAD_BEEF_F00D_BA11,
|
||||
"or {r},{r},{r} is value-neutral"
|
||||
);
|
||||
assert_eq!(ctx.pc, 4, "or {r},{r},{r} advances PC");
|
||||
assert_eq!(
|
||||
res,
|
||||
StepResult::Continue,
|
||||
"priority hint or {r},{r},{r} is a plain no-op (canary Nop), NOT a yield"
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_lwsync_ptesync_eieio_isync_decode_as_benign_noops() {
|
||||
// Memory/sync barrier class. Canary keys `sync` on XO=598 only, so
|
||||
// sync (L=0), lwsync (L=1), ptesync (L=2) all map to the same
|
||||
// `InstrEmit_sync` -> `MemoryBarrier`; `eieio` -> `MemoryBarrier`;
|
||||
// `isync` -> `Nop`. Under our single-host interpreter every one is a
|
||||
// value-neutral no-op that advances PC and must DECODE (never trap as
|
||||
// unknown). This guards the L-field disambiguation and the decode path.
|
||||
let cases: &[(u32, &str)] = &[
|
||||
(0x7C00_04AC, "sync"), // L=0
|
||||
(0x7C20_04AC, "lwsync"), // L=1
|
||||
(0x7C40_04AC, "ptesync"), // L=2
|
||||
(0x7C00_06AC, "eieio"),
|
||||
(0x4C00_012C, "isync"),
|
||||
];
|
||||
for &(raw, name) in cases {
|
||||
let mut ctx = PpcContext::new();
|
||||
let mut mem = TestMem::new();
|
||||
let pre_xer = ctx.xer();
|
||||
let pre_fpscr = ctx.fpscr;
|
||||
let pre_gpr = ctx.gpr;
|
||||
write_instr(&mut mem, 0x200, raw);
|
||||
ctx.pc = 0x200;
|
||||
let res = step(&mut ctx, &mut mem);
|
||||
assert_eq!(res, StepResult::Continue, "{name} continues");
|
||||
assert_eq!(ctx.pc, 0x204, "{name} advances PC (decoded, did not trap)");
|
||||
assert_eq!(ctx.xer(), pre_xer, "{name} leaves XER");
|
||||
assert_eq!(ctx.fpscr, pre_fpscr, "{name} leaves FPSCR");
|
||||
assert_eq!(ctx.gpr, pre_gpr, "{name} leaves GPRs");
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_fadd() {
|
||||
let mut ctx = PpcContext::new();
|
||||
@@ -5332,15 +5483,17 @@ mod tests {
|
||||
write_instr(&mut mem, 0, raw);
|
||||
ctx.pc = 0;
|
||||
step(&mut ctx, &mut mem);
|
||||
assert_eq!(ctx.xer_ov, 1);
|
||||
// -INT_MIN wraps to INT_MIN (low 32 bits) with upper 32 bits zero.
|
||||
assert_eq!(ctx.gpr[5], 0x0000_0000_8000_0000);
|
||||
assert_eq!(ctx.xer_ov, 1, "32-bit INT_MIN check (preserved) sets OV");
|
||||
// PPCBUG-020 fix: neg is full 64-bit `0 - RA` (canary `Sub(0, RA)`).
|
||||
// RA = 0x0000_0000_8000_0000 → 0xFFFF_FFFF_8000_0000. (OV remains the
|
||||
// preserved 32-bit INT_MIN flag.)
|
||||
assert_eq!(ctx.gpr[5], 0xFFFF_FFFF_8000_0000);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn neg_clean_input_no_upper_bits() {
|
||||
// PPCBUG-006 regression: neg r3=5 must produce 0x00000000_FFFFFFFB,
|
||||
// not 0xFFFFFFFF_FFFFFFFB (the 64-bit !ra-then-add-1 result).
|
||||
// PPCBUG-020 fix: neg r3=5 = `0 - 5` = -5 = 0xFFFFFFFF_FFFFFFFB on a
|
||||
// 64-bit core (canary `Sub(0, RA)`), not the truncated 0x00000000_FFFFFFFB.
|
||||
let mut ctx = PpcContext::new();
|
||||
let mut mem = TestMem::new();
|
||||
ctx.gpr[3] = 5;
|
||||
@@ -5348,7 +5501,7 @@ mod tests {
|
||||
write_instr(&mut mem, 0, raw);
|
||||
ctx.pc = 0;
|
||||
step(&mut ctx, &mut mem);
|
||||
assert_eq!(ctx.gpr[5], 0x0000_0000_FFFF_FFFB);
|
||||
assert_eq!(ctx.gpr[5], 0xFFFF_FFFF_FFFF_FFFB);
|
||||
}
|
||||
|
||||
#[test]
|
||||
@@ -5502,9 +5655,10 @@ mod tests {
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn mullwx_overflow_truncates_to_32() {
|
||||
// PPCBUG-009: mullwo r5, r3, r4 with ra=0x10000, rb=0x10000 → product
|
||||
// 0x100000000 (overflow). Low 32 = 0; OE must fire.
|
||||
fn mullwx_overflow_keeps_full_64bit_product() {
|
||||
// PPCBUG-020 fix: mullwo r5, r3, r4 with ra=0x10000, rb=0x10000 → full
|
||||
// 64-bit product 0x1_0000_0000 (canary stores the full i64 product, not
|
||||
// the truncated low 32). OE still fires (the product overflows int32).
|
||||
let mut ctx = PpcContext::new();
|
||||
let mut mem = TestMem::new();
|
||||
ctx.gpr[3] = 0x10000;
|
||||
@@ -5514,7 +5668,7 @@ mod tests {
|
||||
write_instr(&mut mem, 0, raw);
|
||||
ctx.pc = 0;
|
||||
step(&mut ctx, &mut mem);
|
||||
assert_eq!(ctx.gpr[5], 0, "low 32 bits = 0");
|
||||
assert_eq!(ctx.gpr[5], 0x0000_0001_0000_0000, "full 64-bit product");
|
||||
assert_eq!(ctx.xer_ov, 1, "overflow detected");
|
||||
}
|
||||
|
||||
@@ -5536,9 +5690,74 @@ mod tests {
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn srawx_negative_value_zero_extends_upper() {
|
||||
// PPCBUG-041+043: srawx of negative i32 by 1 produces a negative i32;
|
||||
// writeback must zero-extend to u64 (not sign-extend).
|
||||
fn slwx_shift_count_masks_to_6_bits() {
|
||||
// slw masks the shift count to RB[58:63] (6 bits): a count of 0x40 has
|
||||
// low-6-bits 0, so the value passes through unchanged — it must NOT be
|
||||
// zeroed by a naive full-u32 `>= 32` test. Matches canary InstrEmit_slwx.
|
||||
let mut ctx = PpcContext::new();
|
||||
let mut mem = TestMem::new();
|
||||
ctx.gpr[3] = 0x0000_1234u64;
|
||||
ctx.gpr[4] = 0x40; // count & 0x3F == 0 → shift by 0
|
||||
// slwx r5, r3, r4 (XO=24)
|
||||
let raw = (31u32 << 26) | (3 << 21) | (5 << 16) | (4 << 11) | (24 << 1);
|
||||
write_instr(&mut mem, 0, raw);
|
||||
ctx.pc = 0;
|
||||
step(&mut ctx, &mut mem);
|
||||
assert_eq!(ctx.gpr[5], 0x0000_1234u64, "0x40 masks to 0 → passthrough");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn slwx_count_32_to_63_zeroes() {
|
||||
// A masked count in [32,63] (bit 5 set) zeroes the result.
|
||||
let mut ctx = PpcContext::new();
|
||||
let mut mem = TestMem::new();
|
||||
ctx.gpr[3] = 0xFFFF_FFFFu64;
|
||||
ctx.gpr[4] = 0x60; // & 0x3F = 0x20 (32) → zero
|
||||
let raw = (31u32 << 26) | (3 << 21) | (5 << 16) | (4 << 11) | (24 << 1);
|
||||
write_instr(&mut mem, 0, raw);
|
||||
ctx.pc = 0;
|
||||
step(&mut ctx, &mut mem);
|
||||
assert_eq!(ctx.gpr[5], 0);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn srwx_shift_count_masks_to_6_bits() {
|
||||
// srw, same 6-bit mask. Count 0x48 → low-6-bits = 8 → logical >> 8.
|
||||
let mut ctx = PpcContext::new();
|
||||
let mut mem = TestMem::new();
|
||||
ctx.gpr[3] = 0x0000_FF00u64;
|
||||
ctx.gpr[4] = 0x48; // & 0x3F = 8
|
||||
// srwx r5, r3, r4 (XO=536)
|
||||
let raw = (31u32 << 26) | (3 << 21) | (5 << 16) | (4 << 11) | (536 << 1);
|
||||
write_instr(&mut mem, 0, raw);
|
||||
ctx.pc = 0;
|
||||
step(&mut ctx, &mut mem);
|
||||
assert_eq!(ctx.gpr[5], 0x0000_00FFu64, "0x48 masks to 8 → >>8");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn rlwinm_mb_greater_than_me_wraparound_mask() {
|
||||
// rlwinm with MB > ME produces a wraparound mask covering bits
|
||||
// [0..ME] ∪ [MB..31] (a "split" mask). PowerISA MASK(mb,me) wraps when
|
||||
// mb > me. Here rotate by 0, MB=28, ME=3 → mask = 0xF000000F.
|
||||
let mut ctx = PpcContext::new();
|
||||
let mut mem = TestMem::new();
|
||||
ctx.gpr[3] = 0xFFFF_FFFFu64;
|
||||
// rlwinm r5, r3, SH=0, MB=28, ME=3 (opcode 21)
|
||||
let raw = (21u32 << 26) | (3 << 21) | (5 << 16) | (0 << 11) | (28 << 6) | (3 << 1);
|
||||
write_instr(&mut mem, 0, raw);
|
||||
ctx.pc = 0;
|
||||
step(&mut ctx, &mut mem);
|
||||
assert_eq!(ctx.gpr[5], 0x0000_0000_F000_000Fu64,
|
||||
"MB>ME wraparound mask = bits [0..3] | [28..31]");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn srawx_negative_value_sign_extends_upper() {
|
||||
// sraw of negative i32 by 1 produces a negative i32 result that PowerISA
|
||||
// SIGN-extends into the full 64-bit RA (canary InstrEmit_srawx uses
|
||||
// `f.SignExtend`). 0x80000000 >> 1 = 0xC0000000 (i32) → 0xFFFFFFFF_C0000000.
|
||||
// (Was 0x00000000_C0000000 under the PPCBUG-041 zero-extend band-aid.)
|
||||
let mut ctx = PpcContext::new();
|
||||
let mut mem = TestMem::new();
|
||||
ctx.gpr[3] = 0x8000_0000u64; // i32::MIN
|
||||
@@ -5548,14 +5767,15 @@ mod tests {
|
||||
write_instr(&mut mem, 0, raw);
|
||||
ctx.pc = 0;
|
||||
step(&mut ctx, &mut mem);
|
||||
assert_eq!(ctx.gpr[5], 0x0000_0000_C000_0000u64);
|
||||
assert_eq!(ctx.gpr[5], 0xFFFF_FFFF_C000_0000u64);
|
||||
assert!(ctx.cr[0].lt);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn srawix_high_count_negative_input_yields_low32_all_ones() {
|
||||
// PPCBUG-042+043: srawi with count=31 on negative input → low 32 bits
|
||||
// all ones (0xFFFFFFFF), upper 32 zero (was u64::MAX before fix).
|
||||
fn srawix_high_count_negative_input_sign_extends_all_ones() {
|
||||
// srawi count=31 on negative input → result is -1 (0xFFFFFFFF as i32),
|
||||
// sign-extended to the full 64-bit RA: 0xFFFFFFFF_FFFFFFFF (canary
|
||||
// InstrEmit_srawix). Was 0x00000000_FFFFFFFF under the zero-extend band-aid.
|
||||
let mut ctx = PpcContext::new();
|
||||
let mut mem = TestMem::new();
|
||||
ctx.gpr[3] = 0x8000_0000u64;
|
||||
@@ -5564,7 +5784,7 @@ mod tests {
|
||||
write_instr(&mut mem, 0, raw);
|
||||
ctx.pc = 0;
|
||||
step(&mut ctx, &mut mem);
|
||||
assert_eq!(ctx.gpr[5], 0x0000_0000_FFFF_FFFFu64);
|
||||
assert_eq!(ctx.gpr[5], 0xFFFF_FFFF_FFFF_FFFFu64);
|
||||
}
|
||||
|
||||
#[test]
|
||||
@@ -5598,17 +5818,18 @@ mod tests {
|
||||
write_instr(&mut mem, 0, raw);
|
||||
ctx.pc = 0;
|
||||
step(&mut ctx, &mut mem);
|
||||
// Result low 32: 0x00000001 + 0xFFFFFFFF = 0x00000000 with carry.
|
||||
assert_eq!(ctx.gpr[4], 0);
|
||||
// PPCBUG-020 fix: full 64-bit `RA + EXTS(-1)` = 0xFFFFFFFF_00000001 +
|
||||
// 0xFFFFFFFF_FFFFFFFF = 0xFFFFFFFF_00000000 (canary). CA still comes
|
||||
// from the 32-bit compare (low 32: 0x00000001 + 0xFFFFFFFF = 0, carry).
|
||||
assert_eq!(ctx.gpr[4], 0xFFFFFFFF_00000000u64);
|
||||
assert_eq!(ctx.xer_ca, 1, "32-bit compare must see CA=1");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn mulli_overflow_wraps_to_32() {
|
||||
// PPCBUG-004: mulli must truncate to 32 bits even when the upper 32 bits
|
||||
// of RA are polluted (e.g. by upstream bugs). Pre-fix: ra = u64::MAX as
|
||||
// i64 = -1, * 2 = -2, written to GPR as `0xFFFFFFFF_FFFFFFFE`. Post-fix:
|
||||
// truncated to `0xFFFFFFFE`. Discriminating regression test.
|
||||
fn mulli_full_64bit_product() {
|
||||
// PPCBUG-020 fix: mulli uses the full 64-bit RA (canary
|
||||
// `Mul(LoadGPR(RA), Int64(EXTS(imm)))`). RA = u64::MAX = -1, × 2 = -2
|
||||
// = 0xFFFFFFFF_FFFFFFFE (full 64-bit), not the truncated 0xFFFFFFFE.
|
||||
let mut ctx = PpcContext::new();
|
||||
let mut mem = TestMem::new();
|
||||
ctx.gpr[3] = u64::MAX;
|
||||
@@ -5617,13 +5838,14 @@ mod tests {
|
||||
write_instr(&mut mem, 0, raw);
|
||||
ctx.pc = 0;
|
||||
step(&mut ctx, &mut mem);
|
||||
assert_eq!(ctx.gpr[4], 0xFFFF_FFFEu64, "low 32 bits = -2 in i32; upper 32 zero");
|
||||
assert_eq!(ctx.gpr[4], 0xFFFF_FFFF_FFFF_FFFEu64, "full 64-bit -2");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn subficx_neg_simm_zero_extends() {
|
||||
// PPCBUG-005: subfic r4, r3, -1 with r3=5: imm-ra = 0xFFFFFFFF - 5 = 0xFFFFFFFA.
|
||||
// Buggy form: imm sign-extended to u64 0xFFFFFFFFFFFFFFFF - 5 = poisoned.
|
||||
fn subficx_full_64bit_result() {
|
||||
// PPCBUG-020 fix: subfic r4, r3, -1 with r3=5 = `EXTS(-1) - RA` =
|
||||
// 0xFFFFFFFF_FFFFFFFF - 5 = 0xFFFFFFFF_FFFFFFFA (canary `Sub(Int64(
|
||||
// EXTS(imm)), RA)`). CA stays a 32-bit compare (0xFFFFFFFF >= 5 → 1).
|
||||
let mut ctx = PpcContext::new();
|
||||
let mut mem = TestMem::new();
|
||||
ctx.gpr[3] = 5;
|
||||
@@ -5632,7 +5854,7 @@ mod tests {
|
||||
write_instr(&mut mem, 0, raw);
|
||||
ctx.pc = 0;
|
||||
step(&mut ctx, &mut mem);
|
||||
assert_eq!(ctx.gpr[4], 0x0000_0000_FFFF_FFFAu64);
|
||||
assert_eq!(ctx.gpr[4], 0xFFFF_FFFF_FFFF_FFFAu64);
|
||||
assert_eq!(ctx.xer_ca, 1, "0xFFFFFFFF >= 5 → CA=1");
|
||||
}
|
||||
|
||||
@@ -6538,12 +6760,13 @@ mod tests {
|
||||
assert_eq!(ctx.pc, 4);
|
||||
}
|
||||
|
||||
// PPCBUG-054: mtspr CTR must truncate the source GPR to 32 bits, matching
|
||||
// canary's `f.Truncate(ctr, INT32_TYPE)`. Prevents upstream 64-bit GPR
|
||||
// pollution from poisoning the 32-bit CTR counter independently of the
|
||||
// bcx zero-test fix.
|
||||
// CTR is a 64-bit SPR. mtspr CTR stores the full GPR (canary
|
||||
// InstrEmit_mtspr: `f.StoreCTR(rt)`, no truncation). The bdnz/bclr zero-TEST
|
||||
// still truncates to 32 bits (separate, canary-faithful — see the bcx tests
|
||||
// above); the earlier PPCBUG-054 store-side truncation was a band-aid that a
|
||||
// later `mfspr rX, CTR` would read back wrong.
|
||||
#[test]
|
||||
fn mtspr_ctr_truncates_to_32_bits() {
|
||||
fn mtspr_ctr_keeps_full_64_bits() {
|
||||
let mut ctx = PpcContext::new();
|
||||
let mut mem = TestMem::new();
|
||||
ctx.gpr[3] = 0xFFFF_FFFF_8000_0001;
|
||||
@@ -6553,7 +6776,26 @@ mod tests {
|
||||
write_instr(&mut mem, 0, raw);
|
||||
ctx.pc = 0;
|
||||
step(&mut ctx, &mut mem);
|
||||
assert_eq!(ctx.ctr, 0x8000_0001);
|
||||
assert_eq!(ctx.ctr, 0xFFFF_FFFF_8000_0001);
|
||||
}
|
||||
|
||||
// mfspr rX, CTR must read back the full 64-bit CTR (round-trips the value
|
||||
// mtspr stored). This is the observable consequence of the mtspr fix.
|
||||
#[test]
|
||||
fn mfspr_ctr_reads_full_64_bits() {
|
||||
let mut ctx = PpcContext::new();
|
||||
let mut mem = TestMem::new();
|
||||
ctx.gpr[3] = 0xFFFF_FFFF_8000_0001;
|
||||
// mtspr CTR, r3 then mfspr r5, CTR
|
||||
let spr_swapped = ((9u32 & 0x1F) << 5) | ((9u32 >> 5) & 0x1F);
|
||||
let mt = (31u32 << 26) | (3 << 21) | (spr_swapped << 11) | (467 << 1);
|
||||
let mf = (31u32 << 26) | (5 << 21) | (spr_swapped << 11) | (339 << 1);
|
||||
write_instr(&mut mem, 0, mt);
|
||||
write_instr(&mut mem, 4, mf);
|
||||
ctx.pc = 0;
|
||||
step(&mut ctx, &mut mem);
|
||||
step(&mut ctx, &mut mem);
|
||||
assert_eq!(ctx.gpr[5], 0xFFFF_FFFF_8000_0001);
|
||||
}
|
||||
|
||||
// ───────────────────────────────────────────────────────────────────────
|
||||
@@ -7640,8 +7882,8 @@ mod tests {
|
||||
ctx.xer_ca = 0;
|
||||
step(&mut ctx, &mem);
|
||||
assert_eq!(ctx.xer_ca, 0, "ra=0, ca=0 should produce CA=0");
|
||||
// PPCBUG-018: 32-bit ABI. !0u32 + 0 = u32::MAX, with upper 32 bits zero.
|
||||
assert_eq!(ctx.gpr[3], 0xFFFF_FFFFu64, "result = !0u32 + 0 = u32::MAX");
|
||||
// PPCBUG-020 fix: full 64-bit `!RA + CA` = !0u64 + 0 = u64::MAX.
|
||||
assert_eq!(ctx.gpr[3], 0xFFFF_FFFF_FFFF_FFFFu64, "result = !0u64 + 0");
|
||||
}
|
||||
// Case 3: ra=1, ca=0 → CA=0 (old buggy code reported CA=1)
|
||||
{
|
||||
@@ -7653,8 +7895,8 @@ mod tests {
|
||||
ctx.xer_ca = 0;
|
||||
step(&mut ctx, &mem);
|
||||
assert_eq!(ctx.xer_ca, 0, "ra=1, ca=0 should produce CA=0");
|
||||
// PPCBUG-018: 32-bit ABI. !1u32 + 0 = u32::MAX - 1, with upper 32 bits zero.
|
||||
assert_eq!(ctx.gpr[3], 0xFFFF_FFFEu64, "result = !1u32 + 0 = u32::MAX - 1");
|
||||
// PPCBUG-020 fix: full 64-bit `!1u64 + 0` = u64::MAX - 1.
|
||||
assert_eq!(ctx.gpr[3], 0xFFFF_FFFF_FFFF_FFFEu64, "result = !1u64 + 0");
|
||||
}
|
||||
// Case 4: ra=u32::MAX, ca=1 → CA=0; result = !u32::MAX + 1 = 1.
|
||||
{
|
||||
@@ -7666,7 +7908,9 @@ mod tests {
|
||||
ctx.xer_ca = 1;
|
||||
step(&mut ctx, &mem);
|
||||
assert_eq!(ctx.xer_ca, 0, "ra=u32::MAX, ca=1 should produce CA=0");
|
||||
assert_eq!(ctx.gpr[3], 1, "result = !u32::MAX + 1 = 1");
|
||||
// PPCBUG-020 fix: full 64-bit `!RA + CA`. RA = 0x0000_0000_FFFF_FFFF
|
||||
// → !RA = 0xFFFF_FFFF_0000_0000, + 1 = 0xFFFF_FFFF_0000_0001.
|
||||
assert_eq!(ctx.gpr[3], 0xFFFF_FFFF_0000_0001u64, "result = !RA + 1");
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -35,6 +35,20 @@ pub const INITIAL_GUEST_TID: u32 = 1;
|
||||
/// Axis 1 carries the field on every thread but doesn't decrement yet.
|
||||
pub const QUANTUM_DEFAULT: u32 = 50_000;
|
||||
|
||||
/// Anti-starvation floor. On a cooperative single-host slot, strict-priority
|
||||
/// `pick_runnable` lets a high-priority CPU-bound spinner (e.g. a pri-15
|
||||
/// time-critical poll loop pinned by affinity) win every round forever,
|
||||
/// permanently starving a co-located lower-priority peer that the spinner is
|
||||
/// actually *waiting on* — a deadlock that never occurs on real hardware,
|
||||
/// where SMT contexts run those threads concurrently.
|
||||
///
|
||||
/// Once a Ready thread has been passed over this many consecutive slot
|
||||
/// visits, `pick_runnable` grants it ONE pick (then its counter resets). The
|
||||
/// limit is large enough that the genuinely-higher-priority thread still wins
|
||||
/// the overwhelming majority of visits (here: ~4095/4096); the boost only
|
||||
/// guarantees *bounded* forward progress, it does not invert priority.
|
||||
pub const STARVE_LIMIT: u32 = 4096;
|
||||
|
||||
/// Above this depth, `spawn` prunes `Exited` entries from a slot's runqueue
|
||||
/// before pushing the new thread. Keeps peer `ThreadRef`s stable on the
|
||||
/// common (low-depth) path — a game that spawns a handful of long-lived
|
||||
@@ -117,6 +131,20 @@ pub struct GuestThread {
|
||||
/// Axis 3 instruction budget. Decremented per retired step on this
|
||||
/// thread; on zero, slot rotates within same-priority tier.
|
||||
pub quantum_remaining: u32,
|
||||
/// Anti-starvation counter. Incremented each slot visit this thread is
|
||||
/// Ready but NOT picked; reset to 0 when picked. When it reaches
|
||||
/// `STARVE_LIMIT`, `pick_runnable` grants this thread one boosted pick so
|
||||
/// a monopolizing higher-priority peer on the same slot cannot starve it
|
||||
/// indefinitely. Deterministic: a pure function of pick history.
|
||||
pub steps_starved: u32,
|
||||
/// SpawnParams.entry — the BL target the trampoline jumped to.
|
||||
/// Persisted so kernel exports can filter syscalls by spawning
|
||||
/// chain (e.g. the silph UI auto-signal POC). 0 for the initial
|
||||
/// thread (uses `install_initial_thread`, not `spawn`).
|
||||
pub start_entry: u32,
|
||||
/// SpawnParams.start_context — initial r3 at spawn. Persisted for
|
||||
/// the same filtering reason as `start_entry`.
|
||||
pub start_context: u32,
|
||||
}
|
||||
|
||||
impl GuestThread {
|
||||
@@ -136,6 +164,9 @@ impl GuestThread {
|
||||
affinity_mask: 0xFF,
|
||||
ideal_processor: None,
|
||||
quantum_remaining: QUANTUM_DEFAULT,
|
||||
steps_starved: 0,
|
||||
start_entry: 0,
|
||||
start_context: 0,
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -208,15 +239,35 @@ impl Default for HwSlot {
|
||||
impl HwSlot {
|
||||
/// Index of the highest-priority Ready/ServicingIrq thread in this
|
||||
/// slot's runqueue. Tiebreak: prefer lower index (deterministic).
|
||||
///
|
||||
/// Selection is by *effective* priority: a Ready thread that has been
|
||||
/// passed over for `STARVE_LIMIT` consecutive visits is boosted so it
|
||||
/// wins exactly one pick, then [`Scheduler::begin_slot_visit`] resets its
|
||||
/// counter. This restores the guest-visible invariant that every Ready
|
||||
/// thread makes forward progress, without inverting the intended priority
|
||||
/// order (a starved thread only beats its monopolizer once per
|
||||
/// `STARVE_LIMIT` visits). The boost is a pure function of the per-thread
|
||||
/// counters/priority/index, so picks stay deterministic.
|
||||
pub fn pick_runnable(&self) -> Option<usize> {
|
||||
self.runqueue
|
||||
.iter()
|
||||
.enumerate()
|
||||
.filter(|(_, t)| matches!(t.state, HwState::Ready | HwState::ServicingIrq(_)))
|
||||
.max_by_key(|(i, t)| (t.priority, -(*i as i64)))
|
||||
.max_by_key(|(i, t)| (Self::effective_priority(t), -(*i as i64)))
|
||||
.map(|(i, _)| i)
|
||||
}
|
||||
|
||||
/// Priority used for selection. A thread starved for `STARVE_LIMIT`
|
||||
/// visits is lifted to `i32::MAX` so it wins the next pick regardless of
|
||||
/// peer priority; otherwise its nominal priority is used unchanged.
|
||||
fn effective_priority(t: &GuestThread) -> i32 {
|
||||
if t.steps_starved >= STARVE_LIMIT {
|
||||
i32::MAX
|
||||
} else {
|
||||
t.priority
|
||||
}
|
||||
}
|
||||
|
||||
/// How many non-Exited threads currently live on this slot (used by
|
||||
/// placement policies).
|
||||
pub fn live_depth(&self) -> usize {
|
||||
@@ -341,6 +392,28 @@ pub struct Scheduler {
|
||||
/// Sorted by deadline ascending. Scheduler wakes the first entry via
|
||||
/// `advance_to_next_wake` when a round finds nothing runnable.
|
||||
timed_waits: Vec<(u64, ThreadRef)>,
|
||||
/// Coherent monotonic "now" clock — the single authoritative basis the
|
||||
/// kernel deadline-arithmetic (`KernelState::now_basis_at`) reads in
|
||||
/// BOTH execution modes. Per-thread `ctx(hw_id).timebase` is NOT a
|
||||
/// coherent "now":
|
||||
/// * In `--parallel`, workers extract their `PpcContext` (leaving a
|
||||
/// zeroed timebase in the slot) and step unlocked.
|
||||
/// * In **lockstep**, a parked/poll thread has `running_idx == None`,
|
||||
/// so `ctx()` returns `idle_ctx` (timebase 0); a `parse_timeout`
|
||||
/// reading that basis registers `deadline = 0 + relative`, a value
|
||||
/// permanently in the past, and `coord_idle_advance` re-arms that
|
||||
/// same constant deadline forever (timebase-desync livelock — the
|
||||
/// render-gate root: the submitter's 16ms re-wait never fires).
|
||||
/// So a coordinator/parked thread reading per-thread timebase can see a
|
||||
/// stale/zero basis decoupled from the deadline it just advanced to.
|
||||
/// This field is that coherent basis instead. It is DETERMINISTIC: a
|
||||
/// pure function of retired guest instructions (never wall-clock).
|
||||
/// Advanced by `advance_global_clock` (per-block retired count on each
|
||||
/// parallel writeback), `advance_global_clock_to` (floored up to the
|
||||
/// deterministic per-round `stats.instruction_count` in lockstep), and
|
||||
/// floored up by `advance_all_timebases_to`. Two cold lockstep runs
|
||||
/// read identical values, so the lockstep trace stays bit-reproducible.
|
||||
global_clock: u64,
|
||||
/// Global count of TLS slots allocated — `spawn` pre-sizes new threads'
|
||||
/// `tls_values` to this.
|
||||
tls_slot_count: usize,
|
||||
@@ -379,6 +452,7 @@ impl Scheduler {
|
||||
order,
|
||||
rng_state,
|
||||
timed_waits: Vec::new(),
|
||||
global_clock: 0,
|
||||
tls_slot_count: 0,
|
||||
non_empty_runnable: 0,
|
||||
rotation_cursor: 0,
|
||||
@@ -500,6 +574,17 @@ impl Scheduler {
|
||||
self.current.expect("no current thread")
|
||||
}
|
||||
|
||||
/// `(start_entry, start_context)` of the currently-running thread.
|
||||
/// Returns None if there is no current thread or its ref is stale.
|
||||
/// Used by `KernelState::maybe_register_silph_autosignal` to filter
|
||||
/// `NtCreateEvent` calls by spawning chain.
|
||||
pub fn current_thread_entry_and_ctx(&self) -> Option<(u32, u32)> {
|
||||
let r = self.current?;
|
||||
let slot = self.slots.get(r.hw_id as usize)?;
|
||||
let t = slot.runqueue.get(r.idx as usize)?;
|
||||
Some((t.start_entry, t.start_context))
|
||||
}
|
||||
|
||||
// ----- Guest-thread lookup -----
|
||||
|
||||
/// Find the `ThreadRef` of the (non-Exited) thread with `tid`.
|
||||
@@ -614,6 +699,8 @@ impl Scheduler {
|
||||
t.priority = params.priority;
|
||||
t.affinity_mask = mask;
|
||||
t.ideal_processor = params.ideal_processor;
|
||||
t.start_entry = params.entry;
|
||||
t.start_context = params.start_context;
|
||||
// M3.7 — populate the inter-thread reservation handle + slot id
|
||||
// so the interpreter can route lwarx/stwcx through the table.
|
||||
t.ctx.hw_id = slot_id;
|
||||
@@ -744,10 +831,22 @@ impl Scheduler {
|
||||
/// stashes `self.current` so exports can reach it.
|
||||
pub fn begin_slot_visit(&mut self, hw_id: u8) {
|
||||
let slot = &mut self.slots[hw_id as usize];
|
||||
slot.running_idx = slot.pick_runnable();
|
||||
self.current = slot
|
||||
.running_idx
|
||||
.map(|idx| ThreadRef::new(hw_id, idx as u16));
|
||||
let picked = slot.pick_runnable();
|
||||
slot.running_idx = picked;
|
||||
// Anti-starvation bookkeeping: reset the picked thread's counter,
|
||||
// increment every other Ready peer that was passed over this visit.
|
||||
// Once a passed-over thread reaches STARVE_LIMIT it wins the next
|
||||
// pick_runnable (effective_priority -> i32::MAX), then lands here as
|
||||
// `picked` and resets — bounding any thread's starvation. Pure
|
||||
// function of pick history, so it stays deterministic.
|
||||
for (i, t) in slot.runqueue.iter_mut().enumerate() {
|
||||
if Some(i) == picked {
|
||||
t.steps_starved = 0;
|
||||
} else if matches!(t.state, HwState::Ready | HwState::ServicingIrq(_)) {
|
||||
t.steps_starved = t.steps_starved.saturating_add(1);
|
||||
}
|
||||
}
|
||||
self.current = picked.map(|idx| ThreadRef::new(hw_id, idx as u16));
|
||||
}
|
||||
|
||||
/// Clear `current` at the end of each per-slot visit.
|
||||
@@ -803,6 +902,41 @@ impl Scheduler {
|
||||
false
|
||||
}
|
||||
|
||||
/// Cooperative yield: the currently-running thread executed a `db16cyc`
|
||||
/// spin-wait hint (see `StepResult::Yield`). It is busy-spinning on a
|
||||
/// guest spinlock/barrier whose release depends on a *co-located* peer
|
||||
/// that cannot make progress while this thread keeps winning the slot.
|
||||
///
|
||||
/// Promote every Ready peer on this slot past `STARVE_LIMIT` so the next
|
||||
/// `begin_slot_visit` picks one of them (their `effective_priority` →
|
||||
/// `i32::MAX`), and reset the yielder's own counter. Each promoted peer
|
||||
/// runs once and resets to 0 in `begin_slot_visit`; once all peers have
|
||||
/// had their turn the spinner is picked again, spins, and re-yields —
|
||||
/// producing a fair round-robin between the spinner and the threads it is
|
||||
/// waiting on. This mirrors real hardware, where all six HW threads run
|
||||
/// concurrently and the spin resolves as soon as the peer releases.
|
||||
///
|
||||
/// Pure function of the slot's current state (no RNG, no wall-clock), so
|
||||
/// it preserves lockstep determinism. No-op if there is no Ready peer
|
||||
/// (the spinner is alone on its slot — nothing to hand off to).
|
||||
///
|
||||
/// Returns `true` if at least one peer was promoted.
|
||||
pub fn yield_current(&mut self) -> bool {
|
||||
let Some(r) = self.current else { return false; };
|
||||
let slot = &mut self.slots[r.hw_id as usize];
|
||||
let me = r.idx as usize;
|
||||
let mut promoted = false;
|
||||
for (i, t) in slot.runqueue.iter_mut().enumerate() {
|
||||
if i == me {
|
||||
t.steps_starved = 0;
|
||||
} else if matches!(t.state, HwState::Ready | HwState::ServicingIrq(_)) {
|
||||
t.steps_starved = STARVE_LIMIT;
|
||||
promoted = true;
|
||||
}
|
||||
}
|
||||
promoted
|
||||
}
|
||||
|
||||
// ----- Park / wake / exit -----
|
||||
|
||||
pub fn park_current(&mut self, reason: BlockReason) {
|
||||
@@ -1091,6 +1225,42 @@ impl Scheduler {
|
||||
}
|
||||
}
|
||||
}
|
||||
// Keep the parallel-mode coherent clock at least as far forward as
|
||||
// any deadline we fast-forward to (idle/timer/wake advances). This
|
||||
// only mutates the new `global_clock` field — lockstep never reads
|
||||
// it — so it cannot perturb the deterministic lockstep trace.
|
||||
self.global_clock = self.global_clock.max(deadline);
|
||||
}
|
||||
|
||||
/// Parallel-mode coherent "now" (see [`Self::global_clock`] field doc).
|
||||
/// Read by the kernel deadline-arithmetic ONLY when
|
||||
/// `KernelState::parallel_active`; lockstep keeps reading per-thread
|
||||
/// `ctx(hw_id).timebase`.
|
||||
#[inline]
|
||||
pub fn global_clock(&self) -> u64 {
|
||||
self.global_clock
|
||||
}
|
||||
|
||||
/// Advance the parallel-mode coherent clock by `n` retired instructions.
|
||||
/// Called from the parallel worker writeback with the block's executed
|
||||
/// count so "now" tracks aggregate guest progress.
|
||||
#[inline]
|
||||
pub fn advance_global_clock(&mut self, n: u64) {
|
||||
self.global_clock = self.global_clock.saturating_add(n);
|
||||
}
|
||||
|
||||
/// Floor the coherent clock up to `now` (monotonic; never goes
|
||||
/// backwards). Used by the **lockstep** outer loop once per round to
|
||||
/// track the deterministic retired-instruction count
|
||||
/// (`stats.instruction_count`) as the single coherent "now". A plain
|
||||
/// floor-up rather than `saturating_add` because the lockstep caller
|
||||
/// passes an absolute monotonic counter (not a per-block delta), and
|
||||
/// because `advance_all_timebases_to` may already have pushed
|
||||
/// `global_clock` past the instruction count when fast-forwarding to a
|
||||
/// future deadline — clamping with `max` keeps both sources monotone.
|
||||
#[inline]
|
||||
pub fn advance_global_clock_to(&mut self, now: u64) {
|
||||
self.global_clock = self.global_clock.max(now);
|
||||
}
|
||||
|
||||
/// Fast-forward the timebase to the earliest pending timed wait and
|
||||
@@ -1161,6 +1331,28 @@ impl Scheduler {
|
||||
})
|
||||
}
|
||||
|
||||
/// True if any thread is currently `Blocked` on a `WaitAny`/`WaitAll`
|
||||
/// whose handle set contains `handle`. Used by the handle-slab recycler
|
||||
/// (AUDIT-059 R34) to avoid an ABA hazard: if a closed handle's slot is
|
||||
/// returned to the free list while a thread is still parked on it, a
|
||||
/// later `alloc_handle` could hand the same slot to a NEW object, and a
|
||||
/// signal on that new object would wake the stale waiter that was
|
||||
/// waiting on the OLD (closed) object. Canary sidesteps this by keeping
|
||||
/// the object alive via an object_ref while waiters hold references; we
|
||||
/// instead simply decline to recycle a still-waited slot (leaking it,
|
||||
/// matching the pre-R34 bump-only behaviour for that rare case).
|
||||
pub fn any_thread_waiting_on(&self, handle: u32) -> bool {
|
||||
self.slots.iter().any(|slot| {
|
||||
slot.runqueue.iter().any(|t| match &t.state {
|
||||
HwState::Blocked(BlockReason::WaitAny { handles, .. })
|
||||
| HwState::Blocked(BlockReason::WaitAll { handles, .. }) => {
|
||||
handles.contains(&handle)
|
||||
}
|
||||
_ => false,
|
||||
})
|
||||
})
|
||||
}
|
||||
|
||||
/// Snapshot thread states for diagnostic logging. One entry per live
|
||||
/// guest thread (Exited are included so post-mortem can see exit codes).
|
||||
pub fn diagnostic_snapshot(&self) -> Vec<(ThreadRef, Option<u32>, HwState)> {
|
||||
@@ -1858,6 +2050,118 @@ mod tests {
|
||||
assert_eq!(t.quantum_remaining, QUANTUM_DEFAULT, "quantum reloaded");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_anti_starvation_bounded_progress() {
|
||||
// Reproduces the Sylpheed render-gate deadlock: a high-priority
|
||||
// CPU-bound spinner (the pri-15 poll loop) co-located on one slot
|
||||
// with a pri-0 worker (the submitter) the spinner is waiting on.
|
||||
// Strict priority would starve the worker forever; the anti-starve
|
||||
// floor must hand it a pick within STARVE_LIMIT+1 visits, then the
|
||||
// spinner reclaims the slot (priority is NOT inverted).
|
||||
let mut s = mk_empty_scheduler();
|
||||
let mut spinner = SpawnParams::default();
|
||||
spinner.guest_tid = 1;
|
||||
spinner.thread_handle = 0x1000;
|
||||
spinner.affinity_mask = 0b0001;
|
||||
spinner.pcr_base = 0x4000_0000;
|
||||
spinner.priority = 15;
|
||||
s.spawn(spinner, &mut NullPcr).unwrap();
|
||||
let mut worker = SpawnParams::default();
|
||||
worker.guest_tid = 2;
|
||||
worker.thread_handle = 0x1004;
|
||||
worker.affinity_mask = 0b0001;
|
||||
worker.pcr_base = 0x4000_1000;
|
||||
worker.priority = 0;
|
||||
s.spawn(worker, &mut NullPcr).unwrap();
|
||||
|
||||
let mut worker_picks = 0u32;
|
||||
let mut spinner_picks = 0u32;
|
||||
// Both stay Ready (the spinner never blocks — that's the bug shape).
|
||||
for _ in 0..(STARVE_LIMIT + 2) {
|
||||
s.begin_slot_visit(0);
|
||||
match s.thread(s.current.unwrap()).tid {
|
||||
1 => spinner_picks += 1,
|
||||
2 => worker_picks += 1,
|
||||
other => panic!("unexpected tid {other}"),
|
||||
}
|
||||
s.end_slot_visit();
|
||||
}
|
||||
assert_eq!(
|
||||
worker_picks, 1,
|
||||
"starved worker gets exactly one bounded pick within STARVE_LIMIT+2 visits"
|
||||
);
|
||||
assert_eq!(
|
||||
spinner_picks,
|
||||
STARVE_LIMIT + 1,
|
||||
"high-priority spinner still dominates — priority is not inverted"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_db16cyc_yield_hands_slot_to_peer() {
|
||||
// Reproduces the Sylpheed title-screen gate: a guest spinlock/barrier
|
||||
// participant (tid=1) executes the `db16cyc` spin hint each round and
|
||||
// would otherwise win `pick_runnable` forever (equal priority, lower
|
||||
// index), starving the co-located peer (tid=2) it is waiting on.
|
||||
// `yield_current` must promote the Ready peer so the very next
|
||||
// `begin_slot_visit` picks it — without waiting STARVE_LIMIT rounds.
|
||||
let mut s = mk_empty_scheduler();
|
||||
for tid in [1u32, 2] {
|
||||
let mut p = SpawnParams::default();
|
||||
p.guest_tid = tid;
|
||||
p.thread_handle = 0x1000 + tid * 4;
|
||||
p.affinity_mask = 0b0001;
|
||||
p.pcr_base = 0x4000_0000 + tid * 0x1000;
|
||||
p.priority = 0; // equal priority — index would otherwise decide
|
||||
s.spawn(p, &mut NullPcr).unwrap();
|
||||
}
|
||||
|
||||
// Round 1: the spinner (lower index) wins.
|
||||
s.begin_slot_visit(0);
|
||||
let spinner = s.thread(s.current.unwrap()).tid;
|
||||
assert_eq!(spinner, 1, "lower-index equal-priority thread wins first pick");
|
||||
// It spins (db16cyc) → cooperative yield.
|
||||
assert!(s.yield_current(), "yield promotes the Ready peer");
|
||||
s.end_slot_visit();
|
||||
|
||||
// Round 2: the promoted peer must now be picked, not the spinner.
|
||||
s.begin_slot_visit(0);
|
||||
let after_yield = s.thread(s.current.unwrap()).tid;
|
||||
assert_eq!(
|
||||
after_yield, 2,
|
||||
"after db16cyc yield the co-located peer runs (no STARVE_LIMIT wait)"
|
||||
);
|
||||
s.end_slot_visit();
|
||||
|
||||
// Round 3: peer's boost was consumed (reset to 0 when picked), so the
|
||||
// spinner reclaims the slot — fair alternation, no priority inversion.
|
||||
s.begin_slot_visit(0);
|
||||
assert_eq!(
|
||||
s.thread(s.current.unwrap()).tid,
|
||||
1,
|
||||
"spinner reclaims the slot after the peer has had its turn"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_yield_current_noop_when_alone() {
|
||||
// A spinner with no Ready peer on its slot has nothing to hand off to;
|
||||
// yield_current must be a no-op (returns false) and not panic.
|
||||
let mut s = mk_empty_scheduler();
|
||||
let mut p = SpawnParams::default();
|
||||
p.guest_tid = 1;
|
||||
p.thread_handle = 0x1004;
|
||||
p.affinity_mask = 0b0001;
|
||||
p.pcr_base = 0x4000_0000;
|
||||
s.spawn(p, &mut NullPcr).unwrap();
|
||||
s.begin_slot_visit(0);
|
||||
assert!(!s.yield_current(), "no peer to promote → no-op");
|
||||
// Still the same thread next round.
|
||||
s.end_slot_visit();
|
||||
s.begin_slot_visit(0);
|
||||
assert_eq!(s.thread(s.current.unwrap()).tid, 1);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_cooperative_yield_does_not_need_quantum() {
|
||||
let mut s = mk_empty_scheduler();
|
||||
|
||||
@@ -293,28 +293,23 @@ pub fn store_vector_right(mem: &dyn MemoryAccess, ea: u32, v: Vec128) {
|
||||
}
|
||||
}
|
||||
|
||||
// ─── 5-6-5 pixel pack (vpkpx / vupkhpx / vupklpx) ─────────────────────────
|
||||
// PPC vpkpx takes a 32-bit RGB lane and packs it into a 16-bit 1-5-5-5 pixel.
|
||||
// vupkhpx / vupklpx reverse the operation.
|
||||
//
|
||||
// Format: input 32-bit word holds
|
||||
// bits 0-6: unused (0)
|
||||
// bit 7: alpha-select (→ bit 15 of output)
|
||||
// bits 8-15: R (top 5 bits kept)
|
||||
// bits 16-23: G (top 5 bits kept)
|
||||
// bits 24-31: B (top 5 bits kept)
|
||||
// Output 16-bit word:
|
||||
// bit 15: A (from input bit 7)
|
||||
// bits 10-14: R
|
||||
// bits 5-9: G
|
||||
// bits 0-4: B
|
||||
// ─── pixel pack (vpkpx / vupkhpx / vupklpx) ───────────────────────────────
|
||||
// PPC vpkpx packs each 32-bit lane into a 16-bit 1-5-5-5 pixel.
|
||||
// Mapping transcribed EXACTLY from xenia-canary
|
||||
// `ppc_emit_altivec.cc::vkpkx_in_low` (lines 1795-1808):
|
||||
// tmp1 = (input >> 9) & 0xFC00 // out bits 15:10 = in bits 24:19
|
||||
// tmp2 = (input >> 6) & 0x3E0 // out bits 9:5 = in bits 14:10
|
||||
// tmp3 = (input >> 3) & 0x1F // out bits 4:0 = in bits 7:3
|
||||
// result = tmp1 | tmp2 | tmp3
|
||||
// This is a pure shift/mask: there is NO standalone alpha select. Output
|
||||
// bit 15 is simply input bit 24 (the top of the 6-bit field masked by
|
||||
// 0xFC00) — NOT input bit 7. The red field is 6 bits wide here.
|
||||
|
||||
#[inline] pub fn pack_pixel_555(input: u32) -> u16 {
|
||||
let a = (input >> 7) & 0x1;
|
||||
let r = (input >> 8) & 0xFF;
|
||||
let g = (input >> 16) & 0xFF;
|
||||
let b = (input >> 24) & 0xFF;
|
||||
((a << 15) | ((r & 0xF8) << 7) | ((g & 0xF8) << 2) | ((b & 0xF8) >> 3)) as u16
|
||||
let tmp1 = (input >> 9) & 0xFC00;
|
||||
let tmp2 = (input >> 6) & 0x3E0;
|
||||
let tmp3 = (input >> 3) & 0x1F;
|
||||
(tmp1 | tmp2 | tmp3) as u16
|
||||
}
|
||||
|
||||
#[inline] pub fn unpack_pixel_555(input: u16) -> u32 {
|
||||
@@ -801,9 +796,38 @@ mod tests {
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn pack_unpack_pixel_555() {
|
||||
let encoded = pack_pixel_555(0x80_F8_F8_F8);
|
||||
assert_eq!(encoded & 0x8000, 0x8000);
|
||||
fn pack_pixel_555_matches_canary() {
|
||||
// Mapping (canary ppc_emit_altivec.cc::vkpkx_in_low):
|
||||
// out[15:10] = in[24:19], out[9:5] = in[14:10], out[4:0] = in[7:3]
|
||||
// Pure shift/mask, NO standalone alpha bit.
|
||||
|
||||
// All three colour fields exercised. Expected (hand-computed):
|
||||
// (0x018844C0 >> 9)&0xFC00 = 0xC400
|
||||
// (0x018844C0 >> 6)&0x3E0 = 0x100
|
||||
// (0x018844C0 >> 3)&0x1F = 0x18
|
||||
// => 0xC518
|
||||
assert_eq!(pack_pixel_555(0x01_88_44_C0), 0xC518);
|
||||
|
||||
// Boundary the audit flagged: low byte 0xF8 has bit 7 set. Canary does
|
||||
// NOT turn that into output bit 15 (alpha). Output bit 15 = in bit 24,
|
||||
// which is 0 here => high bit clear. (Old impl wrongly produced 0x8000.)
|
||||
assert_eq!(pack_pixel_555(0x80_F8_F8_F8), 0x7FFF);
|
||||
assert_eq!(pack_pixel_555(0x80_F8_F8_F8) & 0x8000, 0);
|
||||
|
||||
// Lone source bit 7 (0x80) lands in the blue field, not in bit 15.
|
||||
assert_eq!(pack_pixel_555(0x00_00_00_80), 0x0010);
|
||||
|
||||
// Output bit 15 is sourced from input bit 24, not bit 7.
|
||||
assert_eq!(pack_pixel_555(0x01_00_00_00), 0x8000);
|
||||
|
||||
// Saturated input -> all field bits set.
|
||||
assert_eq!(pack_pixel_555(0xFF_FF_FF_FF), 0xFFFF);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn unpack_pixel_555_roundtrip() {
|
||||
// vupkhpx/vupklpx are NOTIMPLEMENTED in canary, so unpack_pixel_555 is
|
||||
// unchanged; just sanity-check the alpha-replicate path still holds.
|
||||
let w = unpack_pixel_555(0x8000 | (0x1F << 10) | (0x1F << 5) | 0x1F);
|
||||
assert_eq!(w & 0xFF000000, 0xFF000000);
|
||||
}
|
||||
|
||||
@@ -486,12 +486,20 @@ fn ke_query_performance_frequency(ctx: &mut PpcContext, _mem: &GuestMemory, _sta
|
||||
ctx.gpr[3] = 50_000_000; // 50 MHz
|
||||
}
|
||||
|
||||
fn ke_query_system_time(ctx: &mut PpcContext, mem: &GuestMemory, _state: &mut KernelState) {
|
||||
fn ke_query_system_time(ctx: &mut PpcContext, mem: &GuestMemory, state: &mut KernelState) {
|
||||
let time_ptr = ctx.gpr[3] as u32;
|
||||
if time_ptr != 0 {
|
||||
let fake_time: u64 = 132_500_000_000_000_000; // ~2021 FILETIME
|
||||
mem.write_u32(time_ptr, (fake_time >> 32) as u32);
|
||||
mem.write_u32(time_ptr + 4, fake_time as u32);
|
||||
// ITERATE-2J — advance with the same deterministic clock the
|
||||
// KeTimeStampBundle uses (1 global_clock unit ≈ 100 ns) so a guest
|
||||
// that polls KeQuerySystemTime for elapsed time also sees forward
|
||||
// progress instead of a frozen constant. FILETIME base (~2021) +
|
||||
// 100-ns-unit clock.
|
||||
const FILETIME_BASE: u64 = 132_500_000_000_000_000;
|
||||
let hw_id = state.scheduler.current_hw_id().unwrap_or(0);
|
||||
let now = state.now_basis_at(hw_id);
|
||||
let system_time = FILETIME_BASE.wrapping_add(now);
|
||||
mem.write_u32(time_ptr, (system_time >> 32) as u32);
|
||||
mem.write_u32(time_ptr + 4, system_time as u32);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -696,9 +704,36 @@ fn mm_create_kernel_stack(ctx: &mut PpcContext, mem: &GuestMemory, state: &mut K
|
||||
}
|
||||
}
|
||||
|
||||
/// Region-aware guest-virtual → physical translation, matching canary's
|
||||
/// `Memory::GetPhysicalAddress` + `PhysicalHeap::GetPhysicalAddress`
|
||||
/// (`xenia-canary/src/xenia/memory.cc:528-545` and `:2317-2326`).
|
||||
///
|
||||
/// Canary `PhysicalHeap::GetPhysicalAddress`:
|
||||
/// ```c
|
||||
/// address -= heap_base_;
|
||||
/// if (heap_base_ >= 0xE0000000) { address += 0x1000; }
|
||||
/// return address;
|
||||
/// ```
|
||||
/// The three physical heap bases (0xA0000000 / 0xC0000000 / 0xE0000000) all
|
||||
/// alias the same 512 MB physical window, so `address - heap_base ==
|
||||
/// address & 0x1FFFFFFF` for each. The only region-specific delta is the
|
||||
/// `+0x1000` host-address-offset for the 0xE0000000+ 4 KB mirror — see
|
||||
/// `memory.h:368-372` (`host_address_offset` for `heap_base >= 0xE0000000`).
|
||||
/// For non-physical / sub-0x1FFFFFFF virtual addresses canary returns the
|
||||
/// address unchanged, which equals `address & 0x1FFFFFFF` there too.
|
||||
pub(crate) fn translate_physical_address(virt: u32) -> u32 {
|
||||
let phys = virt & 0x1FFF_FFFF;
|
||||
if virt >= 0xE000_0000 {
|
||||
phys + 0x1000
|
||||
} else {
|
||||
phys
|
||||
}
|
||||
}
|
||||
|
||||
fn mm_get_physical_address(ctx: &mut PpcContext, _mem: &GuestMemory, _state: &mut KernelState) {
|
||||
// r3 = virtual address -> return physical address
|
||||
ctx.gpr[3] &= 0x1FFF_FFFF; // Mask to 512MB physical
|
||||
// r3 = virtual address -> return physical address.
|
||||
// Region-aware, mirroring canary (see `translate_physical_address`).
|
||||
ctx.gpr[3] = translate_physical_address(ctx.gpr[3] as u32) as u64;
|
||||
}
|
||||
|
||||
fn mm_query_address_protect(ctx: &mut PpcContext, _mem: &GuestMemory, _state: &mut KernelState) {
|
||||
@@ -980,6 +1015,43 @@ fn open_vfs_file(
|
||||
// see a null handle later and trigger `XamShowDirtyDiscErrorUI`.
|
||||
let path = crate::path::object_attributes_to_vfs_path(mem, obj_attrs_ptr)
|
||||
.unwrap_or_default();
|
||||
// AUDIT-2.BF — synthetic silph::WorkerCtx spawn. AUDIT-058/059
|
||||
// identified that ours never activates the 6-level static caller
|
||||
// ladder that ends in `sub_825070F0`, so the four worker threads
|
||||
// it would normally spawn (entries 0x82506528/58/88/B8) never run.
|
||||
// Canary's chain originally fires right after `DiscImageDevice::
|
||||
// ResolvePath("\\dat\\movie")` (audit-058); ours never opens
|
||||
// `dat/movie` because tid=13 wedges before reaching it. We
|
||||
// therefore trigger on the first `dat/*` open — the earliest
|
||||
// such open in ours is `dat/files.tbl` (immediately preceding
|
||||
// tid=12/13 spawn at audit-059 round 1).
|
||||
//
|
||||
// **Round 18 finding** (this commit): when the workers are
|
||||
// spawned runnable, they fault almost immediately (`PC=0` at
|
||||
// cycle ~5.5M on the hw thread carrying worker_3), preempting
|
||||
// ours' boot before the normal guest threads even spawn. The
|
||||
// ctx layout from audit-059 round 5 is incomplete — at least
|
||||
// one of `[+0x28]`/`[+0x2C]`/`[+0x30]` (the three foreign-
|
||||
// arena pointers) must be populated for the worker bodies to
|
||||
// run. Synthesising those is a fresh investigation (round 19+).
|
||||
//
|
||||
// Until then the synth path is **opt-in**: set
|
||||
// `XENIA_SILPH_SYNTH=1` to enable the runnable spawn (will
|
||||
// crash boot), or `XENIA_SILPH_SYNTH=suspend` to spawn but keep
|
||||
// them in `Blocked(Suspended)` (lets boot complete with the
|
||||
// ctx materialised in memory for downstream probes). Default:
|
||||
// disabled — preserves the existing boot trajectory.
|
||||
if !state.silph_synth_done && path.starts_with("dat/") {
|
||||
match std::env::var("XENIA_SILPH_SYNTH").as_deref() {
|
||||
Ok("1") | Ok("run") | Ok("runnable") => {
|
||||
let _ = crate::silph_synth::spawn_silph_workers(state, mem, false);
|
||||
}
|
||||
Ok("suspend") | Ok("suspended") => {
|
||||
let _ = crate::silph_synth::spawn_silph_workers(state, mem, true);
|
||||
}
|
||||
_ => {}
|
||||
}
|
||||
}
|
||||
if path.is_empty() && obj_attrs_ptr == 0 {
|
||||
if handle_out != 0 {
|
||||
mem.write_u32(handle_out, 0);
|
||||
@@ -1443,20 +1515,35 @@ fn nt_query_information_file(ctx: &mut PpcContext, mem: &GuestMemory, state: &mu
|
||||
*size
|
||||
};
|
||||
|
||||
// Root-of-device opens (`game:\`, `cache:\`, `partition0`) strip to
|
||||
// an empty string post-prefix — see `open_vfs_file`'s synth path.
|
||||
// Games query these as directories (DirectoryObject probe), and
|
||||
// reporting `Directory=0` makes Sylpheed treat the open as "found a
|
||||
// non-directory where I expected a directory" and call
|
||||
// `XamShowDirtyDiscErrorUI`. Canary's `NtQueryInformationFile` pulls
|
||||
// the real file-system entry's kind; we key on path shape since we
|
||||
// don't model directory entries.
|
||||
let is_directory = path.is_empty()
|
||||
|| path.ends_with('/')
|
||||
|| path.ends_with(':');
|
||||
// Snapshot what we need from the handle, then drop the borrow so we can
|
||||
// re-resolve the path against the VFS for its real attribute byte.
|
||||
let path = path.clone();
|
||||
let size = live_size;
|
||||
let position = *position;
|
||||
|
||||
// Pull the REAL GDFX attribute byte (canary `disc_image_device.cc:154`)
|
||||
// for disc-backed handles by re-resolving the stored path. Root-of-device
|
||||
// opens (`game:\`, `cache:\`, `partition0`) strip to an empty string and
|
||||
// synth-stub opens have no VFS entry — for those we fall back to the
|
||||
// path-shape heuristic. Games query these as directories (DirectoryObject
|
||||
// probe), and reporting `Directory=0` makes Sylpheed treat the open as
|
||||
// "found a non-directory where I expected a directory" and call
|
||||
// `XamShowDirtyDiscErrorUI`.
|
||||
let vfs_attributes: Option<u32> = if path.is_empty() {
|
||||
None
|
||||
} else {
|
||||
state
|
||||
.vfs
|
||||
.as_ref()
|
||||
.and_then(|vfs| vfs.stat(&path).ok())
|
||||
.map(|e| e.attributes)
|
||||
.filter(|&a| a != 0)
|
||||
};
|
||||
let is_directory = match vfs_attributes {
|
||||
Some(a) => (a & 0x10) != 0,
|
||||
None => path.is_empty() || path.ends_with('/') || path.ends_with(':'),
|
||||
};
|
||||
|
||||
// `FILE_ATTRIBUTE_DIRECTORY` (NT / Xbox) — advertised in
|
||||
// `FileNetworkOpenInformation.FileAttributes`; Sylpheed's async-I/O
|
||||
// worker queries with class=34 and the calling code checks this bit
|
||||
@@ -1495,10 +1582,13 @@ fn nt_query_information_file(ctx: &mut PpcContext, mem: &GuestMemory, state: &mu
|
||||
}
|
||||
mem.write_u64(file_info + 32, size);
|
||||
mem.write_u64(file_info + 40, size);
|
||||
let attrs = if is_directory {
|
||||
FILE_ATTRIBUTE_DIRECTORY
|
||||
} else {
|
||||
FILE_ATTRIBUTE_NORMAL
|
||||
// Prefer the real GDFX attribute byte; fall back to the
|
||||
// DIRECTORY/NORMAL split for root-of-device and synth-stub
|
||||
// handles that have no VFS entry.
|
||||
let attrs = match vfs_attributes {
|
||||
Some(a) => a,
|
||||
None if is_directory => FILE_ATTRIBUTE_DIRECTORY,
|
||||
None => FILE_ATTRIBUTE_NORMAL,
|
||||
};
|
||||
mem.write_u32(file_info + 48, attrs);
|
||||
mem.write_u32(file_info + 52, 0); // pad
|
||||
@@ -1701,7 +1791,18 @@ fn nt_query_full_attributes_file(ctx: &mut PpcContext, mem: &GuestMemory, state:
|
||||
mem.write_u32(out + 28, filetime as u32);
|
||||
mem.write_u64(out + 32, entry.size);
|
||||
mem.write_u64(out + 40, entry.size);
|
||||
let attrs: u32 = if entry.is_directory { 0x10 } else { 0x80 };
|
||||
// Use the REAL GDFX attribute byte forwarded by the VFS
|
||||
// (canary `disc_image_device.cc:154`) instead of a
|
||||
// path-shape guess. Disc rips never carry a 0-attribute
|
||||
// entry, but guard anyway so a synthesised/legacy entry
|
||||
// still advertises a sane DIRECTORY/NORMAL split.
|
||||
let attrs: u32 = if entry.attributes != 0 {
|
||||
entry.attributes
|
||||
} else if entry.is_directory {
|
||||
0x10
|
||||
} else {
|
||||
0x80
|
||||
};
|
||||
mem.write_u32(out + 48, attrs);
|
||||
mem.write_u32(out + 52, 0);
|
||||
}
|
||||
@@ -1822,6 +1923,7 @@ fn nt_query_directory_file(ctx: &mut PpcContext, mem: &GuestMemory, state: &mut
|
||||
is_directory: e.is_directory,
|
||||
size: e.size,
|
||||
offset: e.offset,
|
||||
attributes: e.attributes,
|
||||
})
|
||||
})
|
||||
.collect(),
|
||||
@@ -1872,7 +1974,12 @@ fn nt_query_directory_file(ctx: &mut PpcContext, mem: &GuestMemory, state: &mut
|
||||
mem.write_u64(base + 0x20, 0);
|
||||
mem.write_u64(base + 0x28, entry.size);
|
||||
mem.write_u64(base + 0x30, entry.size);
|
||||
let attrs = if entry.is_directory {
|
||||
// Real GDFX attribute byte (canary `disc_image_device.cc:154`);
|
||||
// fall back to the directory/normal split only for legacy entries
|
||||
// that carry no attribute bits.
|
||||
let attrs = if entry.attributes != 0 {
|
||||
entry.attributes
|
||||
} else if entry.is_directory {
|
||||
FILE_ATTRIBUTE_DIRECTORY
|
||||
} else {
|
||||
FILE_ATTRIBUTE_NORMAL
|
||||
@@ -1940,14 +2047,29 @@ fn nt_close(ctx: &mut PpcContext, _mem: &GuestMemory, state: &mut KernelState) {
|
||||
// so a later scheduler round doesn't try to signal a dead handle.
|
||||
// `disarm_timer` is a no-op for non-timer handles.
|
||||
state.disarm_timer(handle);
|
||||
// AUDIT-059 R34: return the slot to the recycle FIFO so a later
|
||||
// `alloc_handle` mints the same ID (matching canary's slab).
|
||||
state.release_handle_slot(handle);
|
||||
}
|
||||
ctx.gpr[3] = 0;
|
||||
}
|
||||
|
||||
fn nt_create_event(ctx: &mut PpcContext, mem: &GuestMemory, state: &mut KernelState) {
|
||||
// r3 = handle_ptr, r4 = obj_attrs, r5 = event_type, r6 = initial_state
|
||||
// r3 = handle_ptr, r4 = obj_attrs, r5 = event_type, r6 = initial_state.
|
||||
//
|
||||
// Xenon DISPATCHER_HEADER `Type` (NT convention):
|
||||
// 0 = NotificationEvent (manual-reset)
|
||||
// 1 = SynchronizationEvent (auto-reset)
|
||||
// Canary: `xboxkrnl_threading.cc:668` `ev->Initialize(!event_type, !!initial_state)`
|
||||
// with `XEvent::Initialize(bool manual_reset, ...)` (xevent.cc:25) and
|
||||
// `InitializeNative` (xevent.cc:41 `case 0x00: manual_reset_ = true`).
|
||||
// So `manual_reset = (event_type == 0)`. The Ke-path
|
||||
// (`ensure_dispatcher_object`) was already correct; the Nt-path here was
|
||||
// inverted, mis-classifying Sylpheed's per-frame VSync gate (type=1 auto +
|
||||
// initial=1) as manual-reset+signaled → it stayed signaled forever and
|
||||
// tid=1's main loop spun ~2800x canary's 60Hz.
|
||||
let handle_ptr = ctx.gpr[3] as u32;
|
||||
let manual_reset = ctx.gpr[5] != 0;
|
||||
let manual_reset = ctx.gpr[5] == 0;
|
||||
let signaled = ctx.gpr[6] != 0;
|
||||
let handle = state.alloc_handle_for(KernelObject::Event {
|
||||
manual_reset,
|
||||
@@ -1961,6 +2083,9 @@ fn nt_create_event(ctx: &mut PpcContext, mem: &GuestMemory, state: &mut KernelSt
|
||||
mem,
|
||||
"NtCreateEvent",
|
||||
);
|
||||
// ITERATE-2C Phase D — audit-049 auto-signal POC. Env-gated; no-op
|
||||
// when `XENIA_SILPH_UI_AUTOSIGNAL_DELAY` is unset.
|
||||
state.maybe_register_silph_autosignal(handle, ctx, mem);
|
||||
if handle_ptr != 0 {
|
||||
mem.write_u32(handle_ptr, handle);
|
||||
}
|
||||
@@ -2048,7 +2173,7 @@ fn nt_set_timer_ex(ctx: &mut PpcContext, mem: &GuestMemory, state: &mut KernelSt
|
||||
// timebase separately (immutable borrow) before any mutation of the
|
||||
// object to keep the borrow-checker happy.
|
||||
let hw_id = state.scheduler.current_hw_id().unwrap_or(0);
|
||||
let now = state.scheduler.ctx(hw_id).timebase;
|
||||
let now = state.now_basis_at(hw_id);
|
||||
|
||||
// Read signed i64 due_time (big-endian hi/lo — same pattern as
|
||||
// parse_timeout). Negative = relative-from-now, positive = absolute
|
||||
@@ -3472,7 +3597,7 @@ pub(crate) fn parse_timeout(state: &KernelState, timeout_ptr: u32, mem: &GuestMe
|
||||
return Some(Some(0)); // poll
|
||||
}
|
||||
let hw_id = state.scheduler.current_hw_id().unwrap_or(0);
|
||||
let now = state.scheduler.ctx(hw_id).timebase;
|
||||
let now = state.now_basis_at(hw_id);
|
||||
// Negative = relative, positive = absolute wall-clock. Our timebase is a
|
||||
// plain instruction counter, so we treat all timeouts as "time-units
|
||||
// after now" regardless of sign, using the magnitude.
|
||||
@@ -4780,12 +4905,14 @@ mod tests {
|
||||
is_directory: false,
|
||||
size: 0x1000,
|
||||
offset: 0,
|
||||
attributes: 0x81, // NORMAL | READONLY
|
||||
},
|
||||
xenia_vfs::VfsEntry {
|
||||
name: "dat".into(),
|
||||
is_directory: true,
|
||||
size: 0,
|
||||
offset: 0,
|
||||
attributes: 0x11, // DIRECTORY | READONLY
|
||||
},
|
||||
// A grandchild — must NOT appear in root enumeration.
|
||||
xenia_vfs::VfsEntry {
|
||||
@@ -4793,6 +4920,7 @@ mod tests {
|
||||
is_directory: false,
|
||||
size: 0x2000,
|
||||
offset: 0,
|
||||
attributes: 0x81,
|
||||
},
|
||||
],
|
||||
}));
|
||||
@@ -4819,9 +4947,11 @@ mod tests {
|
||||
// NextEntryOffset.
|
||||
let mut cursor: u32 = 0;
|
||||
let mut names: Vec<String> = Vec::new();
|
||||
let mut attrs: Vec<u32> = Vec::new();
|
||||
loop {
|
||||
let entry_base = buf + cursor;
|
||||
let name_len = mem.read_u32(entry_base + 0x3C) as usize;
|
||||
attrs.push(mem.read_u32(entry_base + 0x38));
|
||||
let mut bytes = Vec::with_capacity(name_len);
|
||||
for i in 0..name_len as u32 {
|
||||
bytes.push(mem.read_u8(entry_base + 0x40 + i));
|
||||
@@ -4834,6 +4964,12 @@ mod tests {
|
||||
cursor += next;
|
||||
}
|
||||
assert_eq!(names, vec!["default.xex", "dat"]);
|
||||
// The real GDFX attribute byte must be forwarded verbatim: the file
|
||||
// reports NORMAL|READONLY (no DIRECTORY bit), the directory reports
|
||||
// DIRECTORY|READONLY.
|
||||
assert_eq!(attrs, vec![0x81, 0x11]);
|
||||
assert_eq!(attrs[0] & 0x10, 0, "file must not advertise DIRECTORY");
|
||||
assert_ne!(attrs[1] & 0x10, 0, "dir must advertise DIRECTORY");
|
||||
// A second call on the same handle must return NO_MORE_FILES —
|
||||
// the cursor has advanced past the end.
|
||||
ctx.gpr[3] = handle as u64;
|
||||
@@ -6353,4 +6489,23 @@ mod tests {
|
||||
assert!(resolved.ends_with("etc/foo"));
|
||||
std::fs::remove_dir_all(&dir).ok();
|
||||
}
|
||||
|
||||
/// `MmGetPhysicalAddress` must be region-aware, matching canary's
|
||||
/// `PhysicalHeap::GetPhysicalAddress`: the 0xE0000000+ 4 KB mirror gets a
|
||||
/// `+0x1000` host-address-offset; every other region is a flat
|
||||
/// `& 0x1FFFFFFF` mask.
|
||||
#[test]
|
||||
fn mm_get_physical_address_region_aware() {
|
||||
// 0xE0000000 mirror: canary `address - heap_base (==addr & 0x1FFFFFFF)`
|
||||
// then `+ 0x1000`.
|
||||
assert_eq!(translate_physical_address(0xE000_0000), 0x0000_1000);
|
||||
assert_eq!(translate_physical_address(0xE000_5000), 0x0000_6000);
|
||||
assert_eq!(translate_physical_address(0xFFFF_F000), 0x1FFF_F000 + 0x1000);
|
||||
// 0xA0000000 / 0xC0000000 physical heaps: flat mask, no offset.
|
||||
assert_eq!(translate_physical_address(0xA000_0000), 0x0000_0000);
|
||||
assert_eq!(translate_physical_address(0xC012_3000), 0x0012_3000);
|
||||
// Virtual / already-physical (< 0x20000000): unchanged.
|
||||
assert_eq!(translate_physical_address(0x0012_3000), 0x0012_3000);
|
||||
assert_eq!(translate_physical_address(0x4012_3000), 0x0012_3000);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -8,13 +8,18 @@
|
||||
//! guest-issued command stream; source code 1 (`INTERRUPT_SOURCE_CP`).
|
||||
//!
|
||||
//! Canary's [xboxkrnl_video.cc:303-310](xenia-canary/src/xenia/kernel/xboxkrnl/xboxkrnl_video.cc#L303-L310)
|
||||
//! dispatches the callback on HW thread 0. We follow the same convention.
|
||||
//! dispatches the callback on HW thread 0. We follow the same convention
|
||||
//! for picking a *context donor*, but as of iterate-2.BE the dispatch
|
||||
//! itself is **synchronous and host-driven**: the main loop runs the ISR
|
||||
//! inline on the borrowed guest context, mirroring canary's
|
||||
//! `EmulateCPInterruptDPC → Processor::Execute` path
|
||||
//! ([kernel_state.cc:1370](../../../../xenia-canary/src/xenia/kernel/kernel_state.cc#L1370),
|
||||
//! [processor.cc:413](../../../../xenia-canary/src/xenia/cpu/processor.cc#L413)).
|
||||
//! Independent of whether the donor guest thread was Ready or Blocked.
|
||||
//!
|
||||
//! The delivery model is cooperative: we inject the callback entry into HW
|
||||
//! thread 0 at the top of a scheduler round when it's safe (not mid-export,
|
||||
//! not already inside another interrupt). When the callback returns to
|
||||
//! [`LR_HALT_SENTINEL`] the main loop restores the saved [`PpcContext`]
|
||||
//! fields and the HW thread picks up where it left off.
|
||||
//! The audio callback path (audit-048) still uses asynchronous LR-sentinel
|
||||
//! injection on a dedicated per-client worker thread; the
|
||||
//! [`SavedCallbackCtx`] machinery below remains in use there.
|
||||
|
||||
use std::collections::VecDeque;
|
||||
use std::time::{Duration, Instant};
|
||||
|
||||
@@ -3,6 +3,7 @@ pub mod exports;
|
||||
pub mod interrupts;
|
||||
pub mod objects;
|
||||
pub mod path;
|
||||
pub mod silph_synth;
|
||||
pub mod state;
|
||||
pub mod thread;
|
||||
pub mod ui_bridge;
|
||||
|
||||
280
crates/xenia-kernel/src/silph_synth.rs
Normal file
280
crates/xenia-kernel/src/silph_synth.rs
Normal file
@@ -0,0 +1,280 @@
|
||||
//! AUDIT-2.BF — synthetic spawn of the silph::WorkerCtx worker quartet.
|
||||
//!
|
||||
//! AUDIT-058/059 traced a 6-level static-caller ladder
|
||||
//! (`sub_824F7800 ← sub_824F7CD0 ← sub_824F8398 ← sub_821B55D8 ← sub_821B6DF4`,
|
||||
//! topped by virtual-dispatch from `sub_82172BA0+0x1E8`) that activates
|
||||
//! `sub_825070F0` in canary at ~1× / 30 s, kicking off four worker threads
|
||||
//! initialised against a single ~0x440-byte ctx. In ours none of those PCs
|
||||
//! fire (audit-059 round 9 confirmed sub_821B6DF4 = 0×, real chain entry =
|
||||
//! virtual-dispatch from sub_82172BA0+0x1E8 hits wrong-vtable slot).
|
||||
//!
|
||||
//! Rather than chase the wrong-vtable break, this module reproduces the end
|
||||
//! state directly: at the first observation of a load-bearing VFS path
|
||||
//! (`dat/movie`), we synthesise the ctx structure in guest memory per audit-
|
||||
//! 059 round 5's live hexdump and spawn the four worker entry points the
|
||||
//! same way AUDIT-048's audio host-pump spawns its dedicated client worker.
|
||||
//!
|
||||
//! The ctx is opaque to the workers — only fields they dereference matter.
|
||||
//! Per round 5 dump (`audit-runs/audit-059-handle-disambiguation/round5-ctx-
|
||||
//! dump/canary.log`):
|
||||
//!
|
||||
//! +0x00 vtable = 0x8200A1E8 (XEX .rdata, valid in both engines)
|
||||
//! +0x04 self = ctx
|
||||
//! +0x08 intrusive head= ctx
|
||||
//! +0x0C init flag = 1
|
||||
//! +0x10 packed byte = 0x01000000
|
||||
//! +0x18 float ~1.0 = 0x3F7FCCCC
|
||||
//! +0x1C float ~1.0 = 0x3F802D83
|
||||
//! +0x24 flag = 1
|
||||
//! +0x28..+0x30 = three foreign pointers, NULL initially
|
||||
//! +0x54..+0x84 = 4× X_KEVENT auto-reset, state=0
|
||||
//! +0x94..+0xC4 = 4× X_KEVENT manual-reset, state=1
|
||||
//! +0x210..+0x250 = 4-entry intrusive work-ring, empty
|
||||
//!
|
||||
//! Worker entries (each takes r3 = ctx_ptr):
|
||||
//! 0x82506528, 0x82506558, 0x82506588, 0x825065B8
|
||||
|
||||
use xenia_cpu::scheduler::{BlockReason, SpawnParams};
|
||||
use xenia_cpu::ThreadRef;
|
||||
use xenia_memory::{GuestMemory, MemoryAccess};
|
||||
|
||||
use crate::objects::KernelObject;
|
||||
use crate::state::{GuestMemoryPcr, KernelState};
|
||||
use crate::thread::allocate_thread_image;
|
||||
|
||||
/// XEX `.rdata` vtable for the silph::WorkerCtx singleton (audit-059 round 5).
|
||||
const SILPH_CTX_VTABLE: u32 = 0x8200_A1E8;
|
||||
|
||||
/// 4-element fixed entry table — guest text PCs for the four worker bodies.
|
||||
const SILPH_WORKER_ENTRIES: [u32; 4] = [
|
||||
0x8250_6528,
|
||||
0x8250_6558,
|
||||
0x8250_6588,
|
||||
0x8250_65B8,
|
||||
];
|
||||
|
||||
/// Round 0x440 up to a page-ish so the ctx alloc never straddles a page
|
||||
/// boundary in heap_alloc's bookkeeping. Round 20 grew the alloc from 0x500
|
||||
/// to 0x800 to make room for a synthesised sub-object at +0x300 and its
|
||||
/// 32-slot vtable at +0x500 (= ctx + 0x500..0x580). Round 21 retains the
|
||||
/// embedded sub-object but drops the synthesized vtable (we now point at
|
||||
/// canary's real XEX-resident sub-vtable directly), so the 0x500..0x580
|
||||
/// region is unused but harmless.
|
||||
const SILPH_CTX_SIZE: u32 = 0x800;
|
||||
|
||||
/// Offset within the ctx allocation of the synthetic sub-object referenced
|
||||
/// at `[ctx+0x2C]`. Canary's sub-object sits ~0x300 bytes above the ctx and
|
||||
/// varies per-instance; we keep it embedded in the same alloc so a single
|
||||
/// `heap_alloc` covers everything.
|
||||
const SILPH_SUBOBJ_OFFSET: u32 = 0x300;
|
||||
|
||||
/// XEX `.rdata` VA of canary's real sub-object vtable (audit-059 round 21).
|
||||
/// Discovered by:
|
||||
/// 1. Probing canary at `pc=0x82506B08` (= `sub_82506B08`, method 35 of
|
||||
/// the WorkerCtx vtable, the first sub-object method called by every
|
||||
/// `sub_82506528/58/88/B8` worker entry).
|
||||
/// 2. Capturing `[ctx+0x2C]` from the JIT-prolog dump (= sub-object VA
|
||||
/// in canary's heap).
|
||||
/// 3. Re-running with `--audit_jit_prolog_mem_dump=<sub-obj VA>` to read
|
||||
/// `[sub-object + 0]` = sub-vtable VA = **`0x8200A168`**.
|
||||
/// PE inspection confirms slot 15 (called via `[r11+0x3C]` at
|
||||
/// `sub_82506B08+0x44`) = `sub_824FCCC8` and slot 17 (`[r11+0x44]` at
|
||||
/// `sub_82506B08+0x70`) = `sub_824FCE38`. Both are real game methods in
|
||||
/// the same `.text` region as the rest of the worker dispatch surface.
|
||||
const SILPH_SUB_VTABLE_SOURCE_VA: u32 = 0x8200_A168;
|
||||
|
||||
/// Round-19 XEX-resident wrapper constant observed at `[ctx+0x30]` in every
|
||||
/// canary ctx (audit-059 round 7). Same value for all four ctxes — opaque
|
||||
/// pointer / handle the worker passes through without dereferencing.
|
||||
const SILPH_CTX_FIELD_30_CONST: u32 = 0xBE56_8F00;
|
||||
|
||||
/// 64 KiB worker stack (mirrors AUDIT-048 audio worker), half of canary's
|
||||
/// 128 KiB default.
|
||||
const SILPH_WORKER_STACK: u32 = 0x10_000;
|
||||
|
||||
/// Idempotently synthesise the silph::WorkerCtx and spawn the four worker
|
||||
/// threads it normally drives.
|
||||
///
|
||||
/// `suspended` controls whether the spawned threads enter the runqueue as
|
||||
/// `Ready` (false) or as `Blocked(Suspended)` (true). Use `true` for
|
||||
/// diagnostic baselines where you want the ctx materialised in guest memory
|
||||
/// for downstream probes but don't want the worker bodies executing (e.g.
|
||||
/// when round-5 ctx fields like the foreign-arena pointers at +0x28/+0x2C/
|
||||
/// +0x30 are still NULL and the workers would fault on first dereference).
|
||||
///
|
||||
/// Returns the ctx VA on the first call; on subsequent calls returns the
|
||||
/// cached VA without re-spawning. Failures inside spawn are logged but the
|
||||
/// `synth_done` latch is still flipped so we don't retry-loop.
|
||||
///
|
||||
/// Mirrors the AUDIT-048 audio-worker spawn pattern in
|
||||
/// `xaudio_register_render_driver` (`exports.rs:3122`).
|
||||
pub fn spawn_silph_workers(
|
||||
state: &mut KernelState,
|
||||
mem: &GuestMemory,
|
||||
suspended: bool,
|
||||
) -> Option<u32> {
|
||||
if state.silph_synth_done {
|
||||
return Some(state.silph_synth_ctx);
|
||||
}
|
||||
state.silph_synth_done = true;
|
||||
|
||||
let Some(ctx) = state.heap_alloc(SILPH_CTX_SIZE, mem) else {
|
||||
tracing::warn!("silph_synth: heap_alloc({:#x}) failed for ctx", SILPH_CTX_SIZE);
|
||||
return None;
|
||||
};
|
||||
state.silph_synth_ctx = ctx;
|
||||
|
||||
// Zero the entire ctx page first — heap_alloc returns freshly mapped
|
||||
// memory but we want the audit-059-round-5 layout to be canonical
|
||||
// regardless of any future allocator behaviour change.
|
||||
for off in (0..SILPH_CTX_SIZE).step_by(4) {
|
||||
mem.write_u32(ctx + off, 0);
|
||||
}
|
||||
|
||||
// ---- Header scalars (per audit-059 round 5 hexdump) ----
|
||||
mem.write_u32(ctx + 0x00, SILPH_CTX_VTABLE);
|
||||
mem.write_u32(ctx + 0x04, ctx); // self
|
||||
mem.write_u32(ctx + 0x08, ctx); // intrusive list head pointing at self
|
||||
mem.write_u32(ctx + 0x0C, 0x0000_0001); // init flag / refcount
|
||||
mem.write_u32(ctx + 0x10, 0x0100_0000); // packed byte field
|
||||
mem.write_u32(ctx + 0x18, 0x3F7F_CCCC); // float ~1.0 (UI rate A)
|
||||
mem.write_u32(ctx + 0x1C, 0x3F80_2D83); // float ~1.0 (UI rate B)
|
||||
mem.write_u32(ctx + 0x24, 0x0000_0001);
|
||||
|
||||
// +0x28..+0x30 = three foreign pointers.
|
||||
// +0x28 — canary's first-fire snapshot has NULL here. Round-19 fault
|
||||
// analysis shows worker bodies don't dereference this on
|
||||
// first entry, so we leave it NULL too.
|
||||
// +0x2C — sub-object pointer. Worker bodies do
|
||||
// `lwz r3,44(rN); lwz r11,0(r3); lwz r11,60(r11); bctrl`,
|
||||
// i.e. virtual-dispatch through slot 15 of the sub-object's
|
||||
// vtable. Point this at our synthesised sub-object embedded
|
||||
// at ctx + SILPH_SUBOBJ_OFFSET.
|
||||
// +0x30 — XEX-resident wrapper constant 0xBE568F00 (round 7). Opaque
|
||||
// but identical across all four canary ctxes.
|
||||
let subobj_ptr = ctx + SILPH_SUBOBJ_OFFSET;
|
||||
mem.write_u32(ctx + 0x2C, subobj_ptr);
|
||||
mem.write_u32(ctx + 0x30, SILPH_CTX_FIELD_30_CONST);
|
||||
|
||||
// ---- Embedded sub-object at +0x300 ----
|
||||
// Round-21 pivot: instead of synthesising a stub vtable that returns
|
||||
// NULL from every slot, point `[sub_object + 0]` directly at canary's
|
||||
// real XEX-resident sub-vtable VA. The vtable bytes are part of the
|
||||
// same static image both engines map, so referring to it costs zero
|
||||
// guest memory and gives the workers a working virtual-method surface
|
||||
// (slot 15 = sub_824FCCC8, slot 17 = sub_824FCE38, plus 29 other real
|
||||
// methods). Round-19 disassembly shows worker bodies only touch the
|
||||
// sub-object's vtable; the rest of the sub-object is opaque so we
|
||||
// leave it zero-filled.
|
||||
mem.write_u32(subobj_ptr, SILPH_SUB_VTABLE_SOURCE_VA);
|
||||
|
||||
// ---- 4× X_KEVENT auto-reset at +0x54/+0x64/+0x74/+0x84, state = 0 ----
|
||||
// X_DISPATCH_HEADER layout (canary xobject.h:35):
|
||||
// +0x00 type (u8: 0=manual-event, 1=auto-event, 2=mutant, ...)
|
||||
// +0x01 abandoned (u8)
|
||||
// +0x02 size (u8 dwords)
|
||||
// +0x03 inserted (u8)
|
||||
// +0x04 signal_state (u32 BE)
|
||||
// +0x08..+0x0F list_head (two pointers — self-link = empty list)
|
||||
for i in 0..4u32 {
|
||||
let off = ctx + 0x54 + (i * 0x10);
|
||||
mem.write_u8(off, 1); // type = auto-reset Event
|
||||
mem.write_u32(off + 4, 0); // signal_state = 0
|
||||
// List head self-link denotes empty waiter list.
|
||||
mem.write_u32(off + 8, off + 8);
|
||||
mem.write_u32(off + 12, off + 8);
|
||||
}
|
||||
// ---- 4× X_KEVENT manual-reset at +0x94..+0xC4, state = 1 (pre-signaled) ----
|
||||
for i in 0..4u32 {
|
||||
let off = ctx + 0x94 + (i * 0x10);
|
||||
mem.write_u8(off, 0); // type = manual-reset Event
|
||||
mem.write_u32(off + 4, 1); // signal_state = 1 (pre-signaled)
|
||||
mem.write_u32(off + 8, off + 8);
|
||||
mem.write_u32(off + 12, off + 8);
|
||||
}
|
||||
|
||||
// ---- 4-entry intrusive work-ring at +0x210, initially empty ----
|
||||
// Each entry: [+0]=0x01000000 [+4]=0 [+8]=self_ptr [+0xC]=self_ptr.
|
||||
for i in 0..4u32 {
|
||||
let off = ctx + 0x210 + (i * 0x10);
|
||||
mem.write_u32(off, 0x0100_0000);
|
||||
mem.write_u32(off + 4, 0);
|
||||
mem.write_u32(off + 8, off + 8);
|
||||
mem.write_u32(off + 12, off + 8);
|
||||
}
|
||||
|
||||
// +0x250 "XEN"-tagged descriptors and +0x2E0 resource-index table left
|
||||
// zero — they may be populated lazily by the workers themselves.
|
||||
|
||||
// ---- Spawn the 4 worker guest threads ----
|
||||
use std::sync::atomic::Ordering;
|
||||
let mut spawned = 0usize;
|
||||
for (i, &entry) in SILPH_WORKER_ENTRIES.iter().enumerate() {
|
||||
let Some(image) = allocate_thread_image(state, mem, SILPH_WORKER_STACK, 0) else {
|
||||
tracing::warn!("silph_synth: allocate_thread_image failed for worker {}", i);
|
||||
continue;
|
||||
};
|
||||
let tid = state.next_thread_id.fetch_add(1, Ordering::Relaxed);
|
||||
let handle = state.alloc_handle_for(KernelObject::Thread {
|
||||
id: tid,
|
||||
hw_id: None,
|
||||
exit_code: None,
|
||||
waiters: Vec::new(),
|
||||
});
|
||||
let tls_slot_count = state.next_tls_index.load(Ordering::Relaxed);
|
||||
let params = SpawnParams {
|
||||
entry,
|
||||
start_context: ctx, // r3 = ctx_ptr
|
||||
stack_base: image.stack_base,
|
||||
stack_size: image.stack_size,
|
||||
pcr_base: image.pcr_base,
|
||||
tls_base: image.tls_base,
|
||||
thread_handle: handle,
|
||||
guest_tid: tid,
|
||||
create_suspended: suspended,
|
||||
is_initial: false,
|
||||
tls_slot_count,
|
||||
affinity_mask: 0,
|
||||
priority: 0,
|
||||
ideal_processor: None,
|
||||
};
|
||||
match state.scheduler.spawn(params, &mut GuestMemoryPcr(mem)) {
|
||||
Ok(hw_id) => {
|
||||
if let Some(KernelObject::Thread { hw_id: slot, .. }) =
|
||||
state.objects.get_mut(&handle)
|
||||
{
|
||||
*slot = Some(hw_id);
|
||||
}
|
||||
let tref = ThreadRef::new(
|
||||
hw_id,
|
||||
(state.scheduler.slots[hw_id as usize].runqueue.len() - 1) as u16,
|
||||
);
|
||||
state.silph_synth_handles[i] = Some(handle);
|
||||
state.silph_synth_refs[i] = Some(tref);
|
||||
spawned += 1;
|
||||
tracing::info!(
|
||||
"silph_synth: spawned worker {} tid={} handle={:#x} entry={:#010x} ctx={:#010x}",
|
||||
i, tid, handle, entry, ctx
|
||||
);
|
||||
}
|
||||
Err(_) => {
|
||||
tracing::warn!(
|
||||
"silph_synth: scheduler.spawn failed for worker {} entry={:#010x}",
|
||||
i, entry
|
||||
);
|
||||
}
|
||||
}
|
||||
// Avoid an unused-variable warning if BlockReason isn't referenced.
|
||||
let _ = BlockReason::WaitAny {
|
||||
handles: Vec::new(),
|
||||
deadline: None,
|
||||
};
|
||||
}
|
||||
|
||||
tracing::info!(
|
||||
"silph_synth: ctx={:#010x} workers_spawned={}/4",
|
||||
ctx, spawned
|
||||
);
|
||||
|
||||
Some(ctx)
|
||||
}
|
||||
@@ -56,6 +56,18 @@ pub struct KernelState {
|
||||
/// publish; observers (the kernel object table) are guarded by
|
||||
/// their own synchronization.
|
||||
next_handle: std::sync::atomic::AtomicU32,
|
||||
/// AUDIT-059 R34: FIFO free list of closed handle slots, mirroring
|
||||
/// canary's slab/free-list `ObjectTable`. Without this, ours' bump
|
||||
/// allocator monotonically grows so a recycled slot in canary
|
||||
/// (e.g. `F8000098` reused 130× per 30s) corresponds to a fresh,
|
||||
/// never-reused slot in ours — the kernel-object identity drifts.
|
||||
/// Recycling closes that gap and (per AUDIT-042 / R30) may
|
||||
/// side-effect-unwedge γ-cluster #2 by letting silph signals land
|
||||
/// on the same handle slot the wait registered for. Population is
|
||||
/// gated on `KernelState::release_handle_slot` (only IDs in
|
||||
/// `[HANDLE_BASE, 0xF000_0000)` are recycled — synthetic XAudio
|
||||
/// handles at `0xF000_0000+` are reserved and must never be reused).
|
||||
free_handles: std::collections::VecDeque<u32>,
|
||||
/// Scheduler managing all emulated HW threads + their per-slot
|
||||
/// runqueues. Starts empty — the app installs the initial guest thread
|
||||
/// on slot 0 via `KernelState::install_initial_thread` once it has the
|
||||
@@ -244,6 +256,52 @@ pub struct KernelState {
|
||||
/// Distinct from `ctor_probe_pcs` because that helper emits 8
|
||||
/// frames of back-chain per hit — too noisy for branch tracing.
|
||||
pub branch_probe_pcs: std::collections::HashSet<u32>,
|
||||
/// AUDIT-2BF — diagnostic. PCs at which to emit a structured one-line
|
||||
/// `AUDIT-PC-PROBE` record on every fire, designed for the silph init
|
||||
/// chain virtual-dispatch site at `sub_82172BA0+0x1E8` (PC
|
||||
/// `0x82172D88`, a `bctrl` after a 3-deep load of vtable slot 6). The
|
||||
/// emitted line carries (pc, tid, hw, cycle, lr, r3, r11) plus four
|
||||
/// guest-memory dereferences off `r3`: `[r3+0]` (vtable), `[[r3+0]+24]`
|
||||
/// (slot 6 method pointer = the bctrl target), `[r3+0x0C]` (audit-059
|
||||
/// round-9 canary-known auxiliary handle `0xF80000D8`), and `[r3+0x30]`
|
||||
/// (canary-known embedded sub-object vtable `0x820A1870`). Distinct
|
||||
/// from `branch_probe_pcs` because that helper only logs registers (no
|
||||
/// memory) and from `lr_trace_pcs` because that emits JSON intended
|
||||
/// for canary diffing, not the four hard-coded indirect dereferences
|
||||
/// needed here. Read-only — no guest state mutation. Lockstep
|
||||
/// digest unaffected. Settable via `--audit-pc-probe-hex` /
|
||||
/// `XENIA_AUDIT_PC_PROBE`.
|
||||
pub audit_pc_probe_pcs: std::collections::HashSet<u32>,
|
||||
/// AUDIT-2BF round 14 — diagnostic. Optional guest VA. When set, each
|
||||
/// `AUDIT-PC-PROBE` fire emits a paired `AUDIT-MEM-READ` line with
|
||||
/// `addr`, `*addr` (singleton value), `**addr` (vtable), `***addr+0`
|
||||
/// (vtable[0] = first virtual method), and `***addr+24` (vtable[6]
|
||||
/// in 4-byte stride = slot 6 = silph chain bctrl target). Three-deep
|
||||
/// dereference to resolve the vtable[0] target at the bctrl site
|
||||
/// `0x822F1B4C` inside `sub_822F1AA8`. Read-only; lockstep digest
|
||||
/// unaffected. Settable via `--audit-mem-read-hex` /
|
||||
/// `XENIA_AUDIT_MEM_READ`.
|
||||
pub audit_mem_read_addr: Option<u32>,
|
||||
/// AUDIT-052 — diagnostic. When set, each `AUDIT-PC-PROBE` fire
|
||||
/// additionally emits an `AUDIT-R3-DUMP` line with N bytes of guest
|
||||
/// memory dumped from `r3` as `u32` lanes (4-byte aligned only).
|
||||
/// Sized for audit-051's 80-byte stack-local struct at `r31+96`
|
||||
/// inside `sub_82452DC0` (probe `sub_8245B000` entry where
|
||||
/// `r3 == parent's r31+96`). Read-only; lockstep digest unaffected.
|
||||
/// Settable via `--audit-r3-dump-bytes` /
|
||||
/// `XENIA_AUDIT_R3_DUMP_BYTES`.
|
||||
pub audit_r3_dump_bytes: Option<u32>,
|
||||
/// iterate-2E — diagnostic pointer-chase. `(reg, off)`: on every
|
||||
/// `AUDIT-PC-PROBE` fire, treat `gpr[reg]` as a base object pointer,
|
||||
/// dump its first 64 bytes, then follow `[base+off]` to a sub-object
|
||||
/// (e.g. a stream/file object held in a work item), dump ITS first 64
|
||||
/// bytes, then follow `[[base+off]+0]` to the sub-object's vtable and
|
||||
/// dump the first 48 u32 slots. Designed to capture the live work-item
|
||||
/// + stream object + vtable at `sub_824510E0` entry (r4 = work item,
|
||||
/// stream at +36, vtable[28] = the "is-read-done?" predicate) BEFORE
|
||||
/// the pool recycles the slot. Read-only; lockstep digest unaffected.
|
||||
/// Settable via `XENIA_AUDIT_DEREF=<reg>:<off>` (e.g. `4:36`).
|
||||
pub audit_deref: Option<(u8, u32)>,
|
||||
/// M12 — diagnostic. PCs at which to emit a structured JSONL record
|
||||
/// per fire, designed for diffing against xenia-canary's
|
||||
/// `--log_lr_on_pc` patch output. Each line carries
|
||||
@@ -264,6 +322,56 @@ pub struct KernelState {
|
||||
pub dump_addrs: Vec<u32>,
|
||||
/// `--dump-section=BASE:LEN:PATH` end-of-run snapshot, page-gated by `is_mapped`.
|
||||
pub dump_section: Option<(u32, u32, std::path::PathBuf)>,
|
||||
/// AUDIT-2.BF — synthetic silph::WorkerCtx spawn one-shot latch. Set on
|
||||
/// first call to [`crate::silph_synth::spawn_silph_workers`] (triggered
|
||||
/// by the first observation of a load-bearing VFS path such as
|
||||
/// `dat/movie`), then reused — subsequent triggers are no-ops.
|
||||
pub silph_synth_done: bool,
|
||||
/// AUDIT-2.BF — VA of the synthesised silph::WorkerCtx. Zero before the
|
||||
/// first spawn; set to the ctx base by `spawn_silph_workers`. Held on
|
||||
/// the kernel state so future export hooks can find it (no caller does
|
||||
/// yet — placeholder for round 19+ wiring).
|
||||
pub silph_synth_ctx: u32,
|
||||
/// AUDIT-2.BF — kernel handles for the 4 synthetic worker threads.
|
||||
pub silph_synth_handles: [Option<u32>; 4],
|
||||
/// AUDIT-2.BF — `ThreadRef` cache for the 4 synthetic workers.
|
||||
pub silph_synth_refs: [Option<xenia_cpu::ThreadRef>; 4],
|
||||
/// ITERATE-2C Phase D — auto-signal delay for silph::UImpl
|
||||
/// `NtCreateEvent` calls (see [`Self::maybe_register_silph_autosignal`]).
|
||||
/// `None` = feature disabled; populated once from
|
||||
/// `XENIA_SILPH_UI_AUTOSIGNAL_DELAY=<u64>` at construction.
|
||||
pub silph_autosignal_delay: Option<u64>,
|
||||
/// ITERATE-2C Phase D — pending auto-signal queue. Drained each
|
||||
/// outer round by [`Self::fire_due_silph_autosignals`].
|
||||
pub silph_autosignal_pending: Vec<AutoSignalPending>,
|
||||
/// ITERATE-2C Phase D — most recent `stats.instruction_count`
|
||||
/// deposited by the scheduler loop (see
|
||||
/// [`Self::set_now_cycle_hint`]). Used by
|
||||
/// [`Self::maybe_register_silph_autosignal`] to compute absolute
|
||||
/// deadlines, since `nt_create_event` doesn't see `ExecStats`.
|
||||
pub last_cycle_hint: u64,
|
||||
/// ITERATE-2C Phase D — one-shot diagnostic latch. Flipped by
|
||||
/// [`Self::fire_due_silph_autosignals`] on the first visit where
|
||||
/// the pending queue is non-empty but no entry is due yet.
|
||||
pub silph_autosignal_diag_logged: bool,
|
||||
/// ITERATE-2J — guest VA of the `KeTimeStampBundle` block (xboxkrnl
|
||||
/// data export ordinal 0x00AD). Set during the import-patch pass in
|
||||
/// `xenia-app`. Zero until then. The guest's worker-hub channel
|
||||
/// dispatch loop polls `[block+0x10]` (`tick_count`, milliseconds) and
|
||||
/// gates dispatch on a `tick_count + 66` deadline; if the block is
|
||||
/// never re-written that deadline never elapses and the hub spins
|
||||
/// forever (the tid14 0x109c starvation gate). The run loop ticks this
|
||||
/// block every round from the deterministic `global_clock` via
|
||||
/// [`Self::update_timestamp_bundle`].
|
||||
pub timestamp_bundle_addr: u32,
|
||||
}
|
||||
|
||||
/// ITERATE-2C Phase D — one queued auto-signal. `deadline_cycle` is
|
||||
/// absolute (cycle hint at register time + configured delay).
|
||||
#[derive(Debug, Clone, Copy)]
|
||||
pub struct AutoSignalPending {
|
||||
pub handle: u32,
|
||||
pub deadline_cycle: u64,
|
||||
}
|
||||
|
||||
impl KernelState {
|
||||
@@ -289,6 +397,7 @@ impl KernelState {
|
||||
let mut state = Self {
|
||||
exports: HashMap::new(),
|
||||
next_handle: AtomicU32::new(0x1000),
|
||||
free_handles: std::collections::VecDeque::new(),
|
||||
scheduler,
|
||||
next_tls_index: AtomicU32::new(0),
|
||||
cs_waiters: HashMap::new(),
|
||||
@@ -327,10 +436,25 @@ impl KernelState {
|
||||
ctor_probe_pcs: std::collections::HashSet::new(),
|
||||
pc_probe_consumers: HashMap::new(),
|
||||
branch_probe_pcs: std::collections::HashSet::new(),
|
||||
audit_pc_probe_pcs: std::collections::HashSet::new(),
|
||||
audit_mem_read_addr: None,
|
||||
audit_r3_dump_bytes: None,
|
||||
audit_deref: None,
|
||||
lr_trace_pcs: std::collections::HashSet::new(),
|
||||
lr_trace_writer: None,
|
||||
dump_addrs: Vec::new(),
|
||||
dump_section: None,
|
||||
silph_synth_done: false,
|
||||
silph_synth_ctx: 0,
|
||||
silph_synth_handles: [None; 4],
|
||||
silph_synth_refs: [None; 4],
|
||||
silph_autosignal_delay: std::env::var("XENIA_SILPH_UI_AUTOSIGNAL_DELAY")
|
||||
.ok()
|
||||
.and_then(|v| v.parse::<u64>().ok()),
|
||||
silph_autosignal_pending: Vec::new(),
|
||||
last_cycle_hint: 0,
|
||||
silph_autosignal_diag_logged: false,
|
||||
timestamp_bundle_addr: 0,
|
||||
};
|
||||
crate::exports::register_exports(&mut state);
|
||||
crate::xam::register_exports(&mut state);
|
||||
@@ -604,12 +728,39 @@ impl KernelState {
|
||||
}
|
||||
|
||||
pub fn alloc_handle(&mut self) -> u32 {
|
||||
// AUDIT-059 R34: prefer recycling a closed slot (FIFO, matching
|
||||
// canary's `ObjectTable` slab) before bumping. The Arc<Mutex<
|
||||
// KernelState>> already serializes us; no extra synchronization.
|
||||
if let Some(slot) = self.free_handles.pop_front() {
|
||||
return slot;
|
||||
}
|
||||
// M2.4: lock-free fetch_add. Relaxed is sufficient — IDs are
|
||||
// opaque tokens; no payload is sequenced against the counter.
|
||||
self.next_handle
|
||||
.fetch_add(4, std::sync::atomic::Ordering::Relaxed)
|
||||
}
|
||||
|
||||
/// AUDIT-059 R34. Return a freshly-closed handle slot to the FIFO
|
||||
/// recycle queue. No-op for the synthetic XAudio range (`>= 0xF000_0000`,
|
||||
/// AUDIT-048) and the reserved `< 0x1000` band. Call site: `nt_close`'s
|
||||
/// `objects.remove` branch when refcount reaches zero.
|
||||
///
|
||||
/// ABA guard (subsystem-audit 2026-06-12): never recycle a slot that a
|
||||
/// thread is still parked on. Without this, a closed slot could be
|
||||
/// re-minted for a new object and a signal on that new object would wake
|
||||
/// the stale waiter that was blocked on the OLD object at the same slot.
|
||||
/// Such a slot is simply leaked (it stays out of `free_handles`),
|
||||
/// reproducing the pre-R34 bump-only behaviour for that rare case.
|
||||
pub fn release_handle_slot(&mut self, handle: u32) {
|
||||
if handle < 0x1000 || handle >= 0xF000_0000 {
|
||||
return;
|
||||
}
|
||||
if self.scheduler.any_thread_waiting_on(handle) {
|
||||
return;
|
||||
}
|
||||
self.free_handles.push_back(handle);
|
||||
}
|
||||
|
||||
pub fn alloc_handle_for(&mut self, obj: KernelObject) -> u32 {
|
||||
let h = self.alloc_handle();
|
||||
self.objects.insert(h, obj);
|
||||
@@ -714,6 +865,173 @@ impl KernelState {
|
||||
self.audit.record_wake(handle, entry);
|
||||
}
|
||||
|
||||
/// ITERATE-2C Phase D — deposit the latest scheduler instruction
|
||||
/// count so `nt_create_event` can compute absolute auto-signal
|
||||
/// deadlines. Called once per outer round from the app's
|
||||
/// `coord_pre_round` site. No-op when the feature env is unset.
|
||||
pub fn set_now_cycle_hint(&mut self, now_cycle: u64) {
|
||||
self.last_cycle_hint = now_cycle;
|
||||
}
|
||||
|
||||
/// ITERATE-2J — tick the `KeTimeStampBundle` block (xboxkrnl ordinal
|
||||
/// 0x00AD) from the deterministic monotonic clock so the guest sees a
|
||||
/// clock that *advances*.
|
||||
///
|
||||
/// `clock` is the scheduler's `global_clock` — a pure function of
|
||||
/// retired guest instructions (see [`Self::now_basis_at`] /
|
||||
/// `Scheduler::global_clock`). Lockstep floors it up to
|
||||
/// `stats.instruction_count` each round; parallel sums per-block
|
||||
/// retired counts. Using it (rather than wall-clock) keeps every
|
||||
/// guest-visible time value a deterministic function of guest progress,
|
||||
/// so lockstep stays byte-reproducible.
|
||||
///
|
||||
/// ## Cadence
|
||||
/// The existing kernel time math (`parse_timeout` in `exports.rs`)
|
||||
/// already treats **1 `global_clock` unit ≈ 100 ns**: it converts a
|
||||
/// signed 100-ns `LARGE_INTEGER` timeout to a deadline by dividing the
|
||||
/// magnitude by 100 and adding it to `now` (= `global_clock`). To stay
|
||||
/// coherent with that, this method uses the same scale:
|
||||
///
|
||||
/// * `interrupt_time` / `system_time` (100-ns units): `clock` (with a
|
||||
/// FILETIME epoch base added to `system_time`).
|
||||
/// * `tick_count` (milliseconds): `clock / INSTRUCTIONS_PER_MS` where
|
||||
/// `INSTRUCTIONS_PER_MS = 10_000` (10_000 × 100 ns = 1 ms).
|
||||
///
|
||||
/// At 10_000 clock-units/ms, the guest's `tick_count + 66` ms hub
|
||||
/// deadline elapses by ~660_000 retired instructions — very early in a
|
||||
/// ~1 B-instruction boot — while a 16 ms `KeWait` timeout
|
||||
/// (`parse_timeout`: 160_000 units) still resolves to 16 ms of
|
||||
/// tick_count, so no timeout collapses to "instant". The two readers
|
||||
/// share one scale.
|
||||
pub fn update_timestamp_bundle(&self, mem: &GuestMemory, clock: u64) {
|
||||
let block = self.timestamp_bundle_addr;
|
||||
if block == 0 {
|
||||
return;
|
||||
}
|
||||
const INSTRUCTIONS_PER_MS: u64 = 10_000;
|
||||
// FILETIME epoch base (~2021) so `system_time` is a plausible
|
||||
// absolute wall-clock; matches the constant used by
|
||||
// `ke_query_system_time`. interrupt_time is "since boot" so it
|
||||
// starts at the clock origin (no epoch offset).
|
||||
const FILETIME_BASE: u64 = 132_500_000_000_000_000;
|
||||
let interrupt_time: u64 = clock;
|
||||
let system_time: u64 = FILETIME_BASE.wrapping_add(clock);
|
||||
let tick_count: u32 = (clock / INSTRUCTIONS_PER_MS) as u32;
|
||||
// BE writes (write_u64/write_u32 use to_be_bytes) — guest is BE.
|
||||
mem.write_u64(block, interrupt_time); // +0x00 interrupt_time
|
||||
mem.write_u64(block + 0x08, system_time); // +0x08 system_time
|
||||
mem.write_u32(block + 0x10, tick_count); // +0x10 tick_count (ms)
|
||||
mem.write_u32(block + 0x14, 0); // +0x14 padding
|
||||
}
|
||||
|
||||
/// ITERATE-2C Phase D — register a freshly-allocated event for
|
||||
/// auto-signal after the configured delay, **iff** the creating
|
||||
/// thread matches the silph::UImpl tid=13 chain that wedges in
|
||||
/// audit-049. Filter:
|
||||
///
|
||||
/// * Env `XENIA_SILPH_UI_AUTOSIGNAL_DELAY` set (= delay non-None)
|
||||
/// * Frame-1 LR (the guest caller's post-bl PC, walked one step up
|
||||
/// from the live thunk-wrapper frame) is in
|
||||
/// `[0x821CB15C, 0x821CB160]` — this is the `NtCreateEvent` call
|
||||
/// site inside `sub_821CB030+0x128`. The live `ctx.lr` is the
|
||||
/// thunk wrapper's return slot (e.g. `0x824a9f6c`), so we walk
|
||||
/// one back-chain step to reach the actual guest caller.
|
||||
/// * Creating thread's `start_entry == 0x821748F0` (silph trampoline)
|
||||
/// * Creating thread's `start_context == 0x4024a840`
|
||||
///
|
||||
/// On match, the handle is queued with `deadline = last_cycle_hint +
|
||||
/// delay`. Drained by [`Self::fire_due_silph_autosignals`] from the
|
||||
/// outer scheduler loop.
|
||||
pub fn maybe_register_silph_autosignal(
|
||||
&mut self,
|
||||
handle: u32,
|
||||
ctx: &PpcContext,
|
||||
mem: &GuestMemory,
|
||||
) {
|
||||
let Some(delay) = self.silph_autosignal_delay else {
|
||||
return;
|
||||
};
|
||||
let Some((entry, start_ctx)) = self.scheduler.current_thread_entry_and_ctx() else {
|
||||
return;
|
||||
};
|
||||
if entry != 0x821748F0 || start_ctx != 0x4024_a840 {
|
||||
return;
|
||||
}
|
||||
let frames = walk_guest_back_chain(ctx.gpr[1] as u32, ctx.lr as u32, mem, 2);
|
||||
let caller_lr = match frames.get(1) {
|
||||
Some((_, lr)) => *lr,
|
||||
None => return,
|
||||
};
|
||||
if !(0x821CB15C..=0x821CB160).contains(&caller_lr) {
|
||||
return;
|
||||
}
|
||||
let deadline = self.last_cycle_hint.saturating_add(delay);
|
||||
self.silph_autosignal_pending
|
||||
.push(AutoSignalPending { handle, deadline_cycle: deadline });
|
||||
tracing::info!(
|
||||
"silph autosignal: scheduled handle={:#x} caller_lr={:#x} for cycle {} (now={}, delay={})",
|
||||
handle,
|
||||
caller_lr,
|
||||
deadline,
|
||||
self.last_cycle_hint,
|
||||
delay,
|
||||
);
|
||||
}
|
||||
|
||||
/// ITERATE-2C Phase D — drain pending entries whose deadline has
|
||||
/// passed. Each fires by setting `Event { signaled = true }` and
|
||||
/// invoking the existing `wake_eligible_waiters` to release blocked
|
||||
/// waiters. No-op when the queue is empty (the common case).
|
||||
pub fn fire_due_silph_autosignals(&mut self, now_cycle: u64) {
|
||||
if self.silph_autosignal_pending.is_empty() {
|
||||
return;
|
||||
}
|
||||
let any_due = self
|
||||
.silph_autosignal_pending
|
||||
.iter()
|
||||
.any(|p| p.deadline_cycle <= now_cycle);
|
||||
if !any_due {
|
||||
// Diagnostic for the Phase D POC: log first time we visit
|
||||
// with a non-empty queue but nothing due yet.
|
||||
if !self.silph_autosignal_diag_logged {
|
||||
self.silph_autosignal_diag_logged = true;
|
||||
if let Some(first) = self.silph_autosignal_pending.first() {
|
||||
tracing::info!(
|
||||
"silph autosignal: tick (first visit, none due) now={} pending={} first_deadline={}",
|
||||
now_cycle,
|
||||
self.silph_autosignal_pending.len(),
|
||||
first.deadline_cycle,
|
||||
);
|
||||
}
|
||||
}
|
||||
}
|
||||
let mut i = 0;
|
||||
while i < self.silph_autosignal_pending.len() {
|
||||
if self.silph_autosignal_pending[i].deadline_cycle <= now_cycle {
|
||||
let p = self.silph_autosignal_pending.swap_remove(i);
|
||||
let prev = match self.objects.get_mut(&p.handle) {
|
||||
Some(KernelObject::Event { signaled, .. }) => {
|
||||
let was = *signaled;
|
||||
*signaled = true;
|
||||
Some(was)
|
||||
}
|
||||
_ => None,
|
||||
};
|
||||
tracing::info!(
|
||||
"silph autosignal: firing handle={:#x} prev_signaled={:?} at cycle {}",
|
||||
p.handle,
|
||||
prev,
|
||||
now_cycle,
|
||||
);
|
||||
self.audit_signal(p.handle, 0, "silph_autosignal", prev.unwrap_or(false) as u64);
|
||||
crate::exports::wake_eligible_waiters(self, p.handle);
|
||||
// do not advance i — swap_remove pulled a new entry into i
|
||||
} else {
|
||||
i += 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Diagnostic. If the live PC for HW slot `hw_id` is in
|
||||
/// `self.ctor_probe_pcs`, emit a single `CTOR-PROBE` line with
|
||||
/// the current cycle, tid, hw_id, sp, r3, lr, plus an 8-frame
|
||||
@@ -797,6 +1115,123 @@ impl KernelState {
|
||||
);
|
||||
}
|
||||
|
||||
/// AUDIT-2BF — diagnostic. If the live PC for HW slot `hw_id` is in
|
||||
/// `self.audit_pc_probe_pcs`, emit a single one-line
|
||||
/// `AUDIT-PC-PROBE` record with (pc, tid, hw, cycle, lr, r3, r11)
|
||||
/// plus four guest-memory dereferences off r3: `[r3+0]` (vtable),
|
||||
/// `[[r3+0]+24]` (slot 6 method = bctrl target), `[r3+0x0C]`
|
||||
/// (auxiliary handle field), `[r3+0x30]` (embedded sub-object
|
||||
/// vtable field). Tuned for the silph init chain virtual-dispatch
|
||||
/// site at `sub_82172BA0+0x1E8` (PC `0x82172D88`).
|
||||
///
|
||||
/// Read-only. No guest-state mutation; lockstep digest unaffected.
|
||||
/// Empty set is the common case → single `is_empty()` test on the
|
||||
/// hot path.
|
||||
pub fn fire_audit_pc_probe_if_match(&self, hw_id: u8, mem: &GuestMemory) {
|
||||
if self.audit_pc_probe_pcs.is_empty() {
|
||||
return;
|
||||
}
|
||||
let ctx = self.scheduler.ctx(hw_id);
|
||||
let pc = ctx.pc;
|
||||
if !self.audit_pc_probe_pcs.contains(&pc) {
|
||||
return;
|
||||
}
|
||||
let tid = self.scheduler.tid(hw_id).unwrap_or(0);
|
||||
let r3 = ctx.gpr[3] as u32;
|
||||
let r11 = ctx.gpr[11] as u32;
|
||||
let lr = ctx.lr as u32;
|
||||
let cycle = ctx.cycle_count;
|
||||
// Memory dereferences. Guest pointers may be unmapped/garbage;
|
||||
// `read_u32` returns 0 for unmapped pages (heap.rs:510 returns
|
||||
// a default), so an all-zero block in the output reliably
|
||||
// indicates an invalid `r3`.
|
||||
let vtable = mem.read_u32(r3);
|
||||
let slot6_method = if vtable != 0 {
|
||||
mem.read_u32(vtable.wrapping_add(24))
|
||||
} else {
|
||||
0
|
||||
};
|
||||
let aux_handle = mem.read_u32(r3.wrapping_add(0x0C));
|
||||
let sub_vt = mem.read_u32(r3.wrapping_add(0x30));
|
||||
println!(
|
||||
"AUDIT-PC-PROBE pc={:#010x} tid={} hw={} cycle={} lr={:#010x} r3={:#010x} r11={:#010x} \
|
||||
[r3+0]={:#010x} [[r3+0]+24]={:#010x} [r3+0x0C]={:#010x} [r3+0x30]={:#010x}",
|
||||
pc, tid, hw_id, cycle, lr, r3, r11,
|
||||
vtable, slot6_method, aux_handle, sub_vt,
|
||||
);
|
||||
// AUDIT-2BF round 14 — paired memory-read. When
|
||||
// `audit_mem_read_addr` is set, dereference 3 deep: singleton
|
||||
// pointer → vtable → vtable[0] / vtable[24]. Defensively
|
||||
// null-checks each level. `read_u32` returns 0 for unmapped
|
||||
// pages so all-zero output is the unmapped/uninitialized
|
||||
// signature.
|
||||
if let Some(addr) = self.audit_mem_read_addr {
|
||||
let val = mem.read_u32(addr);
|
||||
let vt = if val != 0 { mem.read_u32(val) } else { 0 };
|
||||
let m0 = if vt != 0 { mem.read_u32(vt) } else { 0 };
|
||||
let m6 = if vt != 0 { mem.read_u32(vt.wrapping_add(24)) } else { 0 };
|
||||
println!(
|
||||
"AUDIT-MEM-READ addr={:#010x} val={:#010x} vtable={:#010x} \
|
||||
vtable[0]={:#010x} vtable[24]={:#010x} pc={:#010x} tid={} cycle={}",
|
||||
addr, val, vt, m0, m6, pc, tid, cycle,
|
||||
);
|
||||
}
|
||||
// AUDIT-052 — dump N bytes of guest memory from r3 as u32 lanes
|
||||
// when `audit_r3_dump_bytes` is set. Sized for the 80-byte
|
||||
// stack-local struct at sub_82452DC0's `r31+96` (probe is
|
||||
// sub_8245B000 entry where r3 IS the struct ptr). Output
|
||||
// format: `AUDIT-R3-DUMP pc=… r3=… +0x00=… +0x04=… …`.
|
||||
if let Some(n) = self.audit_r3_dump_bytes {
|
||||
let n = n.min(256) & !3u32; // cap 256B, 4-byte align
|
||||
let mut out = String::with_capacity(64 + (n as usize) * 16);
|
||||
use std::fmt::Write as _;
|
||||
let _ = write!(
|
||||
&mut out,
|
||||
"AUDIT-R3-DUMP pc={:#010x} tid={} cycle={} r3={:#010x}",
|
||||
pc, tid, cycle, r3,
|
||||
);
|
||||
let mut off: u32 = 0;
|
||||
while off < n {
|
||||
let v = mem.read_u32(r3.wrapping_add(off));
|
||||
let _ = write!(&mut out, " +0x{:02x}={:#010x}", off, v);
|
||||
off = off.wrapping_add(4);
|
||||
}
|
||||
println!("{}", out);
|
||||
}
|
||||
// iterate-2E — pointer-chase: dump base object (gpr[reg]), the
|
||||
// sub-object it holds at [base+off], and that sub-object's vtable
|
||||
// slots. Captures the live work-item + stream + vtable[28] at
|
||||
// sub_824510E0 before the pool recycles the slot. Read-only.
|
||||
if let Some((reg, deref_off)) = self.audit_deref {
|
||||
use std::fmt::Write as _;
|
||||
let base = ctx.gpr[reg as usize] as u32;
|
||||
let dump64 = |label: &str, p: u32| {
|
||||
let mut s = String::with_capacity(256);
|
||||
let _ = write!(&mut s, "AUDIT-DEREF {} ptr={:#010x}", label, p);
|
||||
let mut o: u32 = 0;
|
||||
while o < 64 {
|
||||
let _ = write!(&mut s, " +0x{:02x}={:#010x}", o, mem.read_u32(p.wrapping_add(o)));
|
||||
o += 4;
|
||||
}
|
||||
println!("{}", s);
|
||||
};
|
||||
println!("AUDIT-DEREF-HEAD pc={:#010x} tid={} cycle={} reg=r{} off=0x{:x}", pc, tid, cycle, reg, deref_off);
|
||||
dump64("item", base);
|
||||
let sub = mem.read_u32(base.wrapping_add(deref_off));
|
||||
dump64("sub", sub);
|
||||
let vt = mem.read_u32(sub); // [sub+0] = vtable
|
||||
// Dump 48 vtable slots so slot 28 (+0x70) and slot 36 (+0x90) show.
|
||||
let mut s = String::with_capacity(512);
|
||||
let _ = write!(&mut s, "AUDIT-DEREF vtable={:#010x}", vt);
|
||||
let mut slot: u32 = 0;
|
||||
while slot < 48 {
|
||||
let _ = write!(&mut s, " [{}]={:#010x}", slot, mem.read_u32(vt.wrapping_add(slot * 4)));
|
||||
slot += 1;
|
||||
}
|
||||
println!("{}", s);
|
||||
}
|
||||
}
|
||||
|
||||
/// M12 — diagnostic. If the live PC for HW slot `hw_id` is in
|
||||
/// `self.lr_trace_pcs`, emit one JSONL record. Format mirrors what
|
||||
/// xenia-canary's `--log_lr_on_pc` patch emits, plus the cycle
|
||||
@@ -922,6 +1357,30 @@ impl KernelState {
|
||||
self.pending_timer_fires.first().map(|&(d, _)| d)
|
||||
}
|
||||
|
||||
/// Coherent "now" basis for deadline arithmetic — the scheduler's
|
||||
/// single monotonic `global_clock`, in BOTH execution modes.
|
||||
///
|
||||
/// Per-thread `ctx(hw_id).timebase` is NOT a sound "now" for deadline
|
||||
/// arithmetic: in `--parallel` workers extract/zero their slots while
|
||||
/// stepping unlocked, and in **lockstep** a parked/poll thread has
|
||||
/// `running_idx == None` so `ctx()` returns `idle_ctx` (timebase 0).
|
||||
/// Either way a `parse_timeout` reading the per-thread basis can see 0
|
||||
/// (or a stale value) and register `deadline = 0 + relative`, a value
|
||||
/// permanently in the past, which `coord_idle_advance` then re-arms
|
||||
/// forever (the timebase-desync livelock; the render-gate root). The
|
||||
/// `global_clock` is a deterministic function of retired guest
|
||||
/// instructions (per-round `stats.instruction_count` floor-ups in
|
||||
/// lockstep, per-block retired counts in parallel), so it is coherent,
|
||||
/// monotonic, never zero after boot, and bit-reproducible across two
|
||||
/// cold lockstep runs.
|
||||
///
|
||||
/// The `hw_id` argument is retained for call-site clarity (which slot a
|
||||
/// caller would conceptually be "asking about") but is no longer read —
|
||||
/// the basis is global.
|
||||
pub fn now_basis_at(&self, _hw_id: u8) -> u64 {
|
||||
self.scheduler.global_clock()
|
||||
}
|
||||
|
||||
/// Fire every timer whose deadline is `<= now` (derived from slot 0's
|
||||
/// timebase, matching `parse_timeout`'s "current thread" fallback).
|
||||
/// For each fire: mark the timer `signaled=true`, clear its
|
||||
@@ -930,7 +1389,7 @@ impl KernelState {
|
||||
/// fired — the caller uses this to decide whether the scheduler round
|
||||
/// needs a follow-up `advance_to_next_wake_if_due` step.
|
||||
pub fn fire_due_timers(&mut self) -> bool {
|
||||
let now = self.scheduler.ctx(0).timebase;
|
||||
let now = self.now_basis_at(0);
|
||||
let mut fired = false;
|
||||
loop {
|
||||
let Some(&(deadline, handle)) = self.pending_timer_fires.first() else {
|
||||
|
||||
@@ -31,6 +31,9 @@ impl VfsDevice for HostPathDevice {
|
||||
is_directory: metadata.is_dir(),
|
||||
size: metadata.len(),
|
||||
offset: 0,
|
||||
// Host FS carries no Xbox attribute byte; synthesise the
|
||||
// DIRECTORY/NORMAL split like canary's HostPathDevice.
|
||||
attributes: if metadata.is_dir() { 0x10 } else { 0x80 },
|
||||
});
|
||||
}
|
||||
Ok(entries)
|
||||
@@ -49,6 +52,7 @@ impl VfsDevice for HostPathDevice {
|
||||
is_directory: metadata.is_dir(),
|
||||
size: metadata.len(),
|
||||
offset: 0,
|
||||
attributes: if metadata.is_dir() { 0x10 } else { 0x80 },
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
@@ -29,6 +29,11 @@ const GDFX_MAGIC: &[u8; 20] = b"MICROSOFT*XBOX*MEDIA";
|
||||
/// File attribute: directory
|
||||
const FILE_ATTRIBUTE_DIRECTORY: u8 = 0x10;
|
||||
|
||||
/// File attribute: read-only. Canary OR's this into every GDFX entry's
|
||||
/// attribute byte because a pressed disc is inherently read-only
|
||||
/// (`disc_image_device.cc:154`: `attributes | kFileAttributeReadOnly`).
|
||||
const FILE_ATTRIBUTE_READONLY: u8 = 0x01;
|
||||
|
||||
/// Known game partition offsets to try
|
||||
const LIKELY_OFFSETS: &[u64] = &[
|
||||
0x0000_0000,
|
||||
@@ -131,6 +136,11 @@ impl DiscImageDevice {
|
||||
|
||||
let name = String::from_utf8_lossy(&buffer[p + 14..p + 14 + name_length]).to_string();
|
||||
let is_directory = (attributes & FILE_ATTRIBUTE_DIRECTORY) != 0;
|
||||
// Match canary: the on-disc attribute byte (DIRECTORY/HIDDEN/SYSTEM/
|
||||
// ARCHIVE/NORMAL bits as authored) OR the implicit READONLY bit for
|
||||
// pressed media. We forward the FULL byte, not a path-shape guess, so
|
||||
// attribute queries report exactly what the disc records.
|
||||
let attributes = (attributes | FILE_ATTRIBUTE_READONLY) as u32;
|
||||
let file_offset = self.game_offset + sector * SECTOR_SIZE;
|
||||
let full_path = if prefix.is_empty() {
|
||||
name.clone()
|
||||
@@ -143,6 +153,7 @@ impl DiscImageDevice {
|
||||
is_directory,
|
||||
size: length,
|
||||
offset: file_offset,
|
||||
attributes,
|
||||
});
|
||||
|
||||
// Descend into subdirectories. Zero-length directory entries exist
|
||||
@@ -260,4 +271,73 @@ mod tests {
|
||||
.expect("read_file on nested path");
|
||||
assert!(!bytes.is_empty(), "nested read returned empty buffer");
|
||||
}
|
||||
|
||||
/// Build a one-node GDFX directory buffer in memory and parse it with
|
||||
/// `collect_entries`, asserting the real on-disc attribute byte is
|
||||
/// forwarded into `VfsEntry.attributes` (with READONLY OR'd in, matching
|
||||
/// canary `disc_image_device.cc:154`) rather than synthesised from the
|
||||
/// path shape.
|
||||
fn parse_single_entry(name: &str, on_disc_attr: u8) -> VfsEntry {
|
||||
// GDFX dirent: node_l(u16) node_r(u16) sector(u32) length(u32)
|
||||
// attributes(u8) name_length(u8) name(bytes). The directory bit
|
||||
// gates subdirectory descent; use length=0 so a "directory" entry
|
||||
// is treated as an empty leaf and we don't recurse off the buffer.
|
||||
let mut buf = Vec::new();
|
||||
buf.extend_from_slice(&0u16.to_le_bytes()); // node_l
|
||||
buf.extend_from_slice(&0u16.to_le_bytes()); // node_r
|
||||
buf.extend_from_slice(&0u32.to_le_bytes()); // sector
|
||||
buf.extend_from_slice(&0u32.to_le_bytes()); // length (0 => leaf)
|
||||
buf.push(on_disc_attr); // attributes
|
||||
buf.push(name.len() as u8); // name_length
|
||||
buf.extend_from_slice(name.as_bytes());
|
||||
|
||||
let mut dev = DiscImageDevice {
|
||||
name: "test".into(),
|
||||
path: std::path::PathBuf::new(),
|
||||
game_offset: 0,
|
||||
entries: Vec::new(),
|
||||
};
|
||||
// `file` is only touched when descending into a non-empty directory;
|
||||
// our length=0 entries never recurse, so a dummy handle is fine.
|
||||
let mut file = std::fs::File::open("/dev/null").expect("open /dev/null");
|
||||
dev.collect_entries(&mut file, &buf, 0, "").expect("parse");
|
||||
assert_eq!(dev.entries.len(), 1);
|
||||
dev.entries.into_iter().next().unwrap()
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn directory_entry_reports_directory_attribute() {
|
||||
// On-disc 0x10 (DIRECTORY) -> attributes carries 0x10 and READONLY.
|
||||
let e = parse_single_entry("dat", FILE_ATTRIBUTE_DIRECTORY);
|
||||
assert!(e.is_directory, "directory bit not decoded");
|
||||
assert_ne!(
|
||||
e.attributes & 0x10,
|
||||
0,
|
||||
"FILE_ATTRIBUTE_DIRECTORY must be set for a directory entry"
|
||||
);
|
||||
assert_ne!(e.attributes & 0x01, 0, "READONLY must be OR'd in (canary)");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn file_entry_has_no_directory_attribute() {
|
||||
// On-disc 0x80 (NORMAL) -> not a directory; READONLY still OR'd in.
|
||||
let e = parse_single_entry("default.xex", 0x80);
|
||||
assert!(!e.is_directory, "non-directory misdecoded as directory");
|
||||
assert_eq!(
|
||||
e.attributes & 0x10,
|
||||
0,
|
||||
"FILE_ATTRIBUTE_DIRECTORY must be clear for a file entry"
|
||||
);
|
||||
assert_ne!(e.attributes & 0x80, 0, "NORMAL bit must be preserved");
|
||||
assert_ne!(e.attributes & 0x01, 0, "READONLY must be OR'd in (canary)");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn archive_and_hidden_bits_are_preserved() {
|
||||
// ARCHIVE(0x20) | HIDDEN(0x02) authored on disc must survive intact.
|
||||
let e = parse_single_entry("save.dat", 0x20 | 0x02);
|
||||
assert_eq!(e.attributes & 0x20, 0x20, "ARCHIVE bit dropped");
|
||||
assert_eq!(e.attributes & 0x02, 0x02, "HIDDEN bit dropped");
|
||||
assert_eq!(e.attributes & 0x10, 0, "spurious DIRECTORY bit");
|
||||
}
|
||||
}
|
||||
|
||||
@@ -22,6 +22,16 @@ pub struct VfsEntry {
|
||||
pub is_directory: bool,
|
||||
pub size: u64,
|
||||
pub offset: u64,
|
||||
/// Xbox `FILE_ATTRIBUTE_*` bitmask for this entry, sourced from the
|
||||
/// backing device's real on-disc metadata rather than inferred from
|
||||
/// the path shape. For GDFX disc images this is the on-disc attribute
|
||||
/// byte at dirent offset +12 OR'd with `FILE_ATTRIBUTE_READONLY`
|
||||
/// (matches xenia-canary `disc_image_device.cc:154`:
|
||||
/// `entry->attributes_ = attributes | kFileAttributeReadOnly`).
|
||||
///
|
||||
/// Bit layout (canary `vfs/entry.h:66-76`): READONLY=0x01, HIDDEN=0x02,
|
||||
/// SYSTEM=0x04, DIRECTORY=0x10, ARCHIVE=0x20, NORMAL=0x80.
|
||||
pub attributes: u32,
|
||||
}
|
||||
|
||||
/// Trait for VFS device implementations (XISO, STFS, host path, etc.)
|
||||
|
||||
Reference in New Issue
Block a user