[iterate-2U] VdGlobalDevice: allocate a real device cell so the swap counter (clock B) can advance
Sylpheed's title loop re-runs its per-frame manager update sub_821741C8
only when "clock B" ([controller+88], the swap count) changes. Clock B's
sole source is the CP swap-complete callback sub_824CE2B8, which bumps
[gfx+15160] via the TWO-LEVEL deref [[VdGlobalDevice]+0]+15160, where
VdGlobalDevice is the kernel variable export 0x01BE at guest .data
0x82000750.
Ours patched that import slot with literal 0 (the old "passed through to
Vd* shims, write 0" behaviour). Consequences, both confirmed at runtime:
* the guest's graphics init stores its D3D device object via
`stw r31, 0([0x82000750])` (sub_824C6DC0 @0x824C6F18) — with the slot
0, that store lands at address 0;
* the swap callback reads [[0x82000750]] = [0] = 0 and increments
[0+15160] (the null page) instead of the real device's swap counter.
So [gfx+15160] never moved, clock B stayed frozen at 0, sub_821741C8
fired exactly once, and the game submitted one render batch (the 78-draw
splash) then stalled.
Fix mirrors xenia-canary RegisterVideoExports (xboxkrnl_video.cc:557-564)
exactly: allocate a 4-byte cell, point the import slot at it, zero the
cell. The guest then stores its device into the cell, and the callback's
two-level deref resolves correctly. Verified: [0x82000750] now holds a
real cell whose [+0] is the device (gfx state), the swap callback bumps
[gfx+15160] 0->1, clock B advances, and the per-frame chain steps forward
(sub_821741C8 fires 1->2x, GamePart update sub_821C7CB8 0->1x).
Determinism: --gpu-inline digest re-baselined and byte-identical across
runs. The fix shifts the early execution trajectory (clock B unfreezing),
so the n50m golden moves imports 451500->178937 and instructions
50000001->50000014; draws/swaps/RTs/shaders unchanged (78/4/2/3). n2m
golden unchanged (early boot, pre-fix-effect). 675 workspace tests green;
sylpheed_n50m oracle green.
Note: this breaks the FIRST hard blocker (clock B could never advance at
all). Full per-frame sustain (draws past 78) needs a further step: each
GamePart update must submit a per-frame command buffer (with PM4_INTERRUPT)
during the asset-streaming phase to keep generating CP interrupts; ours
currently produces only the single seed interrupt from the initial batch,
so the chain advances once and re-stalls. Tracked for the next iterate.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1540,8 +1540,19 @@ fn cmd_exec_inner(
|
||||
mem.write_u32(addr, block);
|
||||
}
|
||||
("xboxkrnl.exe", 0x01BE) => {
|
||||
// VdGlobalDevice — passed through to Vd* shims. Write 0.
|
||||
mem.write_u32(addr, 0);
|
||||
// VdGlobalDevice — a *pointer to* a global D3D-device cell.
|
||||
// Mirror xenia-canary RegisterVideoExports (xboxkrnl_video.cc:
|
||||
// 557-564): allocate a 4-byte cell, point the import slot at
|
||||
// it, and zero the cell. The guest's graphics init then stores
|
||||
// its device object INTO the cell (e.g. sub_824C6DC0 @
|
||||
// 0x824C6F18 `stw r31, 0([0x82000750])`), and the swap-complete
|
||||
// callback sub_824CE2B8 reads it back via the two-level
|
||||
// `[[VdGlobalDevice]+0]+15160` to bump the swap counter (clock
|
||||
// B). Writing 0 directly here (the old behaviour) made that
|
||||
// store land at address 0 and the swap counter never advance —
|
||||
// freezing the title-loop's per-frame manager update.
|
||||
let cell = alloc_zero(0x4, &mut mem, &mut kernel);
|
||||
mem.write_u32(addr, cell);
|
||||
}
|
||||
("xboxkrnl.exe", 0x01C0) => {
|
||||
// VdGpuClockInMHz
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
{
|
||||
"instructions": 50000001,
|
||||
"imports": 451500,
|
||||
"instructions": 50000014,
|
||||
"imports": 178937,
|
||||
"unimpl": 0,
|
||||
"draws": 78,
|
||||
"swaps": 4,
|
||||
|
||||
Reference in New Issue
Block a user