[iterate-2E] Extend coherent monotonic clock to lockstep (timebase-desync livelock fix)
Lockstep livelocked the scheduler the same way --parallel did before
0332d19: the kernel deadline-arithmetic (`now_basis_at`) read per-thread
`ctx(hw_id).timebase`, but a parked/poll thread has `running_idx == None`
so `Scheduler::ctx()` returns `idle_ctx` (timebase 0). A poll thread (tid=7,
a `KeWaitForSingleObject` loop with a 30ms relative timeout) computing its
deadline via `parse_timeout` therefore read `now = 0` and registered
`deadline = 0 + 3000 = 3000` — a constant ~7.78M units in the past.
`coord_idle_advance` then re-armed that same constant 3000 deadline forever,
pinning virtual time and starving every other thread's real future deadline.
Render-gate impact: the submitter (tid=6) re-enters a 16ms-timeout
WaitForMultiple after its first jobs; that timeout never fired because vtime
was pinned at 3000, so virtual time never reached real future deadlines.
Fix (Option A — mirror the parallel fix): drive the existing deterministic
`Scheduler::global_clock` in lockstep too (floored up once per outer round
to `stats.instruction_count`, a pure function of retired guest instructions —
no wall-clock), and route `KernelState::now_basis_at` through `global_clock()`
in BOTH modes. New `Scheduler::advance_global_clock_to(now)` floor-up keeps it
monotone alongside `advance_all_timebases_to`. Parallel behavior unchanged
(it already read `global_clock()`).
Verified (lockstep, 50M):
- DETERMINISM: two cold `check -n 5M` and two cold `-n 50M` runs byte-identical.
- LIVELOCK GONE: "advanced to deadline" went from 592,679 fires / 2 unique
values / 562,084 pinned at 3000 -> 18,586 fires / 18,567 unique /
0 pinned, strictly increasing 5.4M -> 50M. Poll thread tid=7 now ends
Blocked with a real future deadline Some(60002824) instead of spin-Ready
on the past 3000.
- imports 1,790,936 -> 92,317 at 50M (the spin no longer burns import calls).
Cascade (lockstep, XENIA_CACHE_PERSIST=1, -n 200M): engine now runs to budget
instead of hard-deadlocking. Hub enqueue (sub_82458068) 4x; submitter dequeue
(sub_82458508) still 3x — the lost 4th-job HANDOFF (count/notify between
sub_82458068's tail and the submitter queue) is a SEPARATE downstream gate,
not the timebase. New gate: tid=5 (hub) Blocked INFINITE on event 0x1080
(job-4 completion); tid=6 (submitter) Ready, parked in WaitForMultiple
(sub_824AB214), loop-top stops at cycle 6.23M. draws still 0, VdSwap 1.
Golden re-baseline (same commit): sylpheed_n50m
instructions 50000004 -> 50000007, imports 1790936 -> 92317
(swaps/draws/RTs/shaders/textures unchanged). sylpheed_n2m unchanged
(livelock onsets after 2M). Suite 665/665 + oracle green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1295,24 +1295,28 @@ impl KernelState {
|
||||
self.pending_timer_fires.first().map(|&(d, _)| d)
|
||||
}
|
||||
|
||||
/// Coherent "now" basis for deadline arithmetic, gated on execution mode.
|
||||
/// Coherent "now" basis for deadline arithmetic — the scheduler's
|
||||
/// single monotonic `global_clock`, in BOTH execution modes.
|
||||
///
|
||||
/// In **lockstep** (`parallel_active == false`) this returns exactly the
|
||||
/// pre-existing per-thread `ctx(hw_id).timebase` each call site read
|
||||
/// before, so the deterministic lockstep trace is byte-identical (no
|
||||
/// golden re-baseline). In **parallel** (`parallel_active == true`) the
|
||||
/// per-thread timebases are incoherent (workers extract/zero their slots
|
||||
/// while stepping unlocked), so we return the scheduler's single
|
||||
/// monotonic `global_clock` instead — the basis that breaks the
|
||||
/// timebase-desync livelock. Callers pass the `hw_id` they would have
|
||||
/// used for the lockstep `ctx()` read (slot 0 for coordinator-side
|
||||
/// drains, the current thread's slot for in-guest waits).
|
||||
pub fn now_basis_at(&self, hw_id: u8) -> u64 {
|
||||
if self.parallel_active {
|
||||
self.scheduler.global_clock()
|
||||
} else {
|
||||
self.scheduler.ctx(hw_id).timebase
|
||||
}
|
||||
/// Per-thread `ctx(hw_id).timebase` is NOT a sound "now" for deadline
|
||||
/// arithmetic: in `--parallel` workers extract/zero their slots while
|
||||
/// stepping unlocked, and in **lockstep** a parked/poll thread has
|
||||
/// `running_idx == None` so `ctx()` returns `idle_ctx` (timebase 0).
|
||||
/// Either way a `parse_timeout` reading the per-thread basis can see 0
|
||||
/// (or a stale value) and register `deadline = 0 + relative`, a value
|
||||
/// permanently in the past, which `coord_idle_advance` then re-arms
|
||||
/// forever (the timebase-desync livelock; the render-gate root). The
|
||||
/// `global_clock` is a deterministic function of retired guest
|
||||
/// instructions (per-round `stats.instruction_count` floor-ups in
|
||||
/// lockstep, per-block retired counts in parallel), so it is coherent,
|
||||
/// monotonic, never zero after boot, and bit-reproducible across two
|
||||
/// cold lockstep runs.
|
||||
///
|
||||
/// The `hw_id` argument is retained for call-site clarity (which slot a
|
||||
/// caller would conceptually be "asking about") but is no longer read —
|
||||
/// the basis is global.
|
||||
pub fn now_basis_at(&self, _hw_id: u8) -> u64 {
|
||||
self.scheduler.global_clock()
|
||||
}
|
||||
|
||||
/// Fire every timer whose deadline is `<= now` (derived from slot 0's
|
||||
|
||||
Reference in New Issue
Block a user