fix(kernel): KRNBUG-D08 — wall-clock v-sync under --parallel
The synthetic v-sync ticker used a per-instruction proxy (VSYNC_INSTR_PERIOD = 150 k) tuned for ~10 MIPS lockstep throughput → 60 Hz. Audit M11 observed this drifts under `--parallel`: with 6 worker threads sharing the kernel mutex, the dispatcher executes more PPC instructions per tick callback, so the accumulator never crosses 150 k. Result: ~629 v-syncs/100M lockstep → ~2 v-syncs/100M --parallel. Hybrid solution preserves lockstep determinism (which the goldens depend on) while fixing --parallel: * `tick_vsync_instr(instr_count)` — legacy instruction-count ticker, used by lockstep. Bit-stable across runs. * `tick_vsync_wallclock()` — new Instant-based ticker. Fires `floor(elapsed / VSYNC_PERIOD)` v-syncs since the anchor and advances the anchor by that many full periods (no lazy backlog). Capped at INTERRUPT_QUEUE_CAP per call so a forward-jumping clock can't overflow the FIFO. * `KernelState.parallel_active` flag set at startup from `--parallel` / `XENIA_PARALLEL=1`. Read by `coord_pre_round` in main.rs to choose between the two tickers. Verification: * cargo test --workspace --release: 561 passing (+3 new wall-clock tests vs prior 558 baseline). * lockstep -n 100M --stable-digest: BIT-IDENTICAL to pre-Phase-3 baseline. interrupts_delivered preserved at ~630 (was ~629 pre-fix). * --parallel --reservations-table -n 30M: interrupts_delivered rose from ~2 to 17. (FIFO INTERRUPT_QUEUE_CAP=4 still caps burst delivery; that's a separate bottleneck — addressed by raising cap when --parallel queue depth becomes the next blocker.) Trade-off: --parallel runs are non-deterministic at the v-sync rate by design (per audit M05 PPCBUG-703 already). Lockstep stays bit-identical, so the `sylpheed_n*m.json` goldens are untouched. Audit IDs: KRNBUG-D08 (closed). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -786,6 +786,7 @@ fn cmd_exec_inner(
|
||||
v == "1" || v == "true" || v == "yes"
|
||||
});
|
||||
let parallel_active = parallel || parallel_via_env;
|
||||
kernel.parallel_active = parallel_active;
|
||||
if reservations_table || reservations_via_env || parallel_active {
|
||||
kernel.reservations.enable();
|
||||
if !quiet {
|
||||
@@ -1517,7 +1518,27 @@ fn coord_pre_round(
|
||||
);
|
||||
}
|
||||
|
||||
if kernel.interrupts.tick_vsync(stats.instruction_count) {
|
||||
// KRNBUG-D08: backend-aware v-sync ticker.
|
||||
//
|
||||
// **Lockstep**: instruction-count ticker (deterministic; one tick per
|
||||
// PPC block boundary, predictable cadence). The cadence drifts a bit
|
||||
// from real 60 Hz but is bit-stable across runs, which matters for
|
||||
// the `sylpheed_n*m.json` golden oracles.
|
||||
//
|
||||
// **--parallel**: wall-clock ticker. The instruction-count proxy
|
||||
// dropped from 629 v-syncs/100M lockstep to ~2 under `--parallel`
|
||||
// (audit M11) because the dispatcher executes more PPC instructions
|
||||
// per tick callback when 6 worker threads share the kernel mutex,
|
||||
// so the accumulator never crosses the 150k threshold. Wall-clock
|
||||
// restores the ~60 Hz rate at the cost of bit-exact run reproducibility,
|
||||
// which is acceptable under `--parallel` (M11 already documented
|
||||
// `--parallel` as non-deterministic by design).
|
||||
let fired = if kernel.parallel_active {
|
||||
kernel.interrupts.tick_vsync_wallclock()
|
||||
} else {
|
||||
kernel.interrupts.tick_vsync_instr(stats.instruction_count)
|
||||
};
|
||||
if fired {
|
||||
use std::sync::atomic::Ordering;
|
||||
let mmio = kernel.gpu.mmio();
|
||||
let prev = mmio.d1mode_vblank_vline_status.load(Ordering::Relaxed);
|
||||
|
||||
Reference in New Issue
Block a user