xenia-cpu: VMX128, FPSCR, decoder split, scheduler, decode/block caches

Split the monolithic interpreter into cohesive modules: dedicated
decoder (decoder.rs) producing 8-byte DecodedInstr; opcode tables
(opcode.rs); explicit traps (trap.rs); FPSCR helpers (fpscr.rs);
overflow/carry helpers (overflow.rs); a 4 KiB-page-versioned decode
cache and basic-block cache (block_cache.rs); and a full VMX/VMX128
implementation (vmx.rs) covering AltiVec + Xenon's 128-bit extensions.

Add the parallel-execution substrate behind --parallel: a 7-party
phaser (phaser.rs) for round-based barrier sync, ReservationTable
(reservation.rs) for guest LL/SC, and the per-HW-thread scheduler
core (scheduler.rs) that owns ThreadRefs, runqueues, and pending IRQs.

Disassembler is now the single source of truth: disasm.rs gains the
full base + extended + VMX128 mnemonic set, with golden JSON fixtures
and a disasm_goldens test suite. Add a criterion-style interpreter
bench. context.rs grows the per-thread state the new modules need
(reservation slot, FPSCR, vector regs).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
MechaCat02
2026-05-01 16:27:43 +02:00
parent e9b2b57a44
commit c36cca14f9
20 changed files with 12284 additions and 458 deletions

View File

@@ -145,6 +145,33 @@ impl PpcOpcode {
matches!(self, Self::sc)
}
/// Returns true if this opcode unconditionally ends a basic block:
/// any branch, system call, trap, or `Invalid` (decoder couldn't
/// recognize the instruction — execution will hit the
/// `Unimplemented` arm and we don't want to swallow the boundary
/// inside a cached block).
///
/// Notably *not* terminating: `mtmsr`/`mtmsrd`/`isync`/`mfmsr`.
/// On real hardware these have synchronization semantics (a context
/// synchronizing event for `isync`, MSR rewrite for the `mt*`s) but
/// our interpreter has no asynchronous-exception model and no
/// out-of-order execution — they execute as plain ALU/move ops and
/// don't change control flow synchronously. Block-cache replay is
/// still bit-for-bit identical to per-instruction dispatch for
/// those.
///
/// Used by the basic-block cache (`block_cache.rs`) to know when to
/// stop accumulating instructions during a forward decode walk.
pub fn terminates_block(&self) -> bool {
matches!(
self,
Self::bx | Self::bcx | Self::bclrx | Self::bcctrx
| Self::sc
| Self::td | Self::tdi | Self::tw | Self::twi
| Self::Invalid
)
}
/// Returns true if this is a load instruction.
pub fn is_load(&self) -> bool {
matches!(self,
@@ -194,3 +221,60 @@ impl std::fmt::Display for PpcOpcode {
std::fmt::Debug::fmt(self, f)
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn terminates_block_includes_all_branches() {
assert!(PpcOpcode::bx.terminates_block());
assert!(PpcOpcode::bcx.terminates_block());
assert!(PpcOpcode::bclrx.terminates_block());
assert!(PpcOpcode::bcctrx.terminates_block());
}
#[test]
fn terminates_block_includes_sc_and_traps() {
assert!(PpcOpcode::sc.terminates_block());
assert!(PpcOpcode::td.terminates_block());
assert!(PpcOpcode::tdi.terminates_block());
assert!(PpcOpcode::tw.terminates_block());
assert!(PpcOpcode::twi.terminates_block());
}
#[test]
fn terminates_block_includes_invalid() {
// Decoder failure must end the block — otherwise an unknown
// opcode would be replayed inside a cached block without going
// through the per-instruction Unimplemented path.
assert!(PpcOpcode::Invalid.terminates_block());
}
#[test]
fn terminates_block_excludes_straight_line_ops() {
// Common ALU and load/store ops must NOT terminate a block.
assert!(!PpcOpcode::addi.terminates_block());
assert!(!PpcOpcode::addis.terminates_block());
assert!(!PpcOpcode::addx.terminates_block());
assert!(!PpcOpcode::cmpi.terminates_block());
assert!(!PpcOpcode::cmp.terminates_block());
assert!(!PpcOpcode::lwz.terminates_block());
assert!(!PpcOpcode::stw.terminates_block());
assert!(!PpcOpcode::lbzx.terminates_block());
assert!(!PpcOpcode::ori.terminates_block());
assert!(!PpcOpcode::oris.terminates_block());
assert!(!PpcOpcode::rlwinmx.terminates_block());
}
#[test]
fn terminates_block_excludes_msr_and_sync_ops() {
// Documented decision: synchronizing ops execute as ALU within
// a block since the interpreter has no async-exception model.
assert!(!PpcOpcode::mtmsr.terminates_block());
assert!(!PpcOpcode::mtmsrd.terminates_block());
assert!(!PpcOpcode::isync.terminates_block());
assert!(!PpcOpcode::sync.terminates_block());
assert!(!PpcOpcode::mfmsr.terminates_block());
}
}