Xbox 360 CPU
The Xbox 360 CPU (codenamed “Xenon”) is a custom IBM PowerPC processor:- Architecture: PowerPC 64-bit (running in 32-bit mode)
- Cores: 3 cores, each with 2 hardware threads (6 logical processors)
- Clock Speed: 3.2 GHz
- Instruction Set: PowerPC with AltiVec (VMX) and VMX128 extensions
- Registers: 64-bit general purpose and vector registers
JIT Translation Pipeline
The JIT translates PowerPC code through three main phases:
Phase 1: Translation to HIR
PowerPC instructions are translated to Xenia’s High-level Intermediate Representation (HIR):- Scanner (
src/xenia/cpu/ppc/ppc_scanner.h) - Analyzes code to find basic blocks and functions - HIR Builder (
src/xenia/cpu/ppc/ppc_hir_builder.h) - Constructs HIR from PowerPC instructions - Emitters (
src/xenia/cpu/ppc/ppc_emit_*.cc) - Per-instruction translation logic
ppc_emit_control.cc- Branch, call, and control flow instructionsppc_emit_alu.cc- Integer arithmetic and logical operationsppc_emit_fpu.cc- Floating-point operationsppc_emit_altivec.cc- Vector (AltiVec/VMX) instructions
Phase 2: HIR Optimization
The HIR passes through a series of compiler passes for optimization: Pass Order (fromsrc/xenia/cpu/ppc/ppc_translator.cc):
- Control Flow Analysis - Builds control flow graph (CFG)
- Control Flow Simplification - Merges blocks and removes dead branches
- Context Promotion - Promotes frequently-used context values to registers
- Simplification + Constant Propagation (loop until no changes)
- Simplifies expressions and eliminates redundant operations
- Propagates constants through expressions
- Memory Sequence Combination - Combines adjacent loads/stores
- Dead Code Elimination - Removes unused instructions
- Value Reduction - Simplifies value representations
- Context Promotion - PowerPC registers are stored in a context structure. This pass promotes hot registers to x64 registers, avoiding memory loads/stores.
- Constant Propagation - Detects compile-time constants and folds them into instructions
- Dead Store Elimination - Removes writes to memory/registers that are never read
src/xenia/cpu/compiler/passes/ with descriptive names.
Phase 3: Backend Code Generation
The x64 backend consumes HIR and emits native machine code:- x64 Backend (
src/xenia/cpu/backend/x64/x64_backend.cc) - Main backend implementation - x64 Sequences (
src/xenia/cpu/backend/x64/x64_sequences.cc) - HIR to x64 instruction sequences - x64 Emitter (
src/xenia/cpu/backend/x64/x64_emitter.cc) - Generates actual x64 machine code - Code Cache (
src/xenia/cpu/backend/x64/x64_code_cache.cc) - Stores compiled code
x64 ABI and Register Mapping
Xenia guest functions cannot be called directly from host code. Calls transition through a thunk that sets up the guest execution environment.Transition Thunks
Defined insrc/xenia/cpu/backend/x64/x64_backend.cc:389:
- Host → Guest: Saves host registers, loads guest context, jumps to JIT code
- Guest → Host: Saves guest context, restores host registers, returns
StackLayout::Thunk (src/xenia/cpu/backend/x64/x64_stack_layout.h:96).
Register Allocation
Fromsrc/xenia/cpu/backend/x64/x64_emitter.cc:57:
Integer Registers
| x64 Register | Usage |
|---|---|
| RAX | Scratch (temporary values) |
| RBX | JIT temporary |
| RCX | Scratch |
| RDX | Scratch |
| RSP | Stack Pointer |
| RBP | Unused |
| RSI | PowerPC Context Pointer |
| RDI | Virtual Memory Base |
| R8-R11 | Unused (available for parameters) |
| R12-R15 | JIT temporaries |
- RSI always points to the PowerPC context structure (guest registers)
- RDI always points to the base of guest virtual memory
Floating Point Registers
| x64 Register | Usage |
|---|---|
| XMM0-XMM5 | Scratch (temporary values) |
| XMM6-XMM15 | JIT temporaries |
Calling Convention
Guest function parameters and return values follow PowerPC ABI:- Parameters: r3-r10 (additional on stack)
- Return value: r3 (32-bit) or r3:r4 (64-bit)
- Floating point: f1-f13 for parameters
SHIM_CALL convention shows this explicitly:
SHIM_GET_ARG_32(n)- Reads from r3+nSHIM_SET_RETURN_32(v)- Writes to r3
Memory Access
Guest memory accesses are translated to host accesses:Virtual Memory
PowerPC load/store instructions access guest virtual memory:[rdi+address] accesses guest memory directly.
Memory Barriers
PowerPC has explicit memory synchronization instructions:sync- Memory barrierisync- Instruction synchronizationeieio- Enforce in-order execution of I/O
Code Cache
Compiled code is stored in the code cache (src/xenia/cpu/backend/x64/x64_code_cache.cc):
- Functions are compiled once and cached
- Cache is searched by guest address before recompiling
- Generated code is stored in executable memory pages
- On Windows and POSIX, uses platform-specific memory APIs for RWX pages
System Call Handling
When guest code calls a kernel function:- Loader replaces kernel import with
sc(syscall) instruction - JIT detects syscall and emits call to kernel export handler
- Execution transitions from guest to host
- Kernel export (native C++) executes
- Return value is placed in r3
- Execution returns to guest code
Multi-threading
The Xbox 360 has 3 cores with 2 hardware threads each (6 logical processors). Xenia emulates this:- Each guest thread runs on a host thread
- Thread scheduling is handled by the host OS
- Synchronization primitives (mutexes, events) are implemented in kernel
- Lock-free atomic operations translate to x64 lock prefix instructions
Performance Considerations
JIT Compilation Overhead
- First execution of a function incurs compilation cost
- Subsequent calls execute cached native code
- Hot functions compile quickly (< 1ms typically)
- Games with large code bases may have longer initial loads
Optimization Trade-offs
- More optimization passes improve code quality but increase compile time
- Context promotion is critical for performance (avoids memory traffic)
- Some passes can be disabled for faster compilation (e.g.,
--disable_context_promotion)
Accuracy vs Speed
- CPU timing is not cycle-accurate
- Branch prediction behavior differs from real hardware
- Most games don’t depend on exact timing
- Games with tight timing loops may have issues
Debugging and Analysis
HIR Dumping
Use--dump_translated_hir_functions=true to dump HIR for all translated functions. Useful for:
- Understanding translation issues
- Analyzing optimization effectiveness
- Debugging crashes in generated code
Disassembly
Generated x64 code can be inspected with debuggers:- Set breakpoints in JIT code
- Single-step through generated instructions
- Compare with original PowerPC disassembly
References
PowerPC Architecture
- Free60 Xenon CPU Info
- Power ISA Specification
- PowerPC Programming Environments Manual
- AltiVec Programming Environment Manual
- VMX128 Opcodes
