The JIT Compiler
The JIT (Just-In-Time) compiler is CPython’s tier 2 optimization system that compiles hot bytecode sequences into optimized machine code.Architecture Overview
CPython has a two-tier execution system:- Tier 1: Adaptive interpreter with specialized bytecode
- Tier 2: JIT compiler for hot code paths
Historically called “tier 2” in the codebase, you’ll see references to
tier2 in function and variable names.When JIT Compilation Occurs
The JIT activates when aJUMP_BACKWARD instruction becomes “hot”:
JUMP_BACKWARDhas an inline cache counter- Counter decrements on each execution
- When counter reaches zero (threshold exceeded):
- Call
_PyOptimizer_Optimize()in Python/optimizer.c - Pass current frame and instruction pointer
- Create optimized executor for the trace
- Call
Backoff Counter
Threshold determined bybackoff_counter_triggers() in Include/internal/pycore_backoff.h.
Executors
An executor is an optimized version of a bytecode trace, represented by_PyExecutorObject in Include/internal/pycore_optimizer.h.
Executor Storage
Executors are stored in the code object:ENTER_EXECUTOR Instruction
Once an executor is created:JUMP_BACKWARDreplaced withENTER_EXECUTORopargcontains index intoco_executorsarray- Subsequent iterations use the executor directly
Executor Exits
Executors determine where to transfer control:- Return to tier 1 interpreter
- Transfer to another executor
_PyExitData structure.
The Micro-op Optimizer
Defined in Python/optimizer.c as_PyOptimizer_Optimize.
Trace Translation
The optimizer:- Identifies trace - Sequence of bytecode starting from hot jump
- Expands to micro-ops - Each bytecode → sequence of micro-ops
- Optimizes - Apply optimization passes
- Creates executor - Instance of
_PyUOpExecutor_Type
Micro-ops (uops)
Micro-operations are lower-level than bytecode:Optimization Pass
_Py_uop_analyze_and_optimize() in Python/optimizer_analysis.c performs:
- Dead code elimination
- Redundant guard removal
- Constant propagation
- Type specialization
JIT Interpreter
The JIT interpreter is the simpler of two executor implementations, useful for debugging.Enabling
Configure with:Execution
WhenENTER_EXECUTOR runs:
- Jump to
tier2_dispatch:label in Python/ceval.c - Loop executes micro-ops via switch statement
- Switch cases in Python/executor_cases.c.h
- Generated by Tools/cases_generator/tier2_generator.py
Exit Instructions
_EXIT_TRACE- Planned exit, return to tier 1_DEOPT- Deoptimization due to guard failure
Full JIT (Copy-and-Patch)
The full JIT compiles micro-ops to native machine code.Enabling
Architecture
Uses copy-and-patch compilation:- Pre-compiled stencils for each micro-op
- Runtime patching fills in specific values
- Efficient compilation without complex codegen
Copy-and-patch technique described in Haoran Xu’s article and the paper “Copy-and-Patch Compilation”.
Stencil Generation
At build time,make regen-jit generates stencils:
- Read Python/executor_cases.c.h
- For each micro-op, create
.cfile with template from Tools/jit/template.c - Compile with LLVM to produce object files
- Extract machine code into jit_stencils.h
JIT Compilation
_PyJIT_Compile() in Python/jit.c:
- Allocate executable memory
- For each micro-op:
- Copy stencil code
- Patch runtime values (constants, object pointers, etc.)
- Set
executor->jit_codeto point to compiled function
JIT Function Signature
Defined in pycore_jit.h:Execution
WhenENTER_EXECUTOR encounters JIT code:
- Check if
executor->jit_codeis set - Call JIT function instead of tier 2 interpreter
- Function returns next instruction pointer
- Continue execution from returned location
Executor Invalidation
Executors may become invalid when assumptions change.Executor List
All executors stored in interpreter state:Invalidation Triggers
- Type modified (method added/removed)
- Global/builtin modified
- Module dict modified
- Code object modified
Invalidation Process
- Iterate
executor_list_head - Mark affected executors as invalid
- Next
ENTER_EXECUTORwill recompile or deoptimize
Example: JIT in Action
Trace Example
Simplified micro-op trace for loop body:Performance Benefits
The JIT provides:Reduced Dispatch Overhead
- Tier 1: Decode + dispatch for every instruction
- JIT: Direct machine code execution
Better Register Allocation
- Tier 1: Stack-based with memory ops
- JIT: Can keep values in registers across instructions
Inlining Opportunities
- Micro-ops can inline small operations
- Eliminates call overhead
Typical Speedup
For hot numeric loops:- 2-4x faster than tier 1 interpreter
- Still slower than compiled languages (C, Rust)
- Best for tight loops with predictable types
Configuration
Build Options
Runtime Control
Currently no runtime flags to control JIT behavior. It activates automatically for hot code.Debugging JIT
JIT Stats
Compile with JIT stats:Disabling JIT
For debugging, rebuild without JIT:Implementation Status
Experimental Feature: The JIT is experimental in Python 3.13+. APIs and behavior may change in future versions.
Supported Platforms
- x86-64 (Linux, macOS, Windows)
- ARM64 (Linux, macOS)
Limitations
- Not all bytecode instructions have micro-op translations
- Some operations force deoptimization to tier 1
- Exception handling may deoptimize
Further Reading
Videos
Brandt Bucher’s PyCon US 2023 talk - Inside CPython 3.11’s specializing adaptive interpreter PyCon 2024: Building a JIT compiler for CPythonPapers
Copy-and-Patch Compilation - Fast compilation algorithm for high-level languagesRelated Topics
- Bytecode Interpreter - Tier 1 execution and specialization
- Code Objects - Where executors are stored
- Compiler Design - Bytecode generation
