Overview
The IR Generator is the fourth phase of compilation. It transforms the Abstract Syntax Tree (AST) into Three Address Code (TAC), a low-level, platform-independent representation that bridges high-level language and machine code.Three Address Code limits each instruction to:
- At most one operator
- At most three addresses (operands)
Why Intermediate Representation?
Platform Independence
TAC is not tied to any CPU architecture.Same IR can target:
- x86 assembly
- ARM assembly
- LLVM IR
- JVM bytecode
Optimization
Simpler than AST, easier to optimize:
- Constant folding
- Dead code elimination
- Common subexpression elimination
- Register allocation
Explicit Order
Operations in sequential order - no ambiguity about evaluation.AST: Tree structure (implicit ordering)
TAC: Linear instructions (explicit ordering)
Simple Instructions
Each instruction has one operation - easy to translate to assembly.
TAC Instruction Format
All instructions follow these patterns:- Binary Operation
- Assignment
- Print
Temporary Variables
Complex expressions require intermediate storage:t0,t1,t2, … (sequential integers)- Unique within a program
- Never reused (simple approach - optimization could reuse)
Generation Algorithm
The generator recursively traverses the AST:Process Statements
For each statement in the program:
- Generate IR for the statement
- Append instructions to list
Statement Generation
- Variable Declaration
- Print Statement
Expression Generation
- Number Literal
- Identifier
- Binary Expression
- Grouped Expression
Generation Examples
Simple Expression
- Code
- AST
- Generation Trace
- Generated IR
Nested Expression
- Code
- AST
- Generation Trace
- Generated IR
Complete Program
- Code
- Generated IR
- Output
Optimization Opportunities
Constant Folding
Constant Folding
Before:After:Evaluate constant expressions at compile-time.
Copy Propagation
Copy Propagation
Before:After:Replace copy with original variable.
Dead Code Elimination
Dead Code Elimination
Before:After:Remove unused computations.
Temporary Reuse
Temporary Reuse
Before:After:Reduce number of temporaries.
Comparison: AST vs. IR
For expressiona + b * c:
- AST Representation
- IR Representation
- Tree structure (hierarchical)
- Implicit evaluation order (post-order traversal)
- High-level (close to source code)
- Good for semantic analysis
Use Cases for IR
Code Generation
Each TAC instruction maps directly to assembly:
Optimization
Simple structure makes optimization easier:
- Analyze dependencies
- Detect patterns
- Apply transformations
Interpretation
Can execute IR directly:(Though this compiler interprets AST, not IR)
Multi-Target
Generate different backends from same IR:
- x86 assembly
- ARM assembly
- LLVM IR
- C code
Implementation Details
- Class Structure
- Instruction Storage
Performance
Time Complexity
O(n) where n = AST node countEach node visited once in DFS traversal.
Space Complexity
O(n) for instruction listRoughly one instruction per operation in source code.
Source Code Reference
Implementation
File:
compfinal.pyLines: 1090-1149Key Class:GeneradorIR- IR generator
generar(programa)- Entry point, returns instruction listgenerar_sentencia(sentencia)- Process statementgenerar_expr(expr)- Process expression, return result operandnueva_temp()- Generate unique temporary name
codigo: List[str]- TAC instruction listtemp_counter: int- Temporary variable counter
Next Steps
Interpreter
See how the AST is directly executed
Code Generation
See how IR/AST is converted to x86 assembly
API Reference
Detailed API documentation for the GeneradorIR class