Overview
The GeneradorIR (IR Generator) class generates intermediate representation (IR) code in the form of three-address code. This is the fourth phase of the compilation process and serves as a platform-independent representation of the program.
Class Definition
class GeneradorIR:
def __init__(self)
Constructor Parameters
No parameters required. The generator initializes with an empty code list.
Attributes
codigo (List[str]): List of generated IR instructions
temp_counter (int): Counter for generating unique temporary variable names
Public Methods
generar()
Generates intermediate representation code from the AST.
def generar(self, programa: Programa) -> List[str]
The AST produced by the Parser and validated by the Semantic Analyzer
List of IR instructions in three-address code format
Example:
scanner = Scanner("let x = 5 + 3;")
tokens = scanner.escanear_tokens()
parser = Parser(tokens)
programa = parser.parsear()
generador = GeneradorIR()
codigo_ir = generador.generar(programa)
# Output:
# [FASE 4] Generación de Código Intermedio (IR)
# (Three Address Code)
# t0 = 5 + 3
# x = t0
nueva_temp()
Generates a new unique temporary variable name.
def nueva_temp(self) -> str
A unique temporary variable name (e.g., “t0”, “t1”, “t2”)
Example:
generador = GeneradorIR()
t1 = generador.nueva_temp() # "t0"
t2 = generador.nueva_temp() # "t1"
t3 = generador.nueva_temp() # "t2"
The IR generator produces code in three-address code format, where each instruction has at most three addresses (operands).
Instruction Types
Assignment
Format: variable = value
# Source: let x = 5;
# IR: x = 5
Binary Operation
Format: temp = operand1 operator operand2
# Source: let x = 5 + 3;
# IR:
# t0 = 5 + 3
# x = t0
Print
Format: print value
# Source: print x;
# IR: print x
Code Generation Examples
Simple Assignment
# Source Code
let x = 10;
# Generated IR
x = 10
Binary Expression
# Source Code
let sum = 5 + 3;
# Generated IR
t0 = 5 + 3
sum = t0
Complex Expression
# Source Code
let result = 10 + 5 * 2;
# Generated IR
t0 = 5 * 2 # Multiplication first (higher precedence)
t1 = 10 + t0 # Then addition
result = t1
Nested Expression
# Source Code
let x = (10 + 5) * (3 - 1);
# Generated IR
t0 = 10 + 5 # First parenthesis
t1 = 3 - 1 # Second parenthesis
t2 = t0 * t1 # Multiply the results
x = t2
Multiple Statements
# Source Code
let a = 5;
let b = 10;
let c = a + b * 2;
print c;
# Generated IR
a = 5
b = 10
t0 = b * 2
t1 = a + t0
c = t1
print c
Implementation Details
Expression Generation
The generar_expr() method recursively generates code for expressions:
def generar_expr(self, expr):
if isinstance(expr, NumeroLiteral):
return expr.valor
if isinstance(expr, Identificador):
return expr.nombre
if isinstance(expr, ExpresionBinaria):
izq = self.generar_expr(expr.izquierda)
der = self.generar_expr(expr.derecha)
temp = self.nueva_temp()
self.codigo.append(f"{temp} = {izq} {expr.operador.lexema} {der}")
return temp
if isinstance(expr, ExpresionAgrupada):
return self.generar_expr(expr.expresion)
Statement Generation
The generar_sentencia() method handles different statement types:
def generar_sentencia(self, sentencia):
if isinstance(sentencia, DeclaracionVariable):
resultado = self.generar_expr(sentencia.expresion)
self.codigo.append(f"{sentencia.nombre.lexema} = {resultado}")
elif isinstance(sentencia, SentenciaPrint):
valor = self.generar_expr(sentencia.expresion)
self.codigo.append(f"print {valor}")
Temporary Variables
Temporary variables are used to store intermediate results:
- Named as
t0, t1, t2, etc.
- Each temporary is unique within the program
- Temporaries are never reused
Example:
# Source: let x = (a + b) * (c - d);
# IR:
t0 = a + b # First temporary
t1 = c - d # Second temporary
t2 = t0 * t1 # Third temporary
x = t2
Operator Representation
Operators are preserved from the source code:
+ → Addition
- → Subtraction
* → Multiplication
/ → Division
Usage Example
from compfinal import Scanner, Parser, AnalizadorSemantico, GeneradorIR
# Complete compilation to IR
code = """
let x = 5;
let y = 10;
let z = x + y * 2;
print z;
"""
# Phase 1-3: Lexical, Syntactic, Semantic Analysis
scanner = Scanner(code)
tokens = scanner.escanear_tokens()
parser = Parser(tokens)
programa = parser.parsear()
analizador = AnalizadorSemantico()
if not analizador.analizar(programa):
print("Semantic errors found!")
exit(1)
# Phase 4: IR Generation
generador = GeneradorIR()
codigo_ir = generador.generar(programa)
print("Generated IR Code:")
for instruccion in codigo_ir:
print(f" {instruccion}")
# Output:
# x = 5
# y = 10
# t0 = y * 2
# t1 = x + t0
# z = t1
# print z
The generar() method prints the IR code to console:
[FASE 4] Generación de Código Intermedio (IR)
(Three Address Code)
t0 = 5 + 3
x = t0
print x
Advantages of IR
- Platform Independent: Not tied to any specific machine architecture
- Optimization Ready: Easy to analyze and optimize
- Multiple Backends: Can generate different target code from the same IR
- Simplified Structure: Complex expressions broken into simple operations
Limitations
No Optimization
The IR generator does not perform optimizations:
# Source: let x = 5 + 0;
# IR (not optimized):
t0 = 5 + 0
x = t0
# Could be optimized to:
x = 5
Temporary Variable Reuse
Temporary variables are never reused:
# Each expression gets a new temporary
let a = 1 + 2; # Uses t0
let b = 3 + 4; # Uses t1 (t0 not reused)
See Also