Skip to main content

Overview

The Mini-Compilador Educativo performs six distinct phases of compilation, each producing specific output. Understanding this output helps you learn how compilers work and debug your programs effectively.

Compilation Phases

The compiler processes code through these phases:

Phase 1: Lexical Analysis

Converts source code into tokens (lexemes)

Phase 2: Syntax Analysis

Builds Abstract Syntax Tree (AST) from tokens

Phase 3: Semantic Analysis

Validates program meaning and variable usage

Phase 4: Intermediate Representation

Generates three-address code (TAC)

Phase 5: Interpretation

Executes the program and displays results

Phase 6: Code Generation

Produces x86 assembly code

Phase 1: Lexical Analysis

Purpose

The lexer (scanner) reads source code character-by-character and groups characters into tokens - the basic building blocks of the language.

Output Format

When “Mostrar tokens” is enabled, you see a table:
TOKENS (resultado del análisis léxico)
══════════════════════════════════════════════════════════════════════════════════════
TIPO                 LEXEMA               LINEA    COLUMNA  VALOR          
──────────────────────────────────────────────────────────────────────────────────────
LET                  let                  1        1                       
IDENTIFICADOR        x                    1        5                       
IGUAL                =                    1        7                       
NUMERO               10                   1        9        10             
PUNTO_COMA           ;                    1        11                      
FIN_ARCHIVO                               1        12                      
══════════════════════════════════════════════════════════════════════════════════════

Token Components

TIPO
token type
The category of the token:
  • LET, PRINT - Keywords
  • IDENTIFICADOR - Variable names
  • NUMERO - Numeric literals
  • SUMA, RESTA, MULTIPLICACION, DIVISION - Operators
  • IGUAL - Assignment
  • PAREN_IZQ, PAREN_DER - Parentheses
  • PUNTO_COMA - Statement terminator
  • FIN_ARCHIVO - End marker
LEXEMA
original text
The actual characters from the source code.Example: let → lexema “let”
LINEA
line number
Which line of code the token appears on (1-indexed).
COLUMNA
column position
Character position within the line (1-indexed).
VALOR
numeric value
For NUMERO tokens only: the integer value.Example: lexema “42” → valor 42

Example Analysis

let sum = 10 + 5;
print sum;

Common Lexical Errors

✗ Error léxico en línea 1, columna 9: carácter inesperado '@'
Cause: Character not recognized by the language (e.g., @, #, $)Solution: Remove or replace with valid characters
✗ Error léxico en línea 3, columna 15: carácter inesperado '\u2019'
Cause: Unicode character (smart quotes, special dashes)Solution: Use ASCII equivalents (' instead of ', - instead of )

Phase 2: Syntax Analysis

Purpose

The parser verifies that tokens appear in valid grammatical order and builds an Abstract Syntax Tree (AST) representing the program’s structure.

Output Format

When “Mostrar AST” is enabled, you see a tree diagram:
ÁRBOL DE SINTAXIS ABSTRACTA (AST)
═══════════════════════════════════════════════════════════════
Programa
├── DeclaracionVariable
│   ├── nombre: 'sum'
│   └── valor:
│       └── ExpresionBinaria
│           ├── operador: '+'
│           ├── izquierda:
│           │   └── NumeroLiteral(10)
│           └── derecha:
│               └── NumeroLiteral(5)
└── SentenciaPrint
    └── expresion:
        └── Identificador('sum')
═══════════════════════════════════════════════════════════════

AST Node Types

DeclaracionVariable
DeclaracionVariable
├── nombre: 'x'
└── valor: [expression]
Represents: let x = expression;SentenciaPrint
SentenciaPrint
└── expresion: [expression]
Represents: print expression;

Understanding Tree Structure

let result = (10 + 5) * 2;
The tree structure reflects operator precedence. Deeper nodes evaluate first.

Common Syntax Errors

✗ Error de sintaxis en línea 1, columna 11: Se esperaba ';' al final de la declaración. Se encontró 'let'
Cause: Statement not terminated with semicolon
let x = 10  // Missing ;
let y = 20;
Solution: Add semicolon after every statement
✗ Error de sintaxis en línea 2, columna 7: Se esperaba '=' después del nombre de variable. Se encontró '10'
Cause: Assignment operator missing
let x 10;
Solution: Use = for assignment: let x = 10;
✗ Error de sintaxis en línea 1, columna 1: Se esperaba 'let' o 'print'. Se encontró 'x'
Cause: Statement doesn’t start with keyword
x = 10;  // Missing 'let'
Solution: Start with let or print
✗ Error de sintaxis en línea 1, columna 15: Se esperaba ')' después de la expresión. Se encontró ';'
Cause: Opening parenthesis without closing
let x = (10 + 5;
Solution: Balance parentheses: let x = (10 + 5);

Phase 3: Semantic Analysis

Purpose

The semantic analyzer verifies that the program is logically valid, even if syntactically correct.

Checks Performed

Variable Declaration

Ensures variables are declared before use.Tracks all let declarations and verifies identifiers exist.

Division by Zero

Detects division by zero with literal values.Example: let x = 10 / 0; is caught.

Redeclaration

Warns if a variable is declared twice.Allows redeclaration but issues warning.

Output Format

Success:
[FASE 3] Análisis Semántico...
         (Verificando que el código tenga sentido)
  ✓ Completado: 3 variables verificadas
With Errors:
[FASE 3] Análisis Semántico...
         (Verificando que el código tenga sentido)
  ✗ Error semántico en línea 2, columna 13: la variable 'x' no ha sido declarada
  ✗ Error semántico en línea 3, columna 15: división entre cero detectada

Common Semantic Errors

✗ Error semántico en línea 2, columna 9: la variable 'y' no ha sido declarada
Example:
let x = 10;
let sum = x + y;  // 'y' not declared
Solution: Declare before use:
let x = 10;
let y = 5;
let sum = x + y;
✗ Error semántico en línea 1, columna 13: división entre cero detectada
Example:
let invalid = 10 / 0;
Solution: Use non-zero divisor
Only literal zeros are detected. let x = 0; let y = 10 / x; won’t be caught at compile time.
Advertencia en línea 3, columna 5: la variable 'x' ya fue declarada anteriormente
Example:
let x = 10;
let y = 20;
let x = 30;  // Redeclaration
Impact: Not an error, but may indicate a mistake. Second declaration overwrites the first.
✗ Error semántico en línea 1, columna 13: la variable 'x' no ha sido declarada
Example:
let x = x + 1;  // x doesn't exist yet
Solution: Declare with initial value first:
let x = 0;
let x = x + 1;  // Now valid (with redeclaration warning)

Phase 4: Intermediate Representation

Purpose

Converts the AST into Three-Address Code (TAC), a low-level representation where each instruction has at most three operands.

Output Format

When “Mostrar IR” is enabled:
[FASE 4] Generación de Código Intermedio (IR)
         (Three Address Code)
    a = 5
    b = 10
    t0 = b * 2
    t1 = a + t0
    c = t1
    print c

Instruction Types

variable = value
Example:
x = 10
y = x
Direct assignment from constant or variable.

Example Transformation

let x = 5;
let y = 10;
let result = x + y * 2;
print result;
Temporary variables (t0, t1, …) represent intermediate computation results. They’re generated automatically and don’t appear in the source code.

Understanding TAC

Three-address code makes operator precedence explicit:
Expression: (a + b) * (c - d)

TAC:
t0 = a + b
t1 = c - d
t2 = t0 * t1
Each operation is broken down into simple steps.

Phase 5: Program Execution

Purpose

The interpreter walks the AST and executes the program, displaying results.

Output Format

[FASE 5] Ejecución del programa...
              (Interpretando el AST)
     Resultado:
                → 25
  ✓ Ejecución finalizada

How It Works

1

Process Declarations

let statements evaluate the expression and store the result:
let x = 10 + 5;  // Evaluates to 15, stores in x
2

Process Print Statements

print statements evaluate and display:
print x;  // Looks up x (15), displays "→ 15"
3

Complete Execution

After all statements, shows completion message.

Multiple Print Outputs

let a = 5;
print a;
let b = 10;
print b;
print a + b;
Output:
[FASE 5] Ejecución del programa...
     Resultado:
                → 5
     Resultado:
                → 10
     Resultado:
                → 15
  ✓ Ejecución finalizada

Phase 6: Assembly Generation

Purpose

Produces x86 assembly code compatible with EMU8086 assembler.
Assembly output is not displayed in the console/GUI output panel. Use Exportar ASM in the GUI to save to a file.

Assembly Structure

.data
x dw 0
y dw 0
result dw 0
msg db 13,10,'$'
Declares variables as 16-bit words (dw) and newline message.

Example Assembly Output

let x = 10;
let y = 20;
let sum = x + y;
print sum;

Error Message Summary

Lexical Errors
Phase 1
Error léxico en línea X, columna Y: carácter inesperado 'C'
Invalid character in source code.
Syntax Errors
Phase 2
Error de sintaxis en línea X, columna Y: [expected]. Se encontró '[token]'
Tokens in wrong order or missing required token.
Semantic Errors
Phase 3
Error semántico en línea X, columna Y: [description]
Logical problems like undefined variables or division by zero.

Success Indicators

Look for these messages to confirm success:
✓ Completado: N tokens generados
✓ Completado: N sentencias parseadas
✓ Completado: N variables verificadas
✓ Ejecución finalizada
✓ RESULTADO: ¡Compilación exitosa!

Next Steps

Writing Programs

Master the language syntax to avoid errors

Error Handling

Diagnose and fix common compilation issues

Compiler Architecture

Learn how each phase is implemented

Code Generation

Understand the generated assembly code

Build docs developers (and LLMs) love