Compilation Pipeline
The Expresiones compiler follows a traditional multi-stage compilation architecture using ANTLR 4.13.1 for lexical and syntactic analysis. The compilation process flows through four distinct phases:Lexical Analysis
The
ExpresionesLexer tokenizes source code into 27 token types including keywords, operators, identifiers, and literals.Syntactic Analysis
The
ExpresionesParser constructs an Abstract Syntax Tree (AST) following 7 grammar rules defined in Expresiones.g.Semantic Analysis
The custom
Visitor class traverses the AST, performing type checking, variable resolution, and symbol table management.Architecture Diagram
Core Components
Grammar Definition
The language specification is defined inExpresiones.g using ANTLR syntax:
- Syntactic Rules: 7 parser rules (root, instrucciones, bloque, declaracion, asignacion, condicion, expr)
- Lexical Rules: 27 token types covering keywords, operators, identifiers, and literals
- Language Features: Variable declarations, assignments, arithmetic expressions, logical operations, and conditional statements
The grammar file
Expresiones.g is the single source of truth. Changes to the grammar require regenerating the lexer and parser using ANTLR 4.13.1.Generated Components
ExpresionesLexer.py
Auto-generated lexer that converts source text into tokens. Contains 27 token type constants and lexical rules.
ExpresionesParser.py
Auto-generated parser that builds the parse tree from token stream. Implements 7 context classes for AST nodes.
ExpresionesVisitor.py
Base visitor class with empty implementations for all visit methods. Extended by custom implementations.
Visitor.py
Custom visitor implementation providing semantic analysis, symbol table management, and interpretation.
Data Structures
Symbol Table
The compiler maintains a symbol table as a Python dictionary mapping variable names toSimbolo objects:
Visitor class:
Language Features
Supported Constructs
Variable Declarations
Variable Declarations
int, float, and bool types with optional initialization.Arithmetic Expressions
Arithmetic Expressions
+, -, *, / with proper precedence (multiplication/division before addition/subtraction).Conditional Statements
Conditional Statements
>, <, ==, !=, >=, <=) and logical operators (&&, ||, !).Token Types Reference
The lexer recognizes 27 token types defined inExpresionesLexer.py:
| Token | Value | Description | ||
|---|---|---|---|---|
PROGRAMA | 1 | Keyword program | ||
SI | 2 | Keyword if | ||
SINO | 3 | Keyword else | ||
TIPO | 4 | Data types: int, float, bool | ||
LLAVE_IZQ | 5 | Left brace { | ||
LLAVE_DER | 6 | Right brace } | ||
PAR_IZQ | 7 | Left parenthesis ( | ||
PAR_DER | 8 | Right parenthesis ) | ||
PUNTO_COMA | 9 | Semicolon ; | ||
ASIGNACION | 10 | Assignment = | ||
SUMA | 11 | Addition + | ||
RESTA | 12 | Subtraction - | ||
MULT | 13 | Multiplication * | ||
DIV | 14 | Division / | ||
MAYOR | 15 | Greater than > | ||
MENOR | 16 | Less than < | ||
IGUAL | 17 | Equality == | ||
DIFERENTE | 18 | Not equal != or <> | ||
MAYOR_IGUAL | 19 | Greater or equal >= | ||
MENOR_IGUAL | 20 | Less or equal <= | ||
Y_LOGICO | 21 | Logical AND && | ||
O_LOGICO | 22 | Logical OR ` | ` | |
NO_LOGICO | 23 | Logical NOT ! | ||
ID | 24 | Identifier pattern | ||
NUMERO | 25 | Number literal (int or float) | ||
WS | 26 | Whitespace (skipped) | ||
COMENTARIO | 27 | Single-line comment (skipped) |
Parser Rules
The parser implements 7 grammar rules defined inExpresionesParser.py:
RULE_root(0): Program entry pointRULE_instrucciones(1): Instructions (declarations, assignments, conditionals)RULE_bloque(2): Code blocksRULE_declaracion(3): Variable declarationsRULE_asignacion(4): Variable assignmentsRULE_condicion(5): Conditional expressionsRULE_expr(6): Arithmetic expressions
See individual component pages for detailed implementation information:
- Lexer - Tokenization process
- Parser - AST construction
- Visitor - Tree traversal and interpretation
- Symbol Table - Variable management