Pipeline Overview
Each stage is implemented as a query in the Rock-based build system, allowing for lazy evaluation and memoization:Stage 1: Lexing
Layout Rules
Elara uses significant whitespace (like Haskell). The lexer converts indentation into explicit braces and semicolons:Becomes:
Implementation:
src/Elara/Lexer/Reader.hsQuery: LexedFile :: FilePath -> Query [Lexeme]Errors: LexerError - Reports illegal characters, unterminated strings, etc.Stage 2: Parsing
AST Construction
The token stream is parsed into the Frontend AST, which closely mirrors the source syntax:
Syntax Validation
The parser ensures:
- Balanced parentheses and brackets
- Valid declaration structure (
deffollowed bylet) - Proper pattern match syntax
- Well-formed type signatures
Implementation:
src/Elara/Parse/Module.hs, src/Elara/Parse/Expression.hsQuery: ParsedFile :: FilePath -> Query (Module Frontend)Errors: WParseErrorBundle - Detailed parse errors with source locationsExample: Frontend AST
Stage 3: Desugaring
Implementation:
src/Elara/Desugar.hsQuery: DesugaredModule :: ModuleName -> Query (Module Desugared)Errors: DesugarError - Mismatched declarations, duplicate definitionsStage 4: Renaming
Implementation:
src/Elara/Rename.hs, src/Elara/Rename/Imports.hsQuery: RenamedModule :: ModuleName -> Query (Module Renamed)Errors: RenameError - Undefined names, ambiguous imports, circular importsStage 5: Shunting
Implementation:
src/Elara/Shunt.hs, src/Elara/Shunt/Operator.hsQuery: ModuleByName @Shunted :: ModuleName -> Query (Module Shunted)Errors: ShuntError - Invalid operator precedence, undefined operatorsWarnings: ShuntWarning - Precedence ambiguitiesStage 6: Type Checking
Constraint Generation
Type constraints are generated from the AST using Algorithm W:Generates:
x : αid : α -> α
Constraint Solving
Constraints are unified using Robinson’s unification algorithm:
- Substitution-based unification
- Occurs check to prevent infinite types
- Type error reporting with source locations
Implementation:
src/Elara/TypeInfer.hs, src/Elara/TypeInfer/ConstraintGeneration.hsQuery: TypeCheckedModule :: ModuleName -> Query (Module Typed)Errors: Type mismatches, occurs check failures, infinite types, missing IO annotationsType Inference Example
f : a -> bls : [a]x : a,xs : [a]f x : bmap f xs : [b]- Result:
[b]✓
Stage 7: ToCore
Core AST Construction
The Typed AST is converted to Core, a minimal typed lambda calculus with only 8 constructors:
Var- VariablesLam- Lambda abstractionApp- Function applicationLet- Let bindingCase- Pattern matchingLit- LiteralsType- Type abstractionCast- Type coercion
Pattern Match Compilation
Pattern matches are compiled to efficient decision trees:Becomes a tree of case expressions that avoid redundant tests.
Implementation:
src/Elara/ToCore.hs, src/Elara/ToCore/Match.hsQuery: GetCoreModule :: ModuleName -> Query (CoreModule CoreBind)Errors: Pattern match compilation errors, exhaustiveness checking failuresExample: Core IR
Stage 8: CoreToCore
Implementation:
src/Elara/CoreToCore.hs, src/Elara/Core/ToANF.hs, src/Elara/Core/LiftClosures.hsQueries:GetANFCoreModule :: ModuleName -> Query (CoreModule ANFBind)GetClosureLiftedModule :: ModuleName -> Query (CoreModule ANFBind)GetFinalisedCoreModule :: ModuleName -> Query (CoreModule CoreBind)
ClosureLiftError - Closure conversion failuresStage 9: Emitting
JVM IR Generation
Core is lowered to JVM IR, an intermediate representation closer to JVM bytecode:
- Functions → methods
- Lambdas → synthetic classes implementing
Funcinterface - Pattern matches → switch statements and instanceof checks
- Algebraic data types → Java classes with inheritance
Bytecode Emission
JVM IR is converted to actual JVM bytecode:
- Method bodies → bytecode instructions
- Type information → JVM signatures
- Constants → constant pool entries
Implementation:
src/Elara/JVM/Lower.hs, src/Elara/JVM/Emit.hsQueries:GetJVMIRModule :: ModuleName -> Query IR.ModuleGetJVMClassFiles :: ModuleName -> Query [ClassFile]GetJVMClassBytes :: ModuleName -> Query [(FilePath, ByteString)]
JVMLoweringError, CodeConverterError - JVM compilation failuresJVM Backend Details
Elara compiles to Java 8+ bytecode with the following conventions:- Functions: Static methods in module classes
- Closures: Classes implementing
Elara.Funcinterface - Data constructors: Subclasses with fields
- Pattern matching: Visitor pattern with instanceof checks
- IO: Java methods with side effects
Intermediate Output
Use--dump flags to inspect intermediate representations:
Performance Notes
Compilation Speed
- Lexing: Very fast (~1ms for 1000 lines)
- Parsing: Fast (~5ms for 1000 lines)
- Type checking: Moderate (~50ms for complex code)
- Code generation: Fast (~10ms per module)
Optimization Passes
The CoreToCore stage can be expensive for large modules. Future versions may include:- Parallel query execution
- Incremental type checking
- Separate compilation
- Bytecode caching
Related Pages
Compiler Architecture
High-level overview of compiler design
CLI Reference
Command-line options for compilation