Skip to main content
The Elara compiler transforms source code through 9 distinct passes, each building on the previous stage to ultimately produce JVM bytecode or interpret the program directly.

Pipeline Overview

Each stage is implemented as a query in the Rock-based build system, allowing for lazy evaluation and memoization:

Stage 1: Lexing

1

Tokenization

The source code is converted into a stream of tokens:
def factorial : Int -> Int
let factorial n = if n == 0 then 1 else n * factorial (n - 1)
Becomes:
DEF, IDENTIFIER("factorial"), COLON, IDENTIFIER("Int"), ARROW, IDENTIFIER("Int")
LET, IDENTIFIER("factorial"), IDENTIFIER("n"), EQUALS, IF, ...
2

Layout Rules

Elara uses significant whitespace (like Haskell). The lexer converts indentation into explicit braces and semicolons:
let main =
    print "Hello"
    print "World"
Becomes:
let main = { print "Hello"; print "World" }
3

Comment Removal

Single-line (--) and multi-line ({- -}) comments are stripped out.
Implementation: src/Elara/Lexer/Reader.hsQuery: LexedFile :: FilePath -> Query [Lexeme]Errors: LexerError - Reports illegal characters, unterminated strings, etc.

Stage 2: Parsing

1

AST Construction

The token stream is parsed into the Frontend AST, which closely mirrors the source syntax:
data Module Frontend = Module
  { name :: ModuleName
  , declarations :: [Declaration Frontend]
  , imports :: [Import]
  }
2

Syntax Validation

The parser ensures:
  • Balanced parentheses and brackets
  • Valid declaration structure (def followed by let)
  • Proper pattern match syntax
  • Well-formed type signatures
3

Error Recovery

Modern error recovery allows parsing to continue after errors, reporting multiple issues at once.
Implementation: src/Elara/Parse/Module.hs, src/Elara/Parse/Expression.hsQuery: ParsedFile :: FilePath -> Query (Module Frontend)Errors: WParseErrorBundle - Detailed parse errors with source locations

Example: Frontend AST

def factorial : Int -> Int
let factorial n = if n == 0 then 1 else n * factorial (n - 1)
Produces:
ValueDeclaration
  { name = "factorial"
  , type' = Just (FunctionType Int Int)
  , value = Lambda "n" (
      If (BinaryOp "==" (Var "n") (Int 0))
         (Int 1)
         (BinaryOp "*" (Var "n") (App (Var "factorial") (BinaryOp "-" (Var "n") (Int 1))))
    )
  }

Stage 3: Desugaring

1

Lambda Currying

Multi-argument lambdas become nested single-argument lambdas:
\x y -> x + y
Becomes:
\x -> \y -> x + y
2

Let to Lambda

Let bindings with parameters are converted to lambda expressions:
let add x y = x + y
Becomes:
let add = \x -> \y -> x + y
3

Declaration Merging

Declarations with the same name (type signature + definition) are merged into a single declaration:
def factorial : Int -> Int
let factorial n = ...
Becomes a single ValueDeclaration with both type and implementation.
Implementation: src/Elara/Desugar.hsQuery: DesugaredModule :: ModuleName -> Query (Module Desugared)Errors: DesugarError - Mismatched declarations, duplicate definitions

Stage 4: Renaming

1

Name Resolution

All names are resolved to fully qualified names:
map f xs  -- Becomes: Elara.Prelude.map f xs
2

Unique Name Generation

Local variables get unique identifiers to avoid shadowing:
let f x = let x = 5 in x + x
Becomes:
let f x#1 = let x#2 = 5 in x#2 + x#2
3

Import Processing

Imports are resolved and the module’s namespace is populated.
Implementation: src/Elara/Rename.hs, src/Elara/Rename/Imports.hsQuery: RenamedModule :: ModuleName -> Query (Module Renamed)Errors: RenameError - Undefined names, ambiguous imports, circular imports

Stage 5: Shunting

1

Operator Precedence

Binary operators are reassociated according to their precedence:
1 + 2 * 3  -- Becomes: 1 + (2 * 3)
2

Operator Desugaring

All operators become prefix function calls:
x + y  -- Becomes: (+) x y
3

Custom Operators

User-defined operators with custom fixity are handled:
infixl 6 +$+
let (+$+) x y = ...
Implementation: src/Elara/Shunt.hs, src/Elara/Shunt/Operator.hsQuery: ModuleByName @Shunted :: ModuleName -> Query (Module Shunted)Errors: ShuntError - Invalid operator precedence, undefined operatorsWarnings: ShuntWarning - Precedence ambiguities

Stage 6: Type Checking

1

Constraint Generation

Type constraints are generated from the AST using Algorithm W:
let id x = x
Generates:
  • x : α
  • id : α -> α
2

Constraint Solving

Constraints are unified using Robinson’s unification algorithm:
  • Substitution-based unification
  • Occurs check to prevent infinite types
  • Type error reporting with source locations
3

Generalization

Polymorphic types are generalized:
let id x = x  -- Inferred type: ∀a. a -> a
4

Effect Tracking

IO effects are tracked in the type system:
print : String -> IO ()
Implementation: src/Elara/TypeInfer.hs, src/Elara/TypeInfer/ConstraintGeneration.hsQuery: TypeCheckedModule :: ModuleName -> Query (Module Typed)Errors: Type mismatches, occurs check failures, infinite types, missing IO annotations

Type Inference Example

def map : (a -> b) -> [a] -> [b]
let map f ls =
    match ls with
        [] -> []
        x::xs -> f x :: map f xs
Type checking verifies:
  1. f : a -> b
  2. ls : [a]
  3. x : a, xs : [a]
  4. f x : b
  5. map f xs : [b]
  6. Result: [b]

Stage 7: ToCore

1

Core AST Construction

The Typed AST is converted to Core, a minimal typed lambda calculus with only 8 constructors:
  1. Var - Variables
  2. Lam - Lambda abstraction
  3. App - Function application
  4. Let - Let binding
  5. Case - Pattern matching
  6. Lit - Literals
  7. Type - Type abstraction
  8. Cast - Type coercion
2

Pattern Match Compilation

Pattern matches are compiled to efficient decision trees:
match x with
  (a, 0) -> a
  (0, b) -> b
  (a, b) -> a + b
Becomes a tree of case expressions that avoid redundant tests.
3

Extensive Desugaring

All high-level constructs are desugared:
  • List syntax → cons cells
  • String literals → character lists
  • Do notation → bind operations
  • Multi-way if → nested if-then-else
Implementation: src/Elara/ToCore.hs, src/Elara/ToCore/Match.hsQuery: GetCoreModule :: ModuleName -> Query (CoreModule CoreBind)Errors: Pattern match compilation errors, exhaustiveness checking failures

Example: Core IR

let factorial n = if n == 0 then 1 else n * factorial (n - 1)
Becomes:
Let factorial (Lam n
  (Case (App (App (Var (==)) (Var n)) (Lit 0))
    [ (True, Lit 1)
    , (False, App (App (Var (*)) (Var n))
                  (App (Var factorial) (App (App (Var (-)) (Var n)) (Lit 1))))
    ]))

Stage 8: CoreToCore

1

A-Normal Form (ANF)

Core is converted to ANF where all intermediate values are named:
f (g x) (h y)
Becomes:
let a = g x in
let b = h y in
f a b
2

Closure Lifting

Nested functions are lifted to top-level with explicit environment passing:
let outer x =
  let inner y = x + y
  in inner
Becomes:
let inner env y = env.x + y
let outer x = inner {x}
3

Optimizations

Various Core-to-Core transformations:
  • Dead code elimination
  • Constant folding
  • Beta reduction
  • Eta reduction
  • Inlining small functions
Implementation: src/Elara/CoreToCore.hs, src/Elara/Core/ToANF.hs, src/Elara/Core/LiftClosures.hsQueries:
  • GetANFCoreModule :: ModuleName -> Query (CoreModule ANFBind)
  • GetClosureLiftedModule :: ModuleName -> Query (CoreModule ANFBind)
  • GetFinalisedCoreModule :: ModuleName -> Query (CoreModule CoreBind)
Errors: ClosureLiftError - Closure conversion failures

Stage 9: Emitting

1

JVM IR Generation

Core is lowered to JVM IR, an intermediate representation closer to JVM bytecode:
  • Functions → methods
  • Lambdas → synthetic classes implementing Func interface
  • Pattern matches → switch statements and instanceof checks
  • Algebraic data types → Java classes with inheritance
2

Bytecode Emission

JVM IR is converted to actual JVM bytecode:
  • Method bodies → bytecode instructions
  • Type information → JVM signatures
  • Constants → constant pool entries
3

Class File Writing

Class files are written to the build/ directory:
build/
└── Main.class
Implementation: src/Elara/JVM/Lower.hs, src/Elara/JVM/Emit.hsQueries:
  • GetJVMIRModule :: ModuleName -> Query IR.Module
  • GetJVMClassFiles :: ModuleName -> Query [ClassFile]
  • GetJVMClassBytes :: ModuleName -> Query [(FilePath, ByteString)]
Errors: JVMLoweringError, CodeConverterError - JVM compilation failures

JVM Backend Details

Elara compiles to Java 8+ bytecode with the following conventions:
  • Functions: Static methods in module classes
  • Closures: Classes implementing Elara.Func interface
  • Data constructors: Subclasses with fields
  • Pattern matching: Visitor pattern with instanceof checks
  • IO: Java methods with side effects

Intermediate Output

Use --dump flags to inspect intermediate representations:
elara build Main.elara --dump=parsed,core,jvm
Generates:
build/
├── Main.parsed.elr      # Frontend AST
├── Main.core.elr        # Core IR
├── Main.jvm.ir.elr      # JVM IR
└── Main.classfile.txt   # JVM bytecode disassembly
Dumping intermediate stages is invaluable for debugging compiler issues and understanding how your code is transformed.

Performance Notes

Compilation Speed

  • Lexing: Very fast (~1ms for 1000 lines)
  • Parsing: Fast (~5ms for 1000 lines)
  • Type checking: Moderate (~50ms for complex code)
  • Code generation: Fast (~10ms per module)

Optimization Passes

The CoreToCore stage can be expensive for large modules. Future versions may include:
  • Parallel query execution
  • Incremental type checking
  • Separate compilation
  • Bytecode caching

Compiler Architecture

High-level overview of compiler design

CLI Reference

Command-line options for compilation

Build docs developers (and LLMs) love