Three-phase bytecode compiler transforming AST to executable bytecode
The Arc compiler transforms a parsed AST into executable bytecode through a three-phase pipeline: Emit (AST → symbolic IR), Scope (resolve variables to local indices), and Resolve (convert labels to PC addresses).
Input: AST (typed expression/statement tree) Output: List of EmitterOp (IR instructions + scope metadata) File: src/arc/compiler/emit.gleamWalks the AST recursively and emits symbolic IR:
Input: List of EmitterOp Output: List of IrOp (scope markers consumed, locals assigned) File: src/arc/compiler/scope.gleamResolves symbolic variable names to local slot indices:
IrScopeGetVar("x") → IrGetLocal(0) or IrGetGlobal("x")
IrScopePutVar("x") → IrPutLocal(0) or IrPutGlobal("x")
Identifies captured variables and boxes them for closure sharing
Consumes scope markers (no longer needed)
Example:
// After scope resolution:[ IrPushConst(2), // JsUninitialized (const TDZ) IrPutLocal(0), // x → local slot 0 IrPushConst(0), // 40 IrPushConst(1), // 2 IrBinOp(Add), IrPutLocal(0) // store to x]
Phase 3: Resolve
Input: List of IrOp (still has label IDs) Output: FuncTemplate (bytecode array with absolute addresses) File: src/arc/compiler/resolve.gleamTwo-pass label resolution:
Pass 1: Walk IR, build Dict(label_id → pc_address)
Pass 2: Replace IrJump(label) → Jump(pc), drop IrLabel
Produces final bytecode ready for VM execution.Example:
FuncTemplate( bytecode: [ PushConst(2), // PC 0 PutLocal(0), // PC 1 PushConst(0), // PC 2 PushConst(1), // PC 3 BinOp(Add), // PC 4 PutLocal(0) // PC 5 ], constants: [JsNumber(40.0), JsNumber(2.0), JsUninitialized], local_count: 1)
fn emit_stmt(e: Emitter, stmt: ast.Statement) -> Result(Emitter, EmitError) { case stmt { ast.VariableDeclaration(kind, declarators) -> { // Emit scope markers + IR for each declarator list.try_fold(declarators, e, fn(e, decl) { let e = emit_op(e, DeclareVar(name, binding_kind)) use e <- result.try(emit_expr(e, decl.init)) Ok(emit_ir(e, IrScopePutVar(name))) }) } ast.IfStatement(test, consequent, alternate) -> { use e <- result.try(emit_expr(e, test)) let else_label = fresh_label(e) let end_label = fresh_label(e) let e = emit_ir(e, IrJumpIfFalse(else_label)) use e <- result.try(emit_stmt(e, consequent)) let e = emit_ir(e, IrJump(end_label)) let e = emit_ir(e, IrLabel(else_label)) // ... emit alternate ... Ok(emit_ir(e, IrLabel(end_label))) } // ... 30+ statement types }}
Expression emission
fn emit_expr(e: Emitter, expr: ast.Expression) -> Result(Emitter, EmitError) { case expr { ast.NumberLiteral(n) -> { let #(e, idx) = add_constant(e, JsNumber(Finite(n))) Ok(emit_ir(e, IrPushConst(idx))) } ast.Identifier(name) -> { Ok(emit_ir(e, IrScopeGetVar(name))) } ast.BinaryExpression(op, left, right) -> { use e <- result.try(emit_expr(e, left)) use e <- result.try(emit_expr(e, right)) Ok(emit_ir(e, IrBinOp(op))) } ast.CallExpression(callee, args) -> { use e <- result.try(emit_expr(e, callee)) use e <- result.try(emit_expr_list(e, args)) Ok(emit_ir(e, IrCall(list.length(args)))) } // ... 40+ expression types }}
Function compilation
Functions are compiled recursively — child functions are compiled during parent emission:
fn emit_function_declaration( e: Emitter, name: String, params: List(ast.Pattern), body: List(ast.Statement)) -> Result(Emitter, EmitError> { // Compile child function into a CompiledChild let #(e, child) = compile_child_function(e, params, body) // Add to functions list, get index let func_index = e.next_func let e = Emitter(..e, functions: [child, ..e.functions], next_func: func_index + 1) // Emit MakeClosure instruction let e = emit_ir(e, IrMakeClosure(func_index)) // Bind to name Ok(emit_ir(e, IrScopePutVar(name)))}
File: src/arc/compiler/scope.gleam (300+ lines)The scope resolver walks the EmitterOp list, tracks variable bindings, and resolves names to local indices.
DeclareVar(name, kind) -> { let index = r.next_local let boxed = set.contains(r.captured_vars, name) let binding = Binding(index:, kind:, is_boxed: boxed) // Add to appropriate scope (var → function, let/const → current) let r = add_binding(r, name, binding) // Initialize slot: let r = emit(r, IrPushConst(uninit_idx)) // JsUninitialized let r = emit(r, IrPutLocal(index)) // Box if captured: case boxed { True -> emit(r, IrBoxLocal(index)) False -> r }}
3
Resolve references
Replace symbolic names with concrete operations:
Ir(IrScopeGetVar(name)) -> { case lookup(r.scopes, name) { Some(Binding(index:, is_boxed: True, ..)) -> emit(r, IrGetBoxed(index)) Some(Binding(index:, is_boxed: False, ..)) -> emit(r, IrGetLocal(index)) None -> emit(r, IrGetGlobal(name)) // Not in local scope → global }}Ir(IrScopePutVar(name)) -> { case lookup(r.scopes, name) { Some(Binding(index:, is_boxed: True, ..)) -> emit(r, IrPutBoxed(index)) Some(Binding(index:, is_boxed: False, ..)) -> emit(r, IrPutLocal(index)) None -> emit(r, IrPutGlobal(name)) }}
During scope resolution, emit boxing instructions:
DeclareVar(name, kind) -> { let boxed = set.contains(r.captured_vars, name) // ... allocate local slot ... case boxed { True -> { // Wrap value in BoxSlot on heap: emit(r, IrBoxLocal(index)) // Now locals[index] = JsObject(box_ref) } False -> r }}
3
Access boxed variables
Use GetBoxed/PutBoxed instead of GetLocal/PutLocal:
case lookup(scopes, name) { Some(Binding(is_boxed: True, index:, ..)) -> { // Dereference the box: IrGetBoxed(index) // Read locals[index] as box_ref, push slot.value }}
4
Share boxes with children
Child functions capture the box reference (not the value):