Skip to main content
Transform Abstract Syntax Trees into compact bytecode and execute them in a stack-based virtual machine. This is the core execution engine that brings JavaScript programs to life.

Overview

Bytecode is an intermediate representation between high-level source code and machine code. A virtual machine (VM) interprets bytecode instructions to execute programs.
Bytecode execution provides a balance between interpretation speed and implementation complexity, making it ideal for dynamic languages like JavaScript.

Bytecode design and instruction set

A well-designed instruction set is crucial for efficient VM execution. Each instruction should be simple, compact, and map directly to common operations.

Instruction format

Bytecode instructions typically use a fixed or variable-length format:
enum class Opcode : uint8_t {
    // Stack operations
    PUSH_CONSTANT,    // Push constant to stack
    POP,              // Pop value from stack
    
    // Arithmetic operations
    ADD,              // Pop two values, push sum
    SUB,              // Pop two values, push difference
    MUL,              // Pop two values, push product
    DIV,              // Pop two values, push quotient
    MOD,              // Pop two values, push remainder
    
    // Comparison operations
    EQ,               // Equal
    NEQ,              // Not equal
    LT,               // Less than
    GT,               // Greater than
    LTE,              // Less than or equal
    GTE,              // Greater than or equal
    
    // Logical operations
    AND,              // Logical AND
    OR,               // Logical OR
    NOT,              // Logical NOT
    
    // Variable operations
    LOAD_VAR,         // Load variable to stack
    STORE_VAR,        // Store stack top to variable
    LOAD_GLOBAL,      // Load global variable
    STORE_GLOBAL,     // Store to global variable
    
    // Control flow
    JUMP,             // Unconditional jump
    JUMP_IF_FALSE,    // Jump if stack top is false
    JUMP_IF_TRUE,     // Jump if stack top is true
    
    // Function operations
    CALL,             // Call function
    RETURN,           // Return from function
    
    // Object operations
    NEW_OBJECT,       // Create new object
    GET_PROPERTY,     // Get object property
    SET_PROPERTY,     // Set object property
    
    // Array operations
    NEW_ARRAY,        // Create new array
    GET_ELEMENT,      // Get array element
    SET_ELEMENT,      // Set array element
    
    // Special
    HALT              // Stop execution
};

struct Instruction {
    Opcode opcode;
    uint32_t operand;  // Optional operand
};

Instruction categories

Instructions for managing the operand stack:
  • PUSH_CONSTANT: Push a constant value onto the stack
  • POP: Remove the top value from the stack
  • DUP: Duplicate the top stack value
  • SWAP: Swap the top two stack values
Binary and unary operations:
  • ADD, SUB, MUL, DIV, MOD: Arithmetic operations
  • EQ, NEQ, LT, GT, LTE, GTE: Comparison operations
  • AND, OR, NOT: Logical operations
  • NEG: Unary negation
Loading and storing variables:
  • LOAD_VAR, STORE_VAR: Local variable access
  • LOAD_GLOBAL, STORE_GLOBAL: Global variable access
  • LOAD_UPVALUE, STORE_UPVALUE: Closure variable access
Branching and jumping:
  • JUMP: Unconditional jump to offset
  • JUMP_IF_FALSE: Conditional jump (for if statements, loops)
  • JUMP_IF_TRUE: Conditional jump (for logical OR)
  • LOOP: Jump backward (for loop constructs)
Function invocation and returns:
  • CALL: Call function with N arguments
  • RETURN: Return from function with value
  • CLOSURE: Create closure with upvalues

Compiling AST to bytecode

The compiler traverses the AST and emits bytecode instructions:
class Compiler : public ASTVisitor {
private:
    BytecodeBuffer& bytecode;
    SymbolTable& symbols;
    
public:
    Compiler(BytecodeBuffer& bc, SymbolTable& syms)
        : bytecode(bc), symbols(syms) {}
    
    void visit(BinaryExpression& node) override {
        // Compile left operand
        node.left->accept(*this);
        
        // Compile right operand
        node.right->accept(*this);
        
        // Emit operator instruction
        if (node.op == "+") {
            bytecode.emitOpcode(Opcode::ADD);
        } else if (node.op == "-") {
            bytecode.emitOpcode(Opcode::SUB);
        } else if (node.op == "*") {
            bytecode.emitOpcode(Opcode::MUL);
        } else if (node.op == "/") {
            bytecode.emitOpcode(Opcode::DIV);
        }
        // ... more operators
    }
    
    void visit(Literal& node) override {
        // Add constant to constant pool
        uint32_t index = bytecode.addConstant(node.value);
        
        // Emit PUSH_CONSTANT instruction
        bytecode.emitOpcode(Opcode::PUSH_CONSTANT);
        bytecode.emitOperand(index);
    }
    
    void visit(Identifier& node) override {
        // Look up variable in symbol table
        Symbol* sym = symbols.resolve(node.name);
        
        if (sym->isGlobal()) {
            bytecode.emitOpcode(Opcode::LOAD_GLOBAL);
        } else {
            bytecode.emitOpcode(Opcode::LOAD_VAR);
        }
        bytecode.emitOperand(sym->index);
    }
};
Jump offsets must be carefully calculated and patched after emitting forward jumps, as the target location isn’t known until the code is generated.

Stack-based VM execution

The virtual machine maintains an operand stack and executes instructions sequentially.

VM architecture

1

Instruction pointer

Tracks the current position in bytecode, advancing after each instruction.
2

Operand stack

Holds intermediate values during computation. Operations pop operands and push results.
3

Call frames

Maintains function call context including return address and local variables.
4

Global environment

Stores global variables and functions accessible from anywhere.

VM implementation

class VirtualMachine {
private:
    std::vector<Value> stack;
    std::vector<CallFrame> frames;
    std::unordered_map<std::string, Value> globals;
    
    const uint8_t* ip;  // Instruction pointer
    
    void push(const Value& value) {
        stack.push_back(value);
    }
    
    Value pop() {
        Value value = stack.back();
        stack.pop_back();
        return value;
    }
    
    Value peek(int distance = 0) {
        return stack[stack.size() - 1 - distance];
    }
    
    uint32_t readOperand() {
        uint32_t operand = (ip[0] << 24) | (ip[1] << 16) | 
                          (ip[2] << 8) | ip[3];
        ip += 4;
        return operand;
    }
    
public:
    void execute(const BytecodeBuffer& bytecode);
};
Stack-based VMs are simpler to implement than register-based VMs but may require more instructions for complex operations.

Optimization techniques

Instruction dispatch optimization

Different techniques can speed up instruction dispatch:
Traditional approach using a switch statement. Simple but may not be optimally compiled.
switch (opcode) {
    case Opcode::ADD: /* ... */ break;
    case Opcode::SUB: /* ... */ break;
    // ...
}

Value representation

Efficient value representation is critical for performance:
// Use IEEE 754 double NaN space to encode types
union Value {
    double number;
    uint64_t bits;
    
    // Use quiet NaN range for non-number types
    static constexpr uint64_t QNAN = 0x7ffc000000000000;
    static constexpr uint64_t SIGN_BIT = 0x8000000000000000;
    
    bool isNumber() const {
        return (bits & QNAN) != QNAN;
    }
    
    bool isObject() const {
        return (bits & (QNAN | SIGN_BIT)) == (QNAN | SIGN_BIT);
    }
    
    Object* asObject() const {
        return reinterpret_cast<Object*>(bits & ~(QNAN | SIGN_BIT));
    }
};

Performance considerations

Inline caching

Cache property lookups and method calls for faster repeated access

Constant folding

Evaluate constant expressions at compile time rather than runtime

Dead code elimination

Remove unreachable code to reduce bytecode size

Peephole optimization

Replace instruction sequences with more efficient equivalents
Premature optimization can lead to more complex code without significant benefits. Profile first, then optimize hot paths.

Next steps

Closures and prototypes

Implement closures with environment chains and prototype-based inheritance

Parsing and AST

Learn about lexical analysis, tokenization, and AST construction

Build docs developers (and LLMs) love