Semantic Analysis

Semantic analysis is the process of understanding the meaning and relationships within code beyond its syntax. While the parser produces an AST representing the code’s structure, the semantic analyzer builds additional data structures that capture scoping, symbol resolution, and control flow.

Overview

Oxc’s semantic analyzer performs comprehensive analysis of JavaScript and TypeScript programs, building:

Symbol Table

All declared identifiers (variables, functions, classes) with their properties

Scope Tree

Nested scopes following ECMAScript scoping rules

Reference Graph

Links between identifier uses and their declarations

Semantic analysis bridges the gap between syntax and meaning. It enables tools like linters to detect issues like undefined variables, and transformers to safely rename identifiers.

Analysis Pipeline

The semantic analyzer runs after parsing:

use oxc_allocator::Allocator;
use oxc_parser::Parser;
use oxc_semantic::SemanticBuilder;
use oxc_span::SourceType;

let allocator = Allocator::default();
let source_text = r#"
    const x = 42;
    function foo() {
        console.log(x);
        const y = x + 1;
        return y;
    }
"#;

let source_type = SourceType::default();

// Step 1: Parse
let ret = Parser::new(&allocator, source_text, source_type).parse();
let program = ret.program;

// Step 2: Semantic analysis
let semantic = SemanticBuilder::new()
    .with_check_syntax_error(true)  // Enable additional syntax checks
    .with_cfg(true)                 // Build control flow graph (optional)
    .build(&program);

// The semantic data is now populated
let scoping = semantic.semantic.scoping();

The analyzer produces:

Symbol table: What was declared and where
Scope tree: Lexical scope hierarchy
References: Links from uses to declarations
Control flow graph (optional): Program flow analysis

Scope Tree

The scope tree represents the nested lexical scopes in your code.

Scope Types

Different language constructs create different types of scopes:

// Program scope (top-level)
const global = 1;

// Function scope
function outer() {
    var functionScoped = 2;
    
    // Block scope
    if (true) {
        let blockScoped = 3;
        const alsoBlockScoped = 4;
    }
    
    // Another function scope
    function inner() {
        let nested = 5;
    }
}

// Class scope
class MyClass {
    method() {
        let methodScope = 6;
    }
}

Each scope knows:

Its parent scope
What bindings (variables) it declares
Its scope flags (function, block, strict mode, etc.)

Accessing Scope Information

use oxc_syntax::scope::{ScopeFlags, ScopeId};

let scoping = semantic.semantic.scoping();

// Iterate over all scopes
for scope_id in scoping.scope_ids() {
    let flags = scoping.get_flags(scope_id);
    
    println!("Scope {:?}:", scope_id);
    println!("  Is function scope: {}", flags.contains(ScopeFlags::Function));
    println!("  Is strict mode: {}", flags.contains(ScopeFlags::StrictMode));
    
    // Get parent scope
    if let Some(parent_id) = scoping.get_parent_id(scope_id) {
        println!("  Parent: {:?}", parent_id);
    }
    
    // Get bindings in this scope
    if let Some(bindings) = scoping.get_bindings(scope_id) {
        for (name, symbol_id) in bindings {
            println!("  Binding: {} -> {:?}", name, symbol_id);
        }
    }
}

Scope Hierarchy Example

For this code:

const x = 1;
function foo() {
    const y = 2;
    if (true) {
        const z = 3;
    }
}

The scope tree looks like:

Program Scope (id: 0)
  └─ bindings: { x -> Symbol(0), foo -> Symbol(1) }
  └─ child:
      Function Scope (id: 1) [for foo]
        └─ bindings: { y -> Symbol(2) }
        └─ child:
            Block Scope (id: 2) [for if]
              └─ bindings: { z -> Symbol(3) }

Symbol Table

The symbol table records every declared identifier in the program.

Symbol Properties

Each symbol has:

Name: The identifier name
Span: Source location of declaration
Flags: Type of declaration (var, let, const, function, class, etc.)
Scope: Which scope it belongs to
References: All places it’s used

use oxc_syntax::symbol::{SymbolFlags, SymbolId};

let scoping = semantic.semantic.scoping();

// Iterate over all symbols
for symbol_id in scoping.symbol_ids() {
    let name = scoping.symbol_name(symbol_id);
    let flags = scoping.symbol_flags(symbol_id);
    let span = scoping.symbol_span(symbol_id);
    let scope_id = scoping.symbol_scope(symbol_id);
    
    println!("Symbol: {}", name);
    println!("  ID: {:?}", symbol_id);
    println!("  Flags: {:?}", flags);
    println!("  Declared at: {:?}", span);
    println!("  Scope: {:?}", scope_id);
    
    // Check symbol type
    if flags.contains(SymbolFlags::BlockScopedVariable) {
        println!("  This is a let/const variable");
    }
    if flags.contains(SymbolFlags::Function) {
        println!("  This is a function");
    }
}

Symbol Flags

Common symbol flags:

SymbolFlags::BlockScopedVariable - let or const
SymbolFlags::FunctionScopedVariable - var
SymbolFlags::Function - Function declaration
SymbolFlags::Class - Class declaration
SymbolFlags::CatchVariable - Catch clause parameter
SymbolFlags::ConstVariable - const declaration
SymbolFlags::Import - Import binding
SymbolFlags::Export - Export declaration

These flags help tools understand how identifiers behave. For example, a linter checking for variable reassignment needs to know if something is declared with const.

Reference Resolution

The semantic analyzer links every identifier reference to its declaration.

Understanding References

const x = 42;        // Symbol: x (declaration)
function foo() {
    console.log(x);  // Reference: x -> Symbol x (read)
    const y = x + 1; // Reference: x -> Symbol x (read)
    return y;        // Reference: y -> Symbol y (read)
}
foo();              // Reference: foo -> Symbol foo (read)

Accessing Reference Information

use oxc_syntax::reference::{ReferenceFlags, ReferenceId};

let scoping = semantic.semantic.scoping();

// Get all references for a symbol
for symbol_id in scoping.symbol_ids() {
    let symbol_name = scoping.symbol_name(symbol_id);
    println!("Symbol '{}' is referenced at:", symbol_name);
    
    for reference_id in scoping.get_resolved_reference_ids(symbol_id) {
        let reference = scoping.get_reference(*reference_id);
        let flags = reference.flags();
        
        if flags.contains(ReferenceFlags::Read) {
            println!("  - Read reference");
        }
        if flags.contains(ReferenceFlags::Write) {
            println!("  - Write reference");
        }
    }
}

Reference Flags

ReferenceFlags::Read - Value is read (e.g., x in y = x)
ReferenceFlags::Write - Value is written (e.g., x in x = 5)
ReferenceFlags::Type - Used as a type (TypeScript)
ReferenceFlags::Value - Used as a value

A reference can be both read and write, like x++ or x += 1.

Practical Example: Finding Unused Variables

Here’s how to use semantic analysis to find unused variables:

use oxc_allocator::Allocator;
use oxc_parser::Parser;
use oxc_semantic::SemanticBuilder;
use oxc_span::SourceType;

let allocator = Allocator::default();
let source_text = r#"
    const used = 1;
    const unused = 2;
    
    function foo() {
        console.log(used);
    }
"#;

let ret = Parser::new(&allocator, source_text, SourceType::default()).parse();
let semantic = SemanticBuilder::new().build(&ret.program);

let scoping = semantic.semantic.scoping();

// Find symbols with no references
for symbol_id in scoping.symbol_ids() {
    let references = scoping.get_resolved_reference_ids(symbol_id);
    
    if references.is_empty() {
        let name = scoping.symbol_name(symbol_id);
        let span = scoping.symbol_span(symbol_id);
        println!("Unused variable '{}' at {:?}", name, span);
    }
}

// Output: Unused variable 'unused' at Span { start: 37, end: 43 }

Practical Example: Detecting Unresolved References

// Find references to undefined variables
let unresolved = scoping.root_unresolved_references();

for (name, reference_ids) in unresolved {
    println!("Unresolved reference: '{}'", name);
    for reference_id in reference_ids {
        let reference = scoping.get_reference(*reference_id);
        println!("  Used at node {:?}", reference.node_id());
    }
}

For code like:

function foo() {
    console.log(undefinedVar);  // undefinedVar is not declared
}

This will report undefinedVar as unresolved.

Scope Chain Resolution

The semantic analyzer implements ECMAScript scope chain resolution:

const x = 'global';

function outer() {
    const x = 'outer';
    
    function inner() {
        console.log(x);  // Resolves to 'outer', not 'global'
    }
}

When resolving x in inner():

Check inner function scope - not found
Check outer function scope - found! (resolves here)
(Would check global scope if not found)

This is exactly how JavaScript’s scope chain works at runtime. The semantic analyzer replicates this behavior at compile time.

Integration with AST

Semantic analysis populates fields in the AST:

BindingIdentifier.symbol_id

// After semantic analysis, declarations have symbol_id populated
if let Statement::VariableDeclaration(decl) = stmt {
    for declarator in &decl.declarations {
        if let BindingPatternKind::BindingIdentifier(ident) = &declarator.id.kind {
            // This is now Some(SymbolId) after semantic analysis
            if let Some(symbol_id) = ident.symbol_id.get() {
                println!("Variable '{}' has symbol ID {:?}", 
                    ident.name, symbol_id);
            }
        }
    }
}

IdentifierReference.reference_id

// References have reference_id populated
if let Expression::Identifier(ident) = expr {
    if let Some(reference_id) = ident.reference_id.get() {
        let reference = scoping.get_reference(reference_id);
        println!("Reference '{}':", ident.name);
        println!("  Flags: {:?}", reference.flags());
        
        // Find what it refers to
        if let Some(symbol_id) = reference.symbol_id() {
            let symbol_name = scoping.symbol_name(symbol_id);
            println!("  Resolves to symbol '{}'", symbol_name);
        }
    }
}

Control Flow Graph (Optional)

When enabled, the semantic analyzer also builds a control flow graph (CFG):

use oxc_semantic::SemanticBuilder;

let semantic = SemanticBuilder::new()
    .with_cfg(true)  // Enable CFG construction
    .build(&program);

if let Some(cfg) = semantic.semantic.cfg() {
    // Analyze control flow
    for basic_block_id in cfg.basic_blocks.iter_enumerated() {
        let block = &cfg.basic_blocks[basic_block_id];
        println!("Basic block {:?}:", basic_block_id);
        println!("  Instructions: {:?}", block.instructions.len());
    }
}

The CFG is used for advanced analyses like dead code detection and data flow analysis. Most tools don’t need it.

Data Structures

Scoping

The main data structure:

pub struct Scoping {
    // Symbol table (stored as struct-of-arrays for efficiency)
    symbol_table: SymbolTable,
    
    // All references in the program
    references: IndexVec<ReferenceId, Reference>,
    
    // Scope tree (stored as struct-of-arrays)
    scope_table: ScopeTable,
    
    // Inner data stored in arena allocator
    cell: ScopingCell,
}

Struct-of-Arrays Design: Instead of Vec<Symbol> where each Symbol is a struct with multiple fields, Oxc uses separate vectors for each field. This improves cache locality and reduces memory overhead.

ScopeTable

Stores all scopes in flat arrays:

struct ScopeTable {
    parent_ids: Vec<Option<ScopeId>>,    // Parent scope for each scope
    node_ids: Vec<NodeId>,                // AST node that created the scope
    flags: Vec<ScopeFlags>,               // Scope type and properties
}

SymbolTable

Stores all symbols in flat arrays:

struct SymbolTable {
    symbol_spans: Vec<Span>,              // Where declared
    symbol_flags: Vec<SymbolFlags>,       // Type of declaration
    symbol_scope_ids: Vec<ScopeId>,       // Which scope it's in
    symbol_declarations: Vec<NodeId>,     // AST node of declaration
}

This layout is more cache-friendly than Vec<Symbol> because related data is stored contiguously.

Performance Considerations

Single Pass

The semantic analyzer completes in a single AST traversal

Struct-of-Arrays

Improves cache locality and reduces memory overhead

Arena Allocated

Bindings and references stored in arena for fast allocation

Indexed Access

O(1) lookups using symbol/scope/reference IDs

Use Cases

Semantic analysis enables many advanced tools:

Linting

Detect undefined variables
Find unused variables
Check for variable shadowing
Validate scope rules

Transformation

Safe variable renaming
Scope hoisting
Dead code elimination
Dependency analysis

IDE Features

Go to definition
Find all references
Rename refactoring
Symbol search

Next Steps

AST Structure

Learn about the AST that semantic analysis enhances

Visitor Pattern

Learn how to traverse AST with semantic information

Linter API

See how the linter uses semantic analysis

Transformer API

See how the transformer uses semantic analysis

Getting Started

Architecture

Tools

Using Oxc

Core Concepts

​Semantic Analysis

​Overview

Symbol Table

Scope Tree

Reference Graph

​Analysis Pipeline

​Scope Tree

​Scope Types

​Accessing Scope Information

​Scope Hierarchy Example

​Symbol Table

​Symbol Properties

​Symbol Flags

​Reference Resolution

​Understanding References

​Accessing Reference Information

​Reference Flags

​Practical Example: Finding Unused Variables

​Practical Example: Detecting Unresolved References

​Scope Chain Resolution

​Integration with AST

​BindingIdentifier.symbol_id

​IdentifierReference.reference_id

​Control Flow Graph (Optional)

​Data Structures

​Scoping

​ScopeTable

​SymbolTable

​Performance Considerations

Single Pass

Struct-of-Arrays

Arena Allocated

Indexed Access

​Use Cases

​Linting

​Transformation

​IDE Features

​Next Steps

AST Structure

Visitor Pattern

Linter API

Transformer API

Build docs developers (and LLMs) love

Semantic Analysis

Overview

Analysis Pipeline

Scope Tree

Scope Types

Accessing Scope Information

Scope Hierarchy Example

Symbol Table

Symbol Properties

Symbol Flags

Reference Resolution

Understanding References

Accessing Reference Information

Reference Flags

Practical Example: Finding Unused Variables

Practical Example: Detecting Unresolved References

Scope Chain Resolution

Integration with AST

BindingIdentifier.symbol_id

IdentifierReference.reference_id

Control Flow Graph (Optional)

Data Structures

Scoping

ScopeTable

SymbolTable

Performance Considerations

Use Cases

Linting

Transformation

IDE Features

Next Steps