Skip to main content

Overview

The TypeScript compiler transforms TypeScript source code into JavaScript through a multi-phase pipeline. This document explores the internal implementation of each phase.
The compiler is designed to be fast, incremental, and support rich IDE integration.

Scanner (Lexical Analysis)

Location: src/compiler/scanner.ts (4,101 lines) The scanner converts raw text into a stream of tokens.

Scanner Interface

Scanner API
interface Scanner {
  getToken(): SyntaxKind;
  getTokenStart(): number;
  getTokenEnd(): number;
  getTokenText(): string;
  getTokenValue(): string;
  
  scan(): SyntaxKind;
  
  // Context-specific scanning
  scanJsxToken(): JsxTokenSyntaxKind;
  scanJsDocToken(): JSDocSyntaxKind;
  reScanGreaterToken(): SyntaxKind;
  reScanSlashToken(): SyntaxKind;
  reScanTemplateToken(isTaggedTemplate: boolean): SyntaxKind;
}

Key Responsibilities

Tokenization

Break source text into tokens (keywords, identifiers, operators, literals)

Token Classification

Identify token types using the SyntaxKind enum

Context Awareness

Handle JSX, JSDoc, and template strings with specialized scanners

Error Recovery

Report lexical errors (unterminated strings, invalid characters)

Implementation Details

// From src/compiler/scanner.ts
export function createScanner(
  languageVersion: ScriptTarget,
  skipTrivia: boolean,
  languageVariant?: LanguageVariant,
  textInitial?: string,
  onError?: ErrorCallback,
  start?: number,
  length?: number
): Scanner {
  // Scanner state
  let text = textInitial;
  let pos: number;
  let end: number;
  let token: SyntaxKind;
  let tokenValue: string;
  let tokenFlags: TokenFlags;
  
  // Main scanning loop
  function scan(): SyntaxKind {
    // Skip whitespace and comments
    // Identify next token
    // Return token type
  }
}
The scanner uses character codes (CharacterCodes enum) for efficient character comparison without string allocation.

Parser (Syntax Analysis)

Location: src/compiler/parser.ts (10,823 lines) The parser constructs an Abstract Syntax Tree (AST) from the token stream.

Parser Architecture

1

Token Consumption

Uses Scanner to get next token and advance position
2

Grammar Rules

Implements TypeScript grammar rules as recursive descent parser
3

AST Construction

Creates immutable Node objects using factory functions
4

Error Recovery

Attempts to continue parsing after syntax errors

Key Functions

Parser Entry Points
// Main parser entry point
export function parseSourceFile(
  fileName: string,
  sourceText: string,
  languageVersion: ScriptTarget,
  syntaxCursor?: IncrementalParser.SyntaxCursor,
  setParentNodes?: boolean,
  scriptKind?: ScriptKind
): SourceFile

// Parse individual constructs
function parseClassDeclaration(): ClassDeclaration
function parseFunctionDeclaration(): FunctionDeclaration
function parseTypeAnnotation(): TypeNode
function parseExpression(): Expression

AST Node Structure

Every AST node extends the base Node interface:
interface Node {
  kind: SyntaxKind;
  pos: number;        // Start position in source
  end: number;        // End position in source
  flags: NodeFlags;
  parent: Node;       // Parent node reference
  // ... additional properties
}
The parser creates a complete, position-accurate AST. The pos and end properties enable precise source mapping.

Incremental Parsing

The parser supports incremental reparsing for editor scenarios:
// Reuse unchanged subtrees from previous parse
const syntaxCursor = createSyntaxCursor(oldSourceFile);
const newSourceFile = parseSourceFile(
  fileName,
  newText,
  languageVersion,
  syntaxCursor  // Reuses nodes where possible
);

Binder (Symbol Creation)

Location: src/compiler/binder.ts (3,913 lines) The binder creates symbols and establishes scope relationships.

Binding Process

Creates Symbol objects for declarations (variables, functions, types, etc.)
Builds symbol tables for each scope (global, module, function, block)
Constructs control flow graphs for type narrowing
Identifies function boundaries and other semantic containers

Symbol Table

Symbol Structure
interface Symbol {
  flags: SymbolFlags;           // Kind of symbol (variable, function, etc.)
  escapedName: __String;        // Symbol name
  declarations?: Declaration[]; // AST nodes that declare this symbol
  exports?: SymbolTable;        // Exported members (for modules, classes)
  members?: SymbolTable;        // Members (for classes, interfaces)
  // ... additional properties
}

type SymbolTable = Map<__String, Symbol>;
The binder runs in a single pass over the AST, visiting each node exactly once.

Control Flow Analysis

The binder builds control flow graphs to support type narrowing:
interface FlowNode {
  flags: FlowFlags;
}

interface FlowAssignment extends FlowNode {
  node: Expression | VariableDeclaration;
  antecedent: FlowNode;
}

interface FlowCondition extends FlowNode {
  expression: Expression;
  antecedent: FlowNode;
}

Binding Entry Point

// From src/compiler/binder.ts
export function bindSourceFile(
  file: SourceFile,
  options: CompilerOptions
): void {
  // Initialize binding state
  // Walk AST and bind nodes
  // Create symbols and symbol tables
  // Build control flow graph
}

Checker (Type Checking)

Location: src/compiler/checker.ts (54,434 lines) The checker is the largest and most complex component, implementing TypeScript’s type system.
The checker is highly optimized but complex. Changes here require careful consideration of performance and correctness.

Checker Responsibilities

Type Inference

Infer types from context and initialization

Type Checking

Verify type compatibility and assignability

Symbol Resolution

Resolve references to their declarations

Diagnostics

Report type errors and semantic issues

Type Checker Interface

TypeChecker API
interface TypeChecker {
  getTypeAtLocation(node: Node): Type;
  getSymbolAtLocation(node: Node): Symbol | undefined;
  getTypeOfSymbolAtLocation(symbol: Symbol, node: Node): Type;
  
  // Type operations
  isTypeAssignableTo(source: Type, target: Type): boolean;
  getPropertiesOfType(type: Type): Symbol[];
  getSignaturesOfType(type: Type, kind: SignatureKind): Signature[];
  
  // Diagnostics
  getDiagnostics(sourceFile?: SourceFile): Diagnostic[];
}

Type System

The checker implements a rich type system:
interface Type {
  flags: TypeFlags;
  symbol?: Symbol;
  // ... type-specific properties
}

// Primitive types
- StringType
- NumberType
- BooleanType
- VoidType
- UndefinedType
- NullType

Type Checking Algorithm

1

Symbol Resolution

Resolve identifiers to their symbol declarations
2

Type Instantiation

Instantiate generic types with type arguments
3

Type Inference

Infer type arguments from usage context
4

Assignability Check

Check if source type is assignable to target type
5

Error Reporting

Generate diagnostic messages for type errors

Emitter (Code Generation)

Location: src/compiler/emitter.ts (6,378 lines) The emitter generates JavaScript code and declaration files from the AST.

Emission Pipeline

Emitter Features

JS Generation

Outputs JavaScript matching target ES version

Source Maps

Generates source maps for debugging

Declarations

Emits .d.ts type declaration files

Comments

Preserves and positions comments

Emitter Interface

// From src/compiler/emitter.ts
export function emitFiles(
  resolver: EmitResolver,
  host: EmitHost,
  targetSourceFile?: SourceFile,
  transformers?: EmitTransformers
): EmitResult {
  // Transform AST
  const transformed = transformNodes(...);
  
  // Emit each file
  for (const sourceFile of transformed.files) {
    printFile(sourceFile);
  }
  
  return { diagnostics, emittedFiles };
}

Printer

The printer converts AST nodes to text:
Printer Usage
const printer = createPrinter({
  newLine: NewLineKind.LineFeed,
  removeComments: false,
});

const result = printer.printFile(sourceFile);
The emitter uses a TextWriter for efficient string building without excessive allocations.

Transformers

Location: src/compiler/transformers/ Transformers modify the AST before emission:

Transformation Categories

  • es2015.ts - Classes, arrow functions, destructuring
  • es2016.ts - Exponentiation operator
  • es2017.ts - Async/await
  • es2018.ts - Object spread, async iteration
  • es2019.ts - Optional catch binding
  • es2020.ts - Optional chaining, nullish coalescing
  • es2021.ts - Logical assignment
  • esnext.ts - Latest features

Transformer Pattern

Transformer Structure
function transformSourceFile(context: TransformationContext) {
  return (node: SourceFile): SourceFile => {
    function visitor(node: Node): Node {
      // Transform node based on kind
      switch (node.kind) {
        case SyntaxKind.ClassDeclaration:
          return transformClassDeclaration(node);
        // ... other cases
      }
      
      // Recursively visit children
      return visitEachChild(node, visitor, context);
    }
    
    return visitNode(node, visitor);
  };
}

Performance Optimizations

The parser reuses unchanged nodes during incremental parsing
Type checking happens on-demand, not for all files upfront
Symbol resolution results are cached to avoid repeated work
Identifier strings are interned to reduce memory usage

Next Steps

Language Service Internals

Learn how the compiler powers IDE features like autocompletion and navigation

Reference Files

Key implementation files in src/compiler/:
  • scanner.ts:42 - tokenIsIdentifierOrKeyword() function
  • scanner.ts:389 - isUnicodeIdentifierStart() function
  • scanner.ts:414 - tokenToString() function
  • parser.ts - Main parsing logic
  • binder.ts - Symbol creation and binding
  • checker.ts - Type system implementation
  • emitter.ts - Code generation
  • types.ts - Core type definitions
  • utilities.ts - Shared helper functions

Build docs developers (and LLMs) love