Overview
The TypeScript compiler transforms TypeScript source code into JavaScript through a multi-phase pipeline. This document explores the internal implementation of each phase.
The compiler is designed to be fast, incremental, and support rich IDE integration.
Scanner (Lexical Analysis)
Location : src/compiler/scanner.ts (4,101 lines)
The scanner converts raw text into a stream of tokens.
Scanner Interface
interface Scanner {
getToken () : SyntaxKind ;
getTokenStart () : number ;
getTokenEnd () : number ;
getTokenText () : string ;
getTokenValue () : string ;
scan () : SyntaxKind ;
// Context-specific scanning
scanJsxToken () : JsxTokenSyntaxKind ;
scanJsDocToken () : JSDocSyntaxKind ;
reScanGreaterToken () : SyntaxKind ;
reScanSlashToken () : SyntaxKind ;
reScanTemplateToken ( isTaggedTemplate : boolean ) : SyntaxKind ;
}
Key Responsibilities
Tokenization Break source text into tokens (keywords, identifiers, operators, literals)
Token Classification Identify token types using the SyntaxKind enum
Context Awareness Handle JSX, JSDoc, and template strings with specialized scanners
Error Recovery Report lexical errors (unterminated strings, invalid characters)
Implementation Details
Scanner Creation
Token Types
// From src/compiler/scanner.ts
export function createScanner (
languageVersion : ScriptTarget ,
skipTrivia : boolean ,
languageVariant ?: LanguageVariant ,
textInitial ?: string ,
onError ?: ErrorCallback ,
start ?: number ,
length ?: number
) : Scanner {
// Scanner state
let text = textInitial ;
let pos : number ;
let end : number ;
let token : SyntaxKind ;
let tokenValue : string ;
let tokenFlags : TokenFlags ;
// Main scanning loop
function scan () : SyntaxKind {
// Skip whitespace and comments
// Identify next token
// Return token type
}
}
The scanner uses character codes (CharacterCodes enum) for efficient character comparison without string allocation.
Parser (Syntax Analysis)
Location : src/compiler/parser.ts (10,823 lines)
The parser constructs an Abstract Syntax Tree (AST) from the token stream.
Parser Architecture
Token Consumption
Uses Scanner to get next token and advance position
Grammar Rules
Implements TypeScript grammar rules as recursive descent parser
AST Construction
Creates immutable Node objects using factory functions
Error Recovery
Attempts to continue parsing after syntax errors
Key Functions
// Main parser entry point
export function parseSourceFile (
fileName : string ,
sourceText : string ,
languageVersion : ScriptTarget ,
syntaxCursor ?: IncrementalParser . SyntaxCursor ,
setParentNodes ?: boolean ,
scriptKind ?: ScriptKind
) : SourceFile
// Parse individual constructs
function parseClassDeclaration () : ClassDeclaration
function parseFunctionDeclaration () : FunctionDeclaration
function parseTypeAnnotation () : TypeNode
function parseExpression () : Expression
AST Node Structure
Every AST node extends the base Node interface:
Node Interface
Example Nodes
interface Node {
kind : SyntaxKind ;
pos : number ; // Start position in source
end : number ; // End position in source
flags : NodeFlags ;
parent : Node ; // Parent node reference
// ... additional properties
}
The parser creates a complete, position-accurate AST. The pos and end properties enable precise source mapping.
Incremental Parsing
The parser supports incremental reparsing for editor scenarios:
// Reuse unchanged subtrees from previous parse
const syntaxCursor = createSyntaxCursor ( oldSourceFile );
const newSourceFile = parseSourceFile (
fileName ,
newText ,
languageVersion ,
syntaxCursor // Reuses nodes where possible
);
Binder (Symbol Creation)
Location : src/compiler/binder.ts (3,913 lines)
The binder creates symbols and establishes scope relationships.
Binding Process
Creates Symbol objects for declarations (variables, functions, types, etc.)
Builds symbol tables for each scope (global, module, function, block)
Constructs control flow graphs for type narrowing
Identifies function boundaries and other semantic containers
Symbol Table
interface Symbol {
flags : SymbolFlags ; // Kind of symbol (variable, function, etc.)
escapedName : __String ; // Symbol name
declarations ?: Declaration []; // AST nodes that declare this symbol
exports ?: SymbolTable ; // Exported members (for modules, classes)
members ?: SymbolTable ; // Members (for classes, interfaces)
// ... additional properties
}
type SymbolTable = Map < __String , Symbol >;
The binder runs in a single pass over the AST, visiting each node exactly once.
Control Flow Analysis
The binder builds control flow graphs to support type narrowing:
Flow Nodes
Flow Graph Usage
interface FlowNode {
flags : FlowFlags ;
}
interface FlowAssignment extends FlowNode {
node : Expression | VariableDeclaration ;
antecedent : FlowNode ;
}
interface FlowCondition extends FlowNode {
expression : Expression ;
antecedent : FlowNode ;
}
Binding Entry Point
// From src/compiler/binder.ts
export function bindSourceFile (
file : SourceFile ,
options : CompilerOptions
) : void {
// Initialize binding state
// Walk AST and bind nodes
// Create symbols and symbol tables
// Build control flow graph
}
Checker (Type Checking)
Location : src/compiler/checker.ts (54,434 lines)
The checker is the largest and most complex component, implementing TypeScript’s type system.
The checker is highly optimized but complex. Changes here require careful consideration of performance and correctness.
Checker Responsibilities
Type Inference Infer types from context and initialization
Type Checking Verify type compatibility and assignability
Symbol Resolution Resolve references to their declarations
Diagnostics Report type errors and semantic issues
Type Checker Interface
interface TypeChecker {
getTypeAtLocation ( node : Node ) : Type ;
getSymbolAtLocation ( node : Node ) : Symbol | undefined ;
getTypeOfSymbolAtLocation ( symbol : Symbol , node : Node ) : Type ;
// Type operations
isTypeAssignableTo ( source : Type , target : Type ) : boolean ;
getPropertiesOfType ( type : Type ) : Symbol [];
getSignaturesOfType ( type : Type , kind : SignatureKind ) : Signature [];
// Diagnostics
getDiagnostics ( sourceFile ?: SourceFile ) : Diagnostic [];
}
Type System
The checker implements a rich type system:
Base Types
Complex Types
Generic Types
interface Type {
flags : TypeFlags ;
symbol ?: Symbol ;
// ... type-specific properties
}
// Primitive types
- StringType
- NumberType
- BooleanType
- VoidType
- UndefinedType
- NullType
// Object types
interface ObjectType extends Type {
objectFlags : ObjectFlags ;
}
// Union and intersection types
interface UnionType extends Type {
types : Type [];
}
interface IntersectionType extends Type {
types : Type [];
}
interface TypeReference extends ObjectType {
target : GenericType ;
typeArguments ?: Type [];
}
interface TypeParameter extends Type {
constraint ?: Type ;
default ?: Type ;
}
Type Checking Algorithm
Symbol Resolution
Resolve identifiers to their symbol declarations
Type Instantiation
Instantiate generic types with type arguments
Type Inference
Infer type arguments from usage context
Assignability Check
Check if source type is assignable to target type
Error Reporting
Generate diagnostic messages for type errors
Emitter (Code Generation)
Location : src/compiler/emitter.ts (6,378 lines)
The emitter generates JavaScript code and declaration files from the AST.
Emission Pipeline
Emitter Features
JS Generation Outputs JavaScript matching target ES version
Source Maps Generates source maps for debugging
Declarations Emits .d.ts type declaration files
Comments Preserves and positions comments
Emitter Interface
// From src/compiler/emitter.ts
export function emitFiles (
resolver : EmitResolver ,
host : EmitHost ,
targetSourceFile ?: SourceFile ,
transformers ?: EmitTransformers
) : EmitResult {
// Transform AST
const transformed = transformNodes ( ... );
// Emit each file
for ( const sourceFile of transformed . files ) {
printFile ( sourceFile );
}
return { diagnostics , emittedFiles };
}
Printer
The printer converts AST nodes to text:
const printer = createPrinter ({
newLine: NewLineKind . LineFeed ,
removeComments: false ,
});
const result = printer . printFile ( sourceFile );
The emitter uses a TextWriter for efficient string building without excessive allocations.
Location : src/compiler/transformers/
Transformers modify the AST before emission:
ES Downleveling
Feature Transforms
Module Systems
es2015.ts - Classes, arrow functions, destructuring
es2016.ts - Exponentiation operator
es2017.ts - Async/await
es2018.ts - Object spread, async iteration
es2019.ts - Optional catch binding
es2020.ts - Optional chaining, nullish coalescing
es2021.ts - Logical assignment
esnext.ts - Latest features
jsx.ts - JSX to JavaScript
generators.ts - Generator functions
esDecorators.ts - Stage 3 decorators
legacyDecorators.ts - Experimental decorators
typeSerializer.ts - Emit decorator metadata
module/module.ts - Module transformation
CommonJS conversion
ES module interop
AMD/UMD/System formats
function transformSourceFile ( context : TransformationContext ) {
return ( node : SourceFile ) : SourceFile => {
function visitor ( node : Node ) : Node {
// Transform node based on kind
switch ( node . kind ) {
case SyntaxKind . ClassDeclaration :
return transformClassDeclaration ( node );
// ... other cases
}
// Recursively visit children
return visitEachChild ( node , visitor , context );
}
return visitNode ( node , visitor );
};
}
The parser reuses unchanged nodes during incremental parsing
Type checking happens on-demand, not for all files upfront
Symbol resolution results are cached to avoid repeated work
Identifier strings are interned to reduce memory usage
Next Steps
Language Service Internals Learn how the compiler powers IDE features like autocompletion and navigation
Reference Files
Key implementation files in src/compiler/:
scanner.ts:42 - tokenIsIdentifierOrKeyword() function
scanner.ts:389 - isUnicodeIdentifierStart() function
scanner.ts:414 - tokenToString() function
parser.ts - Main parsing logic
binder.ts - Symbol creation and binding
checker.ts - Type system implementation
emitter.ts - Code generation
types.ts - Core type definitions
utilities.ts - Shared helper functions