compiler/ directory. Each crate handles a specific phase of compilation or provides shared infrastructure.
Compiler Overview
The compiler follows a traditional multi-pass design with three distinct intermediate representations (AST, HIR, MIR) before final code generation.
Main Entry Points
rustc
Binary wrapper that invokes the compiler driver
rustc_driver
Thin wrapper around rustc_driver_impl
rustc_driver_impl
Main compiler driver coordinating all compilation phases
rustc_driver_impl crate orchestrates the entire compilation pipeline and depends on 30+ other compiler crates:
Compilation Pipeline
The compilation process flows through distinct phases, each implemented by specialized crates:Phase 1: Lexical Analysis and Parsing
Lexing
rustc_lexer converts source text into a stream of tokens.
- No dependencies on other rustc crates
- Pure function from
&strto tokens - Handles all Rust syntax including raw strings, byte literals, etc.
Parsing
rustc_parse builds the Abstract Syntax Tree (AST).Dependencies:
rustc_ast: AST data structuresrustc_lexer: Token streamrustc_errors: Diagnostic reportingrustc_span: Source location tracking
AST Representation
rustc_ast defines the AST data structures and provides utilities for working with them.From
compiler/rustc_ast/README.md:
The rustc_ast crate contains those things concerned purely with syntax
– that is, the AST (“abstract syntax tree”), along with some definitions
for tokens and token streams, data structures/traits for mutating ASTs,
and shared definitions for other AST-related parts of the compiler.
AST-Related Crates
AST-Related Crates
Phase 2: Name Resolution and HIR
AST to HIR Lowering
rustc_ast_lowering converts the AST to High-Level Intermediate Representation (HIR).HIR is a desugared, name-resolved version of the AST:
- Removes syntactic sugar
- Resolves identifiers to their definitions
- Simplifies pattern matching
- Normalizes control flow
Name Resolution
rustc_resolve performs name resolution.
- Resolves all identifiers to definitions
- Handles imports and visibility
- Builds the name resolution tables
- Detects name conflicts and ambiguities
Phase 3: Type Checking and Analysis
- Type Analysis
- Trait System
- Type System Core
rustc_hir_analysis performs type checking and trait resolution.Key responsibilities:
- Type inference
- Trait solving
- Well-formedness checking
- Type parameter bounds checking
- Expression type checking
- Method resolution
- Coercion handling
Phase 4: MIR Generation and Optimization
MIR Construction
rustc_mir_build builds the Mid-Level Intermediate Representation (MIR).MIR is a control-flow graph representation:
- Basic blocks with terminators
- Explicit control flow
- Explicit drops and unwinding
- Suitable for optimization and analysis
Borrow Checking
rustc_borrowck implements Rust’s ownership and borrowing rules.This is where Rust’s memory safety guarantees are enforced:
- Lifetime checking
- Borrow tracking
- Move semantics
- Non-Lexical Lifetimes (NLL)
Constant Evaluation
rustc_const_eval evaluates constants at compile time.
- Const function evaluation
- Static initialization
- Compile-time computation
- Pattern matching exhaustiveness
MIR is central to Rust’s compilation model. It’s the representation where borrow checking, optimization, and most analyses occur.
Phase 5: Code Generation
Monomorphization
rustc_monomorphize generates concrete versions of generic code.
- Instantiates generic functions
- Performs collection of items to codegen
- Handles specialization
Code Generation
rustc_codegen_ssa provides the shared codegen abstraction.Backend implementations:
- rustc_codegen_llvm: LLVM backend (default)
- rustc_codegen_cranelift: Cranelift backend
- rustc_codegen_gcc: GCC backend
LLVM Backend
rustc_codegen_llvm is the default code generation backend:- rustc_llvm: LLVM C++ bindings
Alternative Backends
The compiler supports alternative backends for different use cases:- Cranelift: Faster compilation times, less optimization
- GCC: Platform support through GCC
Supporting Infrastructure
Data Structures and Utilities
Core Data Structures
Core Data Structures
rustc_data_structures provides specialized compiler data structures:
Interner: String interningIndexVec: Indexed vector typesFxHashMap/FxHashSet: Fast hashing collectionsFingerprint: For incremental compilation- Graph data structures
- Persistent data structures
Indexing
Indexing
rustc_index defines newtype indices for type-safe indexing:
DefId: Definition identifiersLocalDefId: Local definition identifiersHirId: HIR node identifiers- Various other index types
Hashing
Hashing
rustc_hashes provides specialized hash implementations:
- Fast, non-cryptographic hashing
- Stable hashing for incremental compilation
Arena Allocation
Arena Allocation
rustc_arena provides arena allocators for efficient memory management:
- Batch allocation/deallocation
- Reduced memory fragmentation
- Improved cache locality
Error Handling and Diagnostics
- Error Infrastructure
- Source Tracking
rustc_errors implements comprehensive error reporting:
- Multi-span diagnostics
- Structured suggestions
- Error formatting and styling
- JSON output for tools
Metadata and Linking
Metadata
rustc_metadata handles crate metadata:
- Reading compiled crate metadata
- Writing metadata to rlibs
- Dependency tracking
- Cross-crate information
Symbol Mangling
rustc_symbol_mangling generates mangled symbol names:
- Name mangling for linker
- Demangling support
- Symbol versioning
Session and Configuration
rustc_session manages compilation session state:- Target triples
- ABI specifications
- Calling conventions
- Platform-specific details
Feature Management
rustc_feature manages language features:- Stable features
- Unstable features
- Feature gates
- Edition-based features
Incremental Compilation
Query System
rustc_query_impl implements the query system for demand-driven compilation:
- On-demand computation
- Automatic dependency tracking
- Result caching
- Incremental recompilation
Macros and Expansion
- Macro Expansion
- Proc Macros
rustc_expand implements macro expansion:
- Declarative macros (
macro_rules!) - Procedural macros
- Built-in macros
- Derive macros
println!,format!assert!,debug_assert!include!,include_str!cfg!,env!
Platform-Specific Crates
Sanitizers
Sanitizers
rustc_sanitizers integrates with LLVM sanitizers:
- AddressSanitizer
- ThreadSanitizer
- MemorySanitizer
- Leak Sanitizer
Windows Support
Windows Support
rustc_windows_rc handles Windows resource files.
Internationalization
Internationalization
rustc_baked_icu_data contains embedded ICU data for Unicode operations.
Utility Crates
Filesystem
rustc_fs_utilFilesystem utilities for the compiler
Logging
rustc_logLogging infrastructure
Graphviz
rustc_graphvizGraphviz output generation
Macros
rustc_macrosInternal macros for compiler development
Serialization
rustc_serializeCustom serialization for compiler types
Threading
rustc_thread_poolThread pool for parallel compilation
Analysis and Checking
rustc_passes implements various compiler passes:- Liveness analysis
- Stability checking
- Reachability analysis
- Entry point detection
- Built-in lints
- Lint attributes
- Lint levels
- Custom lint infrastructure
Pattern Matching
rustc_pattern_analysis implements pattern matching analysis:- Exhaustiveness checking
- Usefulness checking
- Reachability analysis
Type System Transmutation
rustc_transmute handles safe transmutation analysis for thetransmute intrinsic.
Public API
rustc_public and rustc_public_bridge provide stable APIs for external tools:These crates enable stable MIR consumers to interact with the compiler without depending on unstable internals.
Compilation Flow Diagram
- Blue: AST/HIR/MIR construction
- Red: Type checking and borrow checking
- Green: Optimization
- Yellow: Code generation
Key Takeaways
Modular Design
74+ specialized crates, each with a single responsibility
Three IRs
AST → HIR → MIR pipeline enables different analyses at appropriate levels
Query-Based
Demand-driven compilation with automatic dependency tracking
Backend Agnostic
Abstraction layer supports multiple code generation backends
Further Reading
- rustc dev guide - Comprehensive compiler development documentation
- Architecture Overview - High-level system architecture
- Bootstrap System - How the compiler builds itself