Skip to main content
The Rust compiler (rustc) is implemented as a collection of 74+ specialized crates in the compiler/ directory. Each crate handles a specific phase of compilation or provides shared infrastructure.

Compiler Overview

The compiler follows a traditional multi-pass design with three distinct intermediate representations (AST, HIR, MIR) before final code generation.

Main Entry Points

rustc

Binary wrapper that invokes the compiler driver

rustc_driver

Thin wrapper around rustc_driver_impl

rustc_driver_impl

Main compiler driver coordinating all compilation phases
The rustc_driver_impl crate orchestrates the entire compilation pipeline and depends on 30+ other compiler crates:
// Key dependencies from rustc_driver_impl/Cargo.toml
rustc_ast
rustc_ast_pretty
rustc_codegen_ssa
rustc_const_eval
rustc_data_structures
rustc_errors
rustc_expand
rustc_hir_analysis
rustc_interface
rustc_metadata
rustc_middle
rustc_mir_build
rustc_mir_transform
rustc_parse
rustc_resolve
rustc_session
// ... and more

Compilation Pipeline

The compilation process flows through distinct phases, each implemented by specialized crates:

Phase 1: Lexical Analysis and Parsing

1

Lexing

rustc_lexer converts source text into a stream of tokens.
  • No dependencies on other rustc crates
  • Pure function from &str to tokens
  • Handles all Rust syntax including raw strings, byte literals, etc.
2

Parsing

rustc_parse builds the Abstract Syntax Tree (AST).Dependencies:
  • rustc_ast: AST data structures
  • rustc_lexer: Token stream
  • rustc_errors: Diagnostic reporting
  • rustc_span: Source location tracking
3

AST Representation

rustc_ast defines the AST data structures and provides utilities for working with them.From compiler/rustc_ast/README.md:
The rustc_ast crate contains those things concerned purely with syntax – that is, the AST (“abstract syntax tree”), along with some definitions for tokens and token streams, data structures/traits for mutating ASTs, and shared definitions for other AST-related parts of the compiler.

Phase 2: Name Resolution and HIR

1

AST to HIR Lowering

rustc_ast_lowering converts the AST to High-Level Intermediate Representation (HIR).HIR is a desugared, name-resolved version of the AST:
  • Removes syntactic sugar
  • Resolves identifiers to their definitions
  • Simplifies pattern matching
  • Normalizes control flow
2

Name Resolution

rustc_resolve performs name resolution.
  • Resolves all identifiers to definitions
  • Handles imports and visibility
  • Builds the name resolution tables
  • Detects name conflicts and ambiguities
3

HIR Representation

rustc_hir defines the HIR data structures.Supporting crates:
  • rustc_hir_id: HIR node identification
  • rustc_hir_pretty: Pretty-printing HIR

Phase 3: Type Checking and Analysis

rustc_hir_analysis performs type checking and trait resolution.Key responsibilities:
  • Type inference
  • Trait solving
  • Well-formedness checking
  • Type parameter bounds checking
rustc_hir_typeck handles detailed type checking:
  • Expression type checking
  • Method resolution
  • Coercion handling

Phase 4: MIR Generation and Optimization

1

MIR Construction

rustc_mir_build builds the Mid-Level Intermediate Representation (MIR).MIR is a control-flow graph representation:
  • Basic blocks with terminators
  • Explicit control flow
  • Explicit drops and unwinding
  • Suitable for optimization and analysis
2

Borrow Checking

rustc_borrowck implements Rust’s ownership and borrowing rules.This is where Rust’s memory safety guarantees are enforced:
  • Lifetime checking
  • Borrow tracking
  • Move semantics
  • Non-Lexical Lifetimes (NLL)
3

Constant Evaluation

rustc_const_eval evaluates constants at compile time.
  • Const function evaluation
  • Static initialization
  • Compile-time computation
  • Pattern matching exhaustiveness
4

MIR Optimization

rustc_mir_transform performs MIR-level optimizations.Optimizations include:
  • Inlining
  • Dead code elimination
  • Constant propagation
  • Simplification passes
MIR is central to Rust’s compilation model. It’s the representation where borrow checking, optimization, and most analyses occur.

Phase 5: Code Generation

Monomorphization

rustc_monomorphize generates concrete versions of generic code.
  • Instantiates generic functions
  • Performs collection of items to codegen
  • Handles specialization

Code Generation

rustc_codegen_ssa provides the shared codegen abstraction.Backend implementations:
  • rustc_codegen_llvm: LLVM backend (default)
  • rustc_codegen_cranelift: Cranelift backend
  • rustc_codegen_gcc: GCC backend

LLVM Backend

rustc_codegen_llvm is the default code generation backend:
# Optional dependency in rustc_interface
rustc_codegen_llvm = { path = "../rustc_codegen_llvm", optional = true }
Supporting LLVM integration:
  • rustc_llvm: LLVM C++ bindings

Alternative Backends

The compiler supports alternative backends for different use cases:
  • Cranelift: Faster compilation times, less optimization
  • GCC: Platform support through GCC

Supporting Infrastructure

Data Structures and Utilities

rustc_data_structures provides specialized compiler data structures:
  • Interner: String interning
  • IndexVec: Indexed vector types
  • FxHashMap/FxHashSet: Fast hashing collections
  • Fingerprint: For incremental compilation
  • Graph data structures
  • Persistent data structures
rustc_index defines newtype indices for type-safe indexing:
  • DefId: Definition identifiers
  • LocalDefId: Local definition identifiers
  • HirId: HIR node identifiers
  • Various other index types
rustc_index_macros provides macros for index types.
rustc_hashes provides specialized hash implementations:
  • Fast, non-cryptographic hashing
  • Stable hashing for incremental compilation
rustc_arena provides arena allocators for efficient memory management:
  • Batch allocation/deallocation
  • Reduced memory fragmentation
  • Improved cache locality

Error Handling and Diagnostics

rustc_errors implements comprehensive error reporting:
  • Multi-span diagnostics
  • Structured suggestions
  • Error formatting and styling
  • JSON output for tools
rustc_error_codes defines all compiler error codes.rustc_error_messages manages error message localization.

Metadata and Linking

Metadata

rustc_metadata handles crate metadata:
  • Reading compiled crate metadata
  • Writing metadata to rlibs
  • Dependency tracking
  • Cross-crate information

Symbol Mangling

rustc_symbol_mangling generates mangled symbol names:
  • Name mangling for linker
  • Demangling support
  • Symbol versioning

Session and Configuration

rustc_session manages compilation session state:
// Core session information
- Compiler options (optimization level, target, etc.)
- Feature gates and stability tracking
- Source file mapping
- Diagnostic emitter
- Target information
rustc_target defines target platform information:
  • Target triples
  • ABI specifications
  • Calling conventions
  • Platform-specific details
rustc_abi handles ABI-related types and computations.

Feature Management

rustc_feature manages language features:
  • Stable features
  • Unstable features
  • Feature gates
  • Edition-based features
rustc_attr_parsing parses attributes and feature gates.

Incremental Compilation

1

Query System

rustc_query_impl implements the query system for demand-driven compilation:
  • On-demand computation
  • Automatic dependency tracking
  • Result caching
  • Incremental recompilation
2

Incremental State

rustc_incremental manages incremental compilation state:
  • Dependency graph persistence
  • Change detection
  • Work product caching

Macros and Expansion

rustc_expand implements macro expansion:
  • Declarative macros (macro_rules!)
  • Procedural macros
  • Built-in macros
  • Derive macros
rustc_builtin_macros implements built-in macros like:
  • println!, format!
  • assert!, debug_assert!
  • include!, include_str!
  • cfg!, env!

Platform-Specific Crates

rustc_sanitizers integrates with LLVM sanitizers:
  • AddressSanitizer
  • ThreadSanitizer
  • MemorySanitizer
  • Leak Sanitizer
rustc_windows_rc handles Windows resource files.
rustc_baked_icu_data contains embedded ICU data for Unicode operations.

Utility Crates

Filesystem

rustc_fs_utilFilesystem utilities for the compiler

Logging

rustc_logLogging infrastructure

Graphviz

rustc_graphvizGraphviz output generation

Macros

rustc_macrosInternal macros for compiler development

Serialization

rustc_serializeCustom serialization for compiler types

Threading

rustc_thread_poolThread pool for parallel compilation

Analysis and Checking

rustc_passes implements various compiler passes:
  • Liveness analysis
  • Stability checking
  • Reachability analysis
  • Entry point detection
rustc_privacy checks privacy rules and visibility. rustc_lint implements the linting framework:
  • Built-in lints
  • Lint attributes
  • Lint levels
  • Custom lint infrastructure
rustc_lint_defs defines lint declarations.

Pattern Matching

rustc_pattern_analysis implements pattern matching analysis:
  • Exhaustiveness checking
  • Usefulness checking
  • Reachability analysis

Type System Transmutation

rustc_transmute handles safe transmutation analysis for the transmute intrinsic.

Public API

rustc_public and rustc_public_bridge provide stable APIs for external tools:
These crates enable stable MIR consumers to interact with the compiler without depending on unstable internals.

Compilation Flow Diagram

  • Blue: AST/HIR/MIR construction
  • Red: Type checking and borrow checking
  • Green: Optimization
  • Yellow: Code generation

Key Takeaways

Modular Design

74+ specialized crates, each with a single responsibility

Three IRs

AST → HIR → MIR pipeline enables different analyses at appropriate levels

Query-Based

Demand-driven compilation with automatic dependency tracking

Backend Agnostic

Abstraction layer supports multiple code generation backends

Further Reading

Build docs developers (and LLMs) love