Compiler Entry Point
The compiler’s entry point is straightforward:rustc_driver_impl, which orchestrates the entire compilation process.
The
rustc_driver_impl crate is effectively the “main” function for the Rust compiler. It orchestrates the compilation process and “knits together” the code from other crates, but doesn’t contain the core compiler logic itself.Compilation Phases
The Rust compiler processes code through several distinct phases:Parsing
Lexical analysis and syntax parsing transform source code into an Abstract Syntax Tree (AST). The
rustc_parse and rustc_ast crates handle this phase.Expansion
Macro expansion, name resolution, and HIR lowering. The compiler expands all macros and resolves names to their definitions.
Analysis
Type checking, borrow checking, and trait resolution occur on the High-level Intermediate Representation (HIR). The
rustc_hir_analysis and rustc_borrowck crates perform these checks.MIR Construction
The HIR is lowered to Mid-level Intermediate Representation (MIR), which is used for optimization and code generation.
Optimization
Various MIR passes optimize the code through techniques like inlining, constant propagation, and dead code elimination.
Key Compiler Crates
Frontend Crates
rustc_lexer
rustc_lexer
Low-level lexical analysis. Tokenizes source code into a stream of tokens.
rustc_parse
rustc_parse
The main parser interface. Creates parsers from source files or strings and handles the parsing process.
rustc_ast
rustc_ast
Contains the Abstract Syntax Tree definitions, token types, and AST manipulation utilities.
rustc_expand
rustc_expand
Handles macro expansion, including both declarative and procedural macros.
Middle-End Crates
rustc_hir
rustc_hir
Defines the High-level Intermediate Representation, a desugared and resolved version of the AST.
rustc_middle
rustc_middle
Central hub containing type definitions, the type context (
TyCtxt), and query infrastructure.rustc_mir_build
rustc_mir_build
Constructs MIR from HIR. Responsible for lowering high-level constructs to MIR.
rustc_mir_transform
rustc_mir_transform
Contains all MIR optimization passes and transformations.
Backend Crates
rustc_codegen_ssa
rustc_codegen_ssa
Backend-agnostic code generation infrastructure. Defines traits and common functionality for all codegen backends.
rustc_codegen_llvm
rustc_codegen_llvm
LLVM backend implementation. Translates MIR to LLVM IR and manages LLVM compilation.
Compiler Callbacks
The compiler provides hooks for external tools to observe and modify compilation:These callbacks enable tools like Clippy, rust-analyzer, and custom compiler plugins to hook into the compilation process at specific points.
Query System
The compiler uses a demand-driven query system for incremental compilation:- Queries are memoized computations that form the compiler’s architecture
- Results are cached and dependencies are tracked for incremental recompilation
- The query system is defined in
rustc_middleand implemented inrustc_query_impl
Memory Management
The compiler uses arena allocation extensively for performance:
- The
rustc_arenacrate provides efficient memory allocation - Most compiler data structures are allocated in arenas and have the same lifetime
- This eliminates the need for individual deallocations and improves cache locality
Error Handling
Therustc_errors crate provides comprehensive error reporting:
- Rich diagnostic messages with source code snippets
- Structured error codes (e.g., E0308 for type mismatches)
- Multiple output formats (JSON, terminal, etc.)
- Suggestions for fixing errors
Next Steps
Frontend
Learn about parsing and AST construction
MIR
Explore the Mid-level Intermediate Representation
Code Generation
Understand how MIR becomes machine code
Incremental Compilation
See how the compiler optimizes rebuild times