Crate structure
The project is split into several crates, each with a specific responsibility:mcc
Core compiler library containing the full compilation pipeline
mcc-syntax
Tree-sitter integration and strongly-typed AST nodes
mcc-driver
Command-line interface and orchestration
xtask
Build-time tooling and development utilities
Core compiler (mcc)
The mcc crate implements all compilation logic as separate modules:
Syntax layer (mcc-syntax)
Provides strongly-typed AST nodes generated from the tree-sitter C grammar. This layer is independent of compilation logic and focuses purely on syntax representation.
Driver (mcc-driver)
Orchestrates the compilation pipeline and handles user interaction. The driver exposes a Callbacks trait fired after each stage:
after_parseafter_lowerafter_codegenafter_render_assemblyafter_compile
Module boundaries
MCC enforces strict boundaries between layers to maintain clarity and enable incremental compilation.Syntax/compilation boundary
Themcc-syntax crate provides the AST interface, while mcc contains all compilation logic. This ensures syntax changes don’t require recompiling the entire compiler.
The syntax layer contains no compilation logic. It only provides strongly-typed wrappers around tree-sitter nodes.
Pipeline stage boundaries
Each compilation stage is implemented as a separate module with clear input/output contracts:External tool boundary
The compiler delegates preprocessing, assembly, and linking to external tools (typically the system C compiler):Key types and abstractions
The compiler uses well-defined types to represent data at each stage:SourceFile
SourceFile
Represents a source file with path and contents. Created as a Salsa input.
Ast
Ast
Wraps the tree-sitter parse tree with strongly-typed accessors from
mcc-syntax.hir::TranslationUnit
hir::TranslationUnit
High-Level IR produced by typechecking. A simplified, semantically-checked AST.
tacky::Program
tacky::Program
Three Address Code (TAC) intermediate representation.
asm::Program
asm::Program
Assembly IR prior to textual rendering. Target-agnostic representation.
Database / Db trait
Database / Db trait
Salsa database for incremental compilation. All tracked functions take
&dyn Db.Diagnostics
Diagnostics
Salsa accumulator for collecting
codespan-reporting diagnostics. Stages push diagnostics instead of failing.Text
Text
Reference-counted string type for efficient memory sharing (wrapper around
Arc<str>).Architectural invariants
These rules are enforced throughout the codebase:No compilation stage depends on later stages in the pipeline
The syntax layer contains no compilation logic
All compilation stages are implemented as pure functions with Salsa tracking
Error handling is non-fatal - compilation continues to collect all errors
The driver crate contains no compilation logic, only orchestration
Error handling boundary
All compilation stages accumulate diagnostics rather than failing immediately:Target support
The compiler targets x86_64 by default but usestarget-lexicon for architecture abstraction:
Dependencies
Key external dependencies:- Salsa (
0.13.2) - Incremental computation framework - tree-sitter / type-sitter - Parsing with error recovery
- codespan-reporting - Diagnostic rendering
- target-lexicon - Target platform abstraction
- im - Persistent data structures for scopes
See the Incremental compilation page to learn how Salsa enables fast rebuilds.