Overview
This guide explains how to extend the MCC compiler with new features while respecting architectural boundaries and maintaining the project’s principles of simplicity and maintainability.
Understanding the Architecture
Before extending the compiler, understand the core architectural invariants:
- No compilation stage depends on later stages in the pipeline
- The syntax layer (
mcc-syntax) contains no compilation logic
- All compilation stages are implemented as pure functions with Salsa tracking
- Error handling is non-fatal - compilation continues to collect all errors
- The driver crate contains no compilation logic, only orchestration
Adding New Pipeline Stages
Adding new pipeline stages changes the core architecture and should be done with extreme caution. Always discuss with maintainers first.
When to Add a Stage
Consider adding a new pipeline stage when:
- The transformation is conceptually distinct from existing stages
- The stage produces a new intermediate representation
- The stage benefits from independent incremental compilation
How to Add a Stage
- Define the IR: Create a new module in
mcc with the intermediate representation
- Implement as Salsa tracked function: Make the stage incremental
- Update the pipeline: Modify the compilation pipeline to include the new stage
- Add callbacks: Update
mcc-driver to expose callbacks for the new stage
- Write tests: Add comprehensive tests for the new stage
// In crates/mcc/src/my_new_stage.rs
#[salsa::tracked]
pub fn my_new_stage(
db: &dyn crate::Db,
input: InputType,
) -> OutputType {
// Stage implementation
// Use Diagnostics::push() for error reporting
}
Modifying Existing Stages
Adding Features to a Stage
When adding features to an existing compilation stage:
- Read the stage implementation: Understand current patterns and invariants
- Keep changes minimal: Add only what’s necessary
- Follow existing patterns: Match the style and structure of surrounding code
- Update tests: Add tests for the new feature
Example: Adding a New AST Node Type
- Update the tree-sitter grammar (don’t hand-edit generated files)
- Regenerate the AST: Run the appropriate xtask command
- Update typechecking: Handle the new node in HIR generation
- Update lowering: Transform the HIR to TAC
- Update codegen: Generate assembly for the new construct
- Add tests: Include valid and invalid test cases
Working with Salsa
Tracked Functions
All compilation stages are Salsa tracked functions. This enables incremental compilation:
#[salsa::tracked]
pub fn compile_stage(
db: &dyn crate::Db,
input: Input,
) -> Output {
// Salsa automatically tracks dependencies
// and caches results
}
Accumulators for Diagnostics
Use the Diagnostics accumulator to report errors without stopping compilation:
use crate::diagnostics::Diagnostics;
#[salsa::tracked]
pub fn my_stage(db: &dyn crate::Db, input: Input) -> Output {
if some_error {
Diagnostics::push(
db,
diagnostic::error()
.with_message("Error message")
.with_labels(vec![/* ... */])
);
}
// Continue processing
}
Do not change Salsa tracking or break incremental compilation invariants unless absolutely necessary.
Extending the CLI
Adding Command-Line Options
Modify mcc-driver to add new CLI options:
// In crates/mcc-driver/src/main.rs or args.rs
#[derive(Debug, clap::Parser)]
pub struct Args {
// Existing fields...
/// Description of your new option
#[arg(long)]
pub my_new_option: bool,
}
Using Options in the Pipeline
Pass options through the compilation pipeline as needed:
- Store options in the Salsa database
- Access them in tracked functions
- Adjust behavior based on the options
Adding Optimization Passes
Optimization passes typically operate on the TAC IR:
- Create a new module in the appropriate stage (often in
lowering or between stages)
- Implement as a transformation: Take IR as input, return modified IR
- Make it optional: Use CLI flags to enable/disable
- Add tests: Include performance benchmarks if relevant
pub fn optimize_tac(program: tacky::Program) -> tacky::Program {
// Transformation logic
// Examples: constant folding, dead code elimination,
// register allocation improvements
}
Adding New Target Architectures
Current Architecture
The compiler currently targets x86_64. To add a new architecture:
- Create target-specific codegen: Implement code generation for the new target
- Create target-specific renderer: Handle OS-specific conventions
- Update target detection: Use
target-lexicon for target selection
- Keep IRs target-agnostic: Don’t modify TAC or HIR
Architecture-Specific Code
- Codegen:
crates/mcc/src/codegen/ - Assembly IR generation
- Render:
crates/mcc/src/render/ - OS and architecture-specific assembly text
Testing Your Extensions
Unit Tests
Add unit tests in the relevant module:
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_my_feature() {
// Test implementation
}
}
Integration Tests
Add test cases to the writing-a-c-compiler-tests suite or create custom integration tests:
# Run integration tests
cargo test -p integration-tests --test integration
# Test specific chapters
cargo test -p integration-tests --test integration -- chapter_1
Snapshot Tests
Use insta for snapshot testing of IR representations:
#[test]
fn test_ir_generation() {
let output = generate_ir(input);
insta::assert_debug_snapshot!(output);
}
Best Practices for Extensions
- Read existing code to understand patterns
- Keep changes small and focused
- Write tests for new features
- Document public APIs
- Use descriptive names
- Follow Rust idioms
Don’t
- Add unnecessary abstractions
- Duplicate code across files
- Break architectural boundaries without discussion
- Skip verification commands
- Hand-edit generated files
- Introduce new dependencies without justification
Common Extension Patterns
Adding a New Language Feature
- Update tree-sitter grammar
- Regenerate AST
- Handle in typechecking (HIR)
- Handle in lowering (TAC)
- Handle in codegen (assembly)
- Add tests (valid and invalid cases)
Adding a New Diagnostic
- Identify the compilation stage where the error occurs
- Use
Diagnostics::push() to report the error
- Provide clear error messages and source spans
- Add tests for the diagnostic
Adding a New Command-Line Flag
- Add to
Args struct in mcc-driver
- Pass through to relevant compilation stages
- Document the flag’s behavior
- Add integration tests with the flag enabled/disabled
Getting Help
When in doubt:
- Read ARCHITECTURE.md for architectural guidance
- Search the codebase for similar patterns
- Check existing tests for examples
- Discuss with maintainers before major changes
Example: Adding Constant Folding
Here’s a complete example of adding a simple optimization:
// In crates/mcc/src/optimizations/constant_folding.rs
use crate::tacky::{Program, Instruction, Value};
pub fn fold_constants(program: Program) -> Program {
// Walk the TAC and fold constant expressions
// Example: Replace "x = 2 + 3" with "x = 5"
program.map_instructions(|instr| {
match instr {
Instruction::Binary { op, left, right, dest } => {
if let (Value::Constant(l), Value::Constant(r)) = (left, right) {
// Compute at compile time
let result = evaluate(op, l, r);
Instruction::Copy {
src: Value::Constant(result),
dest,
}
} else {
instr
}
}
_ => instr,
}
})
}
Then integrate it into the pipeline and add tests.