Extending the Compiler

Overview

This guide explains how to extend the MCC compiler with new features while respecting architectural boundaries and maintaining the project’s principles of simplicity and maintainability.

Understanding the Architecture

Before extending the compiler, understand the core architectural invariants:

No compilation stage depends on later stages in the pipeline
The syntax layer (mcc-syntax) contains no compilation logic
All compilation stages are implemented as pure functions with Salsa tracking
Error handling is non-fatal - compilation continues to collect all errors
The driver crate contains no compilation logic, only orchestration

Adding New Pipeline Stages

Adding new pipeline stages changes the core architecture and should be done with extreme caution. Always discuss with maintainers first.

When to Add a Stage

Consider adding a new pipeline stage when:

The transformation is conceptually distinct from existing stages
The stage produces a new intermediate representation
The stage benefits from independent incremental compilation

How to Add a Stage

Define the IR: Create a new module in mcc with the intermediate representation
Implement as Salsa tracked function: Make the stage incremental
Update the pipeline: Modify the compilation pipeline to include the new stage
Add callbacks: Update mcc-driver to expose callbacks for the new stage
Write tests: Add comprehensive tests for the new stage

// In crates/mcc/src/my_new_stage.rs

#[salsa::tracked]
pub fn my_new_stage(
    db: &dyn crate::Db,
    input: InputType,
) -> OutputType {
    // Stage implementation
    // Use Diagnostics::push() for error reporting
}

Modifying Existing Stages

Adding Features to a Stage

When adding features to an existing compilation stage:

Read the stage implementation: Understand current patterns and invariants
Keep changes minimal: Add only what’s necessary
Follow existing patterns: Match the style and structure of surrounding code
Update tests: Add tests for the new feature

Example: Adding a New AST Node Type

Update the tree-sitter grammar (don’t hand-edit generated files)
Regenerate the AST: Run the appropriate xtask command
Update typechecking: Handle the new node in HIR generation
Update lowering: Transform the HIR to TAC
Update codegen: Generate assembly for the new construct
Add tests: Include valid and invalid test cases

Working with Salsa

Tracked Functions

All compilation stages are Salsa tracked functions. This enables incremental compilation:

#[salsa::tracked]
pub fn compile_stage(
    db: &dyn crate::Db,
    input: Input,
) -> Output {
    // Salsa automatically tracks dependencies
    // and caches results
}

Accumulators for Diagnostics

Use the Diagnostics accumulator to report errors without stopping compilation:

use crate::diagnostics::Diagnostics;

#[salsa::tracked]
pub fn my_stage(db: &dyn crate::Db, input: Input) -> Output {
    if some_error {
        Diagnostics::push(
            db,
            diagnostic::error()
                .with_message("Error message")
                .with_labels(vec![/* ... */])
        );
    }
    // Continue processing
}

Do not change Salsa tracking or break incremental compilation invariants unless absolutely necessary.

Extending the CLI

Adding Command-Line Options

Modify mcc-driver to add new CLI options:

// In crates/mcc-driver/src/main.rs or args.rs

#[derive(Debug, clap::Parser)]
pub struct Args {
    // Existing fields...
    
    /// Description of your new option
    #[arg(long)]
    pub my_new_option: bool,
}

Using Options in the Pipeline

Pass options through the compilation pipeline as needed:

Store options in the Salsa database
Access them in tracked functions
Adjust behavior based on the options

Adding Optimization Passes

Optimization passes typically operate on the TAC IR:

Create a new module in the appropriate stage (often in lowering or between stages)
Implement as a transformation: Take IR as input, return modified IR
Make it optional: Use CLI flags to enable/disable
Add tests: Include performance benchmarks if relevant

pub fn optimize_tac(program: tacky::Program) -> tacky::Program {
    // Transformation logic
    // Examples: constant folding, dead code elimination,
    // register allocation improvements
}

Adding New Target Architectures

Current Architecture

The compiler currently targets x86_64. To add a new architecture:

Create target-specific codegen: Implement code generation for the new target
Create target-specific renderer: Handle OS-specific conventions
Update target detection: Use target-lexicon for target selection
Keep IRs target-agnostic: Don’t modify TAC or HIR

Architecture-Specific Code

Codegen: crates/mcc/src/codegen/ - Assembly IR generation
Render: crates/mcc/src/render/ - OS and architecture-specific assembly text

Testing Your Extensions

Unit Tests

Add unit tests in the relevant module:

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn test_my_feature() {
        // Test implementation
    }
}

Integration Tests

Add test cases to the writing-a-c-compiler-tests suite or create custom integration tests:

# Run integration tests
cargo test -p integration-tests --test integration

# Test specific chapters
cargo test -p integration-tests --test integration -- chapter_1

Snapshot Tests

Use insta for snapshot testing of IR representations:

#[test]
fn test_ir_generation() {
    let output = generate_ir(input);
    insta::assert_debug_snapshot!(output);
}

Best Practices for Extensions

Do

Read existing code to understand patterns
Keep changes small and focused
Write tests for new features
Document public APIs
Use descriptive names
Follow Rust idioms

Don’t

Add unnecessary abstractions
Duplicate code across files
Break architectural boundaries without discussion
Skip verification commands
Hand-edit generated files
Introduce new dependencies without justification

Common Extension Patterns

Adding a New Language Feature

Update tree-sitter grammar
Regenerate AST
Handle in typechecking (HIR)
Handle in lowering (TAC)
Handle in codegen (assembly)
Add tests (valid and invalid cases)

Adding a New Diagnostic

Identify the compilation stage where the error occurs
Use Diagnostics::push() to report the error
Provide clear error messages and source spans
Add tests for the diagnostic

Adding a New Command-Line Flag

Add to Args struct in mcc-driver
Pass through to relevant compilation stages
Document the flag’s behavior
Add integration tests with the flag enabled/disabled

Getting Help

When in doubt:

Read ARCHITECTURE.md for architectural guidance
Search the codebase for similar patterns
Check existing tests for examples
Discuss with maintainers before major changes

Example: Adding Constant Folding

Here’s a complete example of adding a simple optimization:

// In crates/mcc/src/optimizations/constant_folding.rs

use crate::tacky::{Program, Instruction, Value};

pub fn fold_constants(program: Program) -> Program {
    // Walk the TAC and fold constant expressions
    // Example: Replace "x = 2 + 3" with "x = 5"
    
    program.map_instructions(|instr| {
        match instr {
            Instruction::Binary { op, left, right, dest } => {
                if let (Value::Constant(l), Value::Constant(r)) = (left, right) {
                    // Compute at compile time
                    let result = evaluate(op, l, r);
                    Instruction::Copy {
                        src: Value::Constant(result),
                        dest,
                    }
                } else {
                    instr
                }
            }
            _ => instr,
        }
    })
}

Then integrate it into the pipeline and add tests.

Get Started

Core Concepts

Command Reference

Compilation Stages

API Reference

Testing

Development

Extending the Compiler

Overview

Understanding the Architecture

Adding New Pipeline Stages

When to Add a Stage

How to Add a Stage

Modifying Existing Stages

Adding Features to a Stage

Example: Adding a New AST Node Type

Working with Salsa

Tracked Functions

Accumulators for Diagnostics

Extending the CLI

Adding Command-Line Options

Using Options in the Pipeline

Adding Optimization Passes

Adding New Target Architectures

Current Architecture

Architecture-Specific Code

Testing Your Extensions

Unit Tests

Integration Tests

Snapshot Tests

Best Practices for Extensions

Do

Don’t

Common Extension Patterns

Adding a New Language Feature

Adding a New Diagnostic

Adding a New Command-Line Flag

Getting Help

Example: Adding Constant Folding

Build docs developers (and LLMs) love

Get Started

Core Concepts

Command Reference

Compilation Stages

API Reference

Testing

Development

​Overview

​Understanding the Architecture

​Adding New Pipeline Stages

​When to Add a Stage

​How to Add a Stage

​Modifying Existing Stages

​Adding Features to a Stage

​Example: Adding a New AST Node Type

​Working with Salsa

​Tracked Functions

​Accumulators for Diagnostics

​Extending the CLI

​Adding Command-Line Options

​Using Options in the Pipeline

​Adding Optimization Passes

​Adding New Target Architectures

​Current Architecture

​Architecture-Specific Code

​Testing Your Extensions

​Unit Tests

​Integration Tests

​Snapshot Tests

​Best Practices for Extensions

​Do

​Don’t

​Common Extension Patterns

​Adding a New Language Feature

​Adding a New Diagnostic

​Adding a New Command-Line Flag

​Getting Help

​Example: Adding Constant Folding

Build docs developers (and LLMs) love

Overview

Understanding the Architecture

Adding New Pipeline Stages

When to Add a Stage

How to Add a Stage

Modifying Existing Stages

Adding Features to a Stage

Example: Adding a New AST Node Type

Working with Salsa

Tracked Functions

Accumulators for Diagnostics

Extending the CLI

Adding Command-Line Options

Using Options in the Pipeline

Adding Optimization Passes

Adding New Target Architectures

Current Architecture

Architecture-Specific Code

Testing Your Extensions

Unit Tests

Integration Tests

Snapshot Tests

Best Practices for Extensions

Do

Don’t

Common Extension Patterns

Adding a New Language Feature

Adding a New Diagnostic

Adding a New Command-Line Flag

Getting Help

Example: Adding Constant Folding