Skip to main content
The mcc crate provides a library API for embedding the C compiler in tools, tests, and other applications. It exposes a pipeline-based architecture with incremental recomputation powered by Salsa.

Quick Start

Parse, typecheck, lower, generate, and render a C program to assembly text:
use mcc::{Database, SourceFile, Text};

let db = Database::default();
let src = "int main(void) { return 0; }";
let file = SourceFile::new(&db, Text::from("main.c"), Text::from(src));

// Parse → typecheck → TAC → ASM IR → assembly text
let ast = mcc::parse(&db, file);
let tacky = mcc::lowering::lower_program(&db, file);
let asm_ir = mcc::codegen::generate_assembly(&db, tacky);
let asm_text = mcc::render_program(&db, asm_ir, mcc::default_target()).unwrap();

assert!(asm_text.as_str().contains("main"));

Compilation Pipeline

The compiler follows a classic multi-stage pipeline:
1

Preprocessing

preprocess invokes the system C preprocessor (cc -E -P)
2

Parsing

parse produces an abstract syntax tree (Ast)
3

Typechecking

typechecking::typecheck validates types and builds HIR
4

Lowering

lowering::lower_program converts AST to three-address code (TAC)
5

Code Generation

codegen::generate_assembly produces assembly IR
6

Rendering

render_program converts assembly IR to text
7

Assembling & Linking

assemble_and_link invokes the system toolchain

Core Types

Database

pub struct Database {
    storage: salsa::Storage<Self>,
}
The incremental computation database. Create with Database::default() and pass by reference to all pipeline functions.

SourceFile

#[salsa::input]
pub struct SourceFile {
    pub path: Text,
    pub contents: Text,
}
Represents a source file. Create with:
let file = SourceFile::new(&db, Text::from("main.c"), Text::from(src));

Ast

#[salsa::tracked]
pub struct Ast<'db> {
    pub tree: Tree,
}
The abstract syntax tree produced by parsing. Access the root node:
let ast = mcc::parse(&db, file);
let root = ast.root(&db); // ast::TranslationUnit<'_>

Text

pub struct Text(Arc<str>);
A reference-counted string type. Converts from &str, String, and Cow<'_, str>.

Capturing Diagnostics

Each stage accumulates diagnostics rather than panicking. Retrieve them using the stage’s accumulated helper:
use mcc::{Database, SourceFile, Text, diagnostics::Diagnostics};

let db = Database::default();
let file = SourceFile::new(&db, "test.c".into(), "int main(void) {}".into());
let _ = mcc::parse(&db, file);
let diags: Vec<&Diagnostics> = mcc::parse::accumulated::<Diagnostics>(&db, file);

// Render with codespan-reporting using `mcc::Files`
use codespan_reporting::term;
use codespan_reporting::term::termcolor::{ColorChoice, StandardStream};

let mut files = mcc::Files::new();
files.add(&db, file);

let writer = StandardStream::stderr(ColorChoice::Auto);
let config = term::Config::default();

for diag in diags {
    term::emit(&mut writer.lock(), &config, &files, &**diag).unwrap();
}
See Diagnostics for details.

Target Configuration

Rendering assembly requires a target triple:
use target_lexicon::Triple;

let target = mcc::default_target(); // x86_64 for current host OS
let asm_text = mcc::render_program(&db, asm_ir, target)?;

Platform-Specific Behavior

  • macOS: Symbol names are rendered with a leading underscore (e.g., _main)
  • Linux: A .note.GNU-stack section is emitted for stack protection

Pipeline Functions

Preprocessing

pub fn preprocess(
    db: &dyn Db,
    cc: OsString,
    file: SourceFile,
) -> Result<String, CommandError>
Invokes the system C preprocessor. The cc argument is typically "cc" or "gcc".

Parsing

pub fn parse(db: &dyn Db, file: SourceFile) -> Ast<'_>
Parses source code into an abstract syntax tree using tree-sitter.

Typechecking

pub mod typechecking {
    pub fn typecheck(db: &dyn Db, file: SourceFile) -> Hir<'_>;
}
Validates types and builds a high-level intermediate representation.

Lowering

pub mod lowering {
    pub fn lower_program(db: &dyn Db, file: SourceFile) -> tacky::Program<'_>;
    pub fn lower(db: &dyn Db, ast: Ast<'_>) -> tacky::Program<'_>;
}
Converts the AST to three-address code (TAC).

Codegen

pub mod codegen {
    pub fn generate_assembly(db: &dyn Db, tacky: tacky::Program<'_>) -> asm::Program<'_>;
}
Generates assembly IR from TAC.

Rendering

pub fn render_program(
    db: &dyn Db,
    program: asm::Program<'_>,
    target: Triple,
) -> Text
Converts assembly IR to textual assembly code.

Assembling

pub fn assemble_and_link(
    db: &dyn Db,
    cc: OsString,
    assembly: PathBuf,
    output: PathBuf,
    target: Triple,
) -> Result<(), CommandError>
Invokes the system assembler and linker to produce an executable.

Next Steps

Database

Learn about the Salsa-powered incremental computation layer

Diagnostics

Handle errors and warnings with codespan-reporting

Callbacks

Hook into compilation stages with the driver API

Architecture

Understand the overall system design

Build docs developers (and LLMs) love