Skip to main content
This guide covers profiling the Rust compiler to understand and improve its performance, including compilation speed, memory usage, and runtime efficiency.

Why Profile the Compiler?

Compilation Speed

Reduce time spent compiling Rust code

Memory Usage

Minimize compiler memory consumption

Query Optimization

Optimize the query system for better caching

Code Quality

Improve generated code performance

Profiling Setup

1

Build for profiling

Configure config.toml for profiling:
[rust]
# Enable profiler runtime
profiler = true

# Debug info for profiling
debuginfo-level = 1
debuginfo-level-rustc = 1

# Optimize for profiling
optimize = true

# Keep frame pointers for better stack traces
frame-pointers = true
2

Build the compiler

# Build optimized compiler with profiling support
./x.py build --stage 1

# Or build specific components
./x.py build compiler/rustc_driver
3

Install profiling tools

# perf
sudo apt-get install linux-tools-common linux-tools-generic

# flamegraph
cargo install flamegraph

# measureme
cargo install measureme

# samply
cargo install samply

Self-Profiling with measureme

Rustc has built-in self-profiling using the measureme framework:

Basic Self-Profiling

# Profile a compilation
rustc -Z self-profile file.rs

# This creates file-pid.mm_profdata
# Profile with custom name
rustc -Z self-profile=my_profile file.rs

Analyzing Self-Profile Data

# Get summary statistics
summarize file-pid.mm_profdata

# Output shows:
# - Total time
# - Time per query
# - Invocation counts
# - Cache hit rates

Time-Passes Profiling

Track time spent in each compiler pass:
# Time all passes
rustc -Z time-passes file.rs

# Output shows time for each pass:
# - parsing
# - expansion  
# - type checking
# - borrowck
# - codegen

CPU Profiling

Using perf (Linux)

1

Record profile

# Record CPU profile
perf record -F 99 -g ./build/stage1/bin/rustc file.rs

# Record with call graph
perf record -F 99 --call-graph dwarf ./build/stage1/bin/rustc file.rs
2

Analyze results

# Interactive report
perf report

# Show call graph
perf report --stdio

# Filter by function
perf report --stdio | grep function_name

Using Instruments (macOS)

# Profile with Time Profiler
cargo instruments --template time --bin rustc -- file.rs

# Opens Instruments.app with results

Using samply (Cross-Platform)

# Profile with samply
samply record ./build/stage1/bin/rustc file.rs

# Opens profiler.firefox.com with results

Memory Profiling

Heap Profiling with Valgrind

# Profile heap usage over time
valgrind --tool=massif ./build/stage1/bin/rustc file.rs

# Analyze with ms_print
ms_print massif.out.12345

# Visualize with massif-visualizer
massif-visualizer massif.out.12345

Memory Statistics

# Show size of types in compilation
rustc -Z print-type-sizes file.rs

# Helps identify large types causing memory issues

Query Performance Analysis

Query Profiling

The compiler’s query system is a major performance factor:
# Profile query execution
rustc -Z self-profile-events=default,query-cache-hits file.rs

# Analyze cache hit rates
summarize file-pid.mm_profdata --query-stats

Incremental Compilation

# Enable incremental compilation stats
RUSTC_LOG=rustc_incremental=debug rustc file.rs

# Shows:
# - Cache hits/misses
# - Reused query results
# - Invalidated queries

LLVM Performance

LLVM Pass Timing

# Time all LLVM passes
rustc -Z time-llvm-passes file.rs

# With detailed output
rustc -C llvm-args=-time-passes file.rs

LLVM Profiling

# Step 1: Build with instrumentation
rustc -C profile-generate=./pgo-data file.rs

# Step 2: Run the program
./file

# Step 3: Process profile data
llvm-profdata merge -o merged.profdata ./pgo-data

# Step 4: Build with profile
rustc -C profile-use=merged.profdata file.rs

Benchmarking

Compiler Benchmarks

1

Run rustc-perf

# Clone rustc-perf
git clone https://github.com/rust-lang/rustc-perf
cd rustc-perf

# Build collector
cargo build --release -p collector

# Run benchmarks
./target/release/collector bench_local ./build/stage1/bin/rustc
2

Analyze results

# View results
./target/release/collector analyze

# Compare with baseline
./target/release/collector compare baseline.json new.json

Custom Benchmarks

# Install hyperfine
cargo install hyperfine

# Benchmark compilation
hyperfine './build/stage1/bin/rustc file.rs'

# Compare different versions
hyperfine \
  --warmup 3 \
  './old-rustc file.rs' \
  './new-rustc file.rs'

Performance Analysis Workflow

1

Establish baseline

# Profile before changes
rustc -Z self-profile=baseline file.rs
summarize baseline-pid.mm_profdata > baseline.txt
2

Make changes

Implement your performance improvement
3

Profile again

# Profile after changes
rustc -Z self-profile=improved file.rs
summarize improved-pid.mm_profdata > improved.txt
4

Compare results

# Compare profiles
mmview diff baseline-pid.mm_profdata improved-pid.mm_profdata

# Look for:
# - Reduced query execution time
# - Better cache hit rates
# - Fewer invocations
5

Validate improvement

# Run benchmarks
./target/release/collector bench_local ./build/stage1/bin/rustc

# Check against perf suite
./x.py test src/tools/rustc-perf

Common Performance Issues

Symptoms: High time in specific queriesDiagnosis:
rustc -Z self-profile file.rs
summarize file-pid.mm_profdata --query-stats
Solutions:
  • Add caching
  • Optimize hot paths
  • Reduce query invocations
Symptoms: Low query cache hitsDiagnosis:
rustc -Z self-profile-events=query-cache-hits file.rs
Solutions:
  • Improve query key design
  • Enable incremental compilation
  • Reduce query invalidation
Symptoms: Excessive memory consumptionDiagnosis:
valgrind --tool=massif ./build/stage1/bin/rustc file.rs
Solutions:
  • Reduce type sizes
  • Optimize data structures
  • Free unused memory
Symptoms: Long time in codegen phaseDiagnosis:
rustc -Z time-llvm-passes file.rs
Solutions:
  • Reduce codegen units
  • Optimize MIR
  • Disable expensive LLVM passes

Optimization Tips

Query Optimization

  • Cache expensive computations
  • Minimize query parameters
  • Use shallow queries when possible
  • Avoid unnecessary query dependencies

Memory Optimization

  • Use arena allocation
  • Intern strings and types
  • Free temporary data
  • Use compact data structures

Compilation Speed

  • Enable incremental compilation
  • Use query caching effectively
  • Parallelize independent work
  • Reduce redundant computation

Code Generation

  • Optimize MIR before codegen
  • Use appropriate optimization levels
  • Enable LTO for release builds
  • Profile-guided optimization

Performance Testing in CI

Setting Up Performance Tests

name: Performance Test

on: [pull_request]

jobs:
  perf:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      
      - name: Build
        run: ./x.py build --stage 1
      
      - name: Profile
        run: |
          rustc -Z self-profile test.rs
          summarize test-*.mm_profdata > results.txt
      
      - name: Check for regressions
        run: ./check-perf-regression.sh

Resources

Profiling Guide

Rustc dev guide profiling chapter

measureme

Self-profiling framework

rustc-perf

Compiler performance benchmarking

Performance Book

Rust performance book

Next Steps

Debugging

Learn debugging techniques

Compiler Development

Return to compiler development

Library Development

Optimize library code

Build docs developers (and LLMs) love