Understanding rustc’s Mid-level Intermediate Representation used for optimization and code generation
MIR (Mid-level Intermediate Representation) is the heart of the Rust compiler’s optimization and code generation pipeline. It’s a control-flow graph representation that is simpler than HIR but still retains high-level information about the program.
MIR is designed to be easy to analyze and transform while being detailed enough to support advanced optimizations and precise code generation.
MIR represents a Rust program as a control-flow graph (CFG) of basic blocks:
/// The core MIR data structure representing a function bodypub struct Body<'tcx> { /// Basic blocks that make up the control flow graph pub basic_blocks: BasicBlocks<'tcx>, /// Local variables (including arguments and return place) pub local_decls: IndexVec<Local, LocalDecl<'tcx>>, /// User type annotations pub user_type_annotations: IndexVec<UserTypeAnnotationIndex, UserTypeAnnotation<'tcx>>, /// Current compilation phase pub phase: MirPhase, // ... and more fields}
A basic block is a sequence of statements followed by a terminator:
pub struct BasicBlockData<'tcx> { /// Sequence of statements in this block pub statements: Vec<Statement<'tcx>>, /// Terminator instruction (how control flow exits this block) pub terminator: Option<Terminator<'tcx>>, /// True if this block is reachable pub is_cleanup: bool,}
Statements perform operations without transferring control:
pub enum StatementKind<'tcx> { /// Assign the rvalue to the lvalue Assign(Box<(Place<'tcx>, Rvalue<'tcx>)>), /// Mark a local as live at this point StorageLive(Local), /// Mark a local as dead at this point StorageDead(Local), /// Set discriminant for an enum SetDiscriminant { place: Box<Place<'tcx>>, variant_index: VariantIdx }, /// No-op (used during optimization) Nop, // ... and more}
Terminators transfer control to other basic blocks:
pub enum TerminatorKind<'tcx> { /// Goto a single successor block Goto { target: BasicBlock }, /// Branch based on condition SwitchInt { discr: Operand<'tcx>, targets: SwitchTargets }, /// Return from function Return, /// Abort the program Unreachable, /// Drop a value Drop { place: Place<'tcx>, target: BasicBlock, unwind: UnwindAction }, /// Call a function Call { func: Operand<'tcx>, args: Vec<Operand<'tcx>>, destination: Place<'tcx>, target: Option<BasicBlock>, unwind: UnwindAction, // ... and more fields }, /// Assert a condition Assert { cond: Operand<'tcx>, expected: bool, msg: Box<AssertMessage<'tcx>>, target: BasicBlock, unwind: UnwindAction, }, // ... and more}
pub struct Place<'tcx> { pub local: Local, pub projection: &'tcx [PlaceElem<'tcx>],}pub enum PlaceElem<'tcx> { /// Dereference a pointer Deref, /// Access a field Field(FieldIdx, Ty<'tcx>), /// Index into an array Index(Local), /// Downcast to a specific enum variant Downcast(Option<Symbol>, VariantIdx), // ... and more}
Rvalues represent computations that produce values:
pub enum Rvalue<'tcx> { /// Read the value from a place Use(Operand<'tcx>), /// Perform a binary operation BinaryOp(BinOp, Box<(Operand<'tcx>, Operand<'tcx>)>), /// Perform a unary operation UnaryOp(UnOp, Operand<'tcx>), /// Cast a value Cast(CastKind, Operand<'tcx>, Ty<'tcx>), /// Create a reference Ref(Region<'tcx>, BorrowKind, Place<'tcx>), /// Create an aggregate (struct, tuple, array) Aggregate(Box<AggregateKind<'tcx>>, Vec<Operand<'tcx>>), // ... and more}
MIR construction happens through an intermediate representation called THIR (Typed High-level Intermediate Representation), which represents pattern matching and other complex control flow more explicitly.
// From rustc_mir_build/src/lib.rs// The builder module constructs MIR from HIRpub fn provide(providers: &mut Providers) { providers.queries.check_match = thir::pattern::check_match; providers.queries.lit_to_const = thir::constant::lit_to_const; providers.hooks.build_mir_inner_impl = builder::build_mir_inner_impl;}
declare_passes! { mod abort_unwinding_calls : AbortUnwindingCalls; mod add_call_guards : AddCallGuards { AllCallEdges, CriticalCallEdges }; mod cleanup_post_borrowck : CleanupPostBorrowck; mod copy_prop : CopyProp; mod dataflow_const_prop : DataflowConstProp; mod dead_store_elimination : DeadStoreElimination { Initial, Final }; pub mod inline : Inline, ForceInline; mod gvn : GVN; mod jump_threading : JumpThreading; // ... many more passes}