Pipeline Overview
The AlphaFold 3 inference pipeline transforms processed features into predicted 3D structures through a multi-stage neural network. This page details the complete flow from feature tensors to final structure outputs.Pipeline Stages
Entry Point
The main inference entry point isrun_alphafold.py, which orchestrates the entire prediction workflow:
Both stages can be run independently. The data pipeline produces a
*_data.json file that contains all features needed for inference, enabling GPU-free data processing and GPU-only inference on separate machines.Stage 1: Feature Preparation
Input Format
Features are processed from the data pipeline output (or custom inputs) into a structuredBatch object:
Token Features
Tokens are the fundamental unit in AlphaFold 3. Each token represents:- A single residue for proteins/nucleic acids
- An entire ligand molecule
- Residue/ligand type encoding
- Chain ID information
- Token index (position in sequence)
- Chemical properties
src/alphafold3/model/features.py
MSA Features
Multiple Sequence Alignments provide evolutionary information:MSA depth is limited to 1024 sequences (configurable). Sequences are clustered and subsampled to maintain diversity while fitting memory constraints.
Template Features
Structural templates from PDB provide spatial priors:- Template coordinates: Aligned structures
- Template sequence alignment: Mapping query to template
- Template metadata: Resolution, date, identity scores
src/alphafold3/data/templates.py
Stage 2: Network Forward Pass
2.1 Input Embedding
The network begins by creating initial embeddings:- Single representation:
[num_tokens, 384]- per-token features - Pair representation:
[num_tokens, num_tokens, 128]- pairwise relationships
2.2 Evoformer Trunk
The Evoformer processes MSA and template information:MSA Processing
MSA Processing
Processes multiple sequence alignments through attention and transformation layers:
- Row attention (per-sequence)
- Column attention (per-position)
- Transition layers (feed-forward networks)
- Outer product mean (MSA → pair updates)
Template Integration
Template Integration
Incorporates structural template information:
- Template pair features computed from template coordinates
- Template point attention to integrate spatial information
- Template angle features for backbone geometry
src/alphafold3/model/network/template_modules.pyRelative Positional Encoding
Relative Positional Encoding
Adds positional information to pair representations:Encodes relative positions and chain separations.
2.3 Pairformer Module
After Evoformer, the Pairformer refines representations:- Triangle multiplicative updates
- Triangle self-attention
- Single representation updates with attention
- Per-token transition blocks
The Pairformer is the deepest part of the network with 48 blocks. It performs sophisticated reasoning about pairwise token relationships and refines the single representation through cross-attention.
2.4 Diffusion Module
The diffusion head generates 3D atomic coordinates:Noise Schedule
smax=160.0 Å to smin=0.0004 Å.
Diffusion Transformer
The diffusion transformer is conditioned on:- Single representations (token-level context)
- Pair representations (pairwise relationships)
- Noise level embeddings (current diffusion timestep)
src/alphafold3/model/network/diffusion_transformer.py
2.5 Confidence Head
In parallel with structure generation, confidence metrics are computed:- pLDDT: Per-atom local distance confidence
- PAE: Predicted aligned error matrix
- Contact probabilities: Likelihood of token-token contacts
src/alphafold3/model/network/confidence_head.py
Stage 3: Structure Generation
Coordinate Conversion
Model outputs are in a token-atom layout that must be converted to the final flat output format:- Unpacking ligand atoms from token representations
- Ordering atoms according to mmCIF conventions
- Handling missing atoms (set to 0, 0, 0)
Multiple Samples
The diffusion process generates multiple samples per seed:Multiple samples provide diversity in predictions. The best sample is selected based on the ranking score, which combines confidence metrics with clash and disorder penalties.
Stage 4: Confidence Computation
After structure generation, comprehensive confidence metrics are computed:pLDDT (per-atom)
PAE (Predicted Aligned Error)
Aggregate Metrics
Computed from per-token/per-atom confidences:- pTM: Predicted TM-score for full structure (0-1, higher = better)
- ipTM: Interface pTM for multi-chain interactions
- chain_pair_pae_min: Minimum PAE between chain pairs
- ranking_score: Combined metric for ranking predictions
Stage 5: Post-Processing
Structure Output
Predicted structures are written in mmCIF format:<job>_seed-<seed>_sample-<n>_model.cif: Structure in mmCIF format<job>_seed-<seed>_sample-<n>_confidences.json: Full confidence arrays<job>_seed-<seed>_sample-<n>_summary_confidences.json: Scalar metrics
Ranking and Selection
All samples across all seeds are ranked byranking_score:
- Best prediction is copied to root output directory
- Ranking CSV file lists all predictions with scores
- Users can select different samples based on specific metrics (e.g., highest chain-specific confidence)
Optional Outputs
Embeddings
Embeddings
Token and pair embeddings can be saved with Useful for downstream machine learning tasks or analysis.
--save_embeddings=true:Distograms
Distograms
Distance distributions can be saved with Large files (~3 GB for 5000 tokens).
--save_distogram=true:Performance Optimization
Memory Management
The inference pipeline employs several memory optimization strategies:- Gradient checkpointing: Recomputes activations during backward pass
- Block remat: Rematerialization of Pairformer blocks
- bfloat16 precision: Reduces memory by 2× with minimal accuracy loss
Batching
AlphaFold 3 processes one structure at a time. For multiple predictions:GPU Requirements
Minimum: 24GB GPU (e.g., RTX 3090, A5000)
Recommended: 40GB+ GPU (e.g., A100) for large complexesMemory scales with:
- Number of tokens (~quadratic for pair representations)
- Number of atoms
- Number of diffusion steps
Error Handling
Common issues and diagnostics:Out of Memory
Out of Memory
Solutions:
- Reduce MSA depth:
--num_msa=512 - Use smaller templates
- Split very large complexes
- Enable lower precision:
--bfloat16=true
Missing Atoms
Missing Atoms
When atoms cannot be placed (e.g., unsupported ligand atoms), coordinates are set to (0,0,0):Check logs for missing atom warnings.
Low Confidence
Low Confidence
If confidence metrics are low:
- Check MSA depth and quality
- Verify template relevance
- Consider multiple seeds for diversity
- Inspect PAE for specific interaction confidence
Next Steps
Data Pipeline
Learn how features are prepared before inference
Model Architecture
Deep dive into network components