Introduction
AlphaFold 3 represents a significant advancement in biomolecular structure prediction, capable of accurately predicting the structure and interactions of proteins, nucleic acids (DNA/RNA), ligands, ions, and modified residues. This implementation provides the inference pipeline for generating structure predictions.AlphaFold 3 builds upon the foundation of AlphaFold 2 and AlphaFold-Multimer, introducing a diffusion-based architecture that enables joint prediction of multiple biomolecular entity types.
Key Capabilities
AlphaFold 3 can predict structures for:- Proteins: Including post-translational modifications (PTMs)
- RNA and DNA: Including modified bases
- Ligands: Specified via CCD codes, SMILES, or custom CCD definitions
- Ions: Treated as ligands (e.g., MG, ZN, CA)
- Covalent Modifications: Bonds between entities (e.g., covalent ligands, glycans)
- Complexes: Multi-chain assemblies with heterogeneous entity types
Architecture Overview
The AlphaFold 3 architecture consists of three major components:Data Pipeline
Processes input sequences to generate:
- Multiple Sequence Alignments (MSAs) via genetic database searches
- Structural templates from the PDB
- Chemical component information for ligands
Neural Network
Transforms input features through:
- Input featurization and embedding
- Evoformer trunk for evolutionary and structural reasoning
- Pairformer for token-level representations
- Diffusion module for 3D coordinate generation
High-Level Workflow
Core Components
Input Processing
The system accepts JSON inputs defining:- Entity sequences (protein, RNA, DNA)
- Ligands (CCD codes or SMILES)
- Optional custom MSAs and templates
- Bonded atom pairs for covalent modifications
- Model seeds for reproducibility
src/alphafold3/common/folding_input.py
Token-Based Representation
AlphaFold 3 uses a token-based representation where:- Each token represents a residue (protein/nucleic acid) or an entire ligand
- Token features encode sequence, chemical, and structural information
- Pair representations capture relationships between tokens
Diffusion-Based Structure Generation
Unlike AlphaFold 2’s structure module, AlphaFold 3 uses a diffusion process:- Starts with noisy atomic coordinates
- Iteratively denoises through a learned reverse diffusion process
- Employs a diffusion transformer conditioned on evolutionary and structural information
- Generates multiple samples per seed for diversity
src/alphafold3/model/network/diffusion_head.py:30
The diffusion process uses SIGMA_DATA = 16.0 Å and applies random augmentation (rotation/translation) during training to improve generalization.
Model Configuration
The model architecture is configured through several key parameters:Execution Modes
The system supports flexible execution through two main flags:Data Pipeline Mode
Data Pipeline Mode
CPU-only operation that runs genetic searches and template finding. Can be executed separately on machines without GPUs. Controlled by
--run_data_pipeline=true.Key components:- Jackhmmer (protein MSA search against UniRef90, MGnify, Small BFD)
- Nhmmer (RNA/DNA MSA search)
- Hmmsearch (template search against PDB)
Inference Mode
Inference Mode
GPU-required operation that runs the neural network to generate structure predictions. Controlled by
--run_inference=true.The inference pipeline:- Loads model parameters from checkpoint
- Processes features through the network
- Runs diffusion sampling (default: 5 samples per seed)
- Computes confidence metrics
- Generates output CIF files and JSON confidences
Output Structure
For each prediction, AlphaFold 3 generates:- Structure files: mmCIF format containing predicted coordinates
- Confidence JSONs: Full arrays (pLDDT, PAE, contact probabilities)
- Summary JSONs: Scalar metrics (pTM, ipTM, ranking scores)
- Embeddings (optional): Token and pair embeddings for downstream tasks
- Distograms (optional): Distance distributions between tokens
Performance Characteristics
Memory Requirements: Structure prediction scales with system size. A 5,000-token input produces:
- ~6 GB embeddings (single + pair)
- ~3 GB distogram
- Activations require substantial GPU memory (24GB+ recommended)
- Data pipeline: CPU-intensive, hours for large proteins
- Inference: GPU-intensive, minutes per structure on modern GPUs
- Multiple seeds/samples increase runtime proportionally
Key Design Principles
- Unified Architecture: Single model handles all biomolecule types
- Token Abstraction: Flexible representation supporting diverse entities
- Diffusion-Based Sampling: Enables diverse, high-quality structure generation
- Confidence Estimation: Comprehensive metrics for prediction quality assessment
- Modular Pipeline: Separable data processing and inference stages
Use Cases
AlphaFold 3 excels at:- Protein-protein interactions
- Antibody-antigen complexes
- Protein-nucleic acid complexes
- Protein-ligand binding predictions
- Covalently modified proteins
- Custom ligand docking
Next Steps
Inference Pipeline
Learn how the inference pipeline processes inputs and generates predictions
Data Pipeline
Understand MSA generation and template search processes
Model Architecture
Deep dive into the neural network architecture