Overview
Therun_alphafold.py script is the main entry point for running AlphaFold 3 structure predictions. It orchestrates the complete prediction pipeline including data processing, MSA generation, template search, and model inference.
Main Functions
make_model_config
Creates a model configuration with customizable parameters.Flash attention implementation to use. Options:
'triton', 'cudnn', or 'xla'. Triton is fastest and requires Ampere GPUs or later.Number of diffusion samples to generate per seed.
Number of recycling iterations during inference.
Whether to return the final trunk single and pair embeddings. Embeddings are large float16 arrays: num_tokens * 384 + num_tokens * num_tokens * 128.
Whether to return the final distogram. Distogram is a large float16 array: num_tokens * num_tokens * 64.
Configured model instance ready for inference.
predict_structure
Runs the full inference pipeline to predict structures for each seed.The input containing chains, sequences, MSAs, and templates.
The model runner instance for executing predictions.
Token bucket sizes for compilation caching. If None, calculates appropriate bucket from token count.
Maximum date for using CCD model coordinates as fallback.
Maximum iterations for RDKit conformer search.
Whether to deduplicate unpaired MSA against paired MSA.
List of results for each seed, containing inference results and full fold input.
process_fold_input
Runs data pipeline and/or inference on a single fold input.Fold input to process.
Data pipeline config to use. If None, skip the data pipeline.
Model runner to use. If None, skip inference.
Output directory to write results to.
If True, use existing output directory even if non-empty. If False, create timestamped directory.
If True, compress large output files (mmCIF and confidences JSON) using zstandard.
ModelRunner Class
Helper class to run structure prediction stages.Constructor
Model configuration.
JAX device to run inference on (e.g., GPU).
Path to directory containing model parameters.
Methods
run_inference
extract_inference_results
extract_embeddings
extract_distogram
ResultsForSeed
Dataclass storing inference results for a single seed.The random seed used to generate the samples.
The inference results, one per diffusion sample.
The fold input including MSA and templates from data pipeline.
The final trunk single and pair embeddings, if requested.
The token distance histogram, if requested.
Command Line Flags
Input/Output
--json_path: Path to input JSON file--input_dir: Path to directory containing input JSON files--output_dir: Path to output directory (required)--model_dir: Path to model directory (default:~/models)
Pipeline Control
--run_data_pipeline: Whether to run data pipeline (default: True)--run_inference: Whether to run inference (default: True)
Database Paths
--db_dir: Database directory path (can specify multiple)--small_bfd_database_path: Small BFD database path--mgnify_database_path: Mgnify database path--uniref90_database_path: UniRef90 database path--uniprot_cluster_annot_database_path: UniProt database path--ntrna_database_path: NT-RNA database path--rfam_database_path: Rfam database path--rna_central_database_path: RNAcentral database path--pdb_database_path: PDB mmCIF files directory--seqres_database_path: PDB sequence database path
Performance Tuning
--num_recycles: Number of recycles (default: 10)--num_diffusion_samples: Number of diffusion samples (default: 5)--num_seeds: Number of seeds to generate--gpu_device: GPU device index (default: 0)--flash_attention_implementation: Flash attention type:triton,cudnn, orxla(default:triton)--buckets: Token bucket sizes for compilation caching
Output Control
--save_embeddings: Save final embeddings (default: False)--save_distogram: Save distogram (default: False)--compress_large_output_files: Compress output files (default: False)--force_output_dir: Use existing output directory (default: False)
Usage Example
Command Line Usage
See Also
- Input Dataclass - Structure of fold inputs
- Model Class - Core model architecture
- DataPipeline Class - MSA and template processing