Specifying Covalent Bonds
AlphaFold 3 supports explicit specification of covalent bonds between atoms across different entities. This is essential for modeling covalent ligands, glycans, and other covalently-linked molecules.Overview
Covalent bonds are defined in thebondedAtomPairs field as pairs of atoms, where each atom is uniquely identified by:
- Chain ID (entity ID)
- Residue ID (1-based position within the chain)
- Atom Name (unique name within the residue)
All bonds specified in
bondedAtomPairs are implicitly covalent bonds. Other bond types are not currently supported.Bond Specification Format
Atom Identification
Each atom is specified as a three-element tuple:Chain ID (String)
The entity identifier from the
sequences section:- Must be an uppercase letter
- Example:
"A","B","L"
Residue ID (Integer)
1-based position within the chain:
- For proteins/RNA/DNA: position in the sequence (1 = first residue)
- For single-component ligands: always
1 - For multi-component ligands: component position (1, 2, 3, …)
Bond Pair Format
A bond is an array of two atoms:Use Cases
Use Case 1: Covalent Ligands
Binding a ligand to a protein residue:This creates a bond between:
- Chain A, residue 145 (cysteine), atom SG (sulfur)
- Chain L, residue 1 (heme), atom FE (iron)
Use Case 2: Glycans
Defining multi-component glycans with internal bonds:- Linear Glycan
- Branched Glycan
- Protein Asn8 → First NAG
- First NAG → Second NAG
- Second NAG → BMA
Use Case 3: Disulfide Bonds
Defining disulfide bridges between cysteines:Use Case 4: Cross-Chain Bonds
Bonds between different chains:Restrictions
SMILES Ligands Cannot Be Bonded
Polymer-Polymer Bonds Not Supported
Finding Atom Names
For Standard Residues
Use the RCSB PDB Chemical Component Dictionary:Navigate to CCD
For Custom Ligands
Atom names are defined in your user-provided CCD:atom_id values in your bonds.
Complete Example: N-Glycosylated Protein
Validation
Invalid bonds will cause the input to be rejected with a clear error message.Visual Representation
-
First bond: Chain A, residue 145, atom SG ↔ Chain L, residue 1, atom C04
- Cross-chain bond (protein to ligand)
- Covalent ligand attachment
-
Second bond: Chain I, residue 1, atom O6 ↔ Chain I, residue 2, atom C1
- Within-chain bond
- Connects two components of a multi-component ligand
Code Reference
Fromfolding_input.py:958-959:
folding_input.py:1189-1234: