Skip to main content

Providing Custom Structural Templates

Structural templates can be provided for protein chains to guide the structure prediction. Templates are specified as mmCIF files with explicit mappings between query and template residues.
Structural templates are only supported for protein chains. RNA and DNA chains do not support templates.

Overview

A structural template consists of:
  1. An mmCIF file containing a single protein chain
  2. A mapping from query residue indices to template residue indices
{
  "protein": {
    "id": "A",
    "sequence": "MQIFVKTLTGKTITLEVEPS",
    "templates": [
      {
        "mmcif": "data_template\n_entry.id template\n...",
        "queryIndices": [0, 1, 2, 4, 5, 6],
        "templateIndices": [0, 1, 2, 3, 4, 8]
      }
    ]
  }
}

Template Structure

Required Fields

Provide the mmCIF content directly in the JSON:
{
  "templates": [
    {
      "mmcif": "data_template\n_entry.id template\n...",
      "queryIndices": [0, 1, 2, 3],
      "templateIndices": [0, 1, 2, 3]
    }
  ]
}
The mmcif and mmcifPath fields are mutually exclusive. You must use one or the other, not both.

Query and Template Indices

The mapping between query and template residues is defined using two parallel lists:
  • queryIndices: 0-based indices in the query sequence
  • templateIndices: 0-based indices in the template sequence
1

Understanding the Mapping

Each position in queryIndices corresponds to the same position in templateIndices.
{
  "queryIndices":    [0, 1, 2, 3],
  "templateIndices": [0, 2, 5, 6]
}
This creates the mapping:
  • Query residue 0 → Template residue 0
  • Query residue 1 → Template residue 2
  • Query residue 2 → Template residue 5
  • Query residue 3 → Template residue 6
2

Handling Unresolved Residues

mmCIF files can have residues present in residue tables but missing atom coordinates (unresolved residues). These must be counted when specifying template indices.For example, PDB 8UXY has unresolved residues 1–20.To align query residues 0–3 with template residues 21–24 (which are residues 4–7 in the mmCIF with 0-based indexing after accounting for unresolved residues 1–20):
{
  "queryIndices":    [0, 1, 2, 3],
  "templateIndices": [20, 21, 22, 23]
}
3

Multiple Templates

You can provide up to 20 templates per protein chain:
{
  "templates": [
    {
      "mmcifPath": "templates/template1.cif",
      "queryIndices": [0, 1, 2, 3],
      "templateIndices": [0, 1, 2, 3]
    },
    {
      "mmcifPath": "templates/template2.cif",
      "queryIndices": [5, 6, 7, 8],
      "templateIndices": [10, 11, 12, 13]
    }
  ]
}

Template Modes

AlphaFold 3 supports several template usage modes:
Leave templates unset. AlphaFold 3 will search for templates automatically:
{
  "protein": {
    "id": "A",
    "sequence": "MQIFVKTLTGKTITLEVEPS"
  }
}
The data pipeline will run Hmmsearch to find structural templates.

mmCIF Requirements

The mmCIF file must contain exactly one protein chain. Multi-chain mmCIFs will cause an error.

Valid Single-Chain mmCIF

data_template
_entry.id template
#
loop_
_atom_site.group_PDB
_atom_site.id
_atom_site.label_asym_id
_atom_site.label_comp_id
_atom_site.label_seq_id
_atom_site.label_atom_id
_atom_site.Cartn_x
_atom_site.Cartn_y
_atom_site.Cartn_z
ATOM   1    A   MET   1    N     10.123  20.456  30.789
ATOM   2    A   MET   1    CA    11.234  21.567  31.890
...

Complete Examples

Example 1: Single Custom Template

{
  "name": "Protein with custom template",
  "modelSeeds": [42],
  "sequences": [
    {
      "protein": {
        "id": "A",
        "sequence": "MQIFVKTLTGKTITLEVEPS",
        "description": "Protein with one custom template",
        "templates": [
          {
            "mmcifPath": "templates/7bz5.cif",
            "queryIndices": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
            "templateIndices": [10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
          }
        ]
      }
    }
  ],
  "dialect": "alphafold3",
  "version": 4
}

Example 2: Template-Free with Custom MSA

{
  "name": "Template-free prediction",
  "modelSeeds": [42],
  "sequences": [
    {
      "protein": {
        "id": "A",
        "sequence": "MQIFVKTLTGKTITLEVEPS",
        "description": "No templates, custom MSA",
        "unpairedMsa": ">query\nMQIFVKTLTGKTITLEVEPS\n>hit1\nMQIFVKTL-GKTITLEVEPS",
        "pairedMsa": "",
        "templates": []
      }
    }
  ],
  "dialect": "alphafold3",
  "version": 4
}

Example 3: Multiple Templates with Gaps

{
  "name": "Multiple templates",
  "modelSeeds": [42],
  "sequences": [
    {
      "protein": {
        "id": "A",
        "sequence": "MQIFVKTLTGKTITLEVEPSRDWHALE",
        "templates": [
          {
            "mmcifPath": "templates/template1.cif",
            "queryIndices": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
            "templateIndices": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
          },
          {
            "mmcifPath": "templates/template2.cif",
            "queryIndices": [15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26],
            "templateIndices": [5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
          }
        ]
      }
    }
  ],
  "dialect": "alphafold3",
  "version": 4
}

Advanced: Explicitly Setting Templates to Null

{
  "protein": {
    "id": "A",
    "sequence": "MQIFVKTLTGKTITLEVEPS",
    "templates": null
  }
}
Both approaches are equivalent and will trigger automatic template search.

Code Reference

From folding_input.py:86-121:
class Template:
  """Structural template input."""

  def __init__(self, *, mmcif: str, query_to_template_map: Mapping[int, int]):
    """Initializes the template.

    Args:
      mmcif: The structural template in mmCIF format. The mmCIF should have
        only one protein chain.
      query_to_template_map: A mapping from query residue index to template
        residue index.
    """
    self._mmcif = mmcif
    self._query_to_template = tuple(query_to_template_map.items())
From folding_input.py:165-167:
templates: Sequence[Template] | None = None
# If None, this field is unset and must be filled in by the data
# pipeline before featurisation.

Build docs developers (and LLMs) love