Skip to main content
The Bio.motifs module provides tools for analyzing and manipulating sequence motifs, including reading motif files, calculating position weight matrices (PWMs), and searching sequences for motif occurrences.

Main Functions

create()

Create a Motif object from sequence instances.
from Bio import motifs

motif = motifs.create(
    ["TACAA", "TACGC", "TACAC", "TACCC"],
    alphabet="ACGT"
)
print(motif.consensus)
Parameters:
  • instances: List of sequence strings or SeqRecord objects
  • alphabet (str, optional): Alphabet to use (default: “ACGT”)
Returns: Motif object Source: Bio/motifs/__init__.py:34

parse()

Parse an output file from a motif finding program.
from Bio import motifs

with open("motifs.txt") as handle:
    for motif in motifs.parse(handle, "meme"):
        print(motif.consensus)
Parameters:
  • handle: File handle to parse
  • fmt (str): Format name (case-insensitive)
  • strict (bool, optional): Enforce strict format compliance (default: True)
Returns: List of Motif objects Supported Formats:
  • alignace: AlignAce output
  • clusterbuster: Cluster Buster position frequency matrix
  • jaspar: JASPAR multiple PFM format
  • meme: MEME output
  • minimal: MINIMAL MEME output
  • mast: MAST output
  • pfm: JASPAR-style position-frequency matrix
  • pfm-four-columns: Generic 4-column PFM format
  • pfm-four-rows: Generic 4-row PFM format
  • sites: JASPAR-style sites file
  • transfac: TRANSFAC database format
  • xms: XMS matrix format
Source: Bio/motifs/__init__.py:40

read()

Read a single motif from a file.
from Bio import motifs

with open("single_motif.pfm") as handle:
    motif = motifs.read(handle, "pfm")
    print(motif.consensus)
Parameters:
  • handle: File handle to read
  • fmt (str): Format name (case-insensitive)
  • strict (bool, optional): Enforce strict format compliance (default: True)
Returns: Single Motif object Raises: ValueError if file contains zero or more than one motif Source: Bio/motifs/__init__.py:129

write()

Return string representation of motifs in specified format.
from Bio import motifs

motifs_list = [motif1, motif2]
output = motifs.write(motifs_list, "transfac")
Parameters:
  • motifs: List of Motif objects
  • fmt (str): Output format (“clusterbuster”, “pfm”, “jaspar”, “transfac”)
  • **kwargs: Format-specific keyword arguments
Returns: String representation Source: Bio/motifs/__init__.py:615

Motif Class

Motif

A class representing sequence motifs.
from Bio import motifs

motif = motifs.create(["AACGCCA", "ACCGCCC", "AACTCCG"])
print(motif.consensus)
print(motif.pwm)
Attributes:
  • name (str): Motif name
  • alphabet (str): Alphabet used
  • length (int): Length of motif
  • counts: FrequencyPositionMatrix object with nucleotide counts
  • alignment: Alignment object with sequences
  • pseudocounts (dict): Pseudocounts for each letter
  • background (dict): Background frequencies
  • mask: Mask for motif positions
Properties:
  • pwm: Position Weight Matrix (normalized frequencies)
  • pssm: Position-Specific Scoring Matrix (log-odds)
  • consensus (Seq): Consensus sequence
  • anticonsensus (Seq): Anticonsensus sequence
  • degenerate_consensus (Seq): Degenerate consensus using IUPAC codes
  • relative_entropy (array): Information content per position
Methods:
  • reverse_complement(): Return reverse complement of motif
  • weblogo(fname, fmt): Download and save weblogo image
  • format(format_spec): Format motif for output
Source: Bio/motifs/__init__.py:187

Position Weight Matrix (PWM)

Normalized frequency matrix.
motif = motifs.create(["TACGC", "TACCC"])
pwm = motif.pwm
print(pwm)
Methods:
  • log_odds(background): Calculate PSSM from PWM
  • search(sequence, threshold): Search for motif in sequence
  • calculate(sequence): Calculate PWM score for sequence
Source: Bio/motifs/matrix.py

Position-Specific Scoring Matrix (PSSM)

Log-odds scoring matrix.
motif = motifs.create(["TACGC", "TACCC"])
motif.pseudocounts = 0.5
motif.background = {'A': 0.3, 'C': 0.2, 'G': 0.2, 'T': 0.3}
pssm = motif.pssm
print(pssm)
Methods:
  • search(sequence, threshold): Search sequence for high-scoring matches
  • calculate(sequence): Calculate PSSM score for sequence
  • max_score(): Maximum possible score
  • min_score(): Minimum possible score
  • mean(): Mean score across positions
  • std(): Standard deviation across positions
Source: Bio/motifs/matrix.py

Searching Sequences

from Bio import motifs
from Bio.Seq import Seq

# Create motif
motif = motifs.create(["TACGC", "TACCC", "TACAC"])

# Set parameters
motif.pseudocounts = 0.5
motif.background = {'A': 0.25, 'C': 0.25, 'G': 0.25, 'T': 0.25}

# Search using PSSM
sequence = Seq("ATACGCTACCCTAGGGG")
for position, score in motif.pssm.search(sequence, threshold=5.0):
    print(f"Position {position}: score {score:.2f}")

Setting Pseudocounts and Background

from Bio import motifs

motif = motifs.create(["TACGC", "TACCC"])

# Set pseudocounts (uniform)
motif.pseudocounts = 0.5

# Set pseudocounts (per-letter)
motif.pseudocounts = {'A': 0.6, 'C': 0.4, 'G': 0.4, 'T': 0.6}

# Set background (GC content for DNA)
motif.background = 0.5  # 50% GC

# Set background (per-letter)
motif.background = {'A': 0.3, 'C': 0.2, 'G': 0.2, 'T': 0.3}

Example

from Bio import motifs
from Bio.Seq import Seq

# Create motif from sequences
instances = [
    "TACGC",
    "TACCC",
    "TACAC",
    "TAGGC",
    "TACGA"
]

motif = motifs.create(instances, alphabet="ACGT")
print(f"Consensus: {motif.consensus}")
print(f"Length: {motif.length}")
print(f"\nCounts:")
print(motif.counts)

# Set parameters for scoring
motif.pseudocounts = 0.5
motif.background = {'A': 0.25, 'C': 0.25, 'G': 0.25, 'T': 0.25}

# Get PWM and PSSM
print(f"\nPWM:")
print(motif.pwm)
print(f"\nPSSM:")
print(motif.pssm)

# Search for motif in sequence
sequence = Seq("ATACGCTACCCTAGGGGCGTATACGA")
print(f"\nSearching in: {sequence}")
for position, score in motif.pssm.search(sequence, threshold=3.0):
    subseq = sequence[position:position+motif.length]
    print(f"Position {position}: {subseq} (score {score:.2f})")

# Get reverse complement
rc_motif = motif.reverse_complement()
print(f"\nReverse complement consensus: {rc_motif.consensus}")

Build docs developers (and LLMs) love