Skip to main content

Quick Installation

The simplest way to get started with TRIFID is to install it via pip directly from GitHub:
pip install git+https://github.com/fpozoc/trifid.git
This method is ideal if you want to:
  • Use TRIFID’s preprocessing modules (QSplice, Pfam effects)
  • Load and analyze pre-computed predictions
  • Integrate TRIFID into your analysis pipeline

Development Installation

For development work or to reproduce the complete TRIFID methodology from scratch, follow these steps:
1

Install Conda/Mamba

First, ensure you have Miniconda or Anaconda with mamba installed. If not, install Miniconda:
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh -b -p $HOME/miniconda3
Install mamba for faster dependency resolution:
conda install -c conda-forge mamba
2

Clone the Repository

Clone the TRIFID repository and navigate to the directory:
git clone [email protected]:fpozoc/trifid.git
cd trifid
3

Create Environment

Create the conda environment from the provided configuration:
mamba env create -f environment.yml
conda activate trifid
4

Install Pre-commit Hooks

Set up development tools and pre-commit hooks:
pre-commit install

Verify Installation

Test that everything is working correctly:
# Run pre-commit checks
pre-commit run --all-files

# Run tests
pytest -v

System Requirements

Minimum Requirements:
  • Python 3.7+
  • 8GB RAM (16GB recommended for genome-wide analysis)
  • Linux or macOS (Windows via WSL)

Core Dependencies

TRIFID requires the following Python packages:
biopython
numpy
pandas>=1.0
scikit-learn
joblib
cython
pyyaml
gtfparse
loguru
muscle

Install Optional Dependencies

Depending on your use case, you may want additional features:
For generating plots and interpreting model predictions:
pip install -e .[extra]
This includes: matplotlib, altair, shap, eli5, mlxtend, rfpimp

Pre-computed Predictions

Instead of running TRIFID from scratch, you can download pre-computed predictions for multiple genome assemblies. This is the recommended approach for most users.

Available Genomes

AssemblyDatabaseVersionRelease DateDownload
GRCh38GENCODEv27Aug 2017Link
GRCh38GENCODEv37Feb 2021Link
GRCh38GENCODEv42Apr 2022Link
GRCh38RefSeq110Feb 2020Link
GRCh37GENCODEv19Dec 2013Link
GRCh37RefSeq105Feb 2020Link
SpeciesAssemblyDatabaseVersionDownload
MouseGRCm39GENCODEM31Link
MouseGRCm38GENCODEM25Link
RatmRatBN7.2Ensembl105Link
ZebrafishGRCz11Ensembl104Link
FruitflyBDGP6Ensembl107Link
WormWBcel235Ensembl108Link
SpeciesAssemblyDatabaseVersionDownload
ChickenGRCg7bEnsembl108Link
ChimpanzeePan_tro_3.0Ensembl104Link
PigSscrofa11.1Ensembl108Link
CowARS-UCD1.2Ensembl104Link
MacaqueMmul_10Ensembl105Link

Download and Extract

Each prediction package contains:
  • trifid_predictions.tsv.gz - TRIFID scores for all isoforms
  • trifid_db.tsv.gz - Complete feature matrix
  • Feature description files
# Example: Download human GRCh38 GENCODE 27 predictions
wget https://drive.google.com/... -O trifid_predictions.tsv.gz

# Extract and view
gunzip trifid_predictions.tsv.gz
head trifid_predictions.tsv
Prediction files can be large (>1GB compressed). Ensure you have sufficient disk space.

Additional Resources

For advanced usage and model training, you may need additional data files:

Training Data

TRIFID training set (GENCODE 27)

Pre-trained Model

TRIFID model v1.0.4 (pickle format)

Tutorial Notebook

Complete tutorial with examples

Figures Notebook

Need Help?

If you encounter issues during installation:
  1. Check the GitHub Issues for known problems
  2. Ensure all system requirements are met
  3. Try creating a fresh conda environment
  4. Open a new issue with your error log
For species or genome versions not listed, you can request custom predictions by opening an issue on GitHub.

Next Steps

Quick Start Guide

Learn how to load predictions and analyze your first gene

Build docs developers (and LLMs) love