TRIFID provides multiple ways to interpret and explain its predictions. This guide covers score interpretation, feature importance analysis, and local explanations for individual transcripts.
Overview
Interpretation methods in TRIFID:
Global interpretation : Understanding overall feature importance
Local explanation : Why specific transcripts received their scores
SHAP values : Model-agnostic explanations
Visualization : Plots and waterfall charts
Understanding TRIFID Scores
Score Components
Each transcript receives two scores:
trifid_score (Raw Score)
Probability that transcript is functional
Range: 0.0 to 1.0
Independent across genes
Reflects absolute confidence
norm_trifid_score (Normalized Score)
Relative functionality within gene
Range: 0.0 to 1.0
The highest scoring isoform per gene gets 1.0
Reflects relative importance
Example Interpretation
gene_name transcript_id trifid_score norm_trifid_score
TP53 ENST00000269305 0.8912 1.0000
TP53 ENST00000420246 0.3421 0.3839
TP53 ENST00000413465 0.6234 0.6994
Analysis:
ENST00000269305: Highly functional (0.89) and the principal isoform (1.0)
ENST00000420246: Lower confidence (0.34), likely non-functional
ENST00000413465: Moderate score (0.62), context-dependent function
For identifying principal isoforms, use norm_trifid_score. For filtering functional transcripts genome-wide, use trifid_score.
Global Feature Importance
Understand which features drive predictions across the entire dataset.
Multiple Importance Methods
TRIFID’s TreeInterpretation class provides 8 different importance metrics:
from trifid.models.interpret import TreeInterpretation
# Initialize with trained model
interpreter = TreeInterpretation(
model = trained_model,
df = df_training_set,
features_col = feature_names,
target_col = 'label' ,
random_state = 123
)
# Get all importance scores
df_importances = interpreter.merge_feature_importances
print (df_importances)
Importance Metrics Explained
Sklearn Feature Importances
Method: Mean decrease in impurity (Gini)Code: trifid/models/interpret.py:88-99@ property
def feature_importances ( self ):
df = pd.DataFrame(
self .model.feature_importances_,
index = self .train_features.columns
).reset_index().rename(
columns = { 'index' : 'feature' , 0 : 'feature_importances_sklearn' }
).sort_values( by = 'feature_importances_sklearn' , ascending = False )
return df
Pros: Fast, built-in to Random Forest
Cons: Biased toward high-cardinality features
Method: Decrease in accuracy when feature is randomly shuffledCode: trifid/models/interpret.py:167-188@ property
def permutation_importances ( self ):
permutation_importance = PermutationImportance(
self .model,
random_state = self .random_state,
scoring = make_scorer(matthews_corrcoef),
n_iter = 10 ,
cv = StratifiedKFold( n_splits = 10 , shuffle = True ,
random_state = self .random_state),
).fit( self .train_features.values, self .train_target.values)
# ...
return df
Pros: Unbiased, model-agnostic
Cons: Computationally expensive
Method: Shapley values from game theoryCode: trifid/models/interpret.py:190-206@ property
def shap ( self ):
explainer = shap.TreeExplainer( self .model)
shap_values = explainer.shap_values( self .train_features)
vals = np.abs(shap_values).mean( 0 )
std_vals = np.abs(shap_values).std( 0 )
# ...
return df
Pros: Theoretically sound, local explanations
Cons: Slower for large datasetsSHAP is the recommended method for publication-quality interpretations.
Method: Decrease in out-of-bag score when feature is droppedCode: trifid/models/interpret.py:76-86@ property
def dropcol_importances ( self ):
df = oob_dropcol_importances(
self .model,
self .train_features,
self .train_target
).reset_index().rename(
columns = { 'Feature' : 'feature' ,
'Importance' : 'dropcol_importances' }
)
return df
Pros: Direct measure of feature necessity
Cons: Expensive, requires retraining
Example: Feature Importance Analysis
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from trifid.models.interpret import TreeInterpretation
# Load trained model and data
import pickle
model = pickle.load( open ( 'models/selected_model.pkl' , 'rb' ))
df = pd.read_csv( 'data/model/training_set_final.g27.tsv.gz' , sep = ' \t ' )
# Initialize interpreter
interpreter = TreeInterpretation(
model = model,
df = df,
features_col = feature_names,
target_col = 'label'
)
# Get SHAP importances
df_shap = interpreter.shap
print (df_shap.head( 10 ))
# Visualize
plt.figure( figsize = ( 10 , 6 ))
sns.barplot( data = df_shap.head( 15 ), x = 'shap' , y = 'feature' )
plt.xlabel( 'Mean |SHAP value|' )
plt.title( 'Top 15 Features by SHAP Importance' )
plt.tight_layout()
plt.savefig( 'feature_importance.png' , dpi = 300 )
plt.show()
Output:
feature shap
0 norm_spade 0.0823
1 pfam_score 0.0712
2 length_delta_score 0.0634
3 norm_RNA2sj_cds 0.0591
4 norm_ScorePerCodon 0.0456
Local Explanations
Explain why individual transcripts received their specific scores.
SHAP Waterfall Plots
Show how features contribute to a specific prediction:
from trifid.models.interpret import TreeInterpretation
import shap
# Load full feature database
df_features = pd.read_csv(
'data/genomes/GRCh38/g27/trifid_db.tsv.gz' ,
sep = ' \t ' ,
compression = 'gzip'
)
# Initialize interpreter
interpreter = TreeInterpretation(
model = model,
df = df,
features_col = feature_names,
target_col = 'label'
)
# Explain specific transcript
transcript_id = 'ENST00000380152'
explanation = interpreter.local_explanation(
df_features = df_features,
sample = transcript_id,
waterfall = True
)
Code: trifid/models/interpret.py:272-321
def local_explanation (
self ,
df_features ,
sample : str ,
waterfall : bool = False
) -> object :
# Identify if sample is transcript ID or gene name
if sample.startswith(get_id_patterns()):
idx = "transcript_id"
else :
idx = "gene_name"
# Extract sample features
df_features = df_features[
[ "gene_name" , "transcript_id" ] + list ( self .features_col)
]
df_sample = df_features.set_index([ "gene_name" , "transcript_id" ])
df_sample = df_sample.iloc[
df_sample.index.get_level_values(idx) == sample
]
# Calculate SHAP values
explainer = shap.TreeExplainer( self .model)
shap_values = explainer.shap_values(df_sample)
if waterfall:
base_value = explainer.expected_value
shap.plots._waterfall.waterfall_legacy(
base_value[ 0 ],
shap_values[ 0 ]
)
# Return feature contributions
df = pd.DataFrame(
list ( zip (np.abs(shap_values).mean( 0 )[ 0 ], df_sample.values[ 0 ])),
columns = [ "shap" , "feature" ],
index = df_sample.columns,
).sort_values( "shap" , ascending = False )
return df.round( 3 )
Interpreting Waterfall Plots
Waterfall plots show:
Base value : Average prediction across dataset (typically ~0.5)
Feature contributions : How each feature pushes the prediction up or down
Final prediction : The TRIFID score
Reading the plot:
Red bars: Features increasing functionality score
Blue bars: Features decreasing functionality score
Bar length: Magnitude of feature’s contribution
Gene-Level Explanations
Compare SHAP values across all isoforms of a gene:
# Explain all isoforms of a gene
gene_name = 'TP53'
explanation = interpreter.local_explanation(
df_features = df_features,
sample = gene_name,
waterfall = False
)
print (explanation)
Output:
ENST00000269305 ENST00000420246 ENST00000413465 std sum
norm_spade 0.142 0.023 0.089 0.051 0.254
pfam_score 0.118 0.011 0.067 0.046 0.196
length_delta_score 0.091 0.187 0.045 0.063 0.323
norm_RNA2sj_cds 0.076 0.003 0.034 0.031 0.113
Analysis:
ENST00000269305: Strong positive contributions from all features
ENST00000420246: Particularly weak in pfam_score and RNA-seq support
High std values indicate features discriminate well between isoforms
Feature Attribution Methods
TRIFID implements multiple attribution approaches for robust interpretation.
Measures dependency between features and labels:
# From trifid/models/interpret.py:132-153
@ property
def mutual_information ( self ):
df = pd.DataFrame(
mutual_info_classif(
self .train_features,
self .train_target,
random_state = self .random_state
),
index = self .train_features.columns,
).reset_index().rename(
columns = { 'index' : 'feature' , 0 : 'mutual_information' }
).sort_values( by = 'mutual_information' , ascending = False )
return df
Interpretation:
Higher MI → stronger relationship with functionality
MI = 0 → feature provides no information
Non-linear relationships captured
Target Permutation
Tests whether feature importances are real or spurious:
# From trifid/models/interpret.py:208-255
@ property
def target_permutation ( self ):
# Shuffle target labels
# Retrain model on permuted data
# Compare importances to real data
# High ratio = real importance
# Low ratio = spurious correlation
return df
Use case: Validate that important features aren’t just correlated by chance.
Comprehensive Interpretation Workflow
Step-by-Step Analysis
Train and evaluate model
from trifid.models.select import Classifier
model = Classifier(
model = RandomForestClassifier( n_estimators = 400 , random_state = 123 ),
df = df_training_set,
features_col = features,
target_col = 'label' ,
random_state = 123
)
# Check performance
print (model.evaluate)
print (model.confusion_matrix)
Calculate feature importances
from trifid.models.interpret import TreeInterpretation
interpreter = TreeInterpretation(
model = model.model,
df = df_training_set,
features_col = features,
target_col = 'label'
)
# Get all importance metrics
df_imp = interpreter.merge_feature_importances
# Save for later reference
df_imp.to_csv( 'feature_importances.tsv' , sep = ' \t ' , index = False )
Visualize global importances
import matplotlib.pyplot as plt
import seaborn as sns
# Get SHAP importances
df_shap = interpreter.shap
# Create barplot
fig, ax = plt.subplots( figsize = ( 10 , 8 ))
sns.barplot(
data = df_shap.head( 20 ),
x = 'shap' ,
y = 'feature' ,
palette = 'viridis' ,
ax = ax
)
ax.set_xlabel( 'Mean |SHAP value|' , fontsize = 12 )
ax.set_ylabel( 'Feature' , fontsize = 12 )
ax.set_title( 'Feature Importance (SHAP)' , fontsize = 14 )
plt.tight_layout()
plt.savefig( 'global_importance.png' , dpi = 300 )
Explain individual predictions
# Load full database
df_full = pd.read_csv(
'data/genomes/GRCh38/g27/trifid_predictions.tsv.gz' ,
sep = ' \t '
)
# Find interesting cases
high_score = df_full.nlargest( 1 , 'trifid_score' )[ 'transcript_id' ].iloc[ 0 ]
low_score = df_full.nsmallest( 1 , 'trifid_score' )[ 'transcript_id' ].iloc[ 0 ]
# Explain both
for tid in [high_score, low_score]:
print ( f " \n Explanation for { tid } :" )
exp = interpreter.local_explanation(
df_features = df_full,
sample = tid
)
print (exp.head( 10 ))
Generate waterfall plots
# Create waterfall for specific transcript
interpreter.local_explanation(
df_features = df_full,
sample = 'ENST00000380152' ,
waterfall = True
)
Visualization Recipes
1. Feature Importance Comparison
Compare multiple importance metrics:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Get different importance scores
df_sklearn = interpreter.feature_importances
df_perm = interpreter.permutation_importances
df_shap = interpreter.shap
# Merge
df_compare = pd.merge(df_sklearn, df_perm, on = 'feature' )
df_compare = pd.merge(df_compare, df_shap, on = 'feature' )
# Normalize to 0-1 scale
for col in df_compare.columns[ 1 :]:
df_compare[ f ' { col } _norm' ] = (
df_compare[col] / df_compare[col].max()
)
# Plot
fig, axes = plt.subplots( 1 , 3 , figsize = ( 18 , 6 ), sharey = True )
methods = [ 'feature_importances_sklearn_norm' ,
'permutation_importance_norm' ,
'shap_norm' ]
titles = [ 'Sklearn' , 'Permutation' , 'SHAP' ]
for ax, method, title in zip (axes, methods, titles):
top_features = df_compare.nlargest( 15 , method)
sns.barplot(
data = top_features,
x = method,
y = 'feature' ,
ax = ax
)
ax.set_title(title, fontsize = 14 )
ax.set_xlabel( 'Normalized Importance' )
if ax != axes[ 0 ]:
ax.set_ylabel( '' )
plt.tight_layout()
plt.savefig( 'importance_comparison.png' , dpi = 300 )
2. Score Distribution by Feature
Show how TRIFID scores vary with feature values:
import numpy as np
df = pd.read_csv( 'data/genomes/GRCh38/g27/trifid_predictions.tsv.gz' , sep = ' \t ' )
# Bin key feature
feature = 'norm_spade'
df[ 'feature_bin' ] = pd.cut(df[feature], bins = 5 , labels = [ 'Very Low' , 'Low' , 'Medium' , 'High' , 'Very High' ])
# Plot
fig, ax = plt.subplots( figsize = ( 10 , 6 ))
sns.boxplot(
data = df,
x = 'feature_bin' ,
y = 'trifid_score' ,
palette = 'RdYlGn' ,
ax = ax
)
ax.set_xlabel( f ' { feature } (binned)' , fontsize = 12 )
ax.set_ylabel( 'TRIFID Score' , fontsize = 12 )
ax.set_title( f 'TRIFID Score Distribution by { feature } ' , fontsize = 14 )
plt.xticks( rotation = 45 )
plt.tight_layout()
plt.savefig( f 'score_by_ { feature } .png' , dpi = 300 )
Visualize feature values and SHAP contributions for gene isoforms:
import numpy as np
# Get all isoforms of a gene
gene = 'TP53'
df_gene = df_full[df_full[ 'gene_name' ] == gene]
# Get SHAP explanations
exp = interpreter.local_explanation(
df_features = df_full,
sample = gene
).T
# Create heatmap
fig, axes = plt.subplots( 1 , 2 , figsize = ( 16 , 8 ))
# Feature values
sns.heatmap(
df_gene[features].set_index(df_gene[ 'transcript_id' ]).T,
cmap = 'viridis' ,
cbar_kws = { 'label' : 'Feature Value' },
ax = axes[ 0 ]
)
axes[ 0 ].set_title( f ' { gene } - Feature Values' , fontsize = 14 )
axes[ 0 ].set_ylabel( 'Feature' )
# SHAP contributions
sns.heatmap(
exp.drop([ 'std' , 'sum' ]),
cmap = 'RdBu_r' ,
center = 0 ,
cbar_kws = { 'label' : 'SHAP Value' },
ax = axes[ 1 ]
)
axes[ 1 ].set_title( f ' { gene } - SHAP Contributions' , fontsize = 14 )
axes[ 1 ].set_ylabel( '' )
plt.tight_layout()
plt.savefig( f ' { gene } _isoform_comparison.png' , dpi = 300 )
Common Interpretation Patterns
Typical characteristics of highly functional isoforms:
Feature Value Contribution
-------------------- ----- ------------
norm_spade 1.00 +0.15 ✓ Intact domains
pfam_score 0.95 +0.12 ✓ Preserved Pfam
length_delta_score 0.88 +0.08 ✓ Near full-length
norm_RNA2sj_cds 0.92 +0.11 ✓ Strong RNA-seq support
CCDS 1.00 +0.06 ✓ In consensus set
Common reasons for low functionality scores:
Feature Value Contribution
-------------------- ----- ------------
pfam_score 0.15 -0.18 ✗ Damaged domains
norm_RNA2sj_cds 0.03 -0.15 ✗ No RNA-seq support
length_delta_score 0.45 -0.09 ✗ Truncated
perc_Lost_State 0.40 -0.08 ✗ Lost domain states
StartEnd_NF 1.00 -0.12 ✗ Incomplete annotation
Ambiguous Cases
Transcripts with mixed signals:
Feature Value Contribution Notes
-------------------- ----- ------------ -----
norm_spade 0.75 +0.08 Moderate domains ⚠️
norm_RNA2sj_cds 0.12 -0.09 Low expression ✗
length_delta_score 0.91 +0.09 Full-length ✓
pfam_score 0.82 +0.06 Mostly intact ✓
Final score: 0.54 → Context-dependent functionality
Exporting Interpretations
Generate Interpretation Report
def generate_interpretation_report ( interpreter , df_full , gene_name , output_dir = 'reports' ):
import os
os.makedirs(output_dir, exist_ok = True )
# 1. Global importances
df_imp = interpreter.shap
df_imp.to_csv( f ' { output_dir } /global_importances.tsv' , sep = ' \t ' , index = False )
# 2. Gene-level explanation
exp = interpreter.local_explanation( df_features = df_full, sample = gene_name)
exp.to_csv( f ' { output_dir } / { gene_name } _explanation.tsv' , sep = ' \t ' )
# 3. Generate plots
fig, ax = plt.subplots( figsize = ( 10 , 6 ))
sns.barplot( data = df_imp.head( 15 ), x = 'shap' , y = 'feature' , ax = ax)
ax.set_title( 'Feature Importance (SHAP)' )
plt.tight_layout()
plt.savefig( f ' { output_dir } /importance.png' , dpi = 300 )
plt.close()
print ( f "Report generated in { output_dir } /" )
# Use it
generate_interpretation_report(
interpreter = interpreter,
df_full = df_predictions,
gene_name = 'TP53' ,
output_dir = 'reports/TP53'
)
Best Practices
Always validate globally before locally Check global feature importances first. If a feature ranks low globally but high locally, investigate why.
Use multiple importance metrics Don’t rely on a single method. Combine SHAP, permutation, and drop-column importances for robust insights.
Consider biological context A low TRIFID score doesn’t always mean non-functional. Consider tissue-specific expression and regulatory context.
Validate with experiments Use interpretations to design experiments, not replace them. Test predictions with functional assays.
Troubleshooting
SHAP Values Don’t Sum to Prediction
Expected: SHAP values should sum to (prediction - base_value)
If not:
Check for missing features in explanation
Verify model hasn’t changed since training
Ensure using same feature order
Inconsistent Importance Rankings
Cause: Different methods measure different aspects of importance
Solution:
Sklearn: measures impurity decrease (fast but biased)
Permutation: measures predictive power (slower but unbiased)
SHAP: measures contribution to individual predictions (most comprehensive)
Focus on SHAP for publication.
Memory Errors with SHAP
Problem: SHAP calculations exhaust memory
Solutions:
# Calculate SHAP in batches
import numpy as np
explainer = shap.TreeExplainer(model)
batch_size = 100
shap_values_list = []
for i in range ( 0 , len (df), batch_size):
batch = df.iloc[i:i + batch_size]
shap_batch = explainer.shap_values(batch[features])
shap_values_list.append(shap_batch)
shap_values = np.vstack(shap_values_list)
Next Steps
Visualization Module Advanced plotting functions for TRIFID results
Case Studies Real-world examples of TRIFID interpretation