FGFR1: Fibroblast Growth Factor Receptor 1

ENSG00000077782 (Ensembl) - P11362 (FGFR1_HUMAN) (UniProt)

Overview

Fibroblast Growth Factor Receptor 1 (FGFR1) is a receptor tyrosine kinase that plays crucial roles in cell proliferation, differentiation, and migration. This case study demonstrates how TRIFID evaluates the functional importance of FGFR1 splice isoforms.

Loading TRIFID Predictions

To analyze FGFR1 isoforms, load the TRIFID predictions for GENCODE 27:

import pandas as pd

# Load predictions
predictions = pd.read_csv(
    'data/genomes/GRCh38/g27/trifid_predictions.tsv.gz', 
    compression='gzip', 
    sep='\t'
)

# Filter for FGFR1
gene_name = 'FGFR1'
fgfr1_data = predictions.loc[
    predictions['gene_name'] == gene_name
][
    ['transcript_id', 'gene_name', 'trifid_score', 'norm_trifid_score', 
     'appris', 'length', 'sequence']
]

print(fgfr1_data)

FGFR1 Isoform Analysis

Isoform Results

Gene name	Transcript ID	APPRIS Label	Length (aa)	TRIFID Score	TRIFID Score (normalized)
FGFR1	ENST00000447712	PRINCIPAL:3	822	0.87	0.99
FGFR1	ENST00000356207	MINOR	733	0.60	0.69
FGFR1	ENST00000397103	MINOR	733	0.01	0.08
FGFR1	ENST00000619564	MINOR	228	0.00	0.01

Key Findings

Principal Isoform (ENST00000447712)
- TRIFID score: 0.87 (high confidence functional)
- Normalized score: 0.99 (highest among gene isoforms)
- Length: 822 amino acids
- Agrees with APPRIS PRINCIPAL annotation
Alternative Isoform (ENST00000356207)
- TRIFID score: 0.60 (moderate functional confidence)
- Normalized score: 0.69
- May represent a functional alternative with different regulatory properties
Low-scoring Isoforms
- ENST00000397103 and ENST00000619564 show very low TRIFID scores
- Likely non-functional or degraded transcripts

Interpretation with SHAP

TRIFID provides local interpretability using SHAP (SHapley Additive exPlanations) values to explain individual predictions.

Loading SHAP Predictions

from trifid.models.interpret import TreeInterpretation
import pickle

# Load model
model = pickle.load(open('models/selected_model.pkl', 'rb'))

# Load training data
df_training = pd.read_csv(
    'data/model/training_set_final.g27.tsv.gz', 
    sep='\t', 
    compression='gzip'
)

# Create interpretation object
interpretation = TreeInterpretation(
    model=model,
    df=df_training,
    features_col=training_features,
    target_col='label',
    random_state=123,
    test_size=0.25
)

# Explain specific isoform
explanation = interpretation.local_explanation(
    df_predictions, 
    sample='ENST00000356207'
)
print(explanation.head(10))

Example SHAP Output

The SHAP waterfall plot shows which features contribute most to the prediction for ENST00000356207:

Positive contributors (pushing score higher):
- Length delta score
- PhyloCSF conservation score
- APPRIS structural features
- RNA-seq expression evidence
Negative contributors (pushing score lower):
- Transcript Support Level (TSL)
- Domain completeness
- Pfam domain integrity

Biological Context

FGFR1 Function

FGFR1 is involved in:

Embryonic development
Angiogenesis
Wound healing
Cell survival signaling

Clinical Relevance

FGFR1 alterations are associated with:

Various cancers (e.g., breast, lung)
Skeletal disorders
Developmental syndromes

Understanding which isoforms are functional is critical for:

Interpreting genetic variants
Designing targeted therapies
Understanding disease mechanisms

Visualization

The TRIFID paper includes a figure showing:

Exon structure of each FGFR1 isoform
TRIFID scores mapped to isoform structure
Domain architecture differences
Expression evidence across tissues

The principal isoform (ENST00000447712) maintains full receptor structure including the tyrosine kinase domain, explaining its high TRIFID score.

Running Your Own Analysis

To analyze FGFR1 or any gene of interest:

def analyze_gene(gene_name, predictions_file):
    """
    Analyze TRIFID predictions for a specific gene.
    
    Args:
        gene_name: Gene symbol (e.g., 'FGFR1')
        predictions_file: Path to TRIFID predictions
    
    Returns:
        DataFrame with isoform analysis
    """
    df = pd.read_csv(predictions_file, sep='\t', compression='gzip')
    
    gene_data = df[
        df['gene_name'] == gene_name
    ].sort_values('trifid_score', ascending=False)
    
    print(f"\nAnalysis for {gene_name}:")
    print(f"Total isoforms: {len(gene_data)}")
    print(f"Functional (score >= 0.5): {(gene_data['trifid_score'] >= 0.5).sum()}")
    print(f"\nTop isoform: {gene_data.iloc[0]['transcript_id']}")
    print(f"Score: {gene_data.iloc[0]['trifid_score']:.2f}")
    
    return gene_data

# Run analysis
fgfr1_analysis = analyze_gene('FGFR1', 'data/genomes/GRCh38/g27/trifid_predictions.tsv.gz')

References

Next Steps

Explore the C1orf112 case study for QSplice module analysis
See NIPAL3 example for Pfam effects
Review interpretation methods for SHAP analysis

Case Studies

Tutorials

FGFR1 Case Study

FGFR1: Fibroblast Growth Factor Receptor 1

Overview

Loading TRIFID Predictions

FGFR1 Isoform Analysis

Isoform Results

Key Findings

Interpretation with SHAP

Loading SHAP Predictions

Example SHAP Output

Biological Context

FGFR1 Function

Clinical Relevance

Visualization

Running Your Own Analysis

References

Next Steps

Build docs developers (and LLMs) love

Case Studies

Tutorials

​FGFR1: Fibroblast Growth Factor Receptor 1

​Overview

​Loading TRIFID Predictions

​FGFR1 Isoform Analysis

​Isoform Results

​Key Findings

​Interpretation with SHAP

​Loading SHAP Predictions

​Example SHAP Output

​Biological Context

​FGFR1 Function

​Clinical Relevance

​Visualization

​Running Your Own Analysis

​References

​Next Steps

Build docs developers (and LLMs) love

FGFR1: Fibroblast Growth Factor Receptor 1

Overview

Loading TRIFID Predictions

FGFR1 Isoform Analysis

Isoform Results

Key Findings

Interpretation with SHAP

Loading SHAP Predictions

Example SHAP Output

Biological Context

FGFR1 Function

Clinical Relevance

Visualization

Running Your Own Analysis

References

Next Steps