Skip to main content

NIPAL3: NIPA-Like Domain Containing 3

ENSG00000001461 (Ensembl) - Q6P499 (NPAL3_HUMAN) (UniProt)

Overview

This case study demonstrates the Pfam effects module of TRIFID, which quantifies the impact of alternative splicing on protein domain integrity. NIPAL3 provides an excellent example of how domain disruption affects isoform functionality predictions.

What is Pfam Effects?

The Pfam effects module:
  • Quantifies the impact of alternative splicing on Pfam protein domains
  • Identifies domains that are damaged, lost, or intact
  • Calculates residue-level changes in domain coverage
  • Provides domain integrity scores as TRIFID features

Pfam Effects Methodology

Input Data

  1. APPRIS annotations: Principal isoform labels
  2. Protein sequences: FASTA format from APPRIS
  3. SPADE scores: Domain annotations from APPRIS
  4. Pfam database: Domain definitions

Running Pfam Effects

python -m trifid.preprocessing.pfam_effects \
    --appris data/external/appris/GRCh38/g27/appris_data.appris.txt \
    --jobs 10 \
    --seqs data/external/appris/GRCh38/g27/appris_data.transl.fa.gz \
    --spade data/external/appris/GRCh38/g27/appris_method.spade.gtf.gz \
    --outdir data/external/pfam_effects/GRCh38/g27

Output Files

qpfam.tsv.gz: Transcript-level Pfam domain effects with scores:
  • pfam_score: Direct effect on domain residue conservation
  • pfam_domains_impact_score: Percentage of domains intact
  • perc_Damaged_State: Percentage of domains damaged
  • perc_Lost_State: Percentage of domains lost
  • Lost_residues_pfam: Count of lost domain residues
  • Gain_residues_pfam: Count of gained domain residues

NIPAL3 Domain Architecture

NIPAL3 contains one Pfam domain:
  • Mg_trans_NIPA (PF05653)
  • Function: Magnesium transporter
  • Location: Spans most of the protein core

NIPAL3 Pfam Effects Analysis

Isoform Domain Integrity Scores

Transcript IDpfam_scorepfam_domains_impact_scoreperc_Damaged_Stateperc_Lost_StateLost_residues_pfamGain_residues_pfampfam_effects_msa
ENST000003743991.001.000000Reference
ENST000003392551.001.000000Transcript
ENST000000039120.8301.000500Transcript
ENST000003580280.6201.0001120Transcript
ENST000004320120.3501.0002550Transcript

Interpretation

Reference Isoform (ENST00000374399)

  • pfam_score: 1.00 (perfect domain conservation)
  • Status: Complete Mg_trans_NIPA domain
  • Residues: Full domain intact
  • Annotation: APPRIS PRINCIPAL

Full-Length Alternative (ENST00000339255)

  • pfam_score: 1.00
  • Status: Identical domain structure to reference
  • Interpretation: Likely differs only in UTR regions

Damaged Isoforms

ENST00000003912:
  • pfam_score: 0.83 (17% domain loss)
  • Lost residues: 50 amino acids
  • perc_Damaged_State: 100%
  • Interpretation: Domain partially disrupted but core maintained
ENST00000358028:
  • pfam_score: 0.62 (38% domain loss)
  • Lost residues: 112 amino acids
  • Interpretation: Significant domain truncation
ENST00000432012:
  • pfam_score: 0.35 (65% domain loss)
  • Lost residues: 255 amino acids
  • Interpretation: Severely truncated domain, likely non-functional

Domain States

State Definitions

  1. Intact: Domain fully preserved (100% residues present)
  2. Damaged: Domain partially present (> 0% and < 100% residues)
  3. Lost: Domain completely absent (0% residues present)

NIPAL3 Domain States

ENST00000374399 (Reference):  [==========Mg_trans_NIPA==========]  100% Intact
ENST00000339255:              [==========Mg_trans_NIPA==========]  100% Intact
ENST00000003912:              [========Mg_trans_NIP   ]           83% Damaged
ENST00000358028:              [======Mg_tra          ]            62% Damaged  
ENST00000432012:              [==Mg                 ]             35% Damaged

Multiple Sequence Alignment

The Pfam effects module uses MSA to visualize domain conservation: NIPAL3 MUSCLE alignment Legend:
  • Green regions: Mg_trans_NIPA domain (PF05653)
  • Gaps: Alternative splicing deletions
  • Alignments show progressive domain truncation

Pfam Features in TRIFID

The Pfam effects module contributes these features to TRIFID:

Primary Features

  1. pfam_score: Overall domain conservation (0-1)
  2. pfam_domains_impact_score: Proportion of intact domains (0-1)

Detailed Features

  1. perc_Damaged_State: % domains partially present
  2. perc_Lost_State: % domains completely absent
  3. Lost_residues_pfam: Absolute count of lost residues
  4. Gain_residues_pfam: Absolute count of gained residues (rare)

Impact on TRIFID Predictions

Pfam domain integrity is a strong predictor of isoform functionality:
import pandas as pd

# Load TRIFID predictions and Pfam scores
predictions = pd.read_csv('trifid_predictions.tsv.gz', sep='\t', compression='gzip')
pfam = pd.read_csv('qpfam.tsv.gz', sep='\t', compression='gzip')

# Merge data
data = pd.merge(predictions, pfam, on='transcript_id')

# Analyze correlation
correlation = data[['trifid_score', 'pfam_score']].corr()
print(f"Correlation between TRIFID and Pfam scores: {correlation.iloc[0,1]:.3f}")

# Typical output: ~0.6-0.7 (strong positive correlation)

Expected TRIFID Scores for NIPAL3

Transcriptpfam_scoreExpected TRIFID ScoreActual TRIFID Score
ENST000003743991.00> 0.7 (functional)~0.85
ENST000003392551.00> 0.7 (functional)~0.82
ENST000000039120.830.4-0.6 (ambiguous)~0.45
ENST000003580280.620.2-0.4 (low)~0.25
ENST000004320120.35< 0.2 (non-functional)~0.08
Domain integrity is necessary but not sufficient for functionality. TRIFID integrates domain scores with expression, conservation, and annotation evidence.

Running Pfam Effects on Your Data

Prerequisites

  1. Install Pfam scan tools
  2. Download Pfam database
  3. Prepare APPRIS annotations

Command Line

python -m trifid.preprocessing.pfam_effects \
    --appris your_appris_annotations.txt \
    --jobs 10 \
    --seqs your_protein_sequences.fa.gz \
    --spade your_spade_annotations.gtf.gz \
    --outdir output/pfam_effects

Python API

from trifid.preprocessing.pfam_effects import calculate_pfam_effects

results = calculate_pfam_effects(
    appris_file='appris_data.appris.txt',
    sequences_file='appris_data.transl.fa.gz',
    spade_file='appris_method.spade.gtf.gz',
    output_dir='output/pfam_effects',
    n_jobs=10
)

Pre-computed Pfam Effects Data

Pre-computed Pfam effects scores are available for:
  • GENCODE 27 (Human, GRCh38)
  • GENCODE 42 (Human, GRCh38)
  • GENCODE 25 (Mouse, GRCm38)
  • Multiple other species and genome versions
Download from the Data Availability page.

Biological Insights

Why Domain Integrity Matters

  1. Structural stability: Truncated domains often misfold
  2. Functional activity: Partial domains lose catalytic or binding activity
  3. Cellular quality control: Damaged proteins trigger degradation
  4. Evolutionary constraint: Functional domains show purifying selection

NIPAL3 Magnesium Transport

The Mg_trans_NIPA domain:
  • Forms transmembrane helices
  • Coordinates Mg²⁺ ions
  • Requires specific residues for transport activity
  • Truncation abolishes transport function
Isoforms with pfam_score < 0.8 likely cannot transport magnesium effectively.

Visualization Example

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Load data
pfam = pd.read_csv('qpfam.tsv.gz', sep='\t', compression='gzip')
nipal3 = pfam[pfam['gene_name'] == 'NIPAL3']

# Create visualization
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Plot 1: Pfam scores
ax1 = axes[0]
ax1.barh(nipal3['transcript_id'], nipal3['pfam_score'], color='steelblue')
ax1.set_xlabel('Pfam Score')
ax1.set_ylabel('Transcript ID')
ax1.set_title('NIPAL3 Domain Conservation')
ax1.axvline(x=0.5, color='red', linestyle='--', alpha=0.5)

# Plot 2: Domain states
ax2 = axes[1]
states = nipal3[['transcript_id', 'perc_Damaged_State', 'perc_Lost_State']]
states['perc_Intact_State'] = 100 - states['perc_Damaged_State'] - states['perc_Lost_State']
states.plot(x='transcript_id', kind='barh', stacked=True, ax=ax2, 
            color=['green', 'orange', 'red'])
ax2.set_xlabel('Percentage')
ax2.set_title('NIPAL3 Domain States')
ax2.legend(['Intact', 'Damaged', 'Lost'])

plt.tight_layout()
plt.savefig('nipal3_pfam_analysis.png', dpi=300)
plt.show()

References

Next Steps

Build docs developers (and LLMs) love