ComprehensivePipeline

Overview

The ComprehensivePipeline class provides an end-to-end analysis workflow that combines basic clustering, communication analysis, and multi-chamber analysis into a single unified pipeline. This is the recommended pipeline for complete cardiac tissue characterization. Inheritance: BasePipeline Source: heartmap.pipelines.ComprehensivePipeline (src/heartmap/pipelines/init.py:370)

Constructor

ComprehensivePipeline(config: Config)

config

Config

required

Configuration object containing all analysis parameters including resolution for clustering.

Attributes

Inherited from BasePipeline:

config (Config): Configuration object
data_processor (DataProcessor): Data processing handler
visualizer (Visualizer): Visualization handler
exporter (ResultsExporter): Results export handler
results (Dict[str, Any]): Dictionary storing pipeline results

Methods

run()

Run the comprehensive HeartMAP analysis pipeline from raw data to final results.

def run(data_path: str, output_dir: Optional[str] = None) -> Dict[str, Any]

data_path

str

required

Path to raw single-cell data file (10X format or similar).

output_dir

Optional[str]

Directory to save all results, visualizations, and comprehensive report. If None, results are returned but not saved.

return

Dict[str, Any]

Dictionary containing comprehensive pipeline results:

adata

AnnData

Fully annotated data object with all analysis results stored in .obs, .obsm, and .uns

results

Dict[str, Any]

Comprehensive analysis results organized by module

annotation

Dict[str, Any]

Cell type annotation results

cluster_labels

np.ndarray

Array of cluster assignments for each cell (from Leiden clustering)

communication

Dict[str, Any]

Cell-cell communication analysis results

hub_scores

pd.Series

Hub scores for each cell (index matches adata.obs.index)

multi_chamber

Dict[str, Any]

Multi-chamber analysis results (chamber markers, correlations, etc.)

Raises:

ImportError: If required dependencies (scanpy, pandas, numpy, matplotlib) are not available

Pipeline Steps:

Data Loading and Processing - Loads and preprocesses raw data using DataProcessor.process_from_raw()
Neighborhood Graph - Computes PCA (40 components) and k-nearest neighbors (k=15)
Dimensionality Reduction - Calculates UMAP for visualization
Cell Clustering - Performs Leiden clustering using the resolution from config
Communication Analysis - Calculates hub scores and communication patterns
Multi-Chamber Analysis - Identifies chamber-specific patterns
Comprehensive Visualization - Generates integrated dashboard with all visualizations
Report Generation - Creates comprehensive HTML/PDF report with all findings

save_results()

Inherited from BasePipeline. Save pipeline results to disk.

def save_results(output_dir: str) -> None

output_dir

str

required

Directory path where results will be saved

Usage Example

from heartmap.config import Config
from heartmap.pipelines import ComprehensivePipeline

# Create configuration with all parameters
config = Config(
    data_path="data/cardiac_tissue.h5ad",
    analysis={
        "resolution": 0.8,
        "n_neighbors": 15,
        "n_pcs": 40
    }
)

# Initialize comprehensive pipeline
pipeline = ComprehensivePipeline(config)

# Run complete analysis from raw data
results = pipeline.run(
    data_path="data/raw/cardiac_10x",
    output_dir="results/comprehensive"
)

# Access all results
adata = results['adata']

# Cell type annotations
cluster_labels = results['results']['annotation']['cluster_labels']
print(f"Identified {len(set(cluster_labels))} cell clusters")

# Communication analysis
hub_scores = results['results']['communication']['hub_scores']
print(f"Mean hub score: {hub_scores.mean():.3f}")

# Multi-chamber results
multi_chamber = results['results']['multi_chamber']
print(f"Multi-chamber analysis: {list(multi_chamber.keys())}")

# The AnnData object contains all annotations
print(f"\nadata.obs columns: {list(adata.obs.columns)}")
print(f"adata.obsm keys: {list(adata.obsm.keys())}")

Output Files

When output_dir is specified, the pipeline generates:

Data Files

heartmap_complete.h5ad - Fully annotated AnnData object with all results
Results exported via ResultsExporter

Visualizations

figures/comprehensive_dashboard.png - Integrated multi-panel dashboard
figures/umap_clusters.png - UMAP colored by cluster
figures/qc_*.png - Quality control metrics
figures/communication_*.png - Communication analysis plots
figures/chamber_*.png - Multi-chamber analysis plots

Reports

Comprehensive HTML/PDF report generated by ResultsExporter.generate_comprehensive_report()

Comparison with Other Pipelines

Feature	BasicPipeline	AdvancedCommunicationPipeline	MultiChamberPipeline	ComprehensivePipeline
Input	Raw data	Annotated data	H5AD file	Raw data
Clustering	✓	-	-	✓
Communication	-	✓	-	✓
Multi-chamber	-	-	✓	✓
Dashboard	-	-	-	✓
Report	-	-	-	✓

Best Practices

Use for complete analysis: The ComprehensivePipeline is ideal when you want all analyses in one workflow
Check output_dir size: Comprehensive analysis generates many files; ensure adequate disk space
Review configuration: All config parameters affect the comprehensive analysis
Inspect the dashboard: The comprehensive dashboard provides an overview of all results
Read the report: The generated report summarizes all findings with interpretations

BasicPipeline

Basic clustering component of comprehensive analysis

AdvancedCommunicationPipeline

Communication analysis component

MultiChamberPipeline

Multi-chamber analysis component

Config

Configuration object reference

Complete Analysis Guide

Detailed guide on running comprehensive analysis

Pipelines

Configuration

Data Processing

Utilities

REST API

CLI

ComprehensivePipeline

Overview

Constructor

Attributes

Methods

run()

save_results()

Usage Example

Output Files

Data Files

Visualizations

Reports

Comparison with Other Pipelines

Best Practices

BasicPipeline

AdvancedCommunicationPipeline

MultiChamberPipeline

Config

Complete Analysis Guide

Build docs developers (and LLMs) love

Pipelines

Configuration

Data Processing

Utilities

REST API

CLI

​Overview

​Constructor

​Attributes

​Methods

​run()

​save_results()

​Usage Example

​Output Files

​Data Files

​Visualizations

​Reports

​Comparison with Other Pipelines

​Best Practices

​Related Documentation

BasicPipeline

AdvancedCommunicationPipeline

MultiChamberPipeline

Config

Complete Analysis Guide

Build docs developers (and LLMs) love

Overview

Constructor

Attributes

Methods

run()

save_results()

Usage Example

Output Files

Data Files

Visualizations

Reports

Comparison with Other Pipelines

Best Practices

Related Documentation