Skip to main content

Overview

The BasicPipeline class provides a streamlined workflow for single-cell RNA-seq analysis of cardiac tissue. It performs standard preprocessing, clustering, and visualization using Scanpy. Inheritance: BasePipeline Source: heartmap.pipelines.BasicPipeline (src/heartmap/pipelines/init.py:48)

Constructor

BasicPipeline(config: Config)
config
Config
required
Configuration object containing analysis parameters including resolution for clustering.

Attributes

Inherited from BasePipeline:
  • config (Config): Configuration object
  • data_processor (DataProcessor): Data processing handler
  • visualizer (Visualizer): Visualization handler
  • exporter (ResultsExporter): Results export handler
  • results (Dict[str, Any]): Dictionary storing pipeline results

Methods

run()

Run the basic analysis pipeline including data loading, clustering, and visualization.
def run(data_path: str, output_dir: Optional[str] = None) -> Dict[str, Any]
data_path
str
required
Path to raw single-cell data file (10X format or similar).
output_dir
Optional[str]
Directory to save results and visualizations. If None, results are returned but not saved.
return
Dict[str, Any]
Dictionary containing pipeline results:
adata
AnnData
Annotated data object with clustering results
results
Dict[str, Any]
Analysis results
cluster_labels
np.ndarray
Array of cluster assignments for each cell (from Leiden clustering)
Raises:
  • ImportError: If required dependencies (scanpy, pandas, numpy, matplotlib) are not available
Pipeline Steps:
  1. Data Loading and Processing - Loads raw data using DataProcessor.process_from_raw()
  2. Neighborhood Graph - Computes PCA (40 components) and k-nearest neighbors (k=15) if not present
  3. Cell Clustering - Performs Leiden clustering using the resolution from config
  4. Visualization - Generates UMAP plots and QC metrics (if output_dir provided)
  5. Results Export - Saves annotated data as annotated_data.h5ad and results

save_results()

Inherited from BasePipeline. Save pipeline results to disk.
def save_results(output_dir: str) -> None
output_dir
str
required
Directory path where results will be saved

Usage Example

from heartmap.config import Config
from heartmap.pipelines import BasicPipeline

# Create configuration
config = Config(
    data_path="data/cardiac_tissue.h5ad",
    analysis={
        "resolution": 0.8
    }
)

# Initialize pipeline
pipeline = BasicPipeline(config)

# Run analysis
results = pipeline.run(
    data_path="data/raw/cardiac_10x",
    output_dir="results/basic_analysis"
)

# Access results
adata = results['adata']
cluster_labels = results['results']['cluster_labels']

print(f"Identified {len(set(cluster_labels))} cell clusters")

Output Files

When output_dir is specified, the pipeline generates:
  • annotated_data.h5ad - Processed AnnData object with cluster annotations
  • figures/umap_clusters.png - UMAP visualization colored by cluster
  • figures/qc_*.png - Quality control metric plots
  • Results exported via ResultsExporter

Quickstart Guide

Get started with HeartMAP analysis

Config

Configuration object reference

DataProcessor

Data processing functionality

Build docs developers (and LLMs) love