Skip to main content

Quick Start Guide

This guide will get you analyzing cardiac single-cell data with HeartMAP in just a few minutes. We’ll cover both command-line and Python API usage.
Prerequisites: Make sure you have installed HeartMAP before starting this guide.

30-Second Analysis (CLI)

The fastest way to analyze your heart data:
heartmap your_heart_data.h5ad
That’s it! HeartMAP will automatically:
  • Load your data
  • Perform quality control
  • Identify cell types
  • Generate visualizations
  • Save results to the results/ directory

CLI Options

For more control over the analysis:
# Comprehensive analysis with custom output
heartmap data/heart_data.h5ad \
    --analysis-type comprehensive \
    --output-dir results/comprehensive \
    --config my_config.yaml

# Specific analysis types
heartmap data/heart_data.h5ad --analysis-type annotation
heartmap data/heart_data.h5ad --analysis-type communication
heartmap data/heart_data.h5ad --analysis-type multi-chamber

# Memory-optimized for large datasets
heartmap data/large_dataset.h5ad \
    --analysis-type comprehensive \
    --config config_large.yaml

2-Minute Python Analysis

For more flexibility and integration into your workflows:
1

Import HeartMAP

from heartmap import Config
from heartmap.pipelines import ComprehensivePipeline
2

Configure analysis

# Use default configuration
config = Config.default()

# Or customize for your system
config = Config.default()
config.data.max_cells_subset = 50000  # Adjust based on available memory
config.data.max_genes_subset = 5000
config.analysis.resolution = 0.5
3

Run the pipeline

# Create pipeline
pipeline = ComprehensivePipeline(config)

# Run analysis
results = pipeline.run('your_data.h5ad', 'results/')

print("✅ Analysis complete! Check 'results/' directory.")

Complete Example

Here’s a complete working example:
complete_analysis.py
from heartmap import Config
from heartmap.pipelines import ComprehensivePipeline

# Load and customize configuration
config = Config.default()
config.data.max_cells_subset = 50000  # Optimize for your memory
config.data.max_genes_subset = 5000
config.analysis.resolution = 0.5

# Create and run pipeline
pipeline = ComprehensivePipeline(config)
results = pipeline.run('data/raw/heart_data.h5ad', 'results/')

# Access results
adata = results['adata']
analysis_results = results['results']

print(f"Analyzed {adata.n_obs} cells with {adata.n_vars} genes")
print(f"Identified {len(adata.obs['leiden'].unique())} clusters")
print("✅ Analysis complete!")

Pipeline Options

HeartMAP provides specialized pipelines for different analysis needs:
from heartmap import Config
from heartmap.pipelines import BasicPipeline

# Quality control and cell type annotation
config = Config.default()
pipeline = BasicPipeline(config)
results = pipeline.run('data/raw/heart_data.h5ad', 'results/basic/')

# Access cluster labels
cluster_labels = results['results']['cluster_labels']

Configuration

Customize HeartMAP behavior using the Config class:
from heartmap.config import Config, DataConfig, AnalysisConfig

# Create configuration from scratch
config = Config(
    data=DataConfig(
        min_genes=200,
        min_cells=3,
        max_cells_subset=50000,
        max_genes_subset=5000,
        random_seed=42
    ),
    analysis=AnalysisConfig(
        n_neighbors=10,
        n_pcs=40,
        resolution=0.5,
        use_leiden=True
    ),
    model=ModelConfig(
        save_intermediate=True,
        use_gpu=False
    ),
    paths=PathConfig(
        results_dir="my_results",
        figures_dir="my_figures"
    )
)
Or load from a YAML file:
config = Config.from_yaml('my_config.yaml')
Example config.yaml:
data:
  min_genes: 200
  min_cells: 3
  max_cells_subset: 50000
  max_genes_subset: 5000
  target_sum: 10000.0
  n_top_genes: 2000
  random_seed: 42

analysis:
  n_components_pca: 50
  n_neighbors: 10
  n_pcs: 40
  resolution: 0.5
  n_marker_genes: 25
  use_leiden: true

model:
  model_type: "comprehensive"
  save_intermediate: true
  use_gpu: false

paths:
  results_dir: "results"
  figures_dir: "figures"

Understanding the Output

After running an analysis, HeartMAP generates several outputs:

Directory Structure

results/
├── figures/
│   ├── umap_clusters.png          # UMAP visualization
│   ├── qc_metrics.png             # Quality control plots
│   ├── communication_heatmap.png  # Cell-cell communication
│   └── hub_scores.png             # Communication hubs
├── annotated_data.h5ad            # Processed data with annotations
├── heartmap_complete.h5ad         # Complete analysis results
└── results_summary.json           # Analysis summary

Accessing Results

# Access the processed single-cell data
adata = results['adata']

# Cell metadata
print(adata.obs.head())  # Cell annotations
print(adata.obs['leiden'].value_counts())  # Cluster sizes

# Gene expression
print(adata.X.shape)  # Expression matrix
print(adata.var_names[:10])  # Gene names

# Embeddings
print(adata.obsm['X_pca'].shape)  # PCA
print(adata.obsm['X_umap'].shape)  # UMAP

Expected Output

When you run the comprehensive pipeline, you should see output similar to this:
=== Running Comprehensive HeartMAP Pipeline ===
1. Loading and processing data...
   Reading data from: your_data.h5ad
   Original dataset: 50000 cells × 20000 genes
   After filtering: 48532 cells × 18432 genes
   
Computing neighborhood graph...
   PCA computed: 50 components
   Neighbors computed: 10 neighbors
   
2. Performing comprehensive analysis...
   Clustering with Leiden algorithm...
   Identified 15 clusters
   
Computing UMAP...
   UMAP computed: 2 dimensions
   
3. Generating comprehensive visualizations...
   Creating UMAP plots...
   Creating QC metrics plots...
   Creating communication networks...
   Creating comprehensive dashboard...
   
4. Saving results...
   Saved processed data to: results/heartmap_complete.h5ad
   Saved figures to: results/figures/
   Saved summary to: results/results_summary.json
   
Comprehensive HeartMAP pipeline completed!
 Analysis complete! Check 'results/' directory.

Real-World Example

Here’s a complete example analyzing a heart dataset:
analyze_heart_data.py
import scanpy as sc
from heartmap import Config
from heartmap.pipelines import ComprehensivePipeline

# Configure for your system (16GB RAM)
config = Config.default()
config.data.max_cells_subset = 30000
config.data.max_genes_subset = 4000
config.analysis.resolution = 0.5

# Run comprehensive analysis
print("Starting HeartMAP analysis...")
pipeline = ComprehensivePipeline(config)
results = pipeline.run(
    data_path='data/heart_atlas.h5ad',
    output_dir='results/heart_analysis'
)

# Examine results
adata = results['adata']
print(f"\nAnalysis Summary:")
print(f"  Total cells: {adata.n_obs:,}")
print(f"  Total genes: {adata.n_vars:,}")
print(f"  Clusters identified: {len(adata.obs['leiden'].unique())}")
print(f"\nCluster sizes:")
print(adata.obs['leiden'].value_counts())

# Visualize key results
sc.pl.umap(adata, color='leiden', title='Cell Type Clusters', save='_clusters.png')
sc.pl.umap(adata, color='hub_score', title='Communication Hub Scores', save='_hubs.png')

print("\n✅ Analysis complete! Results saved to: results/heart_analysis/")

Memory Optimization Tips

For large datasets or limited RAM:
Subsetting Strategy: HeartMAP can intelligently subset your data to fit available memory while preserving biological signal.
# For 8GB systems
config.data.max_cells_subset = 10000
config.data.max_genes_subset = 2000

# For 16GB systems
config.data.max_cells_subset = 30000
config.data.max_genes_subset = 4000

# For 32GB+ systems
config.data.max_cells_subset = 50000
config.data.max_genes_subset = 5000

Next Steps

Now that you’ve completed your first analysis:

Tutorials

Learn advanced analysis techniques

Configuration Guide

Master HeartMAP configuration

API Reference

Explore the complete Python API

Examples

See real-world use cases

Troubleshooting

Common Issues

Make sure your data file path is correct and the file exists:
import os
print(os.path.exists('your_data.h5ad'))  # Should print True
Reduce the dataset size in your configuration:
config.data.max_cells_subset = 10000  # Reduce this
config.data.max_genes_subset = 2000   # Reduce this
Install the communication extras:
pip install heartmap[communication]
For communication or multi-chamber pipelines, first run the basic pipeline:
# Step 1: Basic annotation
basic = BasicPipeline(config)
basic.run('raw_data.h5ad', 'results/')

# Step 2: Use annotated data
comm = AdvancedCommunicationPipeline(config)
comm.run('results/annotated_data.h5ad', 'results/')

Getting Help

If you run into issues:

Build docs developers (and LLMs) love