Quick Start Guide

This guide will get you analyzing cardiac single-cell data with HeartMAP in just a few minutes. We’ll cover both command-line and Python API usage.

Prerequisites: Make sure you have installed HeartMAP before starting this guide.

30-Second Analysis (CLI)

The fastest way to analyze your heart data:

heartmap your_heart_data.h5ad

That’s it! HeartMAP will automatically:

Load your data
Perform quality control
Identify cell types
Generate visualizations
Save results to the results/ directory

CLI Options

For more control over the analysis:

# Comprehensive analysis with custom output
heartmap data/heart_data.h5ad \
    --analysis-type comprehensive \
    --output-dir results/comprehensive \
    --config my_config.yaml

# Specific analysis types
heartmap data/heart_data.h5ad --analysis-type annotation
heartmap data/heart_data.h5ad --analysis-type communication
heartmap data/heart_data.h5ad --analysis-type multi-chamber

# Memory-optimized for large datasets
heartmap data/large_dataset.h5ad \
    --analysis-type comprehensive \
    --config config_large.yaml

2-Minute Python Analysis

For more flexibility and integration into your workflows:

Import HeartMAP

from heartmap import Config
from heartmap.pipelines import ComprehensivePipeline

Configure analysis

# Use default configuration
config = Config.default()

# Or customize for your system
config = Config.default()
config.data.max_cells_subset = 50000  # Adjust based on available memory
config.data.max_genes_subset = 5000
config.analysis.resolution = 0.5

Run the pipeline

# Create pipeline
pipeline = ComprehensivePipeline(config)

# Run analysis
results = pipeline.run('your_data.h5ad', 'results/')

print("✅ Analysis complete! Check 'results/' directory.")

Complete Example

Here’s a complete working example:

complete_analysis.py

from heartmap import Config
from heartmap.pipelines import ComprehensivePipeline

# Load and customize configuration
config = Config.default()
config.data.max_cells_subset = 50000  # Optimize for your memory
config.data.max_genes_subset = 5000
config.analysis.resolution = 0.5

# Create and run pipeline
pipeline = ComprehensivePipeline(config)
results = pipeline.run('data/raw/heart_data.h5ad', 'results/')

# Access results
adata = results['adata']
analysis_results = results['results']

print(f"Analyzed {adata.n_obs} cells with {adata.n_vars} genes")
print(f"Identified {len(adata.obs['leiden'].unique())} clusters")
print("✅ Analysis complete!")

Pipeline Options

HeartMAP provides specialized pipelines for different analysis needs:

from heartmap import Config
from heartmap.pipelines import BasicPipeline

# Quality control and cell type annotation
config = Config.default()
pipeline = BasicPipeline(config)
results = pipeline.run('data/raw/heart_data.h5ad', 'results/basic/')

# Access cluster labels
cluster_labels = results['results']['cluster_labels']

Configuration

Customize HeartMAP behavior using the Config class:

from heartmap.config import Config, DataConfig, AnalysisConfig

# Create configuration from scratch
config = Config(
    data=DataConfig(
        min_genes=200,
        min_cells=3,
        max_cells_subset=50000,
        max_genes_subset=5000,
        random_seed=42
    ),
    analysis=AnalysisConfig(
        n_neighbors=10,
        n_pcs=40,
        resolution=0.5,
        use_leiden=True
    ),
    model=ModelConfig(
        save_intermediate=True,
        use_gpu=False
    ),
    paths=PathConfig(
        results_dir="my_results",
        figures_dir="my_figures"
    )
)

Or load from a YAML file:

config = Config.from_yaml('my_config.yaml')

Example config.yaml:

data:
  min_genes: 200
  min_cells: 3
  max_cells_subset: 50000
  max_genes_subset: 5000
  target_sum: 10000.0
  n_top_genes: 2000
  random_seed: 42

analysis:
  n_components_pca: 50
  n_neighbors: 10
  n_pcs: 40
  resolution: 0.5
  n_marker_genes: 25
  use_leiden: true

model:
  model_type: "comprehensive"
  save_intermediate: true
  use_gpu: false

paths:
  results_dir: "results"
  figures_dir: "figures"

Understanding the Output

After running an analysis, HeartMAP generates several outputs:

Directory Structure

results/
├── figures/
│   ├── umap_clusters.png          # UMAP visualization
│   ├── qc_metrics.png             # Quality control plots
│   ├── communication_heatmap.png  # Cell-cell communication
│   └── hub_scores.png             # Communication hubs
├── annotated_data.h5ad            # Processed data with annotations
├── heartmap_complete.h5ad         # Complete analysis results
└── results_summary.json           # Analysis summary

Accessing Results

# Access the processed single-cell data
adata = results['adata']

# Cell metadata
print(adata.obs.head())  # Cell annotations
print(adata.obs['leiden'].value_counts())  # Cluster sizes

# Gene expression
print(adata.X.shape)  # Expression matrix
print(adata.var_names[:10])  # Gene names

# Embeddings
print(adata.obsm['X_pca'].shape)  # PCA
print(adata.obsm['X_umap'].shape)  # UMAP

Expected Output

When you run the comprehensive pipeline, you should see output similar to this:

=== Running Comprehensive HeartMAP Pipeline ===
1. Loading and processing data...
   Reading data from: your_data.h5ad
   Original dataset: 50000 cells × 20000 genes
   After filtering: 48532 cells × 18432 genes
   
Computing neighborhood graph...
   PCA computed: 50 components
   Neighbors computed: 10 neighbors
   
2. Performing comprehensive analysis...
   Clustering with Leiden algorithm...
   Identified 15 clusters
   
Computing UMAP...
   UMAP computed: 2 dimensions
   
3. Generating comprehensive visualizations...
   Creating UMAP plots...
   Creating QC metrics plots...
   Creating communication networks...
   Creating comprehensive dashboard...
   
4. Saving results...
   Saved processed data to: results/heartmap_complete.h5ad
   Saved figures to: results/figures/
   Saved summary to: results/results_summary.json
   
Comprehensive HeartMAP pipeline completed!
✅ Analysis complete! Check 'results/' directory.

Real-World Example

Here’s a complete example analyzing a heart dataset:

analyze_heart_data.py

import scanpy as sc
from heartmap import Config
from heartmap.pipelines import ComprehensivePipeline

# Configure for your system (16GB RAM)
config = Config.default()
config.data.max_cells_subset = 30000
config.data.max_genes_subset = 4000
config.analysis.resolution = 0.5

# Run comprehensive analysis
print("Starting HeartMAP analysis...")
pipeline = ComprehensivePipeline(config)
results = pipeline.run(
    data_path='data/heart_atlas.h5ad',
    output_dir='results/heart_analysis'
)

# Examine results
adata = results['adata']
print(f"\nAnalysis Summary:")
print(f"  Total cells: {adata.n_obs:,}")
print(f"  Total genes: {adata.n_vars:,}")
print(f"  Clusters identified: {len(adata.obs['leiden'].unique())}")
print(f"\nCluster sizes:")
print(adata.obs['leiden'].value_counts())

# Visualize key results
sc.pl.umap(adata, color='leiden', title='Cell Type Clusters', save='_clusters.png')
sc.pl.umap(adata, color='hub_score', title='Communication Hub Scores', save='_hubs.png')

print("\n✅ Analysis complete! Results saved to: results/heart_analysis/")

Memory Optimization Tips

For large datasets or limited RAM:

Subsetting Strategy: HeartMAP can intelligently subset your data to fit available memory while preserving biological signal.

# For 8GB systems
config.data.max_cells_subset = 10000
config.data.max_genes_subset = 2000

# For 16GB systems
config.data.max_cells_subset = 30000
config.data.max_genes_subset = 4000

# For 32GB+ systems
config.data.max_cells_subset = 50000
config.data.max_genes_subset = 5000

Next Steps

Now that you’ve completed your first analysis:

Tutorials

Learn advanced analysis techniques

Configuration Guide

Master HeartMAP configuration

API Reference

Explore the complete Python API

Examples

See real-world use cases

Troubleshooting

Common Issues

FileNotFoundError: Data file not found

Make sure your data file path is correct and the file exists:

import os
print(os.path.exists('your_data.h5ad'))  # Should print True

MemoryError: Out of memory

Reduce the dataset size in your configuration:

config.data.max_cells_subset = 10000  # Reduce this
config.data.max_genes_subset = 2000   # Reduce this

ImportError: No module named 'liana'

Install the communication extras:

pip install heartmap[communication]

ValueError: Input data must have cell type annotations

For communication or multi-chamber pipelines, first run the basic pipeline:

# Step 1: Basic annotation
basic = BasicPipeline(config)
basic.run('raw_data.h5ad', 'results/')

# Step 2: Use annotated data
comm = AdvancedCommunicationPipeline(config)
comm.run('results/annotated_data.h5ad', 'results/')

Getting Help

If you run into issues:

Check the API Reference for detailed documentation
Browse Examples for more use cases
Visit GitHub Discussions for community support
Report bugs on GitHub Issues

Get Started

Core Concepts

Guides

Examples

Quick Start

Quick Start Guide

30-Second Analysis (CLI)

CLI Options

2-Minute Python Analysis

Complete Example

Pipeline Options

Configuration

Understanding the Output

Directory Structure

Accessing Results

Expected Output

Real-World Example

Memory Optimization Tips

Next Steps

Tutorials

Configuration Guide

API Reference

Examples

Troubleshooting

Common Issues

Getting Help

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Examples

​Quick Start Guide

​30-Second Analysis (CLI)

​CLI Options

​2-Minute Python Analysis

​Complete Example

​Pipeline Options

​Configuration

​Understanding the Output

​Directory Structure

​Accessing Results

​Expected Output

​Real-World Example

​Memory Optimization Tips

​Next Steps

Tutorials

Configuration Guide

API Reference

Examples

​Troubleshooting

​Common Issues

​Getting Help

Build docs developers (and LLMs) love

Quick Start Guide

30-Second Analysis (CLI)

CLI Options

2-Minute Python Analysis

Complete Example

Pipeline Options

Configuration

Understanding the Output

Directory Structure

Accessing Results

Expected Output

Real-World Example

Memory Optimization Tips

Next Steps

Troubleshooting

Common Issues

Getting Help