Quick Start Guide
This guide will get you analyzing cardiac single-cell data with HeartMAP in just a few minutes. We’ll cover both command-line and Python API usage.
30-Second Analysis (CLI)
The fastest way to analyze your heart data:
heartmap your_heart_data.h5ad
That’s it! HeartMAP will automatically:
Load your data
Perform quality control
Identify cell types
Generate visualizations
Save results to the results/ directory
CLI Options
For more control over the analysis:
# Comprehensive analysis with custom output
heartmap data/heart_data.h5ad \
--analysis-type comprehensive \
--output-dir results/comprehensive \
--config my_config.yaml
# Specific analysis types
heartmap data/heart_data.h5ad --analysis-type annotation
heartmap data/heart_data.h5ad --analysis-type communication
heartmap data/heart_data.h5ad --analysis-type multi-chamber
# Memory-optimized for large datasets
heartmap data/large_dataset.h5ad \
--analysis-type comprehensive \
--config config_large.yaml
2-Minute Python Analysis
For more flexibility and integration into your workflows:
Import HeartMAP
from heartmap import Config
from heartmap.pipelines import ComprehensivePipeline
Configure analysis
# Use default configuration
config = Config.default()
# Or customize for your system
config = Config.default()
config.data.max_cells_subset = 50000 # Adjust based on available memory
config.data.max_genes_subset = 5000
config.analysis.resolution = 0.5
Run the pipeline
# Create pipeline
pipeline = ComprehensivePipeline(config)
# Run analysis
results = pipeline.run( 'your_data.h5ad' , 'results/' )
print ( "✅ Analysis complete! Check 'results/' directory." )
Complete Example
Here’s a complete working example:
from heartmap import Config
from heartmap.pipelines import ComprehensivePipeline
# Load and customize configuration
config = Config.default()
config.data.max_cells_subset = 50000 # Optimize for your memory
config.data.max_genes_subset = 5000
config.analysis.resolution = 0.5
# Create and run pipeline
pipeline = ComprehensivePipeline(config)
results = pipeline.run( 'data/raw/heart_data.h5ad' , 'results/' )
# Access results
adata = results[ 'adata' ]
analysis_results = results[ 'results' ]
print ( f "Analyzed { adata.n_obs } cells with { adata.n_vars } genes" )
print ( f "Identified { len (adata.obs[ 'leiden' ].unique()) } clusters" )
print ( "✅ Analysis complete!" )
Pipeline Options
HeartMAP provides specialized pipelines for different analysis needs:
Basic Pipeline
Communication Pipeline
Multi-Chamber Pipeline
Comprehensive Pipeline
from heartmap import Config
from heartmap.pipelines import BasicPipeline
# Quality control and cell type annotation
config = Config.default()
pipeline = BasicPipeline(config)
results = pipeline.run( 'data/raw/heart_data.h5ad' , 'results/basic/' )
# Access cluster labels
cluster_labels = results[ 'results' ][ 'cluster_labels' ]
Configuration
Customize HeartMAP behavior using the Config class:
from heartmap.config import Config, DataConfig, AnalysisConfig
# Create configuration from scratch
config = Config(
data = DataConfig(
min_genes = 200 ,
min_cells = 3 ,
max_cells_subset = 50000 ,
max_genes_subset = 5000 ,
random_seed = 42
),
analysis = AnalysisConfig(
n_neighbors = 10 ,
n_pcs = 40 ,
resolution = 0.5 ,
use_leiden = True
),
model = ModelConfig(
save_intermediate = True ,
use_gpu = False
),
paths = PathConfig(
results_dir = "my_results" ,
figures_dir = "my_figures"
)
)
Or load from a YAML file:
config = Config.from_yaml( 'my_config.yaml' )
Example config.yaml:
data :
min_genes : 200
min_cells : 3
max_cells_subset : 50000
max_genes_subset : 5000
target_sum : 10000.0
n_top_genes : 2000
random_seed : 42
analysis :
n_components_pca : 50
n_neighbors : 10
n_pcs : 40
resolution : 0.5
n_marker_genes : 25
use_leiden : true
model :
model_type : "comprehensive"
save_intermediate : true
use_gpu : false
paths :
results_dir : "results"
figures_dir : "figures"
Understanding the Output
After running an analysis, HeartMAP generates several outputs:
Directory Structure
results/
├── figures/
│ ├── umap_clusters.png # UMAP visualization
│ ├── qc_metrics.png # Quality control plots
│ ├── communication_heatmap.png # Cell-cell communication
│ └── hub_scores.png # Communication hubs
├── annotated_data.h5ad # Processed data with annotations
├── heartmap_complete.h5ad # Complete analysis results
└── results_summary.json # Analysis summary
Accessing Results
AnnData Object
Analysis Results
Visualization
# Access the processed single-cell data
adata = results[ 'adata' ]
# Cell metadata
print (adata.obs.head()) # Cell annotations
print (adata.obs[ 'leiden' ].value_counts()) # Cluster sizes
# Gene expression
print (adata.X.shape) # Expression matrix
print (adata.var_names[: 10 ]) # Gene names
# Embeddings
print (adata.obsm[ 'X_pca' ].shape) # PCA
print (adata.obsm[ 'X_umap' ].shape) # UMAP
Expected Output
When you run the comprehensive pipeline, you should see output similar to this:
=== Running Comprehensive HeartMAP Pipeline ===
1. Loading and processing data...
Reading data from: your_data.h5ad
Original dataset: 50000 cells × 20000 genes
After filtering: 48532 cells × 18432 genes
Computing neighborhood graph...
PCA computed: 50 components
Neighbors computed: 10 neighbors
2. Performing comprehensive analysis...
Clustering with Leiden algorithm...
Identified 15 clusters
Computing UMAP...
UMAP computed: 2 dimensions
3. Generating comprehensive visualizations...
Creating UMAP plots...
Creating QC metrics plots...
Creating communication networks...
Creating comprehensive dashboard...
4. Saving results...
Saved processed data to: results/heartmap_complete.h5ad
Saved figures to: results/figures/
Saved summary to: results/results_summary.json
Comprehensive HeartMAP pipeline completed!
✅ Analysis complete! Check 'results/' directory.
Real-World Example
Here’s a complete example analyzing a heart dataset:
import scanpy as sc
from heartmap import Config
from heartmap.pipelines import ComprehensivePipeline
# Configure for your system (16GB RAM)
config = Config.default()
config.data.max_cells_subset = 30000
config.data.max_genes_subset = 4000
config.analysis.resolution = 0.5
# Run comprehensive analysis
print ( "Starting HeartMAP analysis..." )
pipeline = ComprehensivePipeline(config)
results = pipeline.run(
data_path = 'data/heart_atlas.h5ad' ,
output_dir = 'results/heart_analysis'
)
# Examine results
adata = results[ 'adata' ]
print ( f " \n Analysis Summary:" )
print ( f " Total cells: { adata.n_obs :,} " )
print ( f " Total genes: { adata.n_vars :,} " )
print ( f " Clusters identified: { len (adata.obs[ 'leiden' ].unique()) } " )
print ( f " \n Cluster sizes:" )
print (adata.obs[ 'leiden' ].value_counts())
# Visualize key results
sc.pl.umap(adata, color = 'leiden' , title = 'Cell Type Clusters' , save = '_clusters.png' )
sc.pl.umap(adata, color = 'hub_score' , title = 'Communication Hub Scores' , save = '_hubs.png' )
print ( " \n ✅ Analysis complete! Results saved to: results/heart_analysis/" )
Memory Optimization Tips
For large datasets or limited RAM:
Subsetting Strategy : HeartMAP can intelligently subset your data to fit available memory while preserving biological signal.
# For 8GB systems
config.data.max_cells_subset = 10000
config.data.max_genes_subset = 2000
# For 16GB systems
config.data.max_cells_subset = 30000
config.data.max_genes_subset = 4000
# For 32GB+ systems
config.data.max_cells_subset = 50000
config.data.max_genes_subset = 5000
Next Steps
Now that you’ve completed your first analysis:
Tutorials Learn advanced analysis techniques
Configuration Guide Master HeartMAP configuration
API Reference Explore the complete Python API
Examples See real-world use cases
Troubleshooting
Common Issues
FileNotFoundError: Data file not found
Make sure your data file path is correct and the file exists: import os
print (os.path.exists( 'your_data.h5ad' )) # Should print True
MemoryError: Out of memory
Reduce the dataset size in your configuration: config.data.max_cells_subset = 10000 # Reduce this
config.data.max_genes_subset = 2000 # Reduce this
ImportError: No module named 'liana'
Install the communication extras: pip install heartmap[communication]
ValueError: Input data must have cell type annotations
Getting Help
If you run into issues: