AnalysisConfig

The AnalysisConfig class defines configuration parameters for dimensionality reduction, clustering, and downstream analysis in HeartMAP.

Class Definition

from heartmap.config import AnalysisConfig

Constructor

AnalysisConfig(
    n_components_pca=50,
    n_neighbors=10,
    n_pcs=40,
    resolution=0.5,
    n_marker_genes=25,
    use_leiden=True,
    use_liana=True
)

Configuration Fields

n_components_pca

int

default:"50"

Number of principal components to compute during PCA. More components capture more variance but increase computation time. Used as input for neighborhood graph construction.

Example:

from heartmap.config import AnalysisConfig

config = AnalysisConfig(n_components_pca=100)
# Compute 100 principal components

n_neighbors

int

default:"10"

Number of neighbors to use when constructing the k-nearest neighbor graph. This parameter affects the connectivity of the neighborhood graph and influences clustering results. Lower values emphasize local structure; higher values emphasize global structure.

Example:

config = AnalysisConfig(n_neighbors=15)
# Use 15 neighbors for neighborhood graph

n_pcs

int

default:"40"

Number of principal components to use for computing the neighborhood graph and UMAP. Should be less than or equal to n_components_pca. Using fewer PCs can reduce noise.

Example:

config = AnalysisConfig(
    n_components_pca=50,
    n_pcs=30  # Use only top 30 PCs for downstream analysis
)

resolution

float

default:"0.5"

Resolution parameter for Leiden/Louvain clustering. Controls the coarseness of the clustering. Higher values produce more clusters (finer granularity); lower values produce fewer clusters (coarser granularity). Typical range: 0.1 to 2.0.

Example:

config = AnalysisConfig(resolution=0.8)
# Higher resolution for more細grained clusters

config = AnalysisConfig(resolution=0.3)
# Lower resolution for broader clusters

n_marker_genes

int

default:"25"

Number of marker genes to identify per cluster. These genes are most differentially expressed in each cluster and are useful for cell type annotation.

Example:

config = AnalysisConfig(n_marker_genes=50)
# Identify top 50 marker genes per cluster

use_leiden

bool

default:"true"

Whether to use Leiden clustering algorithm. If true, uses Leiden algorithm (recommended). If false, falls back to Louvain algorithm. Leiden generally provides better quality clusters.

Example:

config = AnalysisConfig(use_leiden=True)
# Use Leiden clustering (recommended)

config = AnalysisConfig(use_leiden=False)
# Use Louvain clustering instead

use_liana

bool

default:"true"

Whether to perform cell-cell communication analysis using LIANA (Ligand-Receptor Analysis). If true, runs LIANA to identify ligand-receptor interactions between cell types.

Example:

config = AnalysisConfig(use_liana=True)
# Enable ligand-receptor analysis

config = AnalysisConfig(use_liana=False)
# Skip cell-cell communication analysis

Usage Examples

Default Configuration

from heartmap.config import AnalysisConfig

# Create with default values
config = AnalysisConfig()
print(config.n_components_pca)  # 50
print(config.resolution)  # 0.5
print(config.use_leiden)  # True

Custom Configuration

from heartmap.config import AnalysisConfig

# Create with custom values
config = AnalysisConfig(
    n_components_pca=100,
    n_pcs=50,
    n_neighbors=15,
    resolution=0.8,
    n_marker_genes=50
)

Fine-Grained Clustering

from heartmap.config import AnalysisConfig

# Configure for more細 granular clusters
config = AnalysisConfig(
    resolution=1.0,        # Higher resolution
    n_neighbors=15,        # More neighbors for smoother clusters
    n_marker_genes=50,     # More markers for annotation
    use_leiden=True
)

Coarse Clustering

from heartmap.config import AnalysisConfig

# Configure for broad cell type clusters
config = AnalysisConfig(
    resolution=0.3,        # Lower resolution
    n_neighbors=20,        # More neighbors for global structure
    n_marker_genes=25,
    use_leiden=True
)

Fast Analysis (Skip Communication)

from heartmap.config import AnalysisConfig

# Skip time-consuming LIANA analysis
config = AnalysisConfig(
    use_liana=False,       # Disable cell-cell communication
    n_components_pca=30,   # Fewer components
    n_marker_genes=20      # Fewer markers
)

High-Dimensional Analysis

from heartmap.config import AnalysisConfig

# Capture more variance with more PCs
config = AnalysisConfig(
    n_components_pca=100,  # More PCs
    n_pcs=80,              # Use more PCs downstream
    n_neighbors=10
)

Using with Main Config

from heartmap.config import Config, AnalysisConfig

# Create custom analysis config
analysis_config = AnalysisConfig(
    resolution=0.8,
    n_neighbors=15,
    use_liana=True
)

# Use with main config
config = Config.default()
config.analysis = analysis_config

# Or create from dictionary
config = Config.from_dict({
    'analysis': {
        'resolution': 0.8,
        'n_neighbors': 15,
        'use_liana': True
    }
})

Loading from YAML

# config.yaml
analysis:
  n_components_pca: 50
  n_neighbors: 15
  n_pcs: 40
  resolution: 0.8
  n_marker_genes: 50
  use_leiden: true
  use_liana: true

from heartmap.config import Config

config = Config.from_yaml('config.yaml')
print(config.analysis.resolution)  # 0.8

Best Practices

Dimensionality Reduction

n_components_pca: Use 30-100 depending on dataset complexity
- Small datasets (< 10k cells): 30-50 components
- Large datasets (> 50k cells): 50-100 components
n_pcs: Use 70-90% of n_components_pca
- Set to 30-50 for most datasets
- Check explained variance ratio to determine optimal number

Neighborhood Graph

n_neighbors: Typical range 10-30
- 10-15: Emphasizes local structure, more clusters
- 20-30: Emphasizes global structure, fewer clusters
- Larger values for datasets > 50k cells

Clustering Resolution

resolution: Adjust based on expected cell type diversity
- 0.2-0.4: Major cell types (e.g., immune, epithelial, stromal)
- 0.5-0.8: Cell subtypes (e.g., T cell subtypes, macrophage states)
- 0.9-2.0: Fine-grained states (e.g., activation states, cell cycle)
Start with 0.5 and adjust based on biological knowledge
Use multiple resolutions to explore hierarchical structure

Marker Genes

n_marker_genes: 20-50 is typical
- 20-25: Quick overview of cluster identity
- 50-100: Detailed characterization for annotation

Algorithm Selection

use_leiden: Always use true (Leiden is superior to Louvain)
use_liana: Set to false if:
- Only interested in cell type identification
- Limited computational resources
- Dataset has < 5 cell types (communication less interesting)

Common Configurations

Quick Exploratory Analysis

config = AnalysisConfig(
    n_components_pca=30,
    n_pcs=30,
    n_neighbors=10,
    resolution=0.5,
    n_marker_genes=20,
    use_leiden=True,
    use_liana=False  # Skip for speed
)

Comprehensive Analysis

config = AnalysisConfig(
    n_components_pca=100,
    n_pcs=50,
    n_neighbors=15,
    resolution=0.8,
    n_marker_genes=50,
    use_leiden=True,
    use_liana=True  # Full analysis
)

Large Dataset Analysis

config = AnalysisConfig(
    n_components_pca=100,
    n_pcs=80,
    n_neighbors=30,     # More neighbors for large datasets
    resolution=1.0,     # May need higher resolution
    n_marker_genes=25,
    use_leiden=True,
    use_liana=True
)

Pipelines

Configuration

Data Processing

Utilities

REST API

CLI

Class Definition

Constructor

Configuration Fields

n_components_pca

n_neighbors

n_pcs

resolution

n_marker_genes

use_leiden

use_liana

Usage Examples

Default Configuration

Custom Configuration

Fine-Grained Clustering

Coarse Clustering

Fast Analysis (Skip Communication)

High-Dimensional Analysis

Using with Main Config

Loading from YAML

Best Practices

Dimensionality Reduction

Neighborhood Graph

Clustering Resolution

Marker Genes

Algorithm Selection

Common Configurations

Quick Exploratory Analysis

Comprehensive Analysis

Large Dataset Analysis

See Also

Build docs developers (and LLMs) love

Pipelines

Configuration

Data Processing

Utilities

REST API

CLI

​Class Definition

​Constructor

​Configuration Fields

​n_components_pca

​n_neighbors

​n_pcs

​resolution

​n_marker_genes

​use_leiden

​use_liana

​Usage Examples

​Default Configuration

​Custom Configuration

​Fine-Grained Clustering

​Coarse Clustering

​Fast Analysis (Skip Communication)

​High-Dimensional Analysis

​Using with Main Config

​Loading from YAML

​Best Practices

​Dimensionality Reduction

​Neighborhood Graph

​Clustering Resolution

​Marker Genes

​Algorithm Selection

​Common Configurations

​Quick Exploratory Analysis

​Comprehensive Analysis

​Large Dataset Analysis

​See Also

Build docs developers (and LLMs) love

Class Definition

Constructor

Configuration Fields

n_components_pca

n_neighbors

n_pcs

resolution

n_marker_genes

use_leiden

use_liana

Usage Examples

Default Configuration

Custom Configuration

Fine-Grained Clustering

Coarse Clustering

Fast Analysis (Skip Communication)

High-Dimensional Analysis

Using with Main Config

Loading from YAML

Best Practices

Dimensionality Reduction

Neighborhood Graph

Clustering Resolution

Marker Genes

Algorithm Selection

Common Configurations

Quick Exploratory Analysis

Comprehensive Analysis

Large Dataset Analysis

See Also