Skip to main content

Overview

ClickHouse is an optional database backend for Umami, optimized for analytical workloads with billions of events. It provides exceptional query performance and compression for large-scale analytics.
ClickHouse is recommended only for high-volume deployments (1M+ events per day). For most users, PostgreSQL is sufficient and easier to manage.

When to Use ClickHouse

High Volume

Processing millions of events daily

Long Retention

Storing years of historical data

Fast Aggregations

Complex analytical queries on large datasets

Storage Efficiency

Superior compression ratios (10:1 typical)

PostgreSQL vs ClickHouse

FeaturePostgreSQLClickHouse
Best forSmall to medium sitesHigh-volume analytics
Events/day< 1M1M+
Query speedGoodExcellent
CompressionStandardExcellent (10:1)
ComplexityLowMedium
User managementNativeRequires PostgreSQL
Umami uses a dual-database architecture when using ClickHouse: PostgreSQL for user management and configuration, ClickHouse for analytics data.

Installation

docker-compose.yml
services:
  umami:
    image: ghcr.io/umami-software/umami:latest
    ports:
      - "3000:3000"
    environment:
      DATABASE_URL: postgresql://umami:umami@postgres:5432/umami
      CLICKHOUSE_URL: http://clickhouse:8123/umami
      APP_SECRET: replace-me-with-a-random-string
    depends_on:
      postgres:
        condition: service_healthy
      clickhouse:
        condition: service_healthy
    restart: always

  postgres:
    image: postgres:15-alpine
    environment:
      POSTGRES_DB: umami
      POSTGRES_USER: umami
      POSTGRES_PASSWORD: umami
    volumes:
      - postgres-data:/var/lib/postgresql/data
    restart: always
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U $${POSTGRES_USER} -d $${POSTGRES_DB}"]
      interval: 5s
      timeout: 5s
      retries: 5

  clickhouse:
    image: clickhouse/clickhouse-server:latest
    environment:
      CLICKHOUSE_DB: umami
      CLICKHOUSE_USER: umami
      CLICKHOUSE_PASSWORD: umami
    volumes:
      - clickhouse-data:/var/lib/clickhouse
    restart: always
    healthcheck:
      test: ["CMD", "clickhouse-client", "--query", "SELECT 1"]
      interval: 5s
      timeout: 5s
      retries: 5

volumes:
  postgres-data:
  clickhouse-data:
1

Save docker-compose.yml

Save the configuration above to docker-compose.yml
2

Generate APP_SECRET

openssl rand -base64 32
Replace replace-me-with-a-random-string with the generated value.
3

Start services

docker compose up -d
4

Initialize ClickHouse schema

The schema will be created automatically on first startup.

Standalone ClickHouse Server

# Add repository
sudo apt-get install -y apt-transport-https ca-certificates dirmngr
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 8919F6BD2B48D754
echo "deb https://packages.clickhouse.com/deb stable main" | sudo tee /etc/apt/sources.list.d/clickhouse.list

# Install ClickHouse
sudo apt-get update
sudo apt-get install -y clickhouse-server clickhouse-client

# Start service
sudo systemctl start clickhouse-server
sudo systemctl enable clickhouse-server

Database Schema

ClickHouse schema is located in /db/clickhouse/schema.sql:

Core Tables

Stores all page views and custom events:
CREATE TABLE umami.website_event
(
    website_id UUID,
    session_id UUID,
    visit_id UUID,
    event_id UUID,
    -- Session attributes
    hostname LowCardinality(String),
    browser LowCardinality(String),
    os LowCardinality(String),
    device LowCardinality(String),
    screen LowCardinality(String),
    language LowCardinality(String),
    country LowCardinality(String),
    region LowCardinality(String),
    city String,
    -- Page view attributes
    url_path String,
    url_query String,
    utm_source String,
    utm_medium String,
    utm_campaign String,
    utm_content String,
    utm_term String,
    referrer_path String,
    referrer_query String,
    referrer_domain String,
    page_title String,
    -- Click IDs
    gclid String,
    fbclid String,
    msclkid String,
    ttclid String,
    li_fat_id String,
    twclid String,
    -- Event attributes
    event_type UInt32,
    event_name String,
    tag String,
    distinct_id String,
    created_at DateTime('UTC'),
    job_id Nullable(UUID)
)
ENGINE = MergeTree
    PARTITION BY toYYYYMM(created_at)
    ORDER BY (toStartOfHour(created_at), website_id, session_id, visit_id, created_at)
    PRIMARY KEY (toStartOfHour(created_at), website_id, session_id, visit_id)
    SETTINGS index_granularity = 8192;
Key features:
  • Partitioned by month for efficient data management
  • LowCardinality columns for memory efficiency
  • Optimized sorting for common query patterns
Stores custom event properties:
CREATE TABLE umami.event_data
(
    website_id UUID,
    session_id UUID,
    event_id UUID,
    url_path String,
    event_name String,
    data_key String,
    string_value Nullable(String),
    number_value Nullable(Decimal(22, 4)),
    date_value Nullable(DateTime('UTC')),
    data_type UInt32,
    created_at DateTime('UTC'),
    job_id Nullable(UUID)
)
ENGINE = MergeTree
    ORDER BY (website_id, event_id, data_key, created_at)
    SETTINGS index_granularity = 8192;
Stores custom session properties:
CREATE TABLE umami.session_data
(
    website_id UUID,
    session_id UUID,
    data_key String,
    string_value Nullable(String),
    number_value Nullable(Decimal(22, 4)),
    date_value Nullable(DateTime('UTC')),
    data_type UInt32,
    distinct_id String,
    created_at DateTime('UTC'),
    job_id Nullable(UUID)
)
ENGINE = ReplacingMergeTree
    ORDER BY (website_id, session_id, data_key)
    SETTINGS index_granularity = 8192;
Uses ReplacingMergeTree to handle updates to session data.

Materialized Views

ClickHouse uses materialized views for pre-aggregated statistics:
CREATE TABLE umami.website_event_stats_hourly
(
    website_id UUID,
    session_id UUID,
    visit_id UUID,
    -- ... aggregated fields ...
    views SimpleAggregateFunction(sum, UInt64),
    min_time SimpleAggregateFunction(min, DateTime('UTC')),
    max_time SimpleAggregateFunction(max, DateTime('UTC')),
    created_at Datetime('UTC')
)
ENGINE = AggregatingMergeTree
    PARTITION BY toYYYYMM(created_at)
    ORDER BY (website_id, event_type, toStartOfHour(created_at), cityHash64(visit_id), visit_id)
    SAMPLE BY cityHash64(visit_id);
This pre-aggregates data hourly for faster dashboard queries.

Configuration

Connection URL

Set the CLICKHOUSE_URL environment variable:
# HTTP protocol (default)
CLICKHOUSE_URL=http://localhost:8123/umami

# With authentication
CLICKHOUSE_URL=http://username:password@localhost:8123/umami

# HTTPS
CLICKHOUSE_URL=https://clickhouse.example.com:8443/umami

ClickHouse Server Configuration

Optimize ClickHouse settings in /etc/clickhouse-server/config.xml:
<clickhouse>
    <!-- Memory settings -->
    <max_memory_usage>10000000000</max_memory_usage>  <!-- 10GB -->
    <max_bytes_before_external_group_by>5000000000</max_bytes_before_external_group_by>
    
    <!-- Compression -->
    <compression>
        <case>
            <min_part_size>10485760</min_part_size>
            <min_part_size_ratio>0.01</min_part_size_ratio>
            <method>lz4</method>
        </case>
    </compression>
    
    <!-- Performance -->
    <max_concurrent_queries>100</max_concurrent_queries>
    <background_pool_size>16</background_pool_size>
    <background_schedule_pool_size>16</background_schedule_pool_size>
</clickhouse>

Migrations

ClickHouse migrations are located in /db/clickhouse/migrations/:
  • 01_edit_keys.sql - Schema adjustments
  • 02_add_visit_id.sql - Visit tracking
  • 03_session_data.sql - Session data support
  • 04_add_tag.sql - Event tags
  • 05_add_utm_clid.sql - UTM parameters and click IDs
  • 06_update_subdivision.sql - Geographic subdivisions
  • 07_add_distinct_id.sql - User identification
  • 08_update_hostname_view.sql - Hostname materialized view

Manual Schema Initialization

If needed, manually initialize the schema:
# Using clickhouse-client
clickhouse-client --query "CREATE DATABASE IF NOT EXISTS umami"
clickhouse-client --multiquery < /path/to/db/clickhouse/schema.sql

# Using Docker
docker exec -i clickhouse clickhouse-client --multiquery < schema.sql

Performance Optimization

Partition Management

ClickHouse partitions data by month. Manage partitions:
-- List partitions
SELECT
    partition,
    name,
    rows,
    formatReadableSize(bytes_on_disk) AS size
FROM system.parts
WHERE table = 'website_event' AND active
ORDER BY partition DESC;

-- Drop old partition (careful!)
ALTER TABLE umami.website_event DROP PARTITION '202301';

Data Compression

Check compression ratios:
SELECT
    table,
    formatReadableSize(sum(data_compressed_bytes)) AS compressed,
    formatReadableSize(sum(data_uncompressed_bytes)) AS uncompressed,
    round(sum(data_uncompressed_bytes) / sum(data_compressed_bytes), 2) AS ratio
FROM system.columns
WHERE database = 'umami'
GROUP BY table;

Query Optimization

-- Analyze query performance
EXPLAIN SELECT * FROM umami.website_event WHERE website_id = 'xxx';

-- Check query log
SELECT
    query,
    query_duration_ms,
    read_rows,
    formatReadableSize(read_bytes) AS read_size
FROM system.query_log
WHERE type = 'QueryFinish'
ORDER BY query_duration_ms DESC
LIMIT 10;

Monitoring

System Metrics

-- Database size
SELECT
    database,
    formatReadableSize(sum(bytes)) AS size
FROM system.parts
WHERE active
GROUP BY database;

-- Table statistics
SELECT
    table,
    formatReadableSize(sum(bytes)) AS size,
    sum(rows) AS rows,
    max(modification_time) AS latest_modification
FROM system.parts
WHERE database = 'umami' AND active
GROUP BY table;

-- Active queries
SELECT
    query_id,
    user,
    elapsed,
    formatReadableSize(memory_usage) AS memory,
    query
FROM system.processes;

Performance Metrics

-- Merge statistics
SELECT
    table,
    count() AS merges,
    formatReadableSize(sum(bytes_read_uncompressed)) AS read,
    formatReadableSize(sum(bytes_written_uncompressed)) AS written
FROM system.merges
GROUP BY table;

-- Insert performance
SELECT
    table,
    count() AS inserts,
    sum(rows) AS rows,
    formatReadableSize(sum(bytes)) AS size
FROM system.query_log
WHERE type = 'QueryFinish' AND query LIKE 'INSERT%'
GROUP BY table;

Backup and Restore

Using clickhouse-backup

# Install clickhouse-backup
curl -sfL https://github.com/AlexAkulov/clickhouse-backup/releases/latest/download/clickhouse-backup-linux-amd64.tar.gz | tar xzv

# Create backup
clickhouse-backup create

# List backups
clickhouse-backup list

# Restore backup
clickhouse-backup restore <backup-name>

Manual Backup

# Export table data
clickhouse-client --query "SELECT * FROM umami.website_event FORMAT Native" > website_event.native

# Import table data
cat website_event.native | clickhouse-client --query "INSERT INTO umami.website_event FORMAT Native"

Freeze/Unfreeze

-- Freeze table (create snapshot)
ALTER TABLE umami.website_event FREEZE;

-- Copy frozen data
-- Data is in /var/lib/clickhouse/shadow/

Data Retention

TTL (Time To Live)

Automatic data expiration:
-- Add TTL to table (1 year retention)
ALTER TABLE umami.website_event
MODIFY TTL created_at + INTERVAL 1 YEAR;

-- Different TTL for different columns
ALTER TABLE umami.website_event
MODIFY COLUMN ip_address String TTL created_at + INTERVAL 30 DAY;

Manual Deletion

-- Delete old data (lightweight delete)
ALTER TABLE umami.website_event
DELETE WHERE created_at < now() - INTERVAL 1 YEAR;

-- Optimize table to apply deletions
OPTIMIZE TABLE umami.website_event FINAL;
DELETE operations in ClickHouse are asynchronous and may take time to complete.

Troubleshooting

Check ClickHouse is running:
# Check service status
sudo systemctl status clickhouse-server

# Test connection
curl http://localhost:8123/
Verify network access:
telnet localhost 8123
Manually create the database:
CREATE DATABASE IF NOT EXISTS umami;
Run schema creation:
clickhouse-client --multiquery < db/clickhouse/schema.sql
Check query execution plan:
EXPLAIN SELECT ... FROM umami.website_event WHERE ...;
Enable query profiling:
SET send_logs_level = 'trace';
SELECT ... FROM umami.website_event WHERE ...;
Optimize table:
OPTIMIZE TABLE umami.website_event FINAL;
Reduce memory usage:
SET max_memory_usage = 10000000000;  -- 10GB
SET max_bytes_before_external_group_by = 5000000000;
Or adjust server config in /etc/clickhouse-server/config.xml.

Migration from PostgreSQL to ClickHouse

Migrating existing data:
1

Install ClickHouse

Set up ClickHouse alongside PostgreSQL.
2

Configure Umami

Add CLICKHOUSE_URL environment variable.
3

Initialize schema

Schema will be created automatically on first start.
4

Data migration

New events will be written to ClickHouse. Historical data remains in PostgreSQL.
Manual historical data migration is complex and typically not necessary. Most users start fresh with ClickHouse.

Best Practices

Partition Strategy

Use monthly partitions for easier data management and deletion.

Compression

Use LZ4 compression for balance between speed and size.

Materialized Views

Leverage pre-aggregated views for dashboard queries.

Regular Optimization

Run OPTIMIZE TABLE periodically to merge small parts.

Next Steps

PostgreSQL

Configure the PostgreSQL backend

Environment Variables

Configure Umami settings

Migrations

Understand database migrations

Troubleshooting

Solve common issues

Build docs developers (and LLMs) love