Snuba’s storage layer is built on ClickHouse, a columnar database optimized for analytical queries on time-series data. This architecture provides the real-time performance and scalability required for Sentry’s query infrastructure.
Why ClickHouse?
ClickHouse was selected as Snuba’s backing storage for several key reasons:
Real-Time Performance Fast query execution on large datasets with columnar storage
Distributed Architecture Built-in sharding and replication for horizontal scaling
Flexible Storage Engines Multiple table engines for different consistency/performance tradeoffs
SQL Interface Familiar query language with powerful extensions
ClickHouse Cluster Architecture
Snuba manages connections to ClickHouse through cluster abstractions:
# From snuba/clusters/cluster.py
class ClickhouseCluster :
"""
Provides reader, writer and ClickHouse connections shared by all
storages on the cluster.
"""
def __init__ (
self ,
host : str ,
port : int ,
database : str ,
http_port : int ,
storage_sets : Set[ str ],
single_node : bool ,
cluster_name : Optional[ str ] = None ,
distributed_cluster_name : Optional[ str ] = None ,
):
self .__query_node = ClickhouseNode(host, port)
self .__database = database
self .__single_node = single_node
self .__cluster_name = cluster_name
Cluster Types
Single Node Cluster
One ClickHouse server instance
Simplified operations and migrations
No distributed tables required
Suitable for development and small deployments
Multi-Node Cluster
Multiple shards and replicas
Requires cluster_name for local tables
Requires distributed_cluster_name for distributed tables
Enables horizontal scaling
A single proxy address is used for all read/write operations, but DDL operations must run on each individual node.
Cluster Configuration
Clusters are defined in settings.py:
CLUSTERS = [
{
"host" : "clickhouse-server" ,
"port" : 9000 ,
"http_port" : 8123 ,
"user" : "default" ,
"password" : "" ,
"database" : "default" ,
"storage_sets" : { "events" , "transactions" },
"single_node" : False ,
"cluster_name" : "cluster_one_sh" ,
"distributed_cluster_name" : "cluster_one_sh_dist" ,
}
]
Table Structure
Local vs Distributed Tables
In multi-node deployments, Snuba creates two types of tables:
Local Tables
Suffix: _local
Physical storage on each shard
Data written directly here
Contains actual data files
-- Example: errors_local table
CREATE TABLE IF NOT EXISTS errors_local
(
project_id UInt64,
timestamp DateTime ,
event_id UUID,
platform String,
message String,
-- ... more columns
)
ENGINE = ReplicatedReplacingMergeTree( '/clickhouse/tables/{shard}/errors_local' , '{replica}' )
PARTITION BY (retention_days, toMonday( timestamp ))
ORDER BY (project_id, toStartOfDay( timestamp ), event_id)
SAMPLE BY cityHash64(event_id)
Distributed Tables
Suffix: _dist
Virtual table that routes queries to local tables
Queries distributed across all shards
No data stored locally
-- Example: errors_dist distributed table
CREATE TABLE IF NOT EXISTS errors_dist
AS errors_local
ENGINE = Distributed ( 'cluster_one_sh' , default , errors_local, cityHash64(event_id))
Single-node deployments only create local tables. Distributed tables are unnecessary overhead.
Table Engines
ClickHouse provides multiple table engines. Snuba primarily uses these:
ReplicatedReplacingMergeTree
Most common engine for Snuba tables:
ENGINE = ReplicatedReplacingMergeTree(
'/clickhouse/tables/{shard}/errors_local' , -- ZooKeeper path
'{replica}' , -- Replica name
deleted -- Version column
)
Features :
Replication : Automatic data replication via ZooKeeper
Deduplication : Removes duplicate rows with same primary key
Replacement : Keeps row with highest version column value
Eventual consistency : Deduplication happens during merges
Use case : Primary storage for events, transactions, errors
ReplacingMergeTree deduplication is not guaranteed until merge. Use FINAL modifier for guaranteed deduplication (at performance cost).
ReplicatedMergeTree
Basic replicated storage without deduplication:
ENGINE = ReplicatedMergeTree( '/clickhouse/tables/{shard}/table' , '{replica}' )
Features :
Replication via ZooKeeper
No automatic deduplication
Faster inserts than ReplacingMergeTree
Use case : Data where duplicates are acceptable or handled upstream
ReplicatedAggregatingMergeTree
Pre-aggregates data during merges:
ENGINE = ReplicatedAggregatingMergeTree( '/clickhouse/tables/{shard}/table' , '{replica}' )
Features :
Stores intermediate aggregation states
Combines states during merges
Requires AggregateFunction column types
Use case : Metrics pre-aggregations, materialized views
Materialized Views
Creates derived tables that auto-update:
CREATE MATERIALIZED VIEW outcomes_hourly_mv
TO outcomes_hourly_local
AS
SELECT
toStartOfHour( timestamp ) as hour ,
project_id,
outcome,
count () as count
FROM outcomes_raw_local
GROUP BY hour , project_id, outcome
Use case : Pre-aggregated views for faster queries
Storage Schema
Snuba storages are defined in YAML configuration:
# From snuba/datasets/configuration/events/storages/errors.yaml
version : v1
kind : writable_storage
name : errors
storage :
key : errors
set_key : events # Cluster mapping
schema :
columns :
- { name : project_id , type : UInt , args : { size : 64 } }
- { name : timestamp , type : DateTime }
- { name : event_id , type : UUID }
- { name : tags , type : Nested , args : {
subcolumns : [
{ name : key , type : String },
{ name : value , type : String }
]
}}
local_table_name : errors_local
dist_table_name : errors_dist
partition_format :
- retention_days
- date
Key Schema Elements
Defines how data is partitioned on disk:
partition_format :
- retention_days # Data retention partition
- date # Time-based partition
Benefits :
Efficient data deletion (drop entire partitions)
Pruning irrelevant partitions during queries
Parallel processing across partitions
Order By / Primary Key
Defines sort order and primary key (from DDL):
ORDER BY (project_id, toStartOfDay( timestamp ), event_id)
Impact :
Determines data locality on disk
Enables efficient range scans
Critical for query performance
Choose ORDER BY columns based on most common query patterns. First columns should be most selective.
Data Types
Snuba uses ClickHouse’s rich type system:
Primitive Types
columns :
- { name : project_id , type : UInt , args : { size : 64 } } # UInt64
- { name : timestamp , type : DateTime } # DateTime
- { name : event_id , type : UUID } # UUID
- { name : message , type : String } # String
- { name : retention_days , type : UInt , args : { size : 16 } } # UInt16
Complex Types
Nested
Stores arrays of structs:
- name : tags
type : Nested
args :
subcolumns :
- { name : key , type : String }
- { name : value , type : String }
Stored as parallel arrays: tags.key and tags.value
Array
Homogeneous arrays:
- name : sdk_integrations
type : Array
args :
inner_type : { type : String }
Nullable
Allows NULL values (use sparingly):
- name : user_id
type : String
args :
schema_modifiers : [ nullable ]
Nullable columns have performance overhead. Use empty strings or zero values instead when possible.
Connection Management
Snuba maintains connection pools for efficiency:
class ConnectionCache :
"""Caches ClickHouse connection pools per configuration"""
def get_node_connection (
self ,
client_settings : ClickhouseClientSettings,
node : ClickhouseNode,
user : str ,
password : str ,
database : str ,
) -> ClickhousePool:
# Reuse existing connection or create new one
cache_key = (node, client_settings, user, password, database)
if cache_key not in self .__cache:
self .__cache[cache_key] = ClickhousePool(
node.host_name,
node.port,
user,
password,
database,
client_settings = settings,
)
return self .__cache[cache_key]
Client Settings
Different operations use different ClickHouse settings:
class ClickhouseClientSettings ( Enum ):
QUERY = ClickhouseClientSettingsType({}, None )
INSERT = ClickhouseClientSettingsType({}, None )
MIGRATE = ClickhouseClientSettingsType({
"replication_alter_partitions_sync" : 2 ,
"mutations_sync" : 2 ,
"alter_sync" : 2 ,
}, 300000 ) # 5 minute timeout
DELETE = ClickhouseClientSettingsType({ "mutations_sync" : 1 }, None )
REPLACE = ClickhouseClientSettingsType({
"max_block_size" : 8192 ,
"max_memory_usage" : 10_000_000_000 ,
}, None )
Batch Writing
Snuba writes to ClickHouse in batches for efficiency:
class HTTPBatchWriter (BatchWriter[JSONRow]):
"""Batches JSON rows and sends via HTTP to ClickHouse"""
def write ( self , rows : Iterable[JSONRow]) -> None :
# Format rows as JSON or native format
# Send to ClickHouse HTTP endpoint
# Handle retries and errors
Batch Configuration :
Chunk size : Number of rows per HTTP request
Buffer size : Memory buffer for accumulating rows
Max connections : Connection pool size
Storage Sets
Storages are grouped into Storage Sets that share a cluster:
# From snuba/clusters/storage_sets.py
class StorageSetKey ( Enum ):
EVENTS = "events"
TRANSACTIONS = "transactions"
METRICS = "metrics"
SESSIONS = "sessions"
OUTCOMES = "outcomes"
Purpose :
Co-locate related storages on same cluster
Share connection pools and resources
Enable cross-storage optimizations
Query Optimization
PREWHERE clause : Filter before reading all columns
Index usage : Leverage ORDER BY for range scans
Partition pruning : Filter by partition key columns
Sampling : Use SAMPLE clause for approximate queries
Write Optimization
Batch inserts : Write thousands of rows per request
Partition alignment : Respect partition boundaries
Deduplication : Rely on ReplacingMergeTree, not application
Storage Optimization
Compression : ClickHouse compresses data automatically
Materialized views : Pre-aggregate common queries
TTL policies : Auto-delete old partitions
Column selection : Only include necessary columns
ClickHouse’s columnar storage means queries only read columns they need. Add columns liberally without significant cost.