What is Snuba?
Snuba is a service that provides a rich data model on top of ClickHouse together with a fast ingestion consumer and a query optimizer. It was originally developed to replace a combination of Postgres and Redis to search and provide aggregated data on Sentry errors. Since then, it has evolved to support most time series related features over several datasets.Key Features
Database Access Layer
Provides a comprehensive database access layer to the ClickHouse distributed data store with support for both single-node and distributed environments.
Logical Data Model
Query a graph logical data model through SnQL (Snuba Query Language), providing functionalities similar to SQL with rich type safety and validation.
Multiple Datasets
Support multiple separate datasets in a single installation, including events, transactions, metrics, profiles, replays, and more.
Query Optimizer
Rule-based query optimizer that transforms and optimizes queries before execution for better performance.
Core Concepts
Understanding Snuba’s architecture requires familiarity with several key concepts:Datasets
A Dataset is a collection of entities that are related and can be queried together. Datasets provide a logical grouping of related data. For example:events- Error and issue datatransactions- Performance monitoring datametrics- Time series metrics dataevents_analytics_platform- EAP items and spans
Entities
An Entity represents a logical table that can be queried. Entities define:- Schema: The columns available for querying with their types
- Readable Storage: The underlying storage layer for reading data
- Writable Storage: The storage layer for writing data (if applicable)
- Query Processors: Transformations applied to queries
- Validators: Validation logic for queries
events, transactions, metrics_counters, profiles, and search_issues.
Storages
A Storage provides an abstraction over ClickHouse tables. There are two types:- Readable Storage: Abstracts reading from a ClickHouse table or view
- Writable Storage: Abstracts writing to a ClickHouse table
- Table names (local and distributed)
- Column definitions
- Query processors for storage-level transformations
Query Languages
Snuba supports two primary query languages:SnQL (Snuba Query Language)
SnQL is Snuba’s native query language, providing a SQL-like syntax optimized for time series data:- Type-safe expression validation
- Entity-based data source specification
- Support for complex aggregations and functions
- Time range filtering with granularity control
- Tag and column subscript access
MQL (Metrics Query Language)
MQL is designed specifically for querying metrics data with a formula-based approach:Architecture Overview
Snuba’s architecture consists of several key components:Ingestion Pipeline
Kafka Consumers
Snuba consumes data from Kafka topics using either Python or Rust consumers. The Rust consumers provide better performance and are now the default for most datasets.
Message Processing
Messages are parsed, validated, and transformed into rows suitable for ClickHouse insertion. Each dataset has a dedicated message processor.
Batching
Rows are batched together to optimize ClickHouse insert performance. Batch size is controlled by time and row count thresholds.
Query Pipeline
Request Parsing
HTTP requests are parsed and validated. SnQL/MQL queries are transformed into Snuba’s internal query representation.
Entity Processing
The query is processed at the entity level, applying entity-specific transformations and validations.
Storage Processing
The query is further processed for the specific storage backend, applying storage-level optimizations.
Components
Key Components:- API Server: HTTP server that accepts SnQL/MQL queries and returns results
- Consumers: Kafka consumers that ingest data into ClickHouse
- Replacer: Handles data mutations and deletions
- Admin UI: Web interface for managing migrations and system configuration
- Subscriptions: Scheduled queries that run periodically
- Migrations: Schema management system for ClickHouse DDL changes
Use Cases
Snuba excels at:Time Series Analytics
Query and aggregate time series data with automatic time bucketing and granularity control.
Error Tracking
Fast searching and aggregation of error events with support for grouping and filtering.
Performance Monitoring
Analyze transaction and span data for application performance insights.
Metrics Analysis
Query high-cardinality metrics data with efficient aggregation and grouping.
System Requirements
Snuba requires the following services to run:
- ClickHouse 25.3+ - Columnar database for data storage
- Kafka - Message queue for data ingestion
- Redis - Caching and state management
- Python 3.13+ - Runtime for Python components
- Rust (optional) - For building Rust consumers
Performance Characteristics
Batching Strategy
ClickHouse can only handle a limited rate of inserts (approximately 1 insert/second per shard). Snuba addresses this by:- Batching rows before insertion
- Coordinating batch timing across consumer replicas
- Using streaming inserts to handle large batches without memory issues
Query Optimization
Snuba applies multiple optimization techniques:- Query Processors: Transform queries to use efficient ClickHouse functions
- Storage Selection: Route queries to the most appropriate storage backend
- Predicate Pushdown: Push filters down to ClickHouse for early filtering
- Column Pruning: Only select required columns to reduce data transfer
Next Steps
Quickstart
Get Snuba running locally and execute your first query
Installation
Learn about different installation methods and deployment options