Introduction
Snuba provides a comprehensive migration system to manage ClickHouse database schemas and state. The migration system enables controlled schema evolution as the Snuba codebase changes, ensuring that database changes are applied consistently and safely across different deployment configurations.What Are Migrations?
Migrations are the mechanism through which database changes are defined and applied to evolve ClickHouse schemas. Each migration represents a discrete set of changes that can be applied (forward) or reversed (backward) to maintain database consistency.Key Concepts
Migration Groups
Migrations are organized into groups that typically correspond to features or related tables. Groups are executed in a strict order, with all migrations in one group completing before the next group begins.Each group is represented by a folder in
snuba/snuba_migrations/, such as:system- Core migration tracking (always runs first)events- Event data tablestransactions- Transaction data tablesdiscover- Discover dataset tablesmetrics- Metrics storage
Migration Sequence
Within each group, migrations are numbered sequentially (e.g.,
0001_, 0002_, 0003_) and must be applied in order. This ensures that dependencies between migrations are properly satisfied.Types of Migrations
Snuba supports three primary migration types:1. ClickHouse Node Migrations
The most common type, these execute SQL statements on ClickHouse nodes to modify schema. They inherit fromClickhouseNodeMigration.
snuba/snuba_migrations/discover/0008_discover_fix_add_local_table.py
- Creating or dropping tables
- Adding or removing columns
- Creating indexes
- Modifying table settings
- Changing TTL policies
2. Code Migrations
These execute Python functions for complex logic that cannot be expressed as pure SQL. They inherit fromCodeMigration.
snuba/snuba_migrations/functions/0001_functions.py
- Data migrations
- Conditional logic based on cluster configuration
- Complex multi-step operations
- Version-specific behavior
3. Squashed Migrations
Placeholder migrations that used to exist but are now safe to skip. They’re kept in the sequence for historical consistency.Migration Lifecycle
Status States
Each migration has one of three statuses:- NOT_STARTED - Migration hasn’t been executed
- IN_PROGRESS - Migration is currently running
- COMPLETED - Migration has been successfully applied
Forward and Backward Operations
Every migration must define both:- Forward operations - Apply the migration changes
- Backward operations - Revert the migration changes
- Allow recovery if a migration fails partway through
- Enable rolling back completed migrations when necessary
Once a migration is completed and the system is running with that schema, backwards operations should generally not be used to revert to a prior state, as they cannot always restore deleted data.
Blocking Migrations
Migrations that cannot complete immediately must be marked withblocking = True. These typically involve:
- Large data migrations
- Operations that rewrite significant amounts of data
- Changes requiring downtime
Storage Sets and Clusters
Thesettings.CLUSTERS mapping defines the relationship between storage sets and ClickHouse clusters:
- Storage sets - Groups of tables that must be colocated
- Clusters - Physical ClickHouse deployment configurations
- Single node - If
True, simplified migration paths are used
Migration Tracking
Snuba tracks migration status in dedicated ClickHouse tables:migrations_local- Used in single-node deploymentsmigrations_dist- Used in multi-node deployments
- Migration group
- Migration ID
- Status (NOT_STARTED, IN_PROGRESS, COMPLETED)
- Timestamp
- Version number
system group.
Next Steps
Migration Modes
Learn about single-node vs distributed deployment configurations
Creating Migrations
Step-by-step guide to writing your own migrations
Distributed Strategies
Advanced topics for multi-node ClickHouse deployments