Overview
Storage configurations specify:- Schema: Column definitions and table names
- Storage keys: Unique identifiers and cluster assignments
- Readiness state: Deployment environment availability
- Query processors: Storage-level query optimizations
- Allocation policies: Resource limits and rate limiting
- Stream loader (writable only): Kafka consumer configuration
- Deletion settings: Configuration for data deletion operations
Storage Types
Readable Storage
Read-only storage backed by ClickHouse tables or views. Used for immutable or replicated data.
Writable Storage
Read-write storage with Kafka stream consumers. Supports real-time data ingestion and updates.
Readable Storage Schema
Schema version. Must be
v1.Component type. Must be
readable_storage.Unique name for the storage.
Storage identification:
key: Unique storage identifierset_key: Storage set/cluster identifier
Deployment readiness:
limited, partial, complete, experimental, or deprecate.Table schema definition:
columns: Array of column definitionslocal_table_name: Local table name in ClickHousedist_table_name: Distributed table namepartition_format: Optional partition formatnot_deleted_mandatory_condition: Column for soft deletion
Array of query processor configurations for storage-level optimizations.
Resource allocation and rate limiting policies.
Security checks to enforce required query conditions.
Configuration for deletion operations.
Name of the required time column for time-based queries.
Writable Storage Schema
Writable storage includes all readable storage fields plus:Must be
writable_storage.Kafka consumer configuration:
processor: Message processor class namedefault_topic: Primary Kafka topiccommit_log_topic: Commit log topicsubscription_scheduled_topic: Subscription scheduling topicsubscription_result_topic: Subscription results topicsubscription_scheduler_mode: Scheduler mode (partition/global)replacement_topic: Replacements/deletions topicdlq_topic: Dead letter queue topicpre_filter: Optional message filter
Configuration for handling replacements (updates/deletions).
ClickHouse writer-specific options.
Readable Storage Example
errors_ro.yaml
Writable Storage Example
errors.yaml
Schema Configuration
Column Definitions
Columns must specify name, type, and optional arguments:Table Names
Name of the local table on each ClickHouse node. For single-node deployments, this is the only table.
Name of the distributed table that queries should use. In distributed ClickHouse, this table routes queries to local tables.
Partition Format
Define how data is partitioned:Partition Configuration
Soft Deletion
Configure soft deletion with a flag column:Soft Deletion
Readiness States
Control where storages are available:limited
limited
Only available in CI and local development. Use for new storages under development.
partial
partial
Available in staging environments. Use for testing before production.
complete
complete
Fully available in all environments including production.
experimental
experimental
Available but marked as experimental. May have stability issues.
deprecate
deprecate
Marked for deprecation. Will be removed in a future release.
Readiness State
Stream Loader Configuration
For writable storages, configure Kafka consumers:Stream Loader
Processor Types
Common message processors:ErrorsProcessor- Processes error eventsTransactionsProcessor- Processes transaction eventsOutcomesProcessor- Processes outcomes dataMetricsProcessor- Processes metrics data
Subscription Scheduler Modes
partition- Schedule per Kafka partitionglobal- Global scheduling across all partitions
Synchronization Timestamps
orig_message_ts- Use original message timestampreceived_p99- Use 99th percentile of received time
Query Processors
Storage-level query processors optimize queries:Common Query Processors
UniqInSelectAndHavingProcessor
UniqInSelectAndHavingProcessor
Optimizes
uniq() functions in SELECT and HAVING clauses.MappingColumnPromoter
MappingColumnPromoter
Promotes frequently-used mapping columns to first-class columns.
UUIDColumnProcessor
UUIDColumnProcessor
Optimizes UUID column queries.
PrewhereProcessor
PrewhereProcessor
Implements ClickHouse PREWHERE optimization for faster queries.
MappingOptimizer
MappingOptimizer
Optimizes queries on nested/mapping columns using hash maps.
Allocation Policies
Control resource allocation and rate limiting:Allocation Policies
Policy Types
ReferrerGuardRailPolicy
ReferrerGuardRailPolicy
Enforces that queries include a referrer for tracking and debugging.
ConcurrentRateLimitAllocationPolicy
ConcurrentRateLimitAllocationPolicy
Limits concurrent queries per tenant (organization, project, referrer).
BytesScannedRejectingPolicy
BytesScannedRejectingPolicy
Rejects queries that would scan too many bytes.
BytesScannedWindowAllocationPolicy
BytesScannedWindowAllocationPolicy
Throttles queries based on bytes scanned within a time window.
CrossOrgQueryAllocationPolicy
CrossOrgQueryAllocationPolicy
Special limits for cross-organization queries.
Mandatory Condition Checkers
Enforce security requirements:Mandatory Conditions
project_id to prevent data leakage.
Deletion Settings
Configure deletion operations:Deletion Configuration
Replacer Processor
Handle updates and deletions:Replacer Configuration
Best Practices
Security First
Always configure mandatory condition checkers to enforce multi-tenancy.
Resource Limits
Set up allocation policies to prevent resource exhaustion.
Optimize Queries
Use PREWHERE and mapping optimizers for better performance.
Partition Wisely
Choose partition formats that align with query patterns.
Related Configuration
Entities
Connect storages to entities
Datasets
Organize entities in datasets
Overview
Configuration system overview