Entities represent logical data models in Snuba. They define schemas, connect to storage backends, and configure query processing, validation, and translation rules.
Overview
An entity configuration defines:
Schema : Column definitions and data types
Storage connections : Which physical storages back this entity
Storage selector : Logic to choose between multiple storages
Query processors : Query transformations and optimizations
Validators : Query validation and security rules
Translation mappers : Column and function name mappings
Subscription rules : Real-time subscription configuration
Schema Reference
Schema version. Must be v1.
Component type. Must be entity.
Unique name for the entity.
Array of column definitions. Each column specifies name, type, and optional args.
Array of storage connections with optional translation mappers.
Configuration for the storage selector class.
Array of query processor configurations.
Array of validator configurations.
Name of the time column used for time-based queries.
Translation rules for columns, functions, and subscriptables.
Validation mode: do_nothing, warn, or error.
partition_key_column_name
Column name for partition-based query routing.
Array of subscription processor configurations.
Array of subscription validator configurations.
Definitions for entity joins in multi-entity queries.
Basic Example
A minimal entity configuration:
version : v1
kind : entity
name : events
schema :
- name : project_id
type : UInt
args :
size : 64
- name : timestamp
type : DateTime
- name : event_id
type : UUID
- name : message
type : String
required_time_column : timestamp
storages :
- storage : errors
is_writable : true
storage_selector :
selector : SimpleQueryStorageSelector
args :
storage : errors
query_processors : []
validators :
- validator : EntityRequiredColumnValidator
args :
required_filter_columns :
- project_id
Schema Definition
Column Types
Define columns with name, type, and optional arguments:
Simple Types
Nullable Columns
Arrays and Nested
Complex Types
schema :
- name : project_id
type : UInt
args :
size : 64
- name : timestamp
type : DateTime
- name : event_id
type : UUID
- name : message
type : String
Schema Modifiers
Available schema modifiers:
nullable - Column can contain NULL values
readonly - Computed/derived column, not directly writable
lowcardinality - Optimized for columns with few distinct values
Storage Connections
Entities can connect to multiple storages:
storages :
- storage : errors_ro
is_writable : false
translation_mappers :
columns :
- mapper : ColumnToColumn
args :
from_table_name : null
from_col_name : username
to_table_name : null
to_col_name : user_name
subscriptables :
- mapper : SubscriptableMapper
args :
from_column_name : tags
to_nested_col_name : tags
value_subcolumn_name : value
nullable : false
- storage : errors
is_writable : true
translation_mappers :
columns :
- mapper : ColumnToColumn
args :
from_table_name : null
from_col_name : username
to_table_name : null
to_col_name : user_name
Storage Selection
The storage selector determines which storage to query:
Simple Selector
Conditional Selector
storage_selector :
selector : SimpleQueryStorageSelector
args :
storage : errors
Translation Mappers
Translation mappers convert between entity columns and storage columns:
Column Mappers
Column to Column
Column to Mapping
Column to IP Address
translation_mappers :
columns :
- mapper : ColumnToColumn
args :
from_table_name : null
from_col_name : transaction
to_table_name : null
to_col_name : transaction_name
Function Mappers
translation_mappers :
functions :
- mapper : FunctionMapper
args :
from_name : count_unique
to_name : uniq
Subscriptable Mappers
Handle subscript operations like tags[key]:
translation_mappers :
subscriptables :
- mapper : SubscriptableMapper
args :
from_column_name : tags
to_nested_col_name : tags
value_subcolumn_name : value
nullable : false
- mapper : SubscriptableMapper
args :
from_column_name : contexts
to_nested_col_name : contexts
value_subcolumn_name : value
nullable : false
Query Processors
Query processors transform queries before execution:
query_processors :
- processor : TimeSeriesProcessor
args :
time_group_columns :
time : timestamp
rtime : received
time_parse_columns :
- timestamp
- received
- processor : BasicFunctionsProcessor
- processor : HandledFunctionsProcessor
args :
column : exception_stacks.mechanism_handled
Common Query Processors
Handles time-based queries, granularity, and time grouping. - processor : TimeSeriesProcessor
args :
time_group_columns :
time : timestamp
rtime : received
time_parse_columns :
- timestamp
- received
Translates basic function calls to ClickHouse equivalents. - processor : BasicFunctionsProcessor
HandledFunctionsProcessor
Processes error handled/unhandled function calls. - processor : HandledFunctionsProcessor
args :
column : exception_stacks.mechanism_handled
Validators
Validators enforce query constraints and security rules:
validators :
- validator : EntityRequiredColumnValidator
args :
required_filter_columns :
- project_id
- validator : TagConditionValidator
args : {}
- validator : DatetimeConditionValidator
args : {}
Common Validators
EntityRequiredColumnValidator
Ensures required columns are filtered in WHERE clause. - validator : EntityRequiredColumnValidator
args :
required_filter_columns :
- project_id
- organization_id
Validates tag filter conditions. - validator : TagConditionValidator
args : {}
DatetimeConditionValidator
Ensures datetime ranges are reasonable. - validator : DatetimeConditionValidator
args : {}
Subscription Configuration
Configure real-time query subscriptions:
subscription_validators :
- validator : AggregationValidator
args :
max_allowed_aggregations : 10
disallowed_aggregations :
- having
- orderby
required_time_column : timestamp
allows_group_by_without_condition : true
See Subscription Configuration for details.
Join Relationships
Define relationships for multi-entity queries:
join_relationships :
grouped :
rhs_entity : groupedmessage
join_type : inner
columns :
- [ project_id , project_id ]
- [ group_id , id ]
assigned :
rhs_entity : groupassignee
join_type : inner
columns :
- [ project_id , project_id ]
- [ group_id , group_id ]
attributes :
rhs_entity : group_attributes
join_type : left
columns :
- [ project_id , project_id ]
- [ group_id , group_id ]
Join Types
inner - Inner join (only matching rows)
left - Left outer join (all rows from left side)
Complete Example
A full entity configuration for events:
version : v1
kind : entity
name : events
schema :
- name : project_id
type : UInt
args :
size : 64
- name : timestamp
type : DateTime
- name : event_id
type : UUID
- name : message
type : String
- name : platform
type : String
- name : environment
type : String
args :
schema_modifiers : [ nullable ]
- name : tags
type : Nested
args :
subcolumns :
- name : key
type : String
- name : value
type : String
required_time_column : timestamp
storages :
- storage : errors
is_writable : true
translation_mappers :
columns :
- mapper : ColumnToMapping
args :
from_col_name : release
to_nested_col_name : tags
to_nested_mapping_key : sentry:release
nullable : false
subscriptables :
- mapper : SubscriptableMapper
args :
from_column_name : tags
to_nested_col_name : tags
value_subcolumn_name : value
nullable : false
storage_selector :
selector : ErrorsQueryStorageSelector
query_processors :
- processor : TimeSeriesProcessor
args :
time_group_columns :
time : timestamp
time_parse_columns :
- timestamp
- processor : BasicFunctionsProcessor
validators :
- validator : EntityRequiredColumnValidator
args :
required_filter_columns :
- project_id
- validator : DatetimeConditionValidator
args : {}
validate_data_model : error
subscription_validators :
- validator : AggregationValidator
args :
max_allowed_aggregations : 10
required_time_column : timestamp
Best Practices
Security First Always include required column validators for multi-tenant data (project_id, organization_id).
Translation Consistency Keep translation mappers consistent across all storages connected to an entity.
Optimize Queries Use query processors to optimize common query patterns and improve performance.
Document Schema Add YAML comments to explain complex schema structures and column purposes.
Datasets Group entities into datasets
Storages Configure storage backends
Subscriptions Set up real-time subscriptions