Engine configurations determine how data is stored, merged, and queried in ClickHouse. The SDK provides type-safe builders for all major MergeTree engine variants.
Basic Usage
import { engine } from '@tinybirdco/sdk' ;
engine . mergeTree ({
sortingKey: [ 'user_id' , 'timestamp' ],
partitionKey: 'toYYYYMM(timestamp)' ,
ttl: 'timestamp + INTERVAL 90 DAY' ,
});
Common Configuration Options
All MergeTree engines share these base configuration options:
sortingKey
string | string[]
required
Columns used for sorting data within parts. This is required for all MergeTree engines. sortingKey : [ 'user_id' , 'timestamp' ]
Expression for partitioning data (e.g., 'toYYYYMM(timestamp)'). Partitions enable efficient data management and pruning. partitionKey : 'toYYYYMM(timestamp)'
Primary key columns. Defaults to sortingKey if not specified.
TTL expression for automatic data expiration. ttl : 'timestamp + INTERVAL 90 DAY'
settings
Record<string, string | number | boolean>
Additional engine settings. settings : {
index_granularity : 8192 ,
min_bytes_for_wide_part : 0 ,
}
MergeTree
engine.mergeTree() The most universal and functional table engine for high-load tasks. Best for: General-purpose analytics, logs, eventsimport { engine } from '@tinybirdco/sdk' ;
engine . mergeTree ({
sortingKey: [ 'user_id' , 'timestamp' ],
partitionKey: 'toYYYYMM(timestamp)' ,
ttl: 'timestamp + INTERVAL 90 DAY' ,
});
Full Example
import { defineDatasource , t , engine } from '@tinybirdco/sdk' ;
export const events = defineDatasource ( 'events' , {
schema: {
timestamp: t . dateTime (),
user_id: t . string (),
event_name: t . string (),
properties: t . string (),
},
engine: engine . mergeTree ({
sortingKey: [ 'event_name' , 'timestamp' ],
partitionKey: 'toYYYYMM(timestamp)' ,
ttl: 'timestamp + INTERVAL 90 DAY' ,
}),
});
ReplacingMergeTree
engine.replacingMergeTree() Removes duplicate rows with the same sorting key during background merges. Best for: Maintaining latest state, upserts, slowly changing dimensionsengine . replacingMergeTree ({
sortingKey: [ 'id' ],
ver: 'updated_at' , // Optional version column
});
Optional version column. Rows with the highest version are kept during merges.
Optional flag column for clean mode (ClickHouse 23.2+). Rows with isDeleted = 1 are removed during merges.
Full Example
import { defineDatasource , t , engine } from '@tinybirdco/sdk' ;
export const users = defineDatasource ( 'users' , {
schema: {
id: t . string (),
name: t . string (),
email: t . string (),
updated_at: t . dateTime (),
},
engine: engine . replacingMergeTree ({
sortingKey: [ 'id' ],
ver: 'updated_at' ,
}),
});
SummingMergeTree
engine.summingMergeTree() Sums numeric columns for rows with the same sorting key during background merges. Best for: Counters, metrics aggregation, pre-aggregated dataengine . summingMergeTree ({
sortingKey: [ 'date' , 'metric_name' ],
columns: [ 'value' ], // Columns to sum
});
Columns to sum during merges. If not specified, all numeric columns are summed.
Full Example
import { defineDatasource , t , engine } from '@tinybirdco/sdk' ;
export const metrics = defineDatasource ( 'metrics' , {
schema: {
date: t . date (),
metric_name: t . string (),
value: t . int64 (),
count: t . int32 (),
},
engine: engine . summingMergeTree ({
sortingKey: [ 'date' , 'metric_name' ],
columns: [ 'value' , 'count' ],
}),
});
AggregatingMergeTree
engine.aggregatingMergeTree() For incremental data aggregation with AggregateFunction columns. Best for: Materialized views, incremental aggregation pipelinesengine . aggregatingMergeTree ({
sortingKey: [ 'date' ],
});
Full Example
import { defineDatasource , t , engine } from '@tinybirdco/sdk' ;
export const dailyStats = defineDatasource ( 'daily_stats' , {
schema: {
date: t . date (),
pathname: t . string (),
views: t . simpleAggregateFunction ( 'sum' , t . uint64 ()),
unique_sessions: t . aggregateFunction ( 'uniq' , t . string ()),
},
engine: engine . aggregatingMergeTree ({
sortingKey: [ 'date' , 'pathname' ],
}),
});
CollapsingMergeTree
engine.collapsingMergeTree() For collapsing rows that cancel each other out. Best for: Changelog-style updates, mutable data with deletesengine . collapsingMergeTree ({
sortingKey: [ 'id' , 'timestamp' ],
sign: 'sign' , // Column containing 1 or -1
});
Column containing the sign (1 for state, -1 for cancel).
Full Example
import { defineDatasource , t , engine } from '@tinybirdco/sdk' ;
export const userActions = defineDatasource ( 'user_actions' , {
schema: {
id: t . string (),
timestamp: t . dateTime (),
action: t . string (),
sign: t . int8 (), // 1 or -1
},
engine: engine . collapsingMergeTree ({
sortingKey: [ 'id' , 'timestamp' ],
sign: 'sign' ,
}),
});
VersionedCollapsingMergeTree
engine.versionedCollapsingMergeTree() For collapsing with versioning when events may arrive out of order. Best for: Changelog-style updates with potential out-of-order arrivalengine . versionedCollapsingMergeTree ({
sortingKey: [ 'id' ],
sign: 'sign' ,
version: 'version' ,
});
Column containing the sign (1 for state, -1 for cancel).
Column containing the version number for ordering.
Full Example
import { defineDatasource , t , engine } from '@tinybirdco/sdk' ;
export const changelog = defineDatasource ( 'changelog' , {
schema: {
id: t . string (),
timestamp: t . dateTime (),
data: t . string (),
sign: t . int8 (),
version: t . uint64 (),
},
engine: engine . versionedCollapsingMergeTree ({
sortingKey: [ 'id' ],
sign: 'sign' ,
version: 'version' ,
}),
});
Choosing the Right Engine
MergeTree Use for append-only data like logs, events, and analytics.
ReplacingMergeTree Use when you need to update rows by replacing old versions.
SummingMergeTree Use for counters and metrics that can be summed.
AggregatingMergeTree Use for complex aggregations in materialized views.
CollapsingMergeTree Use for mutable data with changelog semantics.
VersionedCollapsingMergeTree Use for mutable data with out-of-order updates.
Best Practices
Sorting Key Selection
Choose sorting keys based on your query patterns:
// Good: Most selective columns first
sortingKey : [ 'user_id' , 'timestamp' ]
// Good: Match your WHERE clause patterns
sortingKey : [ 'date' , 'country' , 'event_type' ]
Partition Key Selection
Partition by time periods for efficient data management:
// Monthly partitions
partitionKey : 'toYYYYMM(timestamp)'
// Daily partitions (for high-volume data)
partitionKey : 'toYYYYMMDD(timestamp)'
// Yearly partitions (for low-volume data)
partitionKey : 'toYear(timestamp)'
TTL for Data Retention
Automatically expire old data:
// Keep 90 days
ttl : 'timestamp + INTERVAL 90 DAY'
// Keep 1 year
ttl : 'timestamp + INTERVAL 1 YEAR'
Changing the engine type requires recreating the datasource. Plan your engine choice carefully based on your use case.