Engine Configurations

Engine configurations determine how data is stored, merged, and queried in ClickHouse. The SDK provides type-safe builders for all major MergeTree engine variants.

Basic Usage

import { engine } from '@tinybirdco/sdk';

engine.mergeTree({
  sortingKey: ['user_id', 'timestamp'],
  partitionKey: 'toYYYYMM(timestamp)',
  ttl: 'timestamp + INTERVAL 90 DAY',
});

Common Configuration Options

All MergeTree engines share these base configuration options:

sortingKey

string | string[]

required

Columns used for sorting data within parts. This is required for all MergeTree engines.

sortingKey: ['user_id', 'timestamp']

partitionKey

string

Expression for partitioning data (e.g., 'toYYYYMM(timestamp)'). Partitions enable efficient data management and pruning.

partitionKey: 'toYYYYMM(timestamp)'

primaryKey

string | string[]

Primary key columns. Defaults to sortingKey if not specified.

primaryKey: ['user_id']

ttl

string

TTL expression for automatic data expiration.

ttl: 'timestamp + INTERVAL 90 DAY'

settings

Record<string, string | number | boolean>

Additional engine settings.

settings: {
  index_granularity: 8192,
  min_bytes_for_wide_part: 0,
}

MergeTree

engine.mergeTree()

The most universal and functional table engine for high-load tasks.Best for: General-purpose analytics, logs, events

import { engine } from '@tinybirdco/sdk';

engine.mergeTree({
  sortingKey: ['user_id', 'timestamp'],
  partitionKey: 'toYYYYMM(timestamp)',
  ttl: 'timestamp + INTERVAL 90 DAY',
});

Full Example

import { defineDatasource, t, engine } from '@tinybirdco/sdk';

export const events = defineDatasource('events', {
  schema: {
    timestamp: t.dateTime(),
    user_id: t.string(),
    event_name: t.string(),
    properties: t.string(),
  },
  engine: engine.mergeTree({
    sortingKey: ['event_name', 'timestamp'],
    partitionKey: 'toYYYYMM(timestamp)',
    ttl: 'timestamp + INTERVAL 90 DAY',
  }),
});

ReplacingMergeTree

engine.replacingMergeTree()

Removes duplicate rows with the same sorting key during background merges.Best for: Maintaining latest state, upserts, slowly changing dimensions

engine.replacingMergeTree({
  sortingKey: ['id'],
  ver: 'updated_at', // Optional version column
});

ver

string

Optional version column. Rows with the highest version are kept during merges.

isDeleted

string

Optional flag column for clean mode (ClickHouse 23.2+). Rows with isDeleted = 1 are removed during merges.

Full Example

import { defineDatasource, t, engine } from '@tinybirdco/sdk';

export const users = defineDatasource('users', {
  schema: {
    id: t.string(),
    name: t.string(),
    email: t.string(),
    updated_at: t.dateTime(),
  },
  engine: engine.replacingMergeTree({
    sortingKey: ['id'],
    ver: 'updated_at',
  }),
});

SummingMergeTree

engine.summingMergeTree()

Sums numeric columns for rows with the same sorting key during background merges.Best for: Counters, metrics aggregation, pre-aggregated data

engine.summingMergeTree({
  sortingKey: ['date', 'metric_name'],
  columns: ['value'], // Columns to sum
});

columns

string[]

Columns to sum during merges. If not specified, all numeric columns are summed.

Full Example

import { defineDatasource, t, engine } from '@tinybirdco/sdk';

export const metrics = defineDatasource('metrics', {
  schema: {
    date: t.date(),
    metric_name: t.string(),
    value: t.int64(),
    count: t.int32(),
  },
  engine: engine.summingMergeTree({
    sortingKey: ['date', 'metric_name'],
    columns: ['value', 'count'],
  }),
});

AggregatingMergeTree

engine.aggregatingMergeTree()

For incremental data aggregation with AggregateFunction columns.Best for: Materialized views, incremental aggregation pipelines

engine.aggregatingMergeTree({
  sortingKey: ['date'],
});

Full Example

import { defineDatasource, t, engine } from '@tinybirdco/sdk';

export const dailyStats = defineDatasource('daily_stats', {
  schema: {
    date: t.date(),
    pathname: t.string(),
    views: t.simpleAggregateFunction('sum', t.uint64()),
    unique_sessions: t.aggregateFunction('uniq', t.string()),
  },
  engine: engine.aggregatingMergeTree({
    sortingKey: ['date', 'pathname'],
  }),
});

CollapsingMergeTree

engine.collapsingMergeTree()

For collapsing rows that cancel each other out.Best for: Changelog-style updates, mutable data with deletes

engine.collapsingMergeTree({
  sortingKey: ['id', 'timestamp'],
  sign: 'sign', // Column containing 1 or -1
});

sign

string

required

Column containing the sign (1 for state, -1 for cancel).

Full Example

import { defineDatasource, t, engine } from '@tinybirdco/sdk';

export const userActions = defineDatasource('user_actions', {
  schema: {
    id: t.string(),
    timestamp: t.dateTime(),
    action: t.string(),
    sign: t.int8(), // 1 or -1
  },
  engine: engine.collapsingMergeTree({
    sortingKey: ['id', 'timestamp'],
    sign: 'sign',
  }),
});

VersionedCollapsingMergeTree

engine.versionedCollapsingMergeTree()

For collapsing with versioning when events may arrive out of order.Best for: Changelog-style updates with potential out-of-order arrival

engine.versionedCollapsingMergeTree({
  sortingKey: ['id'],
  sign: 'sign',
  version: 'version',
});

sign

string

required

Column containing the sign (1 for state, -1 for cancel).

version

string

required

Column containing the version number for ordering.

Full Example

import { defineDatasource, t, engine } from '@tinybirdco/sdk';

export const changelog = defineDatasource('changelog', {
  schema: {
    id: t.string(),
    timestamp: t.dateTime(),
    data: t.string(),
    sign: t.int8(),
    version: t.uint64(),
  },
  engine: engine.versionedCollapsingMergeTree({
    sortingKey: ['id'],
    sign: 'sign',
    version: 'version',
  }),
});

Choosing the Right Engine

MergeTree

Use for append-only data like logs, events, and analytics.

ReplacingMergeTree

Use when you need to update rows by replacing old versions.

SummingMergeTree

Use for counters and metrics that can be summed.

AggregatingMergeTree

Use for complex aggregations in materialized views.

CollapsingMergeTree

Use for mutable data with changelog semantics.

VersionedCollapsingMergeTree

Use for mutable data with out-of-order updates.

Best Practices

Sorting Key Selection

Choose sorting keys based on your query patterns:

// Good: Most selective columns first
sortingKey: ['user_id', 'timestamp']

// Good: Match your WHERE clause patterns
sortingKey: ['date', 'country', 'event_type']

Partition Key Selection

Partition by time periods for efficient data management:

// Monthly partitions
partitionKey: 'toYYYYMM(timestamp)'

// Daily partitions (for high-volume data)
partitionKey: 'toYYYYMMDD(timestamp)'

// Yearly partitions (for low-volume data)
partitionKey: 'toYear(timestamp)'

TTL for Data Retention

Automatically expire old data:

// Keep 90 days
ttl: 'timestamp + INTERVAL 90 DAY'

// Keep 1 year
ttl: 'timestamp + INTERVAL 1 YEAR'

Changing the engine type requires recreating the datasource. Plan your engine choice carefully based on your use case.

Schema Definitions

Type Validators

Runtime Client

Public API

Type Inference

Basic Usage

Common Configuration Options

MergeTree

engine.mergeTree()

Full Example

ReplacingMergeTree

engine.replacingMergeTree()

Full Example

SummingMergeTree

engine.summingMergeTree()

Full Example

AggregatingMergeTree

engine.aggregatingMergeTree()

Full Example

CollapsingMergeTree

engine.collapsingMergeTree()

Full Example

VersionedCollapsingMergeTree

engine.versionedCollapsingMergeTree()

Full Example

Choosing the Right Engine

MergeTree

ReplacingMergeTree

SummingMergeTree

AggregatingMergeTree

CollapsingMergeTree

VersionedCollapsingMergeTree

Best Practices

Sorting Key Selection

Partition Key Selection

TTL for Data Retention

Build docs developers (and LLMs) love

Schema Definitions

Type Validators

Runtime Client

Public API

Type Inference

​Basic Usage

​Common Configuration Options

​MergeTree

engine.mergeTree()

​Full Example

​ReplacingMergeTree

engine.replacingMergeTree()

​Full Example

​SummingMergeTree

engine.summingMergeTree()

​Full Example

​AggregatingMergeTree

engine.aggregatingMergeTree()

​Full Example

​CollapsingMergeTree

engine.collapsingMergeTree()

​Full Example

​VersionedCollapsingMergeTree

engine.versionedCollapsingMergeTree()

​Full Example

​Choosing the Right Engine

MergeTree

ReplacingMergeTree

SummingMergeTree

AggregatingMergeTree

CollapsingMergeTree

VersionedCollapsingMergeTree

​Best Practices

​Sorting Key Selection

​Partition Key Selection

​TTL for Data Retention

Build docs developers (and LLMs) love

Basic Usage

Common Configuration Options

MergeTree

Full Example

ReplacingMergeTree

Full Example

SummingMergeTree

Full Example

AggregatingMergeTree

Full Example

CollapsingMergeTree

Full Example

VersionedCollapsingMergeTree

Full Example

Choosing the Right Engine

Best Practices

Sorting Key Selection

Partition Key Selection

TTL for Data Retention