Skip to main content

Duckling

DuckDB server that replicates your MySQL data. Columnar storage makes analytical queries 100-13,000x faster depending on query shape. Writes are ACID, no partial syncs, no duplicates.

What it does

Replicates MySQL tables into DuckDB with ACID transactions. Incremental sync only fetches what changed since the last watermark. If you need sub-second replication, there’s optional CDC through the MySQL binlog. Query it over REST, WebSocket, or the MySQL wire protocol on port 3307 (so any MySQL client just works). There’s a Nuxt 4 dashboard too. Runs on a $20/month droplet.

Quick Start

Get Duckling running in minutes with Docker

Installation

Installation options: Docker, manual setup, or development

API Reference

REST endpoints, WebSocket SDK, and query examples

Configuration

Environment variables and tuning options

Performance Benchmarks

Real numbers from the benchmark suite with 20M rows:
QueryMySQLDuckDBSpeedup
Full table count4,258 ms4 ms1,064x
Filtered count1,475 ms20 ms73x
Group by status433,259 ms32 ms13,539x
Region x status breakdown16,685 ms112 ms148x
Monthly revenue (2023)7,536 ms30 ms251x
Regional analytics446,038 ms394 ms1,132x
Total909,251 ms592 ms1,535x
Run the benchmark yourself:
cd benchmark
./run.sh                        # 20M rows
BENCHMARK_SCALE=0.01 ./run.sh   # 200K rows (quick smoke test)

Key Features

FeatureDetails
Data integrityACID transactions, primary key constraints
StorageSingle DuckDB file per database, columnar, compressed
Sync modesFull, incremental (watermark), CDC (binlog)
Query accessREST API, WebSocket SDK, native MySQL protocol (port 3307)
Multi-databaseMultiple MySQL sources replicated independently
Schema changesAdditive column evolution, zero downtime

How Sync Works

MechanismTriggerWhen to use
Full syncManual or first runInitial load, disaster recovery
IncrementalEvery 15 min (automatic)Routine catch-up
CDCMySQL binlog streamSub-second replication (opt-in)
Full and incremental sync batch-read rows from MySQL into DuckDB using INSERT OR REPLACE. Watermarks track the last processed ID/timestamp per table. Tables sync in parallel with per-table locking, so a slow table doesn’t block the rest. CDC streams binlog events via ZongJi, processing inserts, updates, and deletes in order. Binlog position is checkpointed for resume after restarts. Enable with CDC_ENABLED=true. It can run alongside scheduled syncs.

MySQL Wire Protocol

Port 3307 runs a MySQL wire protocol server. Connect with any MySQL client:
mysql -h 127.0.0.1 -P 3307 -u duckling -p
If it speaks MySQL, it works. Queries hit DuckDB under the hood.
VariableDefaultDescription
MYSQL_PROTOCOL_ENABLEDtrueTurn the protocol server on/off
MYSQL_PROTOCOL_PORT3307TCP port
MYSQL_PROTOCOL_USERducklingLogin username
MYSQL_PROTOCOL_PASSWORDuses DUCKLING_API_KEYLogin password
MYSQL_PROTOCOL_MAX_CONNECTIONS50Max concurrent connections

Why DuckDB and not X?

vs MariaDB ColumnStore: DuckDB is embedded (no separate server), runs on 4GB RAM instead of 128GB, handles empty strings correctly, and costs 20/monthinsteadof20/month instead of 500+. If you have 100TB+ of data, ColumnStore might make more sense. We don’t. vs ClickHouse: No ZooKeeper, no cluster management, no JOIN limits, actual ACID transactions. ClickHouse wins when you’re ingesting 100K+ rows/sec into 10B+ row tables and have ops people to babysit it.

Get Started

Quick Start Guide

Follow the quick start to get Duckling running with Docker in under 5 minutes

Build docs developers (and LLMs) love