Skip to main content

Sync Issues

Tables Not Syncing

Symptoms: Tables from MySQL are not appearing in DuckDB after sync.
Cause: Tables may be in the EXCLUDED_TABLES configuration.
# Check .env file
cat .env | grep EXCLUDED_TABLES

# Remove table from exclusion list
EXCLUDED_TABLES=old_table,temp_table  # Remove the table you want to sync

# Restart and trigger sync
docker-compose restart duckdb-server
curl -X POST http://localhost:3001/api/sync/full
Cause: Table may not exist or may be in a different database.
# List all MySQL tables
docker exec duckling-server node scripts/mysql.js "SHOW TABLES"

# Check specific table
docker exec duckling-server node scripts/mysql.js "DESCRIBE table_name"
Cause: Sync may be failing with errors.
# Check recent error logs
curl "http://localhost:3001/api/sync-logs?status=error&limit=10"

# View detailed logs
docker-compose logs -f duckdb-server | grep ERROR

Incremental Sync Missing Records

Symptoms: New records in MySQL are not appearing in DuckDB during incremental sync.
Watermark Column Missing: This is the most common cause. Tables without updated_at, created_at, or timestamp columns cannot use incremental sync.
# Check table structure for watermark columns
docker exec duckling-server node scripts/mysql.js "DESCRIBE table_name"

# Watermark columns (in priority order):
# 1. updated_at / updatedAt / modified_at / modifiedAt
# 2. created_at / createdAt
# 3. timestamp
If no watermark column exists:
-- MySQL: Add updated_at column
ALTER TABLE table_name ADD COLUMN updated_at TIMESTAMP 
  DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP;

-- Update existing rows
UPDATE table_name SET updated_at = CURRENT_TIMESTAMP;
Then trigger a full sync to reset the watermark:
curl -X POST http://localhost:3001/api/sync/table/table_name
Cause: Records at exact watermark boundary may be missed.
Duckling uses >= operator (not >) to prevent boundary issues. This means the last synced record is re-processed each sync (safely handled by INSERT OR REPLACE).
# Verify watermark for table
curl -X POST http://localhost:3001/api/query \
  -H "Content-Type: application/json" \
  -d '{"sql": "SELECT * FROM sync_log WHERE table_name = 'table_name' ORDER BY created_at DESC LIMIT 1"}'
If watermark appears stuck, reset it:
# Reset watermark
curl -X POST http://localhost:3001/api/query \
  -d '{"sql": "DELETE FROM sync_log WHERE table_name = 'table_name'"}'

# Trigger full sync
curl -X POST http://localhost:3001/api/sync/table/table_name
Cause: Incremental sync may have failed silently.
# Trigger full sync for specific table
curl -X POST http://localhost:3001/api/sync/table/table_name

# Or full sync for all tables
curl -X POST http://localhost:3001/api/sync/full

Duplicate Records

Symptoms: Same record appears multiple times in DuckDB.
Cause: Table may be missing a primary key in MySQL.
# Verify primary key exists
docker exec duckling-server node scripts/mysql.js "SHOW CREATE TABLE table_name"

# Check for PRIMARY KEY constraint
No Primary Key: Tables without primary keys can accumulate duplicates. The INSERT OR REPLACE upsert logic requires a primary key to identify duplicate records.
Add a primary key to the MySQL table:
-- MySQL: Add primary key
ALTER TABLE table_name ADD PRIMARY KEY (id);
Then recreate the DuckDB table:
# Drop and recreate
curl -X POST http://localhost:3001/api/query \
  -d '{"sql": "DROP TABLE table_name"}'

curl -X POST http://localhost:3001/api/sync/table/table_name
For fact tables (append-only), Duckling uses view-level deduplication:
# Check for duplicates
curl -X POST http://localhost:3001/api/query \
  -d '{"sql": "SELECT COUNT(*) as total, COUNT(DISTINCT id) as unique FROM table_name"}'

# If duplicates exist, query the deduplicated view
curl -X POST http://localhost:3001/api/query \
  -d '{"sql": "SELECT * FROM table_name_dedup"}'
See Schema Evolution for more on deduplication strategy.

Type Issues

BIGINT UNSIGNED Overflow

Symptoms: Crash or error when syncing tables with very large unsigned BIGINT values.
Known Limitation: BIGINT UNSIGNED values above 9,223,372,036,854,775,807 (BIGINT_MAX) will overflow DuckDB’s signed BIGINT type and crash the appender.
Solution: Use a smaller unsigned type or change to signed BIGINT:
-- MySQL: Change to signed BIGINT
ALTER TABLE table_name MODIFY large_id BIGINT SIGNED;

-- Or use DECIMAL for extremely large numbers
ALTER TABLE table_name MODIFY large_id DECIMAL(20,0);
See Type Support for more details.

DECIMAL Precision Loss

Symptoms: High-precision decimal values appear truncated in DuckDB.
MySQL DECIMAL(p,s) maps to DuckDB’s bare DECIMAL type. High-precision values may experience truncation depending on DuckDB’s internal DECIMAL width.
Solution: Verify precision requirements:
# Compare values
docker exec duckling-server node scripts/mysql.js \
  "SELECT decimal_column FROM table_name LIMIT 5"

curl -X POST http://localhost:3001/api/query \
  -d '{"sql": "SELECT decimal_column FROM table_name LIMIT 5"}'
If precision loss is unacceptable, consider storing as VARCHAR:
-- MySQL: Change to VARCHAR
ALTER TABLE table_name MODIFY decimal_column VARCHAR(50);

Binary Types in CDC

Symptoms: Binary columns (BLOB, BINARY, VARBINARY) not updating during incremental sync or CDC.
Binary Types & CDC: Binary types break the INSERT OR REPLACE SQL path used by CDC updates. These types are only supported via full-sync Appender path.
Solution: Use full sync for tables with binary columns:
# Disable incremental sync for specific table
# (requires code modification)

# Or trigger periodic full syncs
# Schedule via cron or external scheduler
curl -X POST http://localhost:3001/api/sync/table/table_with_blobs
Alternatively, exclude binary columns from incremental sync by using a separate table.

JSON Stringification Issues

Symptoms: JSON values not readable or malformed in DuckDB. Solution: Verify JSON validity:
# Check MySQL JSON
docker exec duckling-server node scripts/mysql.js \
  "SELECT json_column FROM table_name LIMIT 1"

# Check DuckDB JSON
curl -X POST http://localhost:3001/api/query \
  -d '{"sql": "SELECT json_column::JSON FROM table_name LIMIT 1"}'
Duckling uses JSON.stringify() for conversion. Invalid JSON in MySQL will cause issues.

Spatial Types

Symptoms: Spatial/geometry columns contain binary garbage in DuckDB.
Spatial Types Not Supported: MySQL spatial types (POINT, LINESTRING, POLYGON, GEOMETRY, etc.) fall through to VARCHAR with raw binary WKB representation.Tables with spatial columns will sync but the spatial data is not usable for spatial queries in DuckDB.
Solution: Exclude spatial tables or extract coordinates:
-- MySQL: Extract coordinates to separate columns
ALTER TABLE locations 
  ADD COLUMN lat DECIMAL(10,8),
  ADD COLUMN lng DECIMAL(11,8);

UPDATE locations 
  SET lat = ST_Y(location),
      lng = ST_X(location);
Then sync the lat/lng columns instead of the spatial type.

Connection Issues

MySQL Connection Timeouts

Symptoms: Sync fails with “Connection timeout” or “Lost connection to MySQL server”.
# .env
MYSQL_MAX_CONNECTIONS=10  # Default is 5

# Restart
docker-compose restart duckdb-server
# Test MySQL connection
docker exec duckling-server node scripts/mysql.js "SELECT 1"

# Check network latency
ping mysql-host
Large batch sizes can cause timeouts on slow connections:
# .env
BATCH_SIZE=5000  # Default is 10000

# Restart
docker-compose restart duckdb-server

DuckDB Lock Errors

Symptoms: “database is locked” errors during queries.
DuckDB uses file-based locking. Multiple processes or containers accessing the same .db file can cause lock contention.
# Ensure only one server instance is running
docker ps | grep duckling

# Stop duplicate instances
docker stop <container_id>
Enable read replicas to separate query workload from sync workload:
# .env
READ_REPLICA_ENABLED=true
REPLICA_REFRESH_INTERVAL=300  # 5 minutes

# Restart
docker-compose restart duckdb-server
# Check disk usage
df -h

# Check DuckDB file size
du -h data/*.db
Low disk space can cause lock errors during writes.

Performance Issues

Slow Queries

Symptoms: DuckDB queries taking longer than expected.
DuckDB is columnar - select only needed columns:
-- ❌ Slow: SELECT *
SELECT * FROM large_table;

-- ✅ Fast: SELECT specific columns
SELECT id, name, created_at FROM large_table;
-- Create index for frequently filtered columns
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_created ON orders(created_at);
Adjust concurrent query limits:
# .env
MAX_CONCURRENT_QUERIES=20  # Default is 10
QUERY_TIMEOUT_MS=60000     # Default is 30000 (30s)

# Restart
docker-compose restart duckdb-server

Slow Sync

Symptoms: Full sync taking hours for large tables.
# .env
BATCH_SIZE=20000  # Default is 10000

# For tables > 1M rows, try 20K-50K
# Restart
docker-compose restart duckdb-server
# .env
WORKER_THREADS=8  # Default is CPU count - 1

# Restart
docker-compose restart duckdb-server
Ensure incremental sync is enabled:
# .env
ENABLE_INCREMENTAL_SYNC=true  # Default is true

# Verify tables have watermark columns
docker exec duckling-server node scripts/mysql.js "DESCRIBE table_name"

Multi-Database Issues

Database Not Found

Symptoms: API returns “Database not found” error.
# List all configured databases
curl http://localhost:3001/api/databases

# Check database ID matches query parameter
curl 'http://localhost:3001/api/tables?db=correct_id'
# View database configuration
cat data/databases.json

# Ensure database entry exists with correct ID
# Test connection to specific database
curl -X POST http://localhost:3001/api/databases/{id}/test

Cross-Database Queries

Symptoms: Cannot join tables from different databases.
Isolation by Design: Each database is isolated with its own DuckDB file. Cross-database queries are not supported.
Workaround: Replicate shared tables to both databases or use application-level joins.

Automation Issues

Auto-Sync Not Running

Symptoms: Scheduled sync not executing automatically.
# .env
AUTO_START_SYNC=true  # Default is true

# Check automation status
curl http://localhost:3001/api/automation/status
# .env
SYNC_INTERVAL_MINUTES=15  # Default is 15

# Restart to apply changes
docker-compose restart duckdb-server
# Start automation service
curl -X POST http://localhost:3001/api/automation/start

# Verify status
curl http://localhost:3001/api/automation/status

Backup Failures

Symptoms: Automatic backups failing silently.
# Verify sufficient disk space for backups
df -h

# Check backup directory
du -sh data/backups/
# .env
AUTO_BACKUP=true
BACKUP_INTERVAL_HOURS=24
BACKUP_RETENTION_DAYS=7

# Restart
docker-compose restart duckdb-server
# Test backup manually
curl -X POST http://localhost:3001/api/automation/backup

# Check logs for errors
docker-compose logs -f duckdb-server | grep backup

Debugging Tips

Enable Debug Logging

# .env
LOG_LEVEL=debug  # Default is info

# Restart
docker-compose restart duckdb-server

# View logs
docker-compose logs -f duckdb-server

Check Health Endpoint

# System health check
curl http://localhost:3001/health

# Detailed status
curl http://localhost:3001/status

# Sync metrics
curl http://localhost:3001/metrics

Validate Data Integrity

# Compare MySQL vs DuckDB counts
curl http://localhost:3001/api/sync/validate

# Check specific table
curl -X POST http://localhost:3001/api/query \
  -d '{"sql": "SELECT COUNT(*) FROM table_name"}'

docker exec duckling-server node scripts/mysql.js \
  "SELECT COUNT(*) FROM table_name"

Export Logs for Support

# Export recent logs
docker-compose logs --tail=500 duckdb-server > duckling-logs.txt

# Export sync logs
curl "http://localhost:3001/api/sync-logs?limit=100" > sync-logs.json

# Export configuration
cat .env > config.txt
cat data/databases.json >> config.txt

Build docs developers (and LLMs) love