Skip to main content

Overview

The sync and endsync commands work together to manage data source synchronization. These commands are critical when resynchronizing a Kafka data stream after failures or when setting up a new data source.

When to Use Sync Commands

Use these commands when:
  • Re-streaming a complete snapshot after a Kafka failure
  • A source database becomes unsynchronized with Metadb
  • Setting up a new data source for the first time
  • Recovering from replication slot issues
During synchronization, the Metadb database remains available but streaming updates are slower. Periodic transforms and external SQL are paused during this process.

metadb sync

Begins synchronization with a data source, preparing the database to accept a fresh snapshot of data.

Syntax

metadb sync --source <source_name> [options]

Options

--source
string
required
Name of the data source to synchronize. Must match a configured data source.
metadb sync --source sensor -D data
-D, --dir
string
required
Path to the Metadb data directory.
metadb sync --source sensor -D /home/metadb/data
--force
boolean
Do not prompt for confirmation before starting synchronization.
metadb sync --source sensor -D data --force
-v, --verbose
boolean
Enable verbose output during synchronization.
metadb sync --source sensor -D data --verbose
--trace
boolean
Enable extremely verbose output (requires METADB_DEV=on).

What It Does

  1. Puts the database into synchronizing mode
  2. Pauses periodic transforms and external SQL
  3. Prepares to accept a complete data snapshot
  4. Marks existing data for potential cleanup
The sync command should be run before starting the server to stream new data. This may take some time to complete.

Example Workflow

# Step 1: Stop the server
metadb stop -D data

# Step 2: Run sync (before starting server)
metadb sync -D data --source sensor

# Step 3: Start the server to begin streaming
metadb start -D data -l metadb.log

metadb endsync

Completes synchronization by removing old data that was not refreshed during the sync process.

Syntax

metadb endsync --source <source_name> [options]

Options

--source
string
required
Name of the data source to finish synchronizing.
metadb endsync --source sensor -D data
-D, --dir
string
required
Path to the Metadb data directory.
metadb endsync --source sensor -D /home/metadb/data
--force
boolean
Do not prompt for confirmation before ending synchronization.
metadb endsync --source sensor -D data --force
-v, --verbose
boolean
Enable verbose output.
metadb endsync --source sensor -D data --verbose
--trace
boolean
Enable extremely verbose output (requires METADB_DEV=on).

What It Does

  1. Removes old data that wasn’t refreshed by the new stream
  2. Exits synchronizing mode
  3. Resumes periodic transforms and external SQL
  4. Returns database to normal operation
You must run endsync to complete the synchronization process. The timing affects users:
  • Too early: Records removed before being re-streamed (missing data)
  • Too late: Deleted records remain temporarily (extra data)

When to Run endsync

Run endsync after the snapshot has finished streaming. Metadb helps you determine the right time:
  • Watch for “source snapshot complete (deadline exceeded)” in the log
  • Check snapshot status with the list status command
  • Generally better to run too late than too early

Example Workflow

# Step 1: Wait for snapshot to complete (check logs)
# Look for: "source snapshot complete (deadline exceeded)"

# Step 2: Stop the server
metadb stop -D data

# Step 3: Run endsync
metadb endsync -D data --source sensor

# Step 4: Restart the server
metadb start -D data -l metadb.log

Complete Resynchronization Procedure

Follow this procedure when resynchronizing a failed Kafka data stream:

1. Update Data Source Configuration

Update the data source with new Kafka topics and consumer group:
alter data source sensor options
    (set topics '^metadb_sensor_2\.', set consumer_group 'metadb_sensor_2_1');
Do not restart the server yet. Continue immediately to Step 2.

2. Stop and Sync

# Stop the running server
metadb stop -D data

# Run sync before starting again
metadb sync -D data --source sensor

3. Start Streaming

# Start the server to begin streaming new data
metadb start -D data -l metadb.log

4. Monitor Progress

Watch the log file for snapshot completion:
tail -f metadb.log
Look for:
source snapshot complete (deadline exceeded)
Or check status programmatically:
list status;

5. Complete Synchronization

# Stop the server
metadb stop -D data

# End sync to remove old data
metadb endsync -D data --source sensor

# Restart the server
metadb start -D data -l metadb.log

Initial Synchronization for New Sources

When creating a new data source with create data source, Metadb automatically enters synchronizing mode:
create data source sensor type kafka options (
    brokers 'kafka:29092',
    topics '^metadb_sensor_1\.',
    consumer_group 'metadb_sensor_1_1',
    add_schema_prefix 'sensor_',
    schema_stop_filter 'admin'
);
After the initial snapshot completes:
# Stop server when snapshot is complete
metadb stop -D data

# Complete the initial synchronization
metadb endsync -D data --source sensor

# Restart server
metadb start -D data -l metadb.log

Troubleshooting

Snapshot Taking Too Long

  • Check Kafka connectivity and consumer group status
  • Verify no timeout issues in source database
  • Monitor replication slot lag

Missing Records After endsync

If you ran endsync too early:
  • Re-run the full sync procedure
  • Wait longer for snapshot completion next time

Old Records Not Removed

If old deleted records persist:
  • Ensure endsync was run and completed successfully
  • Check logs for errors during endsync
Until a failed stream is re-streamed using this procedure, the Metadb database may remain unsynchronized with the source.

Build docs developers (and LLMs) love