metadb sync / endsync

Overview

The sync and endsync commands work together to manage data source synchronization. These commands are critical when resynchronizing a Kafka data stream after failures or when setting up a new data source.

When to Use Sync Commands

Use these commands when:

Re-streaming a complete snapshot after a Kafka failure
A source database becomes unsynchronized with Metadb
Setting up a new data source for the first time
Recovering from replication slot issues

During synchronization, the Metadb database remains available but streaming updates are slower. Periodic transforms and external SQL are paused during this process.

metadb sync

Begins synchronization with a data source, preparing the database to accept a fresh snapshot of data.

Syntax

metadb sync --source <source_name> [options]

Options

--source

string

required

Name of the data source to synchronize. Must match a configured data source.

metadb sync --source sensor -D data

-D, --dir

string

required

Path to the Metadb data directory.

metadb sync --source sensor -D /home/metadb/data

--force

boolean

Do not prompt for confirmation before starting synchronization.

metadb sync --source sensor -D data --force

-v, --verbose

boolean

Enable verbose output during synchronization.

metadb sync --source sensor -D data --verbose

--trace

boolean

Enable extremely verbose output (requires METADB_DEV=on).

What It Does

Puts the database into synchronizing mode
Pauses periodic transforms and external SQL
Prepares to accept a complete data snapshot
Marks existing data for potential cleanup

The sync command should be run before starting the server to stream new data. This may take some time to complete.

Example Workflow

# Step 1: Stop the server
metadb stop -D data

# Step 2: Run sync (before starting server)
metadb sync -D data --source sensor

# Step 3: Start the server to begin streaming
metadb start -D data -l metadb.log

metadb endsync

Completes synchronization by removing old data that was not refreshed during the sync process.

Syntax

metadb endsync --source <source_name> [options]

Options

--source

string

required

Name of the data source to finish synchronizing.

metadb endsync --source sensor -D data

-D, --dir

string

required

Path to the Metadb data directory.

metadb endsync --source sensor -D /home/metadb/data

--force

boolean

Do not prompt for confirmation before ending synchronization.

metadb endsync --source sensor -D data --force

-v, --verbose

boolean

Enable verbose output.

metadb endsync --source sensor -D data --verbose

--trace

boolean

Enable extremely verbose output (requires METADB_DEV=on).

What It Does

Removes old data that wasn’t refreshed by the new stream
Exits synchronizing mode
Resumes periodic transforms and external SQL
Returns database to normal operation

You must run endsync to complete the synchronization process. The timing affects users:

Too early: Records removed before being re-streamed (missing data)
Too late: Deleted records remain temporarily (extra data)

When to Run endsync

Run endsync after the snapshot has finished streaming. Metadb helps you determine the right time:

Watch for “source snapshot complete (deadline exceeded)” in the log
Check snapshot status with the list status command
Generally better to run too late than too early

Example Workflow

# Step 1: Wait for snapshot to complete (check logs)
# Look for: "source snapshot complete (deadline exceeded)"

# Step 2: Stop the server
metadb stop -D data

# Step 3: Run endsync
metadb endsync -D data --source sensor

# Step 4: Restart the server
metadb start -D data -l metadb.log

Complete Resynchronization Procedure

Follow this procedure when resynchronizing a failed Kafka data stream:

1. Update Data Source Configuration

Update the data source with new Kafka topics and consumer group:

alter data source sensor options
    (set topics '^metadb_sensor_2\.', set consumer_group 'metadb_sensor_2_1');

Do not restart the server yet. Continue immediately to Step 2.

2. Stop and Sync

# Stop the running server
metadb stop -D data

# Run sync before starting again
metadb sync -D data --source sensor

3. Start Streaming

# Start the server to begin streaming new data
metadb start -D data -l metadb.log

4. Monitor Progress

Watch the log file for snapshot completion:

tail -f metadb.log

Look for:

source snapshot complete (deadline exceeded)

Or check status programmatically:

list status;

5. Complete Synchronization

# Stop the server
metadb stop -D data

# End sync to remove old data
metadb endsync -D data --source sensor

# Restart the server
metadb start -D data -l metadb.log

Initial Synchronization for New Sources

When creating a new data source with create data source, Metadb automatically enters synchronizing mode:

create data source sensor type kafka options (
    brokers 'kafka:29092',
    topics '^metadb_sensor_1\.',
    consumer_group 'metadb_sensor_1_1',
    add_schema_prefix 'sensor_',
    schema_stop_filter 'admin'
);

After the initial snapshot completes:

# Stop server when snapshot is complete
metadb stop -D data

# Complete the initial synchronization
metadb endsync -D data --source sensor

# Restart server
metadb start -D data -l metadb.log

Troubleshooting

Snapshot Taking Too Long

Check Kafka connectivity and consumer group status
Verify no timeout issues in source database
Monitor replication slot lag

Missing Records After endsync

If you ran endsync too early:

Re-run the full sync procedure
Wait longer for snapshot completion next time

Old Records Not Removed

If old deleted records persist:

Ensure endsync was run and completed successfully
Check logs for errors during endsync

Until a failed stream is re-streamed using this procedure, the Metadb database may remain unsynchronized with the source.

CLI Commands

SQL Extensions

System Functions

System Tables

Configuration

metadb sync / endsync

Overview

When to Use Sync Commands

metadb sync

Syntax

Options

What It Does

Example Workflow

metadb endsync

Syntax

Options

What It Does

When to Run endsync

Example Workflow

Complete Resynchronization Procedure

1. Update Data Source Configuration

2. Stop and Sync

3. Start Streaming

4. Monitor Progress

5. Complete Synchronization

Initial Synchronization for New Sources

Troubleshooting

Snapshot Taking Too Long

Missing Records After endsync

Old Records Not Removed

Build docs developers (and LLMs) love

CLI Commands

SQL Extensions

System Functions

System Tables

Configuration

​Overview

​When to Use Sync Commands

​metadb sync

​Syntax

​Options

​What It Does

​Example Workflow

​metadb endsync

​Syntax

​Options

​What It Does

​When to Run endsync

​Example Workflow

​Complete Resynchronization Procedure

​1. Update Data Source Configuration

​2. Stop and Sync

​3. Start Streaming

​4. Monitor Progress

​5. Complete Synchronization

​Initial Synchronization for New Sources

​Troubleshooting

​Snapshot Taking Too Long

​Missing Records After endsync

​Old Records Not Removed

Build docs developers (and LLMs) love

Overview

When to Use Sync Commands

metadb sync

Syntax

Options

What It Does

Example Workflow

metadb endsync

Syntax

Options

What It Does

When to Run endsync

Example Workflow

Complete Resynchronization Procedure

1. Update Data Source Configuration

2. Stop and Sync

3. Start Streaming

4. Monitor Progress

5. Complete Synchronization

Initial Synchronization for New Sources

Troubleshooting

Snapshot Taking Too Long

Missing Records After endsync

Old Records Not Removed