Skip to main content
Checkpoint provides APIs for creating openable snapshots of a RocksDB database at a specific point in time. Checkpoints can be used for backups, replication, or creating read-only database copies.

Creating a Checkpoint Object

Create

static Status Create(DB* db, Checkpoint** checkpoint_ptr);
Creates a Checkpoint object for a database.
db
DB*
The database to create checkpoints from.
checkpoint_ptr
Checkpoint**
Output parameter for the created Checkpoint object.
Status
Status
Returns OK on success.

Creating a Checkpoint

CreateCheckpoint

virtual Status CreateCheckpoint(
  const std::string& checkpoint_dir,
  uint64_t log_size_for_flush = 0,
  uint64_t* sequence_number_ptr = nullptr
);
Builds an openable snapshot of the database.
checkpoint_dir
const std::string&
Absolute path where the checkpoint will be created. Directory must not exist.
log_size_for_flush
uint64_t
If total log file size >= this value, a flush is triggered for all column families. 0 (default) always triggers flush. Archived logs are not included in the calculation.
sequence_number_ptr
uint64_t*
Optional output parameter set to a sequence number guaranteed to be in the checkpoint.
Status
Status
Returns OK on success, NotSupported if db_paths or cf_paths use multiple directories.

How Checkpoints Work

  1. SST and Blob Files: Hard linked if the checkpoint directory is on the same filesystem, copied otherwise
  2. Other Files: MANIFEST and other required files are always copied
  3. Consistency: Checkpoint represents a consistent point-in-time snapshot

Exporting Column Families

ExportColumnFamily

virtual Status ExportColumnFamily(
  ColumnFamilyHandle* handle,
  const std::string& export_dir,
  ExportImportFilesMetaData** metadata
);
Exports all live SST files of a specified column family.
handle
ColumnFamilyHandle*
Column family to export.
export_dir
const std::string&
Directory where SST files will be exported. Must not exist.
metadata
ExportImportFilesMetaData**
Output parameter with information about exported SST files.
Status
Status
Returns OK on success.

Export Behavior

  • SST files are hard linked when export_dir is on the same partition as the database
  • SST files are copied when on different partitions
  • Always triggers a flush
  • export_dir is created by the API

Example: Creating a Checkpoint

#include "rocksdb/utilities/checkpoint.h"

using namespace ROCKSDB_NAMESPACE;

// Open database
DB* db;
Options options;
options.create_if_missing = true;
Status s = DB::Open(options, "/tmp/testdb", &db);
assert(s.ok());

// Write some data
s = db->Put(WriteOptions(), "key1", "value1");
assert(s.ok());

// Create checkpoint object
Checkpoint* checkpoint;
s = Checkpoint::Create(db, &checkpoint);
assert(s.ok());

// Create checkpoint
uint64_t sequence_number;
s = checkpoint->CreateCheckpoint("/tmp/checkpoint1", 0, &sequence_number);
assert(s.ok());

printf("Checkpoint created at sequence number %lu\n", sequence_number);

// The checkpoint is an openable database
DB* checkpoint_db;
s = DB::OpenForReadOnly(options, "/tmp/checkpoint1", &checkpoint_db);
assert(s.ok());

// Read from checkpoint
std::string value;
s = checkpoint_db->Get(ReadOptions(), "key1", &value);
assert(s.ok());
assert(value == "value1");

delete checkpoint_db;
delete checkpoint;
delete db;

Example: Conditional Flush

#include "rocksdb/utilities/checkpoint.h"

using namespace ROCKSDB_NAMESPACE;

DB* db;
Options options;
Status s = DB::Open(options, "/tmp/testdb", &db);

Checkpoint* checkpoint;
s = Checkpoint::Create(db, &checkpoint);

// Only flush if log files exceed 100MB
uint64_t log_size_threshold = 100 * 1024 * 1024;
s = checkpoint->CreateCheckpoint("/tmp/checkpoint2", log_size_threshold);

if (s.ok()) {
  printf("Checkpoint created\n");
} else {
  fprintf(stderr, "Checkpoint failed: %s\n", s.ToString().c_str());
}

delete checkpoint;
delete db;

Example: Exporting a Column Family

#include "rocksdb/utilities/checkpoint.h"

using namespace ROCKSDB_NAMESPACE;

// Open database with column families
DB* db;
std::vector<ColumnFamilyHandle*> handles;
DBOptions db_options;
db_options.create_if_missing = true;

std::vector<ColumnFamilyDescriptor> column_families;
column_families.push_back(ColumnFamilyDescriptor(
    kDefaultColumnFamilyName, ColumnFamilyOptions()));
column_families.push_back(ColumnFamilyDescriptor(
    "new_cf", ColumnFamilyOptions()));

Status s = DB::Open(db_options, "/tmp/testdb", column_families,
                    &handles, &db);
assert(s.ok());

// Write to column family
s = db->Put(WriteOptions(), handles[1], "key1", "value1");
assert(s.ok());

// Create checkpoint object
Checkpoint* checkpoint;
s = Checkpoint::Create(db, &checkpoint);
assert(s.ok());

// Export column family
ExportImportFilesMetaData* metadata;
s = checkpoint->ExportColumnFamily(handles[1], "/tmp/export", &metadata);
assert(s.ok());

printf("Exported %zu SST files\n", metadata->files.size());

// Clean up
for (auto handle : handles) {
  delete handle;
}
delete metadata;
delete checkpoint;
delete db;

Use Cases

Database Backups

Checkpoints provide a fast way to create consistent backups:
Checkpoint* checkpoint;
Checkpoint::Create(db, &checkpoint);
checkpoint->CreateCheckpoint("/backups/" + current_timestamp());

Read Replicas

Create read-only copies for load distribution:
// Create checkpoint
checkpoint->CreateCheckpoint("/replicas/replica1");

// Open as read-only
DB* replica;
DB::OpenForReadOnly(options, "/replicas/replica1", &replica);

Testing and Development

Quickly create database copies for testing:
// Create checkpoint from production data
checkpoint->CreateCheckpoint("/test/testdb");

// Run tests against the checkpoint
runTests("/test/testdb");

Performance Considerations

  • Hard links are nearly instantaneous
  • Copies can be slow for large databases
  • Ensure checkpoint directory is on the same filesystem for best performance

Flush Behavior

  • Setting log_size_for_flush > 0 can reduce checkpoint time
  • For most up-to-date snapshot, use log_size_for_flush = 0
  • Two-phase commit (2PC) always triggers flush regardless of setting

Space Efficiency

  • Hard-linked checkpoints use minimal additional disk space
  • Deleting the original database doesn’t affect hard-linked checkpoints
  • Space is only freed when all hard links are deleted

Limitations

Multiple Directories Not Supported

Checkpoints don’t support databases using db_paths or cf_paths with multiple directories (without WALs). The API will return NotSupported.

Sequence Number Guarantees

The sequence_number_ptr is set to a sequence number guaranteed to be part of the checkpoint, but not necessarily the latest sequence number.

Comparison with BackupEngine

FeatureCheckpointBackupEngine
SpeedFast (hard links)Slower (copies files)
Space efficiencyHigh (hard links)Lower (full copies)
Incremental backupsNoYes
VerificationNoYes
Restore optionsSimpleAdvanced
Multiple backupsManual managementBuilt-in
Metadata trackingManualAutomatic
Use Checkpoint for:
  • Fast point-in-time snapshots
  • Read replicas
  • Same-filesystem operations
Use BackupEngine for:
  • Incremental backups
  • Remote backups
  • Backup verification
  • Production backup workflows

See Also

Build docs developers (and LLMs) love