Point-in-time consistent reads with RocksDB snapshots
Snapshots in RocksDB provide a consistent, point-in-time view of the database. They allow you to read data as it existed at a specific moment, even as concurrent writes continue to modify the database.
#include <rocksdb/snapshot.h>{ // Snapshot automatically created rocksdb::ManagedSnapshot snapshot(db); // Read from snapshot rocksdb::ReadOptions read_opts; read_opts.snapshot = snapshot.snapshot(); std::string value; db->Get(read_opts, "key", &value); // Snapshot automatically released when out of scope}
// Create snapshotconst rocksdb::Snapshot* snapshot = db->GetSnapshot();rocksdb::ReadOptions read_opts;read_opts.snapshot = snapshot;// Read user and their balancestd::string user_data, balance;db->Get(read_opts, "user:1001", &user_data);db->Get(read_opts, "balance:1001", &balance);// Values are consistent even if concurrent writes occur// Both reads see the same snapshot of the databasedb->ReleaseSnapshot(snapshot);
Without snapshots, concurrent writes between the two reads could lead to inconsistent data (e.g., reading old user data but new balance).
Scan ranges without interference from concurrent writes:
const rocksdb::Snapshot* snapshot = db->GetSnapshot();rocksdb::ReadOptions read_opts;read_opts.snapshot = snapshot;rocksdb::Iterator* it = db->NewIterator(read_opts);// Consistent range scanfor (it->Seek("user:1000"); it->Valid(); it->Next()) { if (it->key().ToString() > "user:2000") break; // Process user data consistently // No new users appear mid-iteration // No users disappear mid-iteration ProcessUser(it->key(), it->value());}delete it;db->ReleaseSnapshot(snapshot);
std::vector<rocksdb::ColumnFamilyHandle*> handles;// ... open database with column families ...// Single snapshot covers all column familiesconst rocksdb::Snapshot* snapshot = db->GetSnapshot();rocksdb::ReadOptions read_opts;read_opts.snapshot = snapshot;// Read from different column families consistentlystd::string user, post;db->Get(read_opts, handles[0], "user:1001", &user);db->Get(read_opts, handles[1], "post:5001", &post);// Both reads see the same point-in-time across all CFsdb->ReleaseSnapshot(snapshot);
A single snapshot provides consistency across ALL column families in the database.
// Write datadb->Put(rocksdb::WriteOptions(), "key1", "value1");// Create snapshotconst rocksdb::Snapshot* snapshot = db->GetSnapshot();// Update key (creates new version)db->Put(rocksdb::WriteOptions(), "key1", "value2");// Compaction runs but CANNOT remove "value1"// because snapshot still references it// Release snapshotdb->ReleaseSnapshot(snapshot);// Now "value1" can be garbage collected during compaction
Important: Long-lived snapshots increase space amplification. They prevent compaction from removing old versions of keys, causing disk space usage to grow.
rocksdb::ReadOptions read_opts;// No snapshot specified - iterator creates implicit snapshotrocksdb::Iterator* it = db->NewIterator(read_opts);// Iterator sees consistent snapshot from creation timefor (it->SeekToFirst(); it->Valid(); it->Next()) { // Consistent iteration even if concurrent writes occur}delete it;// Implicit snapshot released when iterator destroyed
const rocksdb::Snapshot* snapshot = db->GetSnapshot();rocksdb::ReadOptions read_opts;read_opts.snapshot = snapshot;// Multiple iterators share same snapshotrocksdb::Iterator* it1 = db->NewIterator(read_opts);rocksdb::Iterator* it2 = db->NewIterator(read_opts);// Both see identical datadelete it1;delete it2;db->ReleaseSnapshot(snapshot);
Iterator creates its own snapshot:
rocksdb::ReadOptions read_opts;// No snapshot specifiedrocksdb::Iterator* it = db->NewIterator(read_opts);// Iterator uses implicit snapshot// Snapshot released when iterator destroyeddelete it;
std::string value;db->GetProperty("rocksdb.oldest-snapshot-sequence", &value);std::cout << "Oldest snapshot sequence: " << value << std::endl;// If oldest snapshot is very old, it may be preventing compaction
#include <rocksdb/db.h>#include <vector>// Query all posts by a user and their detailsstd::vector<std::string> GetUserPosts(rocksdb::DB* db, const std::string& user_id) { std::vector<std::string> posts; // Create snapshot for consistency rocksdb::ManagedSnapshot snapshot(db); rocksdb::ReadOptions read_opts; read_opts.snapshot = snapshot.snapshot(); // Get user's post list std::string post_list; rocksdb::Status s = db->Get(read_opts, "user_posts:" + user_id, &post_list); if (!s.ok()) return posts; // Parse post IDs std::vector<std::string> post_ids = ParsePostIds(post_list); // Fetch each post (all consistent with snapshot) for (const auto& post_id : post_ids) { std::string post_data; s = db->Get(read_opts, "post:" + post_id, &post_data); if (s.ok()) { posts.push_back(post_data); } } return posts; // Snapshot automatically released}