Overview
A Comparator defines the total order across keys in RocksDB. Custom comparators allow you to implement application-specific key ordering beyond simple lexicographic byte comparison.
Changing comparators after creating a database requires careful migration. A database created with one comparator cannot be safely accessed with a different comparator.
The Comparator Interface
The Comparator class from comparator.h:44 provides the core abstraction for key ordering:
class Comparator : public Customizable, public CompareInterface {
public:
// The name of the comparator (must be unique)
virtual const char* Name() const = 0;
// Three-way comparison: < 0 if a < b, == 0 if a == b, > 0 if a > b
virtual int Compare(const Slice& a, const Slice& b) const = 0;
// Optional: equality check (default uses Compare)
virtual bool Equal(const Slice& a, const Slice& b) const {
return Compare(a, b) == 0;
}
// Advanced: reduce index block size
virtual void FindShortestSeparator(std::string* start,
const Slice& limit) const = 0;
virtual void FindShortSuccessor(std::string* key) const = 0;
};
Key Methods
Compare() - The core comparison function. Must be:
- Thread-safe (may be called concurrently from multiple threads)
- Consistent (same inputs always produce same output)
- Transitive (if a < b and b < c, then a < c)
Name() - Returns a unique identifier. RocksDB uses this to detect comparator mismatches:
Names starting with “rocksdb.” are reserved. Use a custom prefix for your comparators.
FindShortestSeparator() - Optional optimization to reduce index block size. Can be left as no-op.
FindShortSuccessor() - Optional optimization for index blocks. Can be left as no-op.
Built-in Comparators
RocksDB provides several built-in comparators defined in comparator.h:187-208:
BytewiseComparator
const Comparator* BytewiseComparator();
Uses lexicographic ordering on unsigned bytes. Empty string comes before all other strings.
ReverseBytewiseComparator
const Comparator* ReverseBytewiseComparator();
Reverse ordering of BytewiseComparator.
User-Defined Timestamp Comparators
// With uint64_t timestamps
const Comparator* BytewiseComparatorWithU64Ts();
const Comparator* ReverseBytewiseComparatorWithU64Ts();
For user-defined timestamps (UDT), where newer timestamps (larger values) come first for the same user key.
Implementing a Custom Comparator
Example: Integer Key Comparator
class IntegerComparator : public Comparator {
public:
IntegerComparator() {}
const char* Name() const override {
return "myapp.IntegerComparator";
}
int Compare(const Slice& a, const Slice& b) const override {
assert(a.size() == sizeof(int64_t));
assert(b.size() == sizeof(int64_t));
int64_t ia, ib;
memcpy(&ia, a.data(), sizeof(int64_t));
memcpy(&ib, b.data(), sizeof(int64_t));
if (ia < ib) return -1;
if (ia > ib) return 1;
return 0;
}
// Simple no-op implementations for index optimization
void FindShortestSeparator(std::string* start,
const Slice& limit) const override {}
void FindShortSuccessor(std::string* key) const override {}
};
Using Your Custom Comparator
Options options;
options.comparator = new IntegerComparator();
options.create_if_missing = true;
DB* db;
Status s = DB::Open(options, "/tmp/mydb", &db);
The comparator object must outlive the database. Keep it in scope or use a singleton pattern.
User-Defined Timestamps
Starting from RocksDB 7.0, you can associate timestamps with keys. The Comparator constructor accepts a timestamp size:
Comparator(size_t ts_sz) // From comparator.h:48
Timestamp Handling
When timestamps are enabled (timestamp_size_ > 0):
virtual int CompareTimestamp(const Slice& ts1, const Slice& ts2) const;
virtual int CompareWithoutTimestamp(const Slice& a, bool a_has_ts,
const Slice& b, bool b_has_ts) const;
Timestamp Utilities
From comparator.h:216-234:
// Encode uint64_t timestamp to Slice
Slice EncodeU64Ts(uint64_t ts, std::string* ts_buf);
// Decode Slice to uint64_t timestamp
Status DecodeU64Ts(const Slice& ts, uint64_t* int_ts);
// Get min/max timestamp values
Slice MinU64Ts();
Slice MaxU64Ts();
Advanced Features
Key Equivalence
virtual bool CanKeysWithDifferentByteContentsBeEqual() const {
return true;
}
Return false if your comparator ensures different byte sequences are always different keys. This enables optimizations like DataBlockHashIndex.
Same-Length Successor Detection
virtual bool IsSameLengthImmediateSuccessor(
const Slice& s, const Slice& t) const {
return false;
}
Used for auto_prefix_mode optimization. Return true if t is the immediate successor of s with the same length.
Best Practices
Thread Safety: Comparators must be thread-safe. All methods may be called concurrently from multiple threads.
Exception Safety: Exceptions must NOT propagate out of comparator methods. This can cause undefined behavior including data loss.
- Optimize Compare() - This is called very frequently in hot paths
- Avoid memory allocations - Use stack variables when possible
- Keep it simple - Complex comparisons add overhead to all operations
Migration
Changing comparators requires careful planning:
- Export data with old comparator
- Create new database with new comparator
- Import data into new database
There is no automatic migration path.
Filter Integration
If your comparator ignores parts of keys (e.g., suffixes), you must provide a compatible FilterPolicy. The default Bloom filter (NewBloomFilterPolicy()) will not work correctly.
From filter_policy.h:159-165:
If the comparator ignores trailing spaces, it would be incorrect to use a FilterPolicy (like NewBloomFilterPolicy) that does not ignore trailing spaces in keys.
Common Patterns
Composite Keys
class CompositeKeyComparator : public Comparator {
int Compare(const Slice& a, const Slice& b) const override {
// Compare first field
int cmp = CompareField1(a, b);
if (cmp != 0) return cmp;
// If equal, compare second field
return CompareField2(a, b);
}
};
Reverse Timestamp Ordering
class ReverseTimestampComparator : public Comparator {
int Compare(const Slice& a, const Slice& b) const override {
// Extract timestamps (assuming last 8 bytes)
uint64_t ts_a = DecodeFixed64(a.data() + a.size() - 8);
uint64_t ts_b = DecodeFixed64(b.data() + b.size() - 8);
// Newer (larger) timestamps come first
if (ts_a > ts_b) return -1;
if (ts_a < ts_b) return 1;
// Compare keys if timestamps equal
return a.compare(b);
}
};
See Also