Skip to main content
Column families in RocksDB allow you to logically partition data within a single database instance while maintaining independent configuration and operational control for each partition.

What are Column Families?

A column family is a logical partition within a RocksDB database. Each column family:
  • Has its own LSM tree structure
  • Can have independent configuration (compression, compaction, etc.)
  • Shares the same Write-Ahead Log (WAL) for durability
  • Supports atomic writes across multiple column families
Think of column families as separate key-value stores within one database, but with the ability to write to multiple families atomically.

Creating Column Families

Opening a Database with Column Families

Every RocksDB database has at least one column family called “default”:
#include <rocksdb/db.h>
#include <vector>

rocksdb::DB* db;
std::vector<rocksdb::ColumnFamilyHandle*> handles;

// Define column families
std::vector<rocksdb::ColumnFamilyDescriptor> column_families;

// Default column family (always required)
column_families.push_back(rocksdb::ColumnFamilyDescriptor(
    rocksdb::kDefaultColumnFamilyName,
    rocksdb::ColumnFamilyOptions()));

// Custom column families
column_families.push_back(rocksdb::ColumnFamilyDescriptor(
    "users",
    rocksdb::ColumnFamilyOptions()));
    
column_families.push_back(rocksdb::ColumnFamilyDescriptor(
    "posts",
    rocksdb::ColumnFamilyOptions()));

// Open database
rocksdb::Status s = rocksdb::DB::Open(
    rocksdb::DBOptions(),
    "/tmp/testdb",
    column_families,
    &handles,
    &db);

if (!s.ok()) {
  std::cerr << "Failed to open: " << s.ToString() << std::endl;
}
You must open ALL existing column families when opening a database. Use DB::ListColumnFamilies() to discover existing column families.

Listing Existing Column Families

#include <rocksdb/db.h>

std::vector<std::string> column_families;
rocksdb::Status s = rocksdb::DB::ListColumnFamilies(
    rocksdb::DBOptions(),
    "/tmp/testdb",
    &column_families);

if (s.ok()) {
  for (const auto& cf_name : column_families) {
    std::cout << "Found column family: " << cf_name << std::endl;
  }
}

Creating New Column Families

rocksdb::ColumnFamilyHandle* cf_handle;
rocksdb::ColumnFamilyOptions cf_options;

// Configure options for this column family
cf_options.compression = rocksdb::kZSTD;
cf_options.write_buffer_size = 128 << 20; // 128MB

// Create column family
rocksdb::Status s = db->CreateColumnFamily(cf_options, "metrics", &cf_handle);

if (s.ok()) {
  std::cout << "Created column family: metrics" << std::endl;
}

Reading and Writing Data

Writing to Column Families

// Write to specific column family
rocksdb::WriteOptions write_opts;

// Write to "users" column family
db->Put(write_opts, handles[1], "user:1001", "{\"name\":\"Alice\"}");

// Write to "posts" column family
db->Put(write_opts, handles[2], "post:5001", "{\"title\":\"Hello World\"}");

// Write to default column family
db->Put(write_opts, handles[0], "config:version", "1.0");

Atomic Writes Across Column Families

Use WriteBatch for atomic writes across multiple column families:
#include <rocksdb/write_batch.h>

rocksdb::WriteBatch batch;

// Add operations to batch
batch.Put(handles[1], "user:1001", "{\"name\":\"Alice\"}");
batch.Put(handles[2], "post:5001", "{\"author\":\"user:1001\"}");
batch.Put(handles[0], "counter:posts", "5001");

// Execute atomically
rocksdb::Status s = db->Write(rocksdb::WriteOptions(), &batch);
All operations in a WriteBatch either succeed together or fail together, even across different column families.

Reading from Column Families

std::string value;
rocksdb::ReadOptions read_opts;

// Read from "users" column family
rocksdb::Status s = db->Get(read_opts, handles[1], "user:1001", &value);

if (s.ok()) {
  std::cout << "User data: " << value << std::endl;
} else if (s.IsNotFound()) {
  std::cout << "User not found" << std::endl;
}

Iterating Over Column Families

// Create iterator for "posts" column family
rocksdb::Iterator* it = db->NewIterator(read_opts, handles[2]);

// Iterate over all posts
for (it->SeekToFirst(); it->Valid(); it->Next()) {
  std::cout << it->key().ToString() << ": " 
            << it->value().ToString() << std::endl;
}

delete it;

Independent Configuration

Each column family can have different settings optimized for its data:
Optimize for high write throughput:
rocksdb::ColumnFamilyOptions write_heavy_opts;

// Large MemTable for batching writes
write_heavy_opts.write_buffer_size = 256 << 20; // 256MB
write_heavy_opts.max_write_buffer_number = 4;

// Less aggressive compaction
write_heavy_opts.level0_file_num_compaction_trigger = 8;
write_heavy_opts.max_bytes_for_level_base = 1 << 30; // 1GB

// Fast compression
write_heavy_opts.compression = rocksdb::kLZ4Compression;

db->CreateColumnFamily(write_heavy_opts, "logs", &cf_handle);

Use Cases

Store each tenant’s data in a separate column family:
// Create column family per tenant
for (const auto& tenant_id : tenant_ids) {
  rocksdb::ColumnFamilyOptions tenant_opts;
  std::string cf_name = "tenant:" + tenant_id;
  
  db->CreateColumnFamily(tenant_opts, cf_name, &cf_handle);
  tenant_handles[tenant_id] = cf_handle;
}

// Tenant-specific operations
db->Put(write_opts, tenant_handles["acme"], "user:123", "...");
Benefits: Independent compaction, easy tenant deletion, per-tenant statistics
Separate hot and cold data:
// Hot data: recent, frequently accessed
rocksdb::ColumnFamilyOptions hot_opts;
hot_opts.compression = rocksdb::kLZ4Compression; // Fast
db->CreateColumnFamily(hot_opts, "hot_data", &hot_cf);

// Cold data: old, rarely accessed
rocksdb::ColumnFamilyOptions cold_opts;
cold_opts.compression = rocksdb::kZSTD; // Small
cold_opts.bottommost_compression_opts.level = 19;
db->CreateColumnFamily(cold_opts, "cold_data", &cold_cf);

// Move data from hot to cold over time
// ...
Optimize each data type separately:
// Time-series metrics (write-heavy, range scans)
rocksdb::ColumnFamilyOptions metrics_opts;
metrics_opts.write_buffer_size = 256 << 20;
metrics_opts.compaction_style = rocksdb::kCompactionStyleFIFO;
db->CreateColumnFamily(metrics_opts, "metrics", &metrics_cf);

// User profiles (read-heavy, point lookups)
rocksdb::ColumnFamilyOptions profile_opts;
rocksdb::BlockBasedTableOptions table_opts;
table_opts.filter_policy.reset(rocksdb::NewBloomFilterPolicy(10));
profile_opts.table_factory.reset(
    rocksdb::NewBlockBasedTableFactory(table_opts));
db->CreateColumnFamily(profile_opts, "profiles", &profile_cf);

Managing Column Families

Dropping Column Families

// Drop column family
rocksdb::Status s = db->DropColumnFamily(handles[2]);

if (s.ok()) {
  // Still need to destroy the handle
  delete handles[2];
  handles.erase(handles.begin() + 2);
}
Dropping a column family marks it for deletion but doesn’t immediately free space. Files are deleted during compaction.

Getting Column Family Metadata

rocksdb::ColumnFamilyMetaData cf_metadata;
db->GetColumnFamilyMetaData(handles[1], &cf_metadata);

std::cout << "Column family: " << cf_metadata.name << std::endl;
std::cout << "Size: " << cf_metadata.size << " bytes" << std::endl;
std::cout << "File count: " << cf_metadata.file_count << std::endl;
std::cout << "Levels: " << cf_metadata.levels.size() << std::endl;

for (const auto& level : cf_metadata.levels) {
  std::cout << "  Level " << level.level << ": " 
            << level.files.size() << " files" << std::endl;
}

Dynamic Options

Change column family options at runtime:
#include <unordered_map>

std::unordered_map<std::string, std::string> new_options;
new_options["write_buffer_size"] = "134217728"; // 128MB
new_options["max_write_buffer_number"] = "4";

rocksdb::Status s = db->SetOptions(handles[1], new_options);

if (s.ok()) {
  std::cout << "Updated options for column family" << std::endl;
}

Shared Resources

While column families have independent LSM trees, they share some resources:
  • Write-Ahead Log (WAL): All column families share the same WAL
  • Block Cache: Can be shared across column families
  • Compaction Threads: Background threads are shared
  • Rate Limiter: I/O rate limiting applies globally
// Shared block cache across all column families
std::shared_ptr<rocksdb::Cache> shared_cache = 
    rocksdb::NewLRUCache(2 << 30); // 2GB shared cache

rocksdb::BlockBasedTableOptions table_opts;
table_opts.block_cache = shared_cache;

rocksdb::ColumnFamilyOptions cf_opts;
cf_opts.table_factory.reset(
    rocksdb::NewBlockBasedTableFactory(table_opts));

// Use same table options for all column families
for (const auto& cf_name : {"users", "posts", "comments"}) {
  db->CreateColumnFamily(cf_opts, cf_name, &cf_handle);
}

Cleanup

Always properly destroy column family handles:
// Destroy column family handles
for (auto handle : handles) {
  delete handle;
}

// Close database
delete db;
Column family handles must be destroyed before closing the database.

Best Practices

  1. Use column families for logical partitions with different access patterns
  2. Limit the number of column families (typically < 100) to avoid resource overhead
  3. Share block cache across column families to maximize cache efficiency
  4. Use atomic WriteBatch when consistency across families is required
  5. Monitor per-CF statistics to identify performance issues
  6. Always destroy handles before closing the database

Next Steps

LSM Tree Design

Understand the LSM structure behind column families

Compaction

Learn about per-CF compaction strategies

Write-Ahead Log

See how WAL works across column families

Performance Tuning

Optimize column family configurations

Build docs developers (and LLMs) love