Skip to main content
RocksDB supports multiple table formats for storing SST files, each optimized for different use cases. The most common is the block-based table format.

Block-Based Table Options

BlockBasedTableOptions

struct BlockBasedTableOptions {
  bool cache_index_and_filter_blocks = false;
  bool cache_index_and_filter_blocks_with_high_priority = true;
  bool pin_l0_filter_and_index_blocks_in_cache = false;
  bool pin_top_level_index_and_filter = true;
  MetadataCacheOptions metadata_cache_options;
  IndexType index_type = kBinarySearch;
  DataBlockIndexType data_block_index_type = kDataBlockBinarySearch;
  ChecksumType checksum = kXXH3;
  bool no_block_cache = false;
  std::shared_ptr<Cache> block_cache = nullptr;
  uint64_t block_size = 4 * 1024;
  int block_restart_interval = 16;
  uint64_t metadata_block_size = 4096;
  std::shared_ptr<const FilterPolicy> filter_policy = nullptr;
  bool whole_key_filtering = true;
  uint32_t format_version = 7;
};

Index Types

enum IndexType : char {
  kBinarySearch = 0x00,
  kHashSearch = 0x01,
  kTwoLevelIndexSearch = 0x02,
  kBinarySearchWithFirstKey = 0x03
};
Space-efficient index optimized for binary search.
Hash index for prefix lookups when prefix_extractor is provided.
Two-level index with both levels using binary search. Second level blocks use block cache.
kBinarySearchWithFirstKey
IndexType
Binary search index that also contains the first key of each block, allowing iterators to defer reading blocks.

Table Factory Creation

NewBlockBasedTableFactory

TableFactory* NewBlockBasedTableFactory(
  const BlockBasedTableOptions& table_options = BlockBasedTableOptions()
);
Creates a block-based table factory with the specified options.
table_options
const BlockBasedTableOptions&
Configuration for block-based tables.
TableFactory*
TableFactory*
Returns a pointer to the created table factory.

NewPlainTableFactory

TableFactory* NewPlainTableFactory(
  const PlainTableOptions& options = PlainTableOptions()
);
Creates a plain table factory optimized for low-latency on pure-memory or very low-latency media.

NewCuckooTableFactory

TableFactory* NewCuckooTableFactory(
  const CuckooTableOptions& table_options = CuckooTableOptions()
);
Creates a cuckoo hash table factory for SST files.

Cache Configuration

Block Cache

std::shared_ptr<Cache> block_cache = nullptr;
block_cache
std::shared_ptr<Cache>
Cache for data blocks. If nullptr and no_block_cache is false, a 32MB internal cache is created.

Caching Index and Filter Blocks

bool cache_index_and_filter_blocks = false;
cache_index_and_filter_blocks
bool
When false, index/filter blocks are pre-loaded during table initialization. When true, they use the block cache.

Checksum Types

enum ChecksumType : char {
  kNoChecksum = 0x0,
  kCRC32c = 0x1,
  kxxHash = 0x2,
  kxxHash64 = 0x3,
  kXXH3 = 0x4
};
kXXH3
ChecksumType
Default. Fast and high-quality checksum. Supported since RocksDB 6.27.
kCRC32c
ChecksumType
CRC32c checksum with hardware acceleration on x86.

Block Size and Compression

block_size

uint64_t block_size = 4 * 1024;
block_size
uint64_t
Approximate size of uncompressed data packed per block. Actual disk read size may be smaller if compression is enabled.

block_restart_interval

int block_restart_interval = 16;
block_restart_interval
int
Number of keys between restart points for delta encoding. Minimum value is 1.

Filter Configuration

filter_policy

std::shared_ptr<const FilterPolicy> filter_policy = nullptr;
filter_policy
std::shared_ptr<const FilterPolicy>
Filter policy to reduce disk reads. Use NewBloomFilterPolicy() for most applications.

whole_key_filtering

bool whole_key_filtering = true;
whole_key_filtering
bool
If true, place whole keys in the filter (not just prefixes). Must be true for efficient point lookups.

partition_filters

bool partition_filters = false;
partition_filters
bool
Use partitioned filters. Requires kTwoLevelIndexSearch. Filter partition blocks use block cache even when cache_index_and_filter_blocks=false.

Format Version

uint32_t format_version = 7;
format_version
uint32_t
Schema version for table files. Default is 7 for latest features. Version 6 adds checksum protection, version 5 adds faster Bloom filters.

Example

#include "rocksdb/table.h"
#include "rocksdb/cache.h"
#include "rocksdb/filter_policy.h"

using namespace ROCKSDB_NAMESPACE;

// Configure block-based table
BlockBasedTableOptions table_options;
table_options.block_cache = NewLRUCache(512 * 1024 * 1024);  // 512MB cache
table_options.filter_policy.reset(NewBloomFilterPolicy(10));
table_options.block_size = 16 * 1024;  // 16KB blocks
table_options.cache_index_and_filter_blocks = true;
table_options.pin_top_level_index_and_filter = true;

// Create table factory
Options options;
options.table_factory.reset(NewBlockBasedTableFactory(table_options));

// Open database with these options
DB* db;
Status s = DB::Open(options, "/tmp/testdb", &db);

Advanced Options

Metadata Cache Options

struct MetadataCacheOptions {
  PinningTier top_level_index_pinning = PinningTier::kFallback;
  PinningTier partition_pinning = PinningTier::kFallback;
  PinningTier unpartitioned_pinning = PinningTier::kFallback;
};
Controls which block-based table tiers have their metadata pinned in cache.

Read Amplification Measurement

uint32_t read_amp_bytes_per_bit = 0;
read_amp_bytes_per_bit
uint32_t
Enable read amplification statistics. Creates a bitmap to track which parts of blocks are actually read. Must be a power of 2. Default 0 (disabled).

See Also

Build docs developers (and LLMs) love