Skip to main content

Overview

Bloom filters are probabilistic data structures that help RocksDB avoid disk reads for keys that don’t exist. They are stored in SST files and consulted before reading data blocks.
A properly configured Bloom filter can cut down disk seeks from a handful to a single seek per Get() call.

How Bloom Filters Work

From filter_policy.h:9-15:
A database can be configured with a custom FilterPolicy object. This object is responsible for creating a small filter from a set of keys. These filters are stored in rocksdb and are consulted automatically to decide whether or not to read some information from disk.
Bloom filters provide:
  • No false negatives: If a key exists, the filter will always return “might exist”
  • Possible false positives: Sometimes returns “might exist” for non-existent keys
  • Space efficiency: Uses only a few bits per key

Creating Bloom Filters

NewBloomFilterPolicy

From filter_policy.h:166-167:
const FilterPolicy* NewBloomFilterPolicy(
    double bits_per_key,
    bool IGNORED_use_block_based_builder = false);
Parameters:
  • bits_per_key: Average bits allocated per key in the filter
    • Recommended: 9.9 → ~1% false positive rate
    • Lower values = more false positives, less memory
    • Higher values = fewer false positives, more memory
Values < 0.5 are rounded to 0.0 (no filter). Values between 0.5 and 1.0 are rounded to 1.0 (62% FP rate).

Example Configuration

#include "rocksdb/filter_policy.h"
#include "rocksdb/table.h"

BlockBasedTableOptions table_options;
table_options.filter_policy.reset(NewBloomFilterPolicy(9.9));

Options options;
options.table_factory.reset(NewBlockBasedTableFactory(table_options));

Ribbon Filters

From filter_policy.h:169-210, Ribbon filters are a newer alternative:
FilterPolicy* NewRibbonFilterPolicy(
    double bloom_equivalent_bits_per_key,
    int bloom_before_level = 0);

Advantages

  • ~30% space savings compared to Bloom filters
  • Same false positive rate with fewer bits per key
  • Similar query times

Trade-offs

  • 3-4x higher CPU during construction
  • 3x temporary memory during construction
  • Better for lower (larger, longer-lived) LSM levels

Hybrid Configuration

Use Bloom for L0, Ribbon for deeper levels:
// Bloom for L0 (fast), Ribbon for L1+ (space-efficient)
table_options.filter_policy.reset(
    NewRibbonFilterPolicy(
        10.0,              // bloom_equivalent_bits_per_key
        1                  // bloom_before_level (Bloom for L0 only)
    )
);
bloom_before_level values:
  • 0 (default): Bloom for flushes only
  • 1: Bloom for L0, Ribbon for L1+
  • -1: Always use Ribbon filters
  • INT_MAX: Always use Bloom filters

Filter Building Context

From filter_policy.h:48-82, filters are built with contextual information:
struct FilterBuildingContext {
  const BlockBasedTableOptions& table_options;
  CompactionStyle compaction_style;
  int num_levels;                  // LSM levels or -1
  Logger* info_log;
  std::string column_family_name;
  int level_at_creation;           // Target level or -1
  bool is_bottommost;              // Bottommost sorted run?
  TableFileCreationReason reason;  // Why creating this file?
};
Use GetBuilderWithContext() for advanced filter customization:
virtual FilterBitsBuilder* GetBuilderWithContext(
    const FilterBuildingContext& context) const = 0;

Statistics

From statistics.h:111-127, RocksDB tracks filter effectiveness:

Tickers

// Filter helped avoid file reads (true negatives)
BLOOM_FILTER_USEFUL

// Filter indicated "might exist" (true + false positives)
BLOOM_FILTER_FULL_POSITIVE

// Key actually existed (true positives)
BLOOM_FILTER_FULL_TRUE_POSITIVE

// Prefix filter stats for point lookups
BLOOM_FILTER_PREFIX_CHECKED
BLOOM_FILTER_PREFIX_USEFUL
BLOOM_FILTER_PREFIX_TRUE_POSITIVE

Monitoring Filter Efficiency

auto stats = options.statistics;

uint64_t useful = stats->getTickerCount(BLOOM_FILTER_USEFUL);
uint64_t checked = stats->getTickerCount(BLOOM_FILTER_FULL_POSITIVE) +
                   useful;

if (checked > 0) {
  double effectiveness = 100.0 * useful / checked;
  printf("Filter effectiveness: %.2f%%\n", effectiveness);
}

Advanced Options

Partition Filters

For large SST files, partition filters into smaller blocks:
BlockBasedTableOptions table_options;
table_options.partition_filters = true;
table_options.metadata_block_size = 4096;  // Partition size
table_options.filter_policy.reset(NewBloomFilterPolicy(9.9));
Partitioned filters reduce memory usage when only part of a large SST is accessed.

Pin Filters in Cache

Keep filters in memory for frequently accessed files:
table_options.cache_index_and_filter_blocks = true;
table_options.pin_l0_filter_and_index_blocks_in_cache = true;

Last Level Optimization

Disable filters on the last level (all keys will be checked anyway):
ColumnFamilyOptions cf_options;
cf_options.optimize_filters_for_hits = true;
From advanced_options.h:801-815:
For keys which are hits, the filters in the last level are not useful because we will search for the data anyway. This flag allows us to not store filters for the last level.

Memory Considerations

Filter Memory Usage

Estimate filter memory:
// Approximate filter size
size_t filter_bytes = num_keys * bits_per_key / 8;

Cache Integration

Filters are cached separately from data blocks:
// Cache entries are tagged with role
enum class CacheEntryRole {
  kFilterBlock,        // Full or partitioned filter
  kFilterMetaBlock,    // Partitioned filter metadata
  // ...
};
From cache.h:55-88, track filter cache usage:
BlockCacheEntryStatsMapKeys::UsedBytes(CacheEntryRole::kFilterBlock);

Prefix Bloom Filters

Optimize for prefix scans:
// Define prefix extractor
options.prefix_extractor.reset(
    NewFixedPrefixTransform(4)  // First 4 bytes
);

// Enable prefix bloom
table_options.filter_policy.reset(NewBloomFilterPolicy(9.9));
Prefix extractors must be consistent with your key format. Changing the prefix extractor requires rebuilding all SST files.

Custom Filter Policies

Implement FilterPolicy for custom filtering logic:
class CustomFilterPolicy : public FilterPolicy {
 public:
  const char* Name() const override { return "CustomFilter"; }
  
  const char* CompatibilityName() const override {
    return "bloomfilter";  // Compatible with Bloom
  }
  
  FilterBitsBuilder* GetBuilderWithContext(
      const FilterBuildingContext& context) const override {
    // Defer to built-in Bloom based on context
    if (context.level_at_creation == 0) {
      return NewBloomFilterPolicy(12.0)->GetBuilderWithContext(context);
    }
    return NewRibbonFilterPolicy(10.0)->GetBuilderWithContext(context);
  }
  
  FilterBitsReader* GetFilterBitsReader(
      const Slice& contents) const override {
    // Defer to built-in reader
    return NewBloomFilterPolicy(9.9)->GetFilterBitsReader(contents);
  }
};

Compatibility

From filter_policy.h:94-105:
The CompatibilityName is a shared family name for filters that can read each others’ filters. Bloom and Ribbon filters share compatibility.
Important: All built-in FilterPolicies can read other kinds of built-in filters. Ribbon filters require RocksDB >= 6.15.0. Earlier versions will ignore the filter (degraded performance).

Troubleshooting

High False Positive Rate

  1. Increase bits_per_key: Try 12-15 bits for ~0.1-0.5% FP rate
  2. Check statistics: Monitor BLOOM_FILTER_USEFUL vs BLOOM_FILTER_FULL_POSITIVE
  3. Verify filter size: Ensure filters aren’t truncated or corrupted

High Memory Usage

  1. Use Ribbon filters: 30% space savings
  2. Enable partitioned filters: Reduce peak memory
  3. Disable last-level filters: Use optimize_filters_for_hits

Slow Writes

  1. Use Bloom for L0: Ribbon’s CPU overhead affects flushes
  2. Reduce bits_per_key: Balance filter quality vs build time
  3. Monitor construction time: Check FILTER_OPERATION_TOTAL_TIME

Best Practices

Start with defaults: Use NewBloomFilterPolicy(9.9) and tune based on metrics.
  1. Monitor effectiveness: Track BLOOM_FILTER_USEFUL ticker
  2. Profile workload: High miss rate benefits most from filters
  3. Consider Ribbon: Use for large databases with space constraints
  4. Pin critical filters: Keep L0/L1 filters in cache
  5. Optimize last level: Enable optimize_filters_for_hits if applicable

See Also

Build docs developers (and LLMs) love