Skip to main content

Overview

The SST Dump Tool (sst_dump) is a specialized utility for examining RocksDB’s SST (Sorted String Table) files. It provides detailed insights into SST file structure, contents, metadata, and compression efficiency.
Use sst_dump to debug data corruption, analyze storage efficiency, evaluate compression algorithms, and understand LSM tree structure.

Building sst_dump

Build the tool:
make sst_dump

Basic Usage

The general syntax:
sst_dump [options] --file=<sst_file_path>
Or simply provide the path as the first argument:
sst_dump <sst_file_path> [options]

Commands

The --command flag specifies the operation mode:

Check (Default)

Verify SST file integrity by iterating through all entries:
sst_dump --file=/path/to/000123.sst --command=check
1

Validation performed

  • Iterates over all key-value pairs
  • Compares entry count with metadata
  • Reports corruption if mismatch detected
2

Silent on success

Only prints output if errors are encountered

Scan

Iterate through entries and print them to screen:
sst_dump --file=/path/to/000123.sst --command=scan
With hex output:
sst_dump --file=/path/to/000123.sst \
  --command=scan \
  --output_hex
Limit number of entries:
sst_dump --file=/path/to/000123.sst \
  --command=scan \
  --read_num=100

Raw

Dump complete table contents to a file:
sst_dump --file=/path/to/000123.sst --command=raw
This creates 000123.sst_dump.txt with:
  • All key-value pairs
  • Block structure details
  • Footer information
  • Index block contents

Verify

Verify checksums for all blocks without printing data:
sst_dump --file=/path/to/000123.sst --command=verify
Use verify to detect corruption without the overhead of decoding keys and values. It’s faster than check for large files.

Recompress

Analyze file size with different compression algorithms:
sst_dump --file=/path/to/000123.sst \
  --command=recompress \
  --compression_types=kSnappyCompression,kZSTD,kLZ4Compression
Output shows estimated file size with each compression type:
Original Size: 45.2 MB
kSnappyCompression: 38.7 MB (85.6%)
kZSTD: 32.1 MB (71.0%)
kLZ4Compression: 39.2 MB (86.7%)

Identify

Check if file is a valid SST file:
sst_dump --file=/path/to/000123.sst --command=identify
For a directory, lists all valid SST files:
sst_dump --file=/path/to/db_dir --command=identify

Filtering and Range Options

Key Range Scanning

1

From key

Start scanning from a specific key:
sst_dump --file=/path/to/000123.sst \
  --command=scan \
  --from=user_key_100
2

To key

Stop scanning at a specific key:
sst_dump --file=/path/to/000123.sst \
  --command=scan \
  --from=user_key_100 \
  --to=user_key_200
3

Prefix scan

Scan all keys with a specific prefix:
sst_dump --file=/path/to/000123.sst \
  --command=scan \
  --prefix=user_prefix
Cannot be used with --from option

Hex Key Input

When working with binary keys:
sst_dump --file=/path/to/000123.sst \
  --command=scan \
  --input_key_hex \
  --from=0x757365725f6b65795f313030 \
  --output_hex

Display Options

Show Properties

Display table properties after iteration:
sst_dump --file=/path/to/000123.sst \
  --command=scan \
  --show_properties
Outputs:
  • Data size
  • Index size
  • Filter size
  • Number of entries
  • Number of data blocks
  • Compression type
  • User-collected properties

Show Sequence Numbers

Display sequence numbers and value types (for raw dumps):
sst_dump --file=/path/to/000123.sst \
  --command=raw \
  --show_sequence_number_type

Decode Blob Index

For SST files containing blob references:
sst_dump --file=/path/to/000123.sst \
  --command=scan \
  --decode_blob_index
This decodes blob index entries and displays them in human-readable format.

List Meta Blocks

Show all metadata blocks in the file:
sst_dump --file=/path/to/000123.sst \
  --list_meta_blocks

Compression Analysis

Testing Multiple Compression Types

sst_dump --file=/path/to/000123.sst \
  --command=recompress \
  --compression_types=kSnappyCompression,kZSTD,kLZ4Compression,kNoCompression

Custom Block Size

Test recompression with different block sizes:
sst_dump --file=/path/to/000123.sst \
  --command=recompress \
  --compression_types=kZSTD \
  --block_size=32768

Compression Levels

Test a range of compression levels:
sst_dump --file=/path/to/000123.sst \
  --command=recompress \
  --compression_types=kZSTD \
  --compression_level_from=1 \
  --compression_level_to=9

Index Compression

Enable compression for index blocks:
sst_dump --file=/path/to/000123.sst \
  --command=recompress \
  --compression_types=kZSTD \
  --enable_index_compression=1

Dictionary Compression

Configure dictionary-based compression:
sst_dump --file=/path/to/000123.sst \
  --command=recompress \
  --compression_types=kZSTD \
  --compression_max_dict_bytes=32768 \
  --compression_zstd_max_train_bytes=1048576

Advanced Options

Verify Checksum

Enable checksum verification during scans:
sst_dump --file=/path/to/000123.sst \
  --command=scan \
  --verify_checksum

Readahead Size

Optimize I/O with readahead:
sst_dump --file=/path/to/000123.sst \
  --command=scan \
  --readahead_size=4194304  # 4MB

Environment and Filesystem URIs

Use custom environments or filesystems:
sst_dump --file=/path/to/000123.sst \
  --env_uri=hdfs://namenode:8020 \
  --command=scan
Or:
sst_dump --file=/path/to/000123.sst \
  --fs_uri=s3://bucket/prefix \
  --command=scan
--env_uri and --fs_uri are mutually exclusive.

Internal Key Parsing

Parse internal key format on command line:
sst_dump --parse_internal_key=0x757365725f6b657900000000000001230a
Outputs:
key=user_key @ 291: kTypeValue
This is useful for understanding RocksDB’s internal key encoding (user key + sequence number + type).

Practical Examples

Diagnose Corruption

1

Identify corrupted file

sst_dump --file=/db/000123.sst --command=verify
2

Locate bad block

sst_dump --file=/db/000123.sst \
  --command=check \
  --verify_checksum
3

Extract good data

sst_dump --file=/db/000123.sst \
  --command=scan \
  --to=last_good_key > recovered_data.txt

Optimize Compression

1

Test all compression types

sst_dump --file=/db/000123.sst \
  --command=recompress \
  --compression_types=kSnappyCompression,kZSTD,kLZ4Compression
2

Fine-tune best algorithm

sst_dump --file=/db/000123.sst \
  --command=recompress \
  --compression_types=kZSTD \
  --compression_level_from=1 \
  --compression_level_to=9
3

Test with dictionary

sst_dump --file=/db/000123.sst \
  --command=recompress \
  --compression_types=kZSTD \
  --compression_level=3 \
  --compression_max_dict_bytes=65536

Export Key Range

Extract specific key range from SST file:
sst_dump --file=/db/000123.sst \
  --command=scan \
  --from=start_key \
  --to=end_key \
  --output_hex > extracted_range.txt

Inspect File Properties

sst_dump --file=/db/000123.sst \
  --command=check \
  --show_properties

Analyze All Files in Directory

for file in /db/*.sst; do
  echo "Analyzing $file"
  sst_dump --file=$file \
    --command=check \
    --show_properties
done

Understanding Output

Scan Output Format

'key1' @ 123 : 1 => 'value1'
'key2' @ 124 : 1 => 'value2'
Format: key @ sequence_number : type => value Value types:
  • 1 = kTypeValue (regular put)
  • 0 = kTypeDeletion (delete)
  • 2 = kTypeMerge (merge)
  • 7 = kTypeDeletionWithTimestamp

Properties Output

Example output from --show_properties:
Table Properties:
  data size: 45234567
  index size: 123456
  filter size: 98765
  raw key size: 32145678
  raw value size: 98765432
  num entries: 1000000
  num data blocks: 4523
  compression: Snappy
  compression_options: window_bits=-14; level=32767

Common Options Reference

OptionDescription
--file=<path>SST file or directory path
--command=<cmd>Operation: check, scan, raw, verify, recompress, identify
--output_hexDisplay keys and values in hex format
--input_key_hexInterpret —from/—to as hex
--from=<key>Start scanning from this key
--to=<key>Stop scanning at this key
--prefix=<prefix>Scan keys with this prefix
--read_num=<n>Maximum entries to read
--verify_checksumVerify checksums during scan
--show_propertiesDisplay table properties
--decode_blob_indexDecode blob index entries
--block_size=<size>Block size for recompression
--compression_types=<list>Comma-separated compression types
--compression_level=<n>Compression level
--list_meta_blocksList all metadata blocks

Performance Tips

For large files, use --read_num to limit output:
sst_dump --file=/db/large.sst --command=scan --read_num=1000
Use verify instead of check for faster corruption detection without decoding overhead.

Source Reference

SST Dump Tool implementation:
  • include/rocksdb/sst_dump_tool.h:11-14 - Public API (simple Run interface)
  • tools/sst_dump_tool.cc:167-400 - Command-line parsing and tool implementation
  • table/sst_file_dumper.h and table/sst_file_dumper.cc - Core SST file reading logic
Build command:
make sst_dump

Getting Help

Display help message with all options:
sst_dump --help
The help output includes supported compression types for your build.

Build docs developers (and LLMs) love