Overview
The SST Dump Tool (sst_dump) is a specialized utility for examining RocksDB’s SST (Sorted String Table) files. It provides detailed insights into SST file structure, contents, metadata, and compression efficiency.
Use sst_dump to debug data corruption, analyze storage efficiency, evaluate compression algorithms, and understand LSM tree structure.
Building sst_dump
Build the tool:
Basic Usage
The general syntax:
sst_dump [options] --file=<sst_file_path>
Or simply provide the path as the first argument:
sst_dump <sst_file_path> [options]
Commands
The --command flag specifies the operation mode:
Check (Default)
Verify SST file integrity by iterating through all entries:
sst_dump --file=/path/to/000123.sst --command=check
Validation performed
- Iterates over all key-value pairs
- Compares entry count with metadata
- Reports corruption if mismatch detected
Silent on success
Only prints output if errors are encountered
Scan
Iterate through entries and print them to screen:
sst_dump --file=/path/to/000123.sst --command=scan
With hex output:
sst_dump --file=/path/to/000123.sst \
--command=scan \
--output_hex
Limit number of entries:
sst_dump --file=/path/to/000123.sst \
--command=scan \
--read_num=100
Raw
Dump complete table contents to a file:
sst_dump --file=/path/to/000123.sst --command=raw
This creates 000123.sst_dump.txt with:
- All key-value pairs
- Block structure details
- Footer information
- Index block contents
Verify
Verify checksums for all blocks without printing data:
sst_dump --file=/path/to/000123.sst --command=verify
Use verify to detect corruption without the overhead of decoding keys and values. It’s faster than check for large files.
Recompress
Analyze file size with different compression algorithms:
sst_dump --file=/path/to/000123.sst \
--command=recompress \
--compression_types=kSnappyCompression,kZSTD,kLZ4Compression
Output shows estimated file size with each compression type:
Original Size: 45.2 MB
kSnappyCompression: 38.7 MB (85.6%)
kZSTD: 32.1 MB (71.0%)
kLZ4Compression: 39.2 MB (86.7%)
Identify
Check if file is a valid SST file:
sst_dump --file=/path/to/000123.sst --command=identify
For a directory, lists all valid SST files:
sst_dump --file=/path/to/db_dir --command=identify
Filtering and Range Options
Key Range Scanning
From key
Start scanning from a specific key:sst_dump --file=/path/to/000123.sst \
--command=scan \
--from=user_key_100
To key
Stop scanning at a specific key:sst_dump --file=/path/to/000123.sst \
--command=scan \
--from=user_key_100 \
--to=user_key_200
Prefix scan
Scan all keys with a specific prefix:sst_dump --file=/path/to/000123.sst \
--command=scan \
--prefix=user_prefix
Cannot be used with --from option
When working with binary keys:
sst_dump --file=/path/to/000123.sst \
--command=scan \
--input_key_hex \
--from=0x757365725f6b65795f313030 \
--output_hex
Display Options
Show Properties
Display table properties after iteration:
sst_dump --file=/path/to/000123.sst \
--command=scan \
--show_properties
Outputs:
- Data size
- Index size
- Filter size
- Number of entries
- Number of data blocks
- Compression type
- User-collected properties
Show Sequence Numbers
Display sequence numbers and value types (for raw dumps):
sst_dump --file=/path/to/000123.sst \
--command=raw \
--show_sequence_number_type
Decode Blob Index
For SST files containing blob references:
sst_dump --file=/path/to/000123.sst \
--command=scan \
--decode_blob_index
This decodes blob index entries and displays them in human-readable format.
Show all metadata blocks in the file:
sst_dump --file=/path/to/000123.sst \
--list_meta_blocks
Compression Analysis
Testing Multiple Compression Types
sst_dump --file=/path/to/000123.sst \
--command=recompress \
--compression_types=kSnappyCompression,kZSTD,kLZ4Compression,kNoCompression
Custom Block Size
Test recompression with different block sizes:
sst_dump --file=/path/to/000123.sst \
--command=recompress \
--compression_types=kZSTD \
--block_size=32768
Compression Levels
Test a range of compression levels:
sst_dump --file=/path/to/000123.sst \
--command=recompress \
--compression_types=kZSTD \
--compression_level_from=1 \
--compression_level_to=9
Index Compression
Enable compression for index blocks:
sst_dump --file=/path/to/000123.sst \
--command=recompress \
--compression_types=kZSTD \
--enable_index_compression=1
Dictionary Compression
Configure dictionary-based compression:
sst_dump --file=/path/to/000123.sst \
--command=recompress \
--compression_types=kZSTD \
--compression_max_dict_bytes=32768 \
--compression_zstd_max_train_bytes=1048576
Advanced Options
Verify Checksum
Enable checksum verification during scans:
sst_dump --file=/path/to/000123.sst \
--command=scan \
--verify_checksum
Readahead Size
Optimize I/O with readahead:
sst_dump --file=/path/to/000123.sst \
--command=scan \
--readahead_size=4194304 # 4MB
Environment and Filesystem URIs
Use custom environments or filesystems:
sst_dump --file=/path/to/000123.sst \
--env_uri=hdfs://namenode:8020 \
--command=scan
Or:
sst_dump --file=/path/to/000123.sst \
--fs_uri=s3://bucket/prefix \
--command=scan
--env_uri and --fs_uri are mutually exclusive.
Internal Key Parsing
Parse internal key format on command line:
sst_dump --parse_internal_key=0x757365725f6b657900000000000001230a
Outputs:
key=user_key @ 291: kTypeValue
This is useful for understanding RocksDB’s internal key encoding (user key + sequence number + type).
Practical Examples
Diagnose Corruption
Identify corrupted file
sst_dump --file=/db/000123.sst --command=verify
Locate bad block
sst_dump --file=/db/000123.sst \
--command=check \
--verify_checksum
Extract good data
sst_dump --file=/db/000123.sst \
--command=scan \
--to=last_good_key > recovered_data.txt
Optimize Compression
Test all compression types
sst_dump --file=/db/000123.sst \
--command=recompress \
--compression_types=kSnappyCompression,kZSTD,kLZ4Compression
Fine-tune best algorithm
sst_dump --file=/db/000123.sst \
--command=recompress \
--compression_types=kZSTD \
--compression_level_from=1 \
--compression_level_to=9
Test with dictionary
sst_dump --file=/db/000123.sst \
--command=recompress \
--compression_types=kZSTD \
--compression_level=3 \
--compression_max_dict_bytes=65536
Export Key Range
Extract specific key range from SST file:
sst_dump --file=/db/000123.sst \
--command=scan \
--from=start_key \
--to=end_key \
--output_hex > extracted_range.txt
Inspect File Properties
sst_dump --file=/db/000123.sst \
--command=check \
--show_properties
Analyze All Files in Directory
for file in /db/*.sst; do
echo "Analyzing $file"
sst_dump --file=$file \
--command=check \
--show_properties
done
Understanding Output
'key1' @ 123 : 1 => 'value1'
'key2' @ 124 : 1 => 'value2'
Format: key @ sequence_number : type => value
Value types:
1 = kTypeValue (regular put)
0 = kTypeDeletion (delete)
2 = kTypeMerge (merge)
7 = kTypeDeletionWithTimestamp
Properties Output
Example output from --show_properties:
Table Properties:
data size: 45234567
index size: 123456
filter size: 98765
raw key size: 32145678
raw value size: 98765432
num entries: 1000000
num data blocks: 4523
compression: Snappy
compression_options: window_bits=-14; level=32767
Common Options Reference
| Option | Description |
|---|
--file=<path> | SST file or directory path |
--command=<cmd> | Operation: check, scan, raw, verify, recompress, identify |
--output_hex | Display keys and values in hex format |
--input_key_hex | Interpret —from/—to as hex |
--from=<key> | Start scanning from this key |
--to=<key> | Stop scanning at this key |
--prefix=<prefix> | Scan keys with this prefix |
--read_num=<n> | Maximum entries to read |
--verify_checksum | Verify checksums during scan |
--show_properties | Display table properties |
--decode_blob_index | Decode blob index entries |
--block_size=<size> | Block size for recompression |
--compression_types=<list> | Comma-separated compression types |
--compression_level=<n> | Compression level |
--list_meta_blocks | List all metadata blocks |
For large files, use --read_num to limit output:sst_dump --file=/db/large.sst --command=scan --read_num=1000
Use verify instead of check for faster corruption detection without decoding overhead.
Source Reference
SST Dump Tool implementation:
include/rocksdb/sst_dump_tool.h:11-14 - Public API (simple Run interface)
tools/sst_dump_tool.cc:167-400 - Command-line parsing and tool implementation
table/sst_file_dumper.h and table/sst_file_dumper.cc - Core SST file reading logic
Build command:
Getting Help
Display help message with all options:
The help output includes supported compression types for your build.