Benchmark Results
Desktop Performance (Core i7-9700K @ 4.9GHz)
Tested on Ubuntu 24.04 using the Silesia compression corpus:| Compressor | Ratio | Compression | Decompression |
|---|---|---|---|
| zstd 1.5.7 -1 | 2.896 | 510 MB/s | 1550 MB/s |
| brotli 1.1.0 -1 | 2.883 | 290 MB/s | 425 MB/s |
| zlib 1.3.1 -1 | 2.743 | 105 MB/s | 390 MB/s |
| zstd 1.5.7 —fast=1 | 2.439 | 545 MB/s | 1850 MB/s |
| quicklz 1.5.0 -1 | 2.238 | 520 MB/s | 750 MB/s |
| zstd 1.5.7 —fast=4 | 2.146 | 665 MB/s | 2050 MB/s |
| lzo1x 2.10 -1 | 2.106 | 650 MB/s | 780 MB/s |
| lz4 1.10.0 | 2.101 | 675 MB/s | 3850 MB/s |
| snappy 1.2.1 | 2.089 | 520 MB/s | 1500 MB/s |
| lzf 3.6 -1 | 2.077 | 410 MB/s | 820 MB/s |
README.md:42-53
Zstandard at level 1 provides better compression ratio than zlib at level 1, while being nearly 5x faster for compression and 4x faster for decompression.
Server Performance (Core i7-6700K @ 4.0GHz)
Compression Speed vs Ratio characteristics:- Low levels (1-5): 100-500 MB/s compression
- Medium levels (6-12): 20-100 MB/s compression
- High levels (13-19): 5-20 MB/s compression
- Ultra levels (20-22): 1-5 MB/s compression
README.md:67-79
Decompression Speed Consistency
Key characteristic: Decompression speed is preserved and remains roughly the same at all settings, a property shared by most LZ compression algorithms. FromREADME.md:64-65
Memory Usage
Compression Memory by Level
Memory requirements increase with compression level:examples/streaming_memory_usage.c:126-132
Decompression Memory
Decompression memory remains relatively constant:Decompression memory depends on the window size used during compression, not the compression level.
examples/streaming_memory_usage.c:126
Buffer Size Recommendations
For optimal performance with streaming API:lib/zstd.h:842-843,950-951
Multithreading Performance
Enabling Multithreading
nbThreads:
0: Single-threaded mode (default)1+: Specific number of threads0with env: Detected CPU core count
lib/zstd.h:471-479
Multithreading Trade-offs
Benefits:- Improved compression speed on multi-core systems
- Scales well with available cores
- Increased memory usage
- Slight reduction in compression ratio
Job Size Control
lib/zstd.h:480-484
Long Distance Matching
Long distance matching mode improves compression ratio for files with matches at large distances (up to 128 MB).Enabling Long Distance Mode
programs/README.md:199
Performance Impact
Example with tar of multiple clang versions:| Method | Ratio | Compression | Decompression |
|---|---|---|---|
zstd -1 | 5.065 | 284.8 MB/s | 759.3 MB/s |
zstd -5 | 5.826 | 124.9 MB/s | 674.0 MB/s |
zstd -10 | 6.504 | 29.5 MB/s | 771.3 MB/s |
zstd -1 --long | 17.426 | 220.6 MB/s | 1638.4 MB/s |
zstd -5 --long | 19.661 | 165.5 MB/s | 1530.6 MB/s |
zstd -10 --long | 21.949 | 75.6 MB/s | 1632.6 MB/s |
programs/README.md:303-310
Memory Requirements
Enabling long distance mode:- Sets window size to 128 MB
- Increases memory usage for both compressor and decompressor
- Default window log is 27 (128 MB)
programs/README.md:286-287
Dictionary Compression Performance
For small data (e.g., 1KB records), dictionary training provides:- Better compression ratios: Dramatically improved on small data
- Faster compression: Compared to no dictionary
- Faster decompression: Compared to no dictionary
README.md:100
Dictionary gains are most effective in the first few KB of data. After that, the algorithm uses previously decoded content.
Tuning for Performance
Compression Speed Priority
For maximum compression speed:Compression Ratio Priority
For maximum compression ratio:Balanced Performance
For most use cases:Advanced Performance Parameters
Window Log
Controls maximum back-reference distance:- Larger values: More compression, more memory
- Range: 10 to 31 (64-bit) or 30 (32-bit)
lib/zstd.h:368-375
Hash Log
Controls probe table size:- Larger values: Better compression for fast strategies, more memory
- Range: 6 to 30
lib/zstd.h:376-381
Strategy Selection
ZSTD_fast: FastestZSTD_dfast: Fast with better ratioZSTD_greedy: BalancedZSTD_lazy,ZSTD_lazy2: Good ratioZSTD_btlazy2: Better ratioZSTD_btopt: Strong compressionZSTD_btultra,ZSTD_btultra2: Maximum compression
lib/zstd.h:336-347
Benchmarking
The CLI includes a built-in benchmark mode:programs/README.md:250-252
Benchmark measures:
- Compression ratio
- Compression speed (MB/s)
- Decompression speed (MB/s)
Platform-Specific Optimizations
BMI2 Instructions
Zstandard can leverage BMI2 CPU instructions for better performance:- Automatic detection: Enabled by default on supported platforms
- Build-time:
STATIC_BMI2=1orDYNAMIC_BMI2=1 - Performance gain: Especially on decoder side
lib/README.md:167-183
SIMD Optimizations
The library includes SIMD optimizations for:- Huffman decoding
- Sequence processing
- Memory operations