Benchmark results
Real-world performance testing shows substantial speedups:| Package | Homebrew | ZB (cold) | ZB (warm) | Cold Speedup | Warm Speedup |
|---|---|---|---|---|---|
| Overall (top 100) | 452s | 226s | 59s | 2.0x | 7.6x |
| ffmpeg | 3034ms | 3481ms | 688ms | 0.9x | 4.4x |
| libsodium | 2353ms | 392ms | 130ms | 6.0x | 18.1x |
| sqlite | 2876ms | 625ms | 159ms | 4.6x | 18.1x |
| tesseract | 18950ms | 5536ms | 643ms | 3.4x | 29.5x |
Cold cache: First installation with no cached bottlesWarm cache: Reinstallation with bottles already downloaded
Key observations
- Overall: 2x faster on cold cache, 7.6x faster on warm cache for top 100 packages
- Small packages (libsodium, sqlite): 6-18x faster due to parallelism and reduced overhead
- Large packages (tesseract): Up to 29.5x faster on warm cache thanks to APFS clonefile
- Very large packages (ffmpeg): Cold cache slightly slower due to chunked download overhead, but 4.4x faster warm
Optimization strategies
zerobrew employs multiple performance optimizations working in concert:1. Parallel downloads
Racing connections for download reliability:- 3 simultaneous connection attempts per download
- 200ms stagger between attempts
- First successful connection wins, others cancelled
- Handles slow/unreliable mirrors gracefully
- Files over 10MB split into chunks (5-20MB each)
- Up to 6 parallel chunks per file
- HTTP range requests for resumable downloads
- Automatic retry on chunk failure (up to 3 attempts)
- 20 maximum concurrent connections across all downloads
- Prevents overwhelming servers or local network
- Semaphore-based coordination across download tasks
zb_io/src/network/download/mod.rs:
2. Content-addressable caching
Blob cache for downloaded bottles:- Bottles stored by SHA-256 hash in
cache/blobs/ - Automatic deduplication across package versions
- Persistent across zerobrew runs
- Warm cache enables instant reinstallation
- Extracted once and reused by multiple packages
- Reference counting tracks usage
- Garbage collection removes unreferenced entries
- Formula metadata cached in SQLite
- ETags and Last-Modified headers for conditional requests
- Reduces API calls to Homebrew servers
- Stale-while-revalidate pattern for instant responses
3. APFS clonefile (macOS)
Zero-copy materialization on macOS with APFS:- Instant copying:
clonefilesyscall creates copy-on-write clone in microseconds - Space-efficient: Clone shares disk blocks with original until modified
- Large packages: FFmpeg (200MB+) materializes instantly instead of 2-3 seconds
- APFS clonefile (instant, zero-copy)
- Hardlinks (fast, shared inodes)
- Regular copy (slow, last resort)
4. Concurrent extraction and materialization
Parallel operations across multiple packages: Unpack concurrency: 4 parallel extraction operations- Decompression (gzip, xz, zstd) is CPU-bound
- Multiple cores utilized simultaneously
- Tuned to avoid thrashing on typical hardware
- Copying from store to cellar
- Binary patching for Homebrew placeholders
- Code signing (macOS) or permission setting
zb_core/src/context.rs:
5. Optimized binary patching
Bottles contain Homebrew path placeholders that must be rewritten: macOS (Mach-O binaries):- Scan binaries for placeholder strings
- Rewrite load commands and string tables in-place
- Ad-hoc code sign after modification
- Parallel processing across files in keg
- Similar placeholder scanning and rewriting
- No code signing required
- Faster overall due to simpler binary format
Binary patching is parallelized using Rayon, processing multiple files simultaneously on multi-core systems.
6. Database optimizations
SQLite tuned for performance: Transactions: All installation operations wrapped in single transaction- Reduces fsync overhead
- Ensures atomicity for complex operations
- Much faster than individual statements
- Fast lookups for installed packages
- Efficient reference counting queries
- Quick conflict detection for symlinks
rusqlite with bundled SQLite
- Consistent behavior across platforms
- Latest performance improvements
- No system dependency
7. HTTP/2 and connection reuse
Network stack optimizations: HTTP/2: Enabled by default via rustls- Multiplexed streams over single connection
- Header compression reduces bandwidth
- Server push capability (when supported)
reqwest client reused
- TCP connection reuse across requests
- TLS session resumption
- Reduced latency for subsequent downloads
- Automatic decompression of API responses
- Reduces bandwidth for formula metadata
8. Memory efficiency
Streaming and bounded memory usage: Streaming downloads: Chunks written directly to disk- No buffering entire files in memory
- Constant memory usage regardless of file size
- SHA-256 computed incrementally during download
- Tar entries processed as they’re decompressed
- Memory usage proportional to largest file, not archive size
Running benchmarks
To measure performance yourself:- Cold cache performance (first install)
- Warm cache performance (reinstall)
- Per-package timing
- Overall speedup factors
Performance tuning
While zerobrew’s defaults are tuned for most systems, you can adjust concurrency: Download concurrency (environment variable):- Automatically tuned to number of CPU cores
- Override by modifying
ConcurrencyLimitsin code
Why zerobrew is faster
Summarizing the key advantages:- Aggressive parallelism: Downloads, extractions, and materializations happen concurrently
- Smart caching: Multiple cache layers (blobs, store, API) eliminate redundant work
- Zero-copy operations: APFS clonefile makes materialization nearly free on macOS
- Modern networking: HTTP/2, connection reuse, parallel chunks, racing connections
- Optimized I/O: Streaming operations, minimal memory usage, efficient SQLite transactions
- Native performance: Rust compiled code vs Ruby interpreter overhead
Trade-offs
Some performance trade-offs to be aware of: Disk space: Content-addressable storage uses more space short-term- Store layer keeps extracted bottles
- Run
zb gcperiodically to reclaim space - Overall comparable to Homebrew after garbage collection
- Chunked download coordination has overhead
- Amortized across warm cache usage
- FFmpeg example: 15% slower cold, 4.4x faster warm
- Each concurrent download/extraction needs memory
- Still bounded and reasonable (~100-200MB typical)
- Much less than naive “load everything into RAM” approach