Performance Benchmarks
Measured on Apple Silicon macOS, release builds, 100k–1M realistic events. Full methodology: docs/perf-baseline.mdSelf-Hosted (DuckDB)
| Metric | Value |
|---|---|
| Peak ingest throughput | ~26,000 req/s (single event) |
| Batch ingestion | ~74,800 events/s (batch of 10) |
| Ingestion p99 latency (800 req/s) | 1.14 ms |
| Memory (idle) | ~29 MB |
| Memory (under load) | ~64 MB |
| Storage per 1M events | ~278 MB |
| Binary size (linux-amd64 musl) | ~15 MB |
Scaling: 100k → 1M Events
| Dimension | DuckDB Performance |
|---|---|
| Query degradation | 3.5–5x slower per 10x data |
| Ingest degradation | Drops 59% (26k→11k req/s) |
| Memory (query peak) | 407 MB → 3.5 GB |
| Storage efficiency | 278 MB per 1M events |
For deployments expecting >10M events/month or >500 concurrent dashboard users, consider the cloud version with ClickHouse for 10–239x faster queries.
DuckDB Memory Configuration
TheSPARKLYTICS_DUCKDB_MEMORY environment variable controls query memory allocation.
Default Configuration
Recommended by Server Size
| Server RAM | Recommended Setting | Use Case |
|---|---|---|
| 2 GB | 512MB | Small sites, <100k events/month |
| 4 GB | 1GB (default) | Medium sites, <1M events/month |
| 8 GB | 2GB | Growing sites, 1–5M events/month |
| 16 GB | 4GB | High traffic, 5–10M events/month |
| 32 GB+ | 8GB | Very high traffic, >10M events/month |
Configuration
Data Retention
Longer retention periods increase database size and slow down queries.Configure Retention
| Traffic Level | Recommended Retention |
|---|---|
| Low (<100k events/month) | 365 days (default) |
| Medium (100k–1M/month) | 180 days |
| High (>1M/month) | 90 days |
| Very high (>10M/month) | 30–60 days |
Storage Optimization
Disk Space Requirements
- 278 MB per 1 million events (DuckDB format)
- Includes indexes and compressed storage
- Linear scaling up to ~10M events
| Events/Month | Storage/Month | Storage/Year (365 days) |
|---|---|---|
| 100k | 28 MB | 336 MB |
| 1M | 278 MB | 3.3 GB |
| 10M | 2.8 GB | 33 GB |
| 50M | 14 GB | 168 GB |
Volume Performance
For Docker deployments, use appropriate volume drivers:- Local SSD (Best)
- Named Volume (Default)
- NFS (Network)
CPU and Concurrency
Container Resource Limits
For predictable performance, set resource limits:Recommended by Traffic Level
| Requests/Second | CPU Cores | Memory | Notes |
|---|---|---|---|
| <100 | 1 | 1 GB | Small sites |
| 100–500 | 2 | 2 GB | Medium traffic |
| 500–2000 | 4 | 4 GB | High traffic |
| >2000 | 8+ | 8 GB+ | Consider cloud version |
Rate Limiting
Sparklytics has built-in rate limiting on/api/collect: 60 requests/minute per IP.
Reverse Proxy Rate Limiting
Add an additional layer at your reverse proxy:- Nginx
- Caddy
CORS Configuration
Restrict API access to specific origins for security and performance:Query Performance Tips
Dashboard Loading
-
Use appropriate date ranges
- Last 7 days loads faster than last 365 days
- Avoid unnecessarily long ranges
-
DuckDB query memory impacts large aggregations
- Increase
SPARKLYTICS_DUCKDB_MEMORYfor better performance on large datasets - Monitor memory usage:
docker stats sparklytics
- Increase
-
Indexing is automatic
- DuckDB automatically optimizes queries
- No manual index management needed
API Query Optimization
When querying the API programmatically:Monitoring
Health Check Endpoint
Docker Stats
Monitor resource usage:Log Monitoring
Sparklytics logs to stdout. View with:ERRORlevel logs (indicates issues)- Request latencies (should be <10ms for most requests)
- Rate limit rejections (429 responses)
Horizontal Scaling
Sparklytics (self-hosted DuckDB version) is single-instance only. DuckDB is an embedded database and cannot be shared across multiple processes.When to Scale
If you hit these limits:- >2000 requests/second sustained
- >10M events/month
- >500 concurrent dashboard users
- Query latency >1 second consistently
- 10–239x faster queries
- Horizontal scaling
- Distributed query execution
- 5.8x better storage efficiency
Backup and Recovery
Automated Backups
The DuckDB database is a single file. Back it up regularly:Restore from Backup
OS-Level Optimizations
Linux Kernel Parameters
For high-traffic deployments, tune kernel parameters:/etc/sysctl.conf:
File Descriptor Limits
Increase open file limits:CDN for Tracking Script
The tracking script (/s.js) is small (~5 KB gzipped) but frequently requested.
Cloudflare Caching
Add a cache rule for/s.js:
Production Checklist
Before going live with high traffic:- Set
SPARKLYTICS_DUCKDB_MEMORYbased on server size - Configure
SPARKLYTICS_RETENTION_DAYSfor your traffic level - Enable HTTPS with valid certificate
- Set
SPARKLYTICS_CORS_ORIGINSexplicitly - Configure resource limits in docker-compose
- Set up automated backups
- Configure health check monitoring
- Add reverse proxy rate limiting
- Use SSD storage for data volume
- Review and optimize OS kernel parameters
- Enable log rotation for Docker logs
Troubleshooting Performance Issues
Slow dashboard queries
Slow dashboard queries
Symptoms: Stats pages take >5 seconds to loadSolutions:
- Increase
SPARKLYTICS_DUCKDB_MEMORY - Reduce retention period
- Use shorter date ranges
- Check disk I/O (use SSD)
High memory usage
High memory usage
Symptoms: Container uses >80% of allocated memorySolutions:
- Large datasets require more memory
- Increase container memory limit
- Reduce
SPARKLYTICS_RETENTION_DAYS - Consider cloud version for >10M events
Event collection timeouts
Event collection timeouts
Symptoms: 502/504 errors on
/api/collectSolutions:- Check CPU usage:
docker stats - Increase CPU allocation
- Verify disk is not full
- Check network between proxy and container
Database file corruption
Database file corruption
Symptoms: Errors on startup or queriesSolutions:
- Restore from latest backup
- Check disk health
- Ensure clean shutdowns (avoid
kill -9) - Use
restart: unless-stoppedin compose file
Next Steps
Docker Deployment
Complete Docker setup guide
Reverse Proxy
Caddy, Nginx, and Traefik configs
HTTPS Setup
SSL/TLS certificate management
API Reference
Query your analytics data programmatically