Skip to main content
Sparklytics is designed to handle high traffic with minimal resources. This guide covers optimization strategies for production deployments.

Performance Benchmarks

Measured on Apple Silicon macOS, release builds, 100k–1M realistic events. Full methodology: docs/perf-baseline.md

Self-Hosted (DuckDB)

MetricValue
Peak ingest throughput~26,000 req/s (single event)
Batch ingestion~74,800 events/s (batch of 10)
Ingestion p99 latency (800 req/s)1.14 ms
Memory (idle)~29 MB
Memory (under load)~64 MB
Storage per 1M events~278 MB
Binary size (linux-amd64 musl)~15 MB

Scaling: 100k → 1M Events

DimensionDuckDB Performance
Query degradation3.5–5x slower per 10x data
Ingest degradationDrops 59% (26k→11k req/s)
Memory (query peak)407 MB → 3.5 GB
Storage efficiency278 MB per 1M events
For deployments expecting >10M events/month or >500 concurrent dashboard users, consider the cloud version with ClickHouse for 10–239x faster queries.

DuckDB Memory Configuration

The SPARKLYTICS_DUCKDB_MEMORY environment variable controls query memory allocation.

Default Configuration

environment:
  - SPARKLYTICS_DUCKDB_MEMORY=1GB
This is safe for VPS instances with 2–4 GB total RAM.
Server RAMRecommended SettingUse Case
2 GB512MBSmall sites, <100k events/month
4 GB1GB (default)Medium sites, <1M events/month
8 GB2GBGrowing sites, 1–5M events/month
16 GB4GBHigh traffic, 5–10M events/month
32 GB+8GBVery high traffic, >10M events/month

Configuration

services:
  sparklytics:
    environment:
      - SPARKLYTICS_DUCKDB_MEMORY=4GB
Do not set this higher than 50% of available system RAM. DuckDB also uses memory outside this limit for internal operations.

Data Retention

Longer retention periods increase database size and slow down queries.

Configure Retention

environment:
  - SPARKLYTICS_RETENTION_DAYS=365  # Default: 1 year
Recommended settings:
Traffic LevelRecommended Retention
Low (<100k events/month)365 days (default)
Medium (100k–1M/month)180 days
High (>1M/month)90 days
Very high (>10M/month)30–60 days
Sparklytics automatically deletes events older than the retention period. No manual cleanup required.

Storage Optimization

Disk Space Requirements

  • 278 MB per 1 million events (DuckDB format)
  • Includes indexes and compressed storage
  • Linear scaling up to ~10M events
Example calculations:
Events/MonthStorage/MonthStorage/Year (365 days)
100k28 MB336 MB
1M278 MB3.3 GB
10M2.8 GB33 GB
50M14 GB168 GB

Volume Performance

For Docker deployments, use appropriate volume drivers:
volumes:
  sparklytics-data:
    driver: local
    driver_opts:
      type: none
      o: bind
      device: /mnt/ssd/sparklytics

CPU and Concurrency

Container Resource Limits

For predictable performance, set resource limits:
services:
  sparklytics:
    deploy:
      resources:
        limits:
          cpus: '2'
          memory: 4G
        reservations:
          cpus: '1'
          memory: 1G
Requests/SecondCPU CoresMemoryNotes
<10011 GBSmall sites
100–50022 GBMedium traffic
500–200044 GBHigh traffic
>20008+8 GB+Consider cloud version

Rate Limiting

Sparklytics has built-in rate limiting on /api/collect: 60 requests/minute per IP.

Reverse Proxy Rate Limiting

Add an additional layer at your reverse proxy:
http {
    limit_req_zone $binary_remote_addr zone=analytics:10m rate=60r/m;
}

server {
    location /api/collect {
        limit_req zone=analytics burst=20 nodelay;
        proxy_pass http://sparklytics:3000;
    }
}

CORS Configuration

Restrict API access to specific origins for security and performance:
environment:
  - SPARKLYTICS_CORS_ORIGINS=https://yoursite.com,https://www.yoursite.com
If not set, Sparklytics allows all origins by default. Always configure this for production.

Query Performance Tips

Dashboard Loading

  1. Use appropriate date ranges
    • Last 7 days loads faster than last 365 days
    • Avoid unnecessarily long ranges
  2. DuckDB query memory impacts large aggregations
    • Increase SPARKLYTICS_DUCKDB_MEMORY for better performance on large datasets
    • Monitor memory usage: docker stats sparklytics
  3. Indexing is automatic
    • DuckDB automatically optimizes queries
    • No manual index management needed

API Query Optimization

When querying the API programmatically:
// Good: Specific date range
fetch('/api/websites/site_id/stats?start_date=2026-03-01&end_date=2026-03-07')

// Bad: Open-ended or very long ranges
fetch('/api/websites/site_id/stats?start_date=2020-01-01&end_date=2026-12-31')

Monitoring

Health Check Endpoint

curl http://localhost:3000/health
Response:
{"status":"ok"}
Set up monitoring to alert if this returns non-200.

Docker Stats

Monitor resource usage:
docker stats sparklytics
Example output:
NAME           CPU %   MEM USAGE / LIMIT   MEM %   NET I/O
sparklytics    5.2%    156MiB / 4GiB       3.8%    1.2MB / 890KB

Log Monitoring

Sparklytics logs to stdout. View with:
docker logs -f sparklytics
Look for:
  • ERROR level logs (indicates issues)
  • Request latencies (should be <10ms for most requests)
  • Rate limit rejections (429 responses)

Horizontal Scaling

Sparklytics (self-hosted DuckDB version) is single-instance only. DuckDB is an embedded database and cannot be shared across multiple processes.

When to Scale

If you hit these limits:
  • >2000 requests/second sustained
  • >10M events/month
  • >500 concurrent dashboard users
  • Query latency >1 second consistently
Solution: Migrate to Sparklytics Cloud with ClickHouse:
  • 10–239x faster queries
  • Horizontal scaling
  • Distributed query execution
  • 5.8x better storage efficiency

Backup and Recovery

Automated Backups

The DuckDB database is a single file. Back it up regularly:
# Find the data directory
docker volume inspect sparklytics-data

# Backup script
#!/bin/bash
DATE=$(date +%Y%m%d)
BACKUP_DIR=/backups/sparklytics
mkdir -p $BACKUP_DIR

docker run --rm \
  -v sparklytics-data:/data \
  -v $BACKUP_DIR:/backup \
  alpine \
  tar czf /backup/sparklytics-$DATE.tar.gz -C /data .
Schedule with cron:
0 2 * * * /usr/local/bin/backup-sparklytics.sh

Restore from Backup

# Stop Sparklytics
docker compose down

# Restore data
docker run --rm \
  -v sparklytics-data:/data \
  -v /backups/sparklytics:/backup \
  alpine \
  sh -c "rm -rf /data/* && tar xzf /backup/sparklytics-20260303.tar.gz -C /data"

# Start Sparklytics
docker compose up -d

OS-Level Optimizations

Linux Kernel Parameters

For high-traffic deployments, tune kernel parameters:
sudo sysctl -w net.core.somaxconn=4096
sudo sysctl -w net.ipv4.tcp_max_syn_backlog=4096
sudo sysctl -w net.ipv4.ip_local_port_range="1024 65535"
Make permanent in /etc/sysctl.conf:
net.core.somaxconn=4096
net.ipv4.tcp_max_syn_backlog=4096
net.ipv4.ip_local_port_range=1024 65535

File Descriptor Limits

Increase open file limits:
# For Docker container
services:
  sparklytics:
    ulimits:
      nofile:
        soft: 65536
        hard: 65536

CDN for Tracking Script

The tracking script (/s.js) is small (~5 KB gzipped) but frequently requested.

Cloudflare Caching

Add a cache rule for /s.js:
analytics.example.com {
    @script {
        path /s.js
    }
    handle @script {
        header Cache-Control "public, max-age=3600"
        reverse_proxy sparklytics:3000
    }
    reverse_proxy sparklytics:3000
}
The tracking script rarely changes. Safe to cache for 1–24 hours.

Production Checklist

Before going live with high traffic:
  • Set SPARKLYTICS_DUCKDB_MEMORY based on server size
  • Configure SPARKLYTICS_RETENTION_DAYS for your traffic level
  • Enable HTTPS with valid certificate
  • Set SPARKLYTICS_CORS_ORIGINS explicitly
  • Configure resource limits in docker-compose
  • Set up automated backups
  • Configure health check monitoring
  • Add reverse proxy rate limiting
  • Use SSD storage for data volume
  • Review and optimize OS kernel parameters
  • Enable log rotation for Docker logs

Troubleshooting Performance Issues

Symptoms: Stats pages take >5 seconds to loadSolutions:
  • Increase SPARKLYTICS_DUCKDB_MEMORY
  • Reduce retention period
  • Use shorter date ranges
  • Check disk I/O (use SSD)
Symptoms: Container uses >80% of allocated memorySolutions:
  • Large datasets require more memory
  • Increase container memory limit
  • Reduce SPARKLYTICS_RETENTION_DAYS
  • Consider cloud version for >10M events
Symptoms: 502/504 errors on /api/collectSolutions:
  • Check CPU usage: docker stats
  • Increase CPU allocation
  • Verify disk is not full
  • Check network between proxy and container
Symptoms: Errors on startup or queriesSolutions:
  • Restore from latest backup
  • Check disk health
  • Ensure clean shutdowns (avoid kill -9)
  • Use restart: unless-stopped in compose file

Next Steps

Docker Deployment

Complete Docker setup guide

Reverse Proxy

Caddy, Nginx, and Traefik configs

HTTPS Setup

SSL/TLS certificate management

API Reference

Query your analytics data programmatically

Build docs developers (and LLMs) love