Fluxer is designed to scale from a single server to a distributed, multi-region deployment handling tens of thousands of concurrent users. This guide covers scaling strategies at each stage.
Scaling Stages
Stage 1: Single Server (0-1,000 users)
Architecture: Monolith mode with SQLite
1x Fluxer server (4 cores, 8 GB RAM)
1x Valkey (2 GB RAM)
1x Meilisearch (1 GB RAM)
Optional: 1x LiveKit (for voice)
Bottleneck: CPU and disk I/O
Stage 2: Vertical Scaling (1,000-5,000 users)
Architecture: Still monolith, but larger instance
Upgrade to 8-16 cores, 16-32 GB RAM
Use faster NVMe SSD storage
Add read replicas for SQLite (if supported)
Bottleneck: Single database writer
Stage 3: Horizontal Scaling (5,000-10,000 users)
Architecture: Microservices mode with SQLite or Cassandra
Multiple API server instances (load balanced)
Separate media proxy instances
Clustered Valkey/Redis
Consider migrating to Cassandra
Bottleneck: Database write throughput
Stage 4: Global Distribution (10,000+ users)
Architecture: Multi-region microservices with Cassandra
Cassandra cluster (3+ nodes per DC)
Regional LiveKit servers
CDN for static assets
Geographic load balancing
Bottleneck: Network latency and cross-DC replication
Vertical Scaling (Single Server)
Before going distributed, maximize a single server’s capacity.
Optimize Node.js Memory
services :
fluxer_server :
environment :
# Increase heap size (default: auto, ~1.4GB on 4GB system)
NODE_OPTIONS : "--max-old-space-size=8192" # 8 GB
# Enable worker threads for CPU-intensive tasks
UV_THREADPOOL_SIZE : 16
Optimize SQLite
Tune SQLite for write-heavy workloads:
-- Increase cache size
PRAGMA cache_size = - 128000 ; -- 128 MB
-- Use WAL mode with larger checkpoint
PRAGMA journal_mode = WAL;
PRAGMA wal_autocheckpoint = 1000 ;
-- Reduce fsync calls (slight durability trade-off)
PRAGMA synchronous = NORMAL;
-- Use memory-mapped I/O for reads
PRAGMA mmap_size = 268435456 ; -- 256 MB
Add to your config:
{
"database" : {
"backend" : "sqlite" ,
"sqlite_path" : "/mnt/fast-nvme/fluxer.db"
}
}
Place the SQLite database on a fast NVMe SSD for best performance. Avoid network-mounted storage (NFS, EBS).
Optimize Valkey/Redis
# Increase max memory
maxmemory 8gb
maxmemory-policy allkeys-lru
# Disable persistence for pure cache (optional)
# save ""
# appendonly no
# Or use faster persistence
save 900 1
save 300 10
appendfsync everysec
Use a CDN
Offload static assets (avatars, attachments, emojis) to a CDN:
{
"s3" : {
"presigned_url_base" : "https://cdn.example.com"
},
"domain" : {
"static_cdn_domain" : "cdn.example.com"
}
}
Providers:
Cloudflare R2 + CDN (free egress)
AWS S3 + CloudFront
BunnyCDN
DigitalOcean Spaces + CDN
Horizontal Scaling (Microservices)
Enable Microservices Mode
Update config.json:
{
"instance" : {
"deployment_mode" : "microservices"
},
"internal" : {
"queue" : "http://queue-service:8088/queue" ,
"media_proxy" : "http://media-proxy:8080/media"
},
"services" : {
"nats" : {
"core_url" : "nats://nats-core:4222" ,
"jetstream_url" : "nats://nats-jetstream:4222" ,
"auth_token" : "your-nats-token"
}
}
}
Deploy Separate Services
Docker Compose
Kubernetes
compose.microservices.yaml
services :
# Load Balancer (Traefik, Nginx, HAProxy)
traefik :
image : traefik:v2.10
ports :
- "80:80"
- "443:443"
volumes :
- /var/run/docker.sock:/var/run/docker.sock
- ./traefik.yml:/etc/traefik/traefik.yml
# API Servers (scale with --scale)
api :
image : ghcr.io/fluxerapp/fluxer-api:latest
deploy :
replicas : 3
environment :
FLUXER_CONFIG : /config/config.json
volumes :
- ./config:/config:ro
labels :
- "traefik.enable=true"
- "traefik.http.routers.api.rule=PathPrefix(`/api`)"
# WebSocket Gateway (Erlang - stateful)
gateway :
image : ghcr.io/fluxerapp/fluxer-gateway:latest
deploy :
replicas : 2
environment :
FLUXER_CONFIG : /config/config.json
labels :
- "traefik.enable=true"
- "traefik.http.routers.gateway.rule=PathPrefix(`/gateway`)"
# Media Proxy (CPU-intensive)
media_proxy :
image : ghcr.io/fluxerapp/fluxer-media-proxy:latest
deploy :
replicas : 2
volumes :
- ./config:/config:ro
# NATS (Message Queue)
nats-core :
image : nats:2-alpine
command : [ '-c' , '/config/nats.conf' ]
nats-jetstream :
image : nats:2-alpine
command : [ '-c' , '/config/jetstream.conf' , '--jetstream' ]
volumes :
- nats_data:/data
# Valkey Cluster (3 nodes minimum)
valkey-1 :
image : valkey/valkey:8-alpine
command : valkey-server --cluster-enabled yes
valkey-2 :
image : valkey/valkey:8-alpine
command : valkey-server --cluster-enabled yes
valkey-3 :
image : valkey/valkey:8-alpine
command : valkey-server --cluster-enabled yes
volumes :
nats_data :
Scale API servers: docker compose -f compose.microservices.yaml up --scale api= 5 -d
apiVersion : apps/v1
kind : Deployment
metadata :
name : fluxer-api
spec :
replicas : 3
selector :
matchLabels :
app : fluxer-api
template :
metadata :
labels :
app : fluxer-api
spec :
containers :
- name : api
image : ghcr.io/fluxerapp/fluxer-api:latest
env :
- name : FLUXER_CONFIG
value : /config/config.json
volumeMounts :
- name : config
mountPath : /config
readOnly : true
resources :
requests :
memory : "2Gi"
cpu : "1000m"
limits :
memory : "4Gi"
cpu : "2000m"
volumes :
- name : config
configMap :
name : fluxer-config
---
apiVersion : v1
kind : Service
metadata :
name : fluxer-api
spec :
selector :
app : fluxer-api
ports :
- port : 8080
targetPort : 8080
---
apiVersion : autoscaling/v2
kind : HorizontalPodAutoscaler
metadata :
name : fluxer-api-hpa
spec :
scaleTargetRef :
apiVersion : apps/v1
kind : Deployment
name : fluxer-api
minReplicas : 3
maxReplicas : 10
metrics :
- type : Resource
resource :
name : cpu
target :
type : Utilization
averageUtilization : 70
Service Communication
In microservices mode, services communicate via:
NATS Core - RPC between API and Gateway
NATS JetStream - Async job queues
Valkey/Redis - Shared cache and locks
HTTP - Direct service-to-service calls
Migrating to Cassandra
Cassandra is required for:
Multi-region deployments
Datasets > 100 GB
Write throughput > 10,000 writes/sec
1. Deploy Cassandra Cluster
services :
cassandra-1 :
image : cassandra:5.0
hostname : cassandra-1
environment :
- CASSANDRA_CLUSTER_NAME=fluxer-cluster
- CASSANDRA_DC=dc1
- CASSANDRA_RACK=rack1
- CASSANDRA_SEEDS=cassandra-1,cassandra-2,cassandra-3
- MAX_HEAP_SIZE=16G
- CASSANDRA_AUTHENTICATOR=PasswordAuthenticator
- CASSANDRA_AUTHORIZER=CassandraAuthorizer
volumes :
- cassandra1_data:/var/lib/cassandra
deploy :
resources :
limits :
cpus : '8'
memory : 32G
reservations :
cpus : '4'
memory : 24G
cassandra-2 :
image : cassandra:5.0
# ... similar config ...
cassandra-3 :
image : cassandra:5.0
# ... similar config ...
volumes :
cassandra1_data :
cassandra2_data :
cassandra3_data :
Start the cluster:
docker compose -f cassandra-compose.yaml up -d
# Wait for all nodes to join
docker exec cassandra-1 nodetool status
2. Create Keyspace and Schema
-- Create keyspace with replication
CREATE KEYSPACE fluxer
WITH replication = {
'class' : 'NetworkTopologyStrategy' ,
'dc1' : 3 -- 3 replicas in dc1
};
USE fluxer;
-- Fluxer will auto-create tables on first run
-- Or run migration scripts manually
3. Migrate Data from SQLite
Data migration requires downtime or a complex dual-write setup. Plan carefully.
# Export from SQLite
docker exec fluxer_server node -e '
const db = require("./packages/database");
db.exportToJSON("/tmp/export.json");
'
# Import to Cassandra
docker exec fluxer_server node -e '
const db = require("./packages/database");
db.importFromJSON("/tmp/export.json");
'
Alternatively, use a custom migration script or third-party tools.
4. Update Fluxer Configuration
{
"database" : {
"backend" : "cassandra" ,
"cassandra" : {
"hosts" : [ "cassandra-1" , "cassandra-2" , "cassandra-3" ],
"keyspace" : "fluxer" ,
"local_dc" : "dc1" ,
"username" : "cassandra" ,
"password" : "your-secure-password"
}
}
}
Cassandra Tuning
Heap Size:
# Set to 1/4 to 1/2 of total RAM (max 32GB)
MAX_HEAP_SIZE = 16G
Compaction Strategy:
-- Use Leveled Compaction for read-heavy tables
ALTER TABLE messages
WITH compaction = {
'class' : 'LeveledCompactionStrategy' ,
'sstable_size_in_mb' : 160
};
-- Use TWCS for time-series data
ALTER TABLE message_edits
WITH compaction = {
'class' : 'TimeWindowCompactionStrategy' ,
'compaction_window_unit' : 'DAYS' ,
'compaction_window_size' : 7
};
Read/Write Consistency:
// In application code
const cassandra = require ( 'cassandra-driver' );
// QUORUM for strong consistency
const result = await client . execute ( query , params , {
consistency: cassandra . types . consistencies . quorum
});
// LOCAL_QUORUM for multi-DC
const result = await client . execute ( query , params , {
consistency: cassandra . types . consistencies . localQuorum
});
Multi-Region Deployment
For global users, deploy Fluxer in multiple geographic regions.
Architecture
CREATE KEYSPACE fluxer
WITH replication = {
'class' : 'NetworkTopologyStrategy' ,
'us-east' : 3 ,
'eu-west' : 3
};
Each DC runs 3+ Cassandra nodes. Cross-DC replication is automatic.
2. Deploy Regional Fluxer Instances
Each region runs:
API servers (pointing to local Cassandra DC)
Gateway servers
LiveKit servers
Shared Valkey/Redis cluster (or separate per region)
US East config:
{
"database" : {
"backend" : "cassandra" ,
"cassandra" : {
"hosts" : [ "cassandra-us-1" , "cassandra-us-2" , "cassandra-us-3" ],
"local_dc" : "us-east"
}
}
}
EU West config:
{
"database" : {
"backend" : "cassandra" ,
"cassandra" : {
"hosts" : [ "cassandra-eu-1" , "cassandra-eu-2" , "cassandra-eu-3" ],
"local_dc" : "eu-west"
}
}
}
3. GeoDNS Routing
Use GeoDNS to route users to the nearest region:
Cloudflare:
Create Load Balancer with geo-steering
Add pools for each region
Configure health checks
AWS Route 53:
{
"Type" : "A" ,
"Name" : "chat.example.com" ,
"GeoLocation" : {
"ContinentCode" : "NA"
},
"SetIdentifier" : "us-east" ,
"AliasTarget" : {
"HostedZoneId" : "Z1234567890ABC" ,
"DNSName" : "us-east-lb.example.com"
}
}
4. Cross-Region LiveKit
See Voice Setup - Multi-Region for LiveKit distribution.
Metrics to Track
Request rate (requests/sec)
Latency (p50, p95, p99)
Error rate (5xx responses)
Active WebSocket connections
Message delivery latency
SQLite:
Query latency
Write queue depth
WAL checkpoint frequency
Cassandra:
Read/write latency (per DC)
Pending compactions
Dropped mutations
Disk usage per node
Repair status
Hit rate (%)
Memory usage
Evictions/sec
Network I/O
CPU utilization
Memory usage
Disk I/O (IOPS, throughput)
Network bandwidth
Monitoring Stack
Prometheus + Grafana
Datadog
services :
prometheus :
image : prom/prometheus:latest
volumes :
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus
ports :
- "9090:9090"
grafana :
image : grafana/grafana:latest
environment :
GF_SECURITY_ADMIN_PASSWORD : admin
volumes :
- grafana_data:/var/lib/grafana
ports :
- "3000:3000"
node-exporter :
image : prom/node-exporter:latest
ports :
- "9100:9100"
Enable OpenTelemetry in Fluxer: {
"telemetry" : {
"enabled" : true ,
"otlp_endpoint" : "http://otel-collector:4318"
}
}
services :
datadog-agent :
image : datadog/agent:latest
environment :
DD_API_KEY : ${DD_API_KEY}
DD_SITE : datadoghq.com
DD_LOGS_ENABLED : true
DD_APM_ENABLED : true
volumes :
- /var/run/docker.sock:/var/run/docker.sock:ro
- /proc/:/host/proc/:ro
- /sys/fs/cgroup/:/host/sys/fs/cgroup:ro
Cost Optimization
Use Spot/Preemptible Instances
Stateless services (API, media proxy) can run on spot instances:
AWS EC2 Spot (up to 90% discount)
GCP Preemptible VMs (up to 80% discount)
Azure Spot VMs
Do not use spot for:
Database nodes
Stateful gateway servers (will disconnect users)
Object Storage Tiers
Move old attachments to cheaper storage:
// Archive attachments older than 90 days to S3 Glacier
const AWS = require ( 'aws-sdk' );
const s3 = new AWS . S3 ();
await s3 . putObject ({
Bucket: 'fluxer-uploads' ,
Key: 'attachments/old-file.jpg' ,
StorageClass: 'GLACIER' ,
// ... other params
}). promise ();
Compression
Enable compression for API responses:
// Already enabled in Fluxer by default
app . use ( compress ({
threshold: 1024 , // 1KB minimum
level: 6 // Compression level (1-9)
}));
Next Steps
Architecture Deep dive into Fluxer’s system design
Upgrading Learn about upgrade procedures and versioning