Overview
Fluxer uses Apache Cassandra 5.0 as its primary data store for messages, users, guilds, channels, and all persistent application data. Cassandra provides:
Horizontal scalability - Linear scale-out with no single point of failure
High availability - Multi-datacenter replication with tunable consistency
Performance - Optimized for write-heavy workloads with low latency
Resilience - Self-healing with automatic data distribution
This guide covers single-node deployment for development and small instances. Production deployments should use a multi-node cluster.
Docker Compose Setup
Fluxer’s Cassandra stack includes:
cassandra - Main database service
cassandra-backup - Automated backup service with encryption and B2 upload
Configuration
services :
cassandra :
image : cassandra:5.0
hostname : cassandra
environment :
- CASSANDRA_CLUSTER_NAME=fluxer-cluster
- CASSANDRA_DC=dc1
- CASSANDRA_RACK=rack1
- CASSANDRA_ENDPOINT_SNITCH=GossipingPropertyFileSnitch
- CASSANDRA_SEEDS=cassandra
- MAX_HEAP_SIZE=32G
- CASSANDRA_AUTHENTICATOR=PasswordAuthenticator
- CASSANDRA_AUTHORIZER=CassandraAuthorizer
volumes :
- cassandra_data:/var/lib/cassandra
- ./conf/cassandra.yaml:/etc/cassandra/cassandra.yaml
- ./conf/jvm-server.options:/etc/cassandra/jvm-server.options
ports :
- target : 9042
published : 9042
protocol : tcp
mode : host
healthcheck :
test : [ 'CMD-SHELL' , 'nodetool status' ]
interval : 30s
timeout : 10s
retries : 5
start_period : 120s
Resource Limits
resources :
limits :
cpus : '6'
memory : 64G
reservations :
cpus : '4'
memory : 48G
ulimits :
memlock :
soft : -1
hard : -1
nofile :
soft : 100000
hard : 100000
resources :
limits :
cpus : '2'
memory : 8G
reservations :
cpus : '1'
memory : 4G
ulimits :
nofile :
soft : 10000
hard : 10000
Schema Management
Fluxer uses CQL migration files for schema versioning.
Migration Files
Migrations are stored in fluxer_devops/cassandra/migrations/ with timestamp-based naming:
20251224122758_instance_configuration.cql
20260207160233_user_connections.cql
20260214123000_bot_require_code_grant.cql
20260217140000_guild_discovery.cql
Example Migration
20251224122758_instance_configuration.cql
CREATE TABLE IF NOT EXISTS fluxer . instance_configuration (
key text PRIMARY KEY ,
value text ,
updated_at timestamp
);
INSERT INTO fluxer . instance_configuration ( key , value , updated_at)
VALUES ( 'manual_review_enabled' , 'true' , toTimestamp( now ()));
INSERT INTO fluxer . instance_configuration ( key , value , updated_at)
VALUES ( 'manual_review_schedule_enabled' , 'false' , toTimestamp( now ()));
Applying Migrations
Connect to Cassandra
docker exec -it cassandra cqlsh -u cassandra -p " $CASSANDRA_PASSWORD "
Create Keyspace (First Time)
CREATE KEYSPACE IF NOT EXISTS fluxer
WITH REPLICATION = {
'class' : 'SimpleStrategy' ,
'replication_factor' : 1
};
Apply Migration
# Copy migration to container
docker cp migrations/20260217140000_guild_discovery.cql cassandra:/tmp/
# Execute
docker exec cassandra cqlsh -u cassandra -p " $CASSANDRA_PASSWORD " \
-f /tmp/20260217140000_guild_discovery.cql
For production, use a migration tracking table to record applied migrations and prevent duplicates.
Backup Strategy
Fluxer includes an automated backup system using:
Snapshots - Cassandra’s built-in snapshot mechanism
age encryption - Public-key encryption for backup security
Backblaze B2 - Cloud object storage for off-site backups
Backup Process
The cassandra-backup service runs on a schedule and:
Creates a Cassandra snapshot via nodetool snapshot
Collects snapshot files and schema dump
Compresses and encrypts with age
Uploads to B2 object storage
Purges old backups (retains 168 hourly backups = 7 days)
Manual Backup
View Backup Logs
# Run backup manually
docker exec cassandra-backup /backup.sh
Backup Encryption
Backups are encrypted using age :
Generate Key Pair
# Install age
apt install age # Debian/Ubuntu
brew install age # macOS
# Generate keypair
age-keygen -o age_private_key.txt
# Output:
# Public key: age1xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Configure Public Key
Add the public key to .env: AGE_PUBLIC_KEY = age1xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Store Private Key Securely
Save age_private_key.txt to:
Password manager (1Password, Bitwarden)
Hardware security key
Offline encrypted storage
Without the private key, backups cannot be decrypted! Store it securely in multiple locations.
Restore Procedures
Fluxer provides two restore workflows:
1. Fresh Instance from Local Backup
Restore a backup to a new Cassandra instance:
# 1. Create volume and start Cassandra
docker volume create cassandra_data
docker run -d --name cass \
-v cassandra_data:/var/lib/cassandra \
-p 9042:9042 \
cassandra:5.0
echo "Waiting for Cassandra to start..."
sleep 30
# 2. Extract backup and apply schema
docker exec cass sh -c 'apt-get update -qq && apt-get install -y -qq age'
docker cp ~/Downloads/backup.tar.age cass:/tmp/
docker cp ~/Downloads/age_private_key.txt cass:/tmp/key.txt
docker exec cass sh -c 'age -d -i /tmp/key.txt /tmp/backup.tar.age | tar -C /tmp -xf -'
docker exec cass sh -c 'sed "/^WARNING:/d" /tmp/cassandra-backup-*/schema.cql | cqlsh'
# 3. Copy backup to volume and stop Cassandra
docker exec cass sh -c 'cp -r /tmp/cassandra-backup-* /var/lib/cassandra/'
docker stop cass
# 4. Restore SSTable files
docker run -d --name cass-util \
-v cassandra_data:/var/lib/cassandra \
--entrypoint sleep \
cassandra:5.0 infinity
docker exec cass-util sh -c '
BACKUP_DIR=$(ls -d /var/lib/cassandra/cassandra-backup-* | head -1)
DATA_DIR=/var/lib/cassandra/data
for keyspace_dir in "$BACKUP_DIR"/*/; do
keyspace=$(basename "$keyspace_dir")
[[ "$keyspace" =~ ^system ]] && continue
[ ! -d "$keyspace_dir" ] && continue
for snapshot_dir in "$keyspace_dir"/*/snapshots/backup-*/; do
[ ! -d "$snapshot_dir" ] && continue
table_with_uuid=$(basename $(dirname $(dirname "$snapshot_dir")))
table_name=$(echo "$table_with_uuid" | cut -d- -f1)
target_dir=$(ls -d "$DATA_DIR/$keyspace/${table_name}"-* 2>/dev/null | head -1)
if [ -n "$target_dir" ]; then
cp "$snapshot_dir"/* "$target_dir"/
fi
done
done
chown -R cassandra:cassandra "$DATA_DIR"
'
# 5. Restart and refresh
docker rm -f cass-util
docker start cass
sleep 30
# 6. Refresh tables
docker exec cass sh -c '
BACKUP_DIR=$(ls -d /var/lib/cassandra/cassandra-backup-* | head -1)
for keyspace_dir in "$BACKUP_DIR"/*/; do
keyspace=$(basename "$keyspace_dir")
[[ "$keyspace" =~ ^system ]] && continue
for snapshot_dir in "$keyspace_dir"/*/snapshots/backup-*/; do
[ ! -d "$snapshot_dir" ] && continue
table_with_uuid=$(basename $(dirname $(dirname "$snapshot_dir")))
table_name=$(echo "$table_with_uuid" | cut -d- -f1)
nodetool refresh -- "$keyspace" "$table_name" 2>&1 | grep -v deprecated
done
done
'
# 7. Verify
docker exec cass cqlsh -e "SELECT COUNT(*) FROM fluxer.users;"
2. Production Restore from B2
Restore from cloud backup:
# Set variables
BACKUP_NAME = "cassandra-backup-20260304-103753.tar.age"
CASSANDRA_CONTAINER = "cassandra-prod"
# Download from B2
export AWS_ACCESS_KEY_ID = "${ B2_KEY_ID }"
export AWS_SECRET_ACCESS_KEY = "${ B2_APPLICATION_KEY }"
export AWS_DEFAULT_REGION = "${ B2_REGION }"
B2_ENDPOINT_URL = "https://${ B2_ENDPOINT }"
aws s3 cp "s3://${ B2_BUCKET_NAME }/${ BACKUP_NAME }" \
"/tmp/${ BACKUP_NAME }" \
--endpoint-url= "${ B2_ENDPOINT_URL }"
# Copy to container
docker cp "/tmp/${ BACKUP_NAME }" ${ CASSANDRA_CONTAINER } :/tmp/
docker cp /etc/cassandra/age_private_key.txt ${ CASSANDRA_CONTAINER } :/tmp/key.txt
# Stop Cassandra
docker exec ${ CASSANDRA_CONTAINER } sh -c 'apt-get update -qq && apt-get install -y -qq age'
docker stop ${ CASSANDRA_CONTAINER }
# Extract in utility container
docker run -d --name cass-restore-util \
--volumes-from ${ CASSANDRA_CONTAINER } \
--entrypoint sleep \
cassandra:5.0 infinity
docker exec cass-restore-util sh -c \
'age -d -i /tmp/key.txt /tmp/${BACKUP_NAME} | tar -C /tmp -xf -'
docker exec cass-restore-util sh -c \
'cp -r /tmp/cassandra-backup-* /var/lib/cassandra/'
# Restore SSTable files (same as above)
# ...
# Restart and verify
docker rm -f cass-restore-util
docker start ${ CASSANDRA_CONTAINER }
sleep 30
docker exec ${ CASSANDRA_CONTAINER } cqlsh -e "SELECT COUNT(*) FROM fluxer.users;"
# Cleanup
rm -f "/tmp/${ BACKUP_NAME }"
Always test restores in a non-production environment first. Restores overwrite existing data!
Maintenance Operations
Compaction
Manually trigger compaction to reclaim disk space:
# Full compaction (I/O intensive)
docker exec cassandra nodetool compact fluxer
# Specific table
docker exec cassandra nodetool compact fluxer users
Repair
Run repair to ensure data consistency across replicas:
# Full repair
docker exec cassandra nodetool repair -pr fluxer
# Specific table
docker exec cassandra nodetool repair -pr fluxer users
In single-node deployments, repair is unnecessary. It’s critical for multi-node clusters.
Cleanup Snapshots
Remove old snapshots to free disk space:
# List snapshots
docker exec cassandra nodetool listsnapshots
# Clear all snapshots
docker exec cassandra nodetool clearsnapshot --all
# Clear specific snapshot
docker exec cassandra nodetool clearsnapshot -t backup-20260304
Monitor Disk Usage
# Check data directory size
docker exec cassandra du -sh /var/lib/cassandra/data
# Per-keyspace usage
docker exec cassandra du -sh /var/lib/cassandra/data/fluxer
# Table sizes
docker exec cassandra du -sh /var/lib/cassandra/data/fluxer/ *
JVM Heap Size
Set via environment variable:
environment :
- MAX_HEAP_SIZE=32G
- HEAP_NEWSIZE=8G
Guidelines :
Use 25-50% of total system RAM
Minimum: 2G for development
Production: 8-32G depending on dataset
Concurrent Reads/Writes
Edit cassandra.yaml:
concurrent_reads : 32
concurrent_writes : 32
concurrent_counter_writes : 32
Commit Log Settings
commitlog_sync : periodic
commitlog_sync_period_in_ms : 10000
commitlog_segment_size_in_mb : 32
Monitoring
nodetool status docker exec cassandra nodetool status
Shows:
Node status (UN = Up/Normal)
Ownership percentage
Load (data size)
Tokens
nodetool tpstats docker exec cassandra nodetool tpstats
Thread pool statistics:
Active/pending tasks
Blocked tasks
Completed operations
nodetool cfstats docker exec cassandra nodetool cfstats fluxer.users
Table statistics:
Read/write counts
Latency percentiles
SSTable count
JMX Metrics Expose JMX on port 7199 for monitoring tools: Integrate with Prometheus, Grafana, or DataDog.
Troubleshooting
Startup timeout / healthcheck failing
Problem : Container restarts repeatedly during startupSolution :
Increase start_period in healthcheck
Check logs:
docker logs cassandra --tail 100
Verify sufficient memory:
Check data volume integrity
Problem : Writes failing with “No space left on device”Solution :
Clear old snapshots:
docker exec cassandra nodetool clearsnapshot --all
Run compaction:
docker exec cassandra nodetool compact
Increase volume size
Archive old data
Problem : Queries timing out or slowSolution :
Check GC pressure:
docker exec cassandra nodetool gcstats
Review thread pool stats:
docker exec cassandra nodetool tpstats
Increase heap size
Add read/write capacity
Review query patterns for full table scans
See Also