Skip to main content

Overview

Fluxer uses Apache Cassandra 5.0 as its primary data store for messages, users, guilds, channels, and all persistent application data. Cassandra provides:
  • Horizontal scalability - Linear scale-out with no single point of failure
  • High availability - Multi-datacenter replication with tunable consistency
  • Performance - Optimized for write-heavy workloads with low latency
  • Resilience - Self-healing with automatic data distribution
This guide covers single-node deployment for development and small instances. Production deployments should use a multi-node cluster.

Docker Compose Setup

Fluxer’s Cassandra stack includes:
  1. cassandra - Main database service
  2. cassandra-backup - Automated backup service with encryption and B2 upload

Configuration

services:
  cassandra:
    image: cassandra:5.0
    hostname: cassandra
    environment:
      - CASSANDRA_CLUSTER_NAME=fluxer-cluster
      - CASSANDRA_DC=dc1
      - CASSANDRA_RACK=rack1
      - CASSANDRA_ENDPOINT_SNITCH=GossipingPropertyFileSnitch
      - CASSANDRA_SEEDS=cassandra
      - MAX_HEAP_SIZE=32G
      - CASSANDRA_AUTHENTICATOR=PasswordAuthenticator
      - CASSANDRA_AUTHORIZER=CassandraAuthorizer
    volumes:
      - cassandra_data:/var/lib/cassandra
      - ./conf/cassandra.yaml:/etc/cassandra/cassandra.yaml
      - ./conf/jvm-server.options:/etc/cassandra/jvm-server.options
    ports:
      - target: 9042
        published: 9042
        protocol: tcp
        mode: host
    healthcheck:
      test: ['CMD-SHELL', 'nodetool status']
      interval: 30s
      timeout: 10s
      retries: 5
      start_period: 120s

Resource Limits

resources:
  limits:
    cpus: '6'
    memory: 64G
  reservations:
    cpus: '4'
    memory: 48G
ulimits:
  memlock:
    soft: -1
    hard: -1
  nofile:
    soft: 100000
    hard: 100000

Schema Management

Fluxer uses CQL migration files for schema versioning.

Migration Files

Migrations are stored in fluxer_devops/cassandra/migrations/ with timestamp-based naming:
20251224122758_instance_configuration.cql
20260207160233_user_connections.cql
20260214123000_bot_require_code_grant.cql
20260217140000_guild_discovery.cql

Example Migration

20251224122758_instance_configuration.cql
CREATE TABLE IF NOT EXISTS fluxer.instance_configuration (
    key text PRIMARY KEY,
    value text,
    updated_at timestamp
);

INSERT INTO fluxer.instance_configuration (key, value, updated_at)
VALUES ('manual_review_enabled', 'true', toTimestamp(now()));

INSERT INTO fluxer.instance_configuration (key, value, updated_at)
VALUES ('manual_review_schedule_enabled', 'false', toTimestamp(now()));

Applying Migrations

1

Connect to Cassandra

docker exec -it cassandra cqlsh -u cassandra -p "$CASSANDRA_PASSWORD"
2

Create Keyspace (First Time)

CREATE KEYSPACE IF NOT EXISTS fluxer
WITH REPLICATION = {
  'class': 'SimpleStrategy',
  'replication_factor': 1
};
3

Apply Migration

# Copy migration to container
docker cp migrations/20260217140000_guild_discovery.cql cassandra:/tmp/

# Execute
docker exec cassandra cqlsh -u cassandra -p "$CASSANDRA_PASSWORD" \
  -f /tmp/20260217140000_guild_discovery.cql
For production, use a migration tracking table to record applied migrations and prevent duplicates.

Backup Strategy

Fluxer includes an automated backup system using:
  • Snapshots - Cassandra’s built-in snapshot mechanism
  • age encryption - Public-key encryption for backup security
  • Backblaze B2 - Cloud object storage for off-site backups

Backup Process

The cassandra-backup service runs on a schedule and:
  1. Creates a Cassandra snapshot via nodetool snapshot
  2. Collects snapshot files and schema dump
  3. Compresses and encrypts with age
  4. Uploads to B2 object storage
  5. Purges old backups (retains 168 hourly backups = 7 days)
# Run backup manually
docker exec cassandra-backup /backup.sh

Backup Encryption

Backups are encrypted using age:
1

Generate Key Pair

# Install age
apt install age  # Debian/Ubuntu
brew install age # macOS

# Generate keypair
age-keygen -o age_private_key.txt

# Output:
# Public key: age1xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
2

Configure Public Key

Add the public key to .env:
AGE_PUBLIC_KEY=age1xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
3

Store Private Key Securely

Save age_private_key.txt to:
  • Password manager (1Password, Bitwarden)
  • Hardware security key
  • Offline encrypted storage
Without the private key, backups cannot be decrypted! Store it securely in multiple locations.

Restore Procedures

Fluxer provides two restore workflows:

1. Fresh Instance from Local Backup

Restore a backup to a new Cassandra instance:
# 1. Create volume and start Cassandra
docker volume create cassandra_data
docker run -d --name cass \
  -v cassandra_data:/var/lib/cassandra \
  -p 9042:9042 \
  cassandra:5.0

echo "Waiting for Cassandra to start..."
sleep 30

# 2. Extract backup and apply schema
docker exec cass sh -c 'apt-get update -qq && apt-get install -y -qq age'
docker cp ~/Downloads/backup.tar.age cass:/tmp/
docker cp ~/Downloads/age_private_key.txt cass:/tmp/key.txt

docker exec cass sh -c 'age -d -i /tmp/key.txt /tmp/backup.tar.age | tar -C /tmp -xf -'
docker exec cass sh -c 'sed "/^WARNING:/d" /tmp/cassandra-backup-*/schema.cql | cqlsh'

# 3. Copy backup to volume and stop Cassandra
docker exec cass sh -c 'cp -r /tmp/cassandra-backup-* /var/lib/cassandra/'
docker stop cass

# 4. Restore SSTable files
docker run -d --name cass-util \
  -v cassandra_data:/var/lib/cassandra \
  --entrypoint sleep \
  cassandra:5.0 infinity

docker exec cass-util sh -c '
  BACKUP_DIR=$(ls -d /var/lib/cassandra/cassandra-backup-* | head -1)
  DATA_DIR=/var/lib/cassandra/data
  
  for keyspace_dir in "$BACKUP_DIR"/*/; do
    keyspace=$(basename "$keyspace_dir")
    [[ "$keyspace" =~ ^system ]] && continue
    [ ! -d "$keyspace_dir" ] && continue
    
    for snapshot_dir in "$keyspace_dir"/*/snapshots/backup-*/; do
      [ ! -d "$snapshot_dir" ] && continue
      table_with_uuid=$(basename $(dirname $(dirname "$snapshot_dir")))
      table_name=$(echo "$table_with_uuid" | cut -d- -f1)
      target_dir=$(ls -d "$DATA_DIR/$keyspace/${table_name}"-* 2>/dev/null | head -1)
      
      if [ -n "$target_dir" ]; then
        cp "$snapshot_dir"/* "$target_dir"/
      fi
    done
  done
  
  chown -R cassandra:cassandra "$DATA_DIR"
'

# 5. Restart and refresh
docker rm -f cass-util
docker start cass
sleep 30

# 6. Refresh tables
docker exec cass sh -c '
  BACKUP_DIR=$(ls -d /var/lib/cassandra/cassandra-backup-* | head -1)
  
  for keyspace_dir in "$BACKUP_DIR"/*/; do
    keyspace=$(basename "$keyspace_dir")
    [[ "$keyspace" =~ ^system ]] && continue
    
    for snapshot_dir in "$keyspace_dir"/*/snapshots/backup-*/; do
      [ ! -d "$snapshot_dir" ] && continue
      table_with_uuid=$(basename $(dirname $(dirname "$snapshot_dir")))
      table_name=$(echo "$table_with_uuid" | cut -d- -f1)
      nodetool refresh -- "$keyspace" "$table_name" 2>&1 | grep -v deprecated
    done
  done
'

# 7. Verify
docker exec cass cqlsh -e "SELECT COUNT(*) FROM fluxer.users;"

2. Production Restore from B2

Restore from cloud backup:
# Set variables
BACKUP_NAME="cassandra-backup-20260304-103753.tar.age"
CASSANDRA_CONTAINER="cassandra-prod"

# Download from B2
export AWS_ACCESS_KEY_ID="${B2_KEY_ID}"
export AWS_SECRET_ACCESS_KEY="${B2_APPLICATION_KEY}"
export AWS_DEFAULT_REGION="${B2_REGION}"
B2_ENDPOINT_URL="https://${B2_ENDPOINT}"

aws s3 cp "s3://${B2_BUCKET_NAME}/${BACKUP_NAME}" \
  "/tmp/${BACKUP_NAME}" \
  --endpoint-url="${B2_ENDPOINT_URL}"

# Copy to container
docker cp "/tmp/${BACKUP_NAME}" ${CASSANDRA_CONTAINER}:/tmp/
docker cp /etc/cassandra/age_private_key.txt ${CASSANDRA_CONTAINER}:/tmp/key.txt

# Stop Cassandra
docker exec ${CASSANDRA_CONTAINER} sh -c 'apt-get update -qq && apt-get install -y -qq age'
docker stop ${CASSANDRA_CONTAINER}

# Extract in utility container
docker run -d --name cass-restore-util \
  --volumes-from ${CASSANDRA_CONTAINER} \
  --entrypoint sleep \
  cassandra:5.0 infinity

docker exec cass-restore-util sh -c \
  'age -d -i /tmp/key.txt /tmp/${BACKUP_NAME} | tar -C /tmp -xf -'

docker exec cass-restore-util sh -c \
  'cp -r /tmp/cassandra-backup-* /var/lib/cassandra/'

# Restore SSTable files (same as above)
# ...

# Restart and verify
docker rm -f cass-restore-util
docker start ${CASSANDRA_CONTAINER}
sleep 30

docker exec ${CASSANDRA_CONTAINER} cqlsh -e "SELECT COUNT(*) FROM fluxer.users;"

# Cleanup
rm -f "/tmp/${BACKUP_NAME}"
Always test restores in a non-production environment first. Restores overwrite existing data!

Maintenance Operations

Compaction

Manually trigger compaction to reclaim disk space:
# Full compaction (I/O intensive)
docker exec cassandra nodetool compact fluxer

# Specific table
docker exec cassandra nodetool compact fluxer users

Repair

Run repair to ensure data consistency across replicas:
# Full repair
docker exec cassandra nodetool repair -pr fluxer

# Specific table
docker exec cassandra nodetool repair -pr fluxer users
In single-node deployments, repair is unnecessary. It’s critical for multi-node clusters.

Cleanup Snapshots

Remove old snapshots to free disk space:
# List snapshots
docker exec cassandra nodetool listsnapshots

# Clear all snapshots
docker exec cassandra nodetool clearsnapshot --all

# Clear specific snapshot
docker exec cassandra nodetool clearsnapshot -t backup-20260304

Monitor Disk Usage

# Check data directory size
docker exec cassandra du -sh /var/lib/cassandra/data

# Per-keyspace usage
docker exec cassandra du -sh /var/lib/cassandra/data/fluxer

# Table sizes
docker exec cassandra du -sh /var/lib/cassandra/data/fluxer/*

Performance Tuning

JVM Heap Size

Set via environment variable:
environment:
  - MAX_HEAP_SIZE=32G
  - HEAP_NEWSIZE=8G
Guidelines:
  • Use 25-50% of total system RAM
  • Minimum: 2G for development
  • Production: 8-32G depending on dataset

Concurrent Reads/Writes

Edit cassandra.yaml:
concurrent_reads: 32
concurrent_writes: 32
concurrent_counter_writes: 32

Commit Log Settings

commitlog_sync: periodic
commitlog_sync_period_in_ms: 10000
commitlog_segment_size_in_mb: 32

Monitoring

nodetool status

docker exec cassandra nodetool status
Shows:
  • Node status (UN = Up/Normal)
  • Ownership percentage
  • Load (data size)
  • Tokens

nodetool tpstats

docker exec cassandra nodetool tpstats
Thread pool statistics:
  • Active/pending tasks
  • Blocked tasks
  • Completed operations

nodetool cfstats

docker exec cassandra nodetool cfstats fluxer.users
Table statistics:
  • Read/write counts
  • Latency percentiles
  • SSTable count

JMX Metrics

Expose JMX on port 7199 for monitoring tools:
ports:
  - "7199:7199"
Integrate with Prometheus, Grafana, or DataDog.

Troubleshooting

Problem: Container restarts repeatedly during startupSolution:
  1. Increase start_period in healthcheck
  2. Check logs:
    docker logs cassandra --tail 100
    
  3. Verify sufficient memory:
    docker stats cassandra
    
  4. Check data volume integrity
Problem: Writes failing with “No space left on device”Solution:
  1. Clear old snapshots:
    docker exec cassandra nodetool clearsnapshot --all
    
  2. Run compaction:
    docker exec cassandra nodetool compact
    
  3. Increase volume size
  4. Archive old data
Problem: Queries timing out or slowSolution:
  1. Check GC pressure:
    docker exec cassandra nodetool gcstats
    
  2. Review thread pool stats:
    docker exec cassandra nodetool tpstats
    
  3. Increase heap size
  4. Add read/write capacity
  5. Review query patterns for full table scans

See Also

Build docs developers (and LLMs) love