Skip to main content

Running the Runtime

Development

# Start PostgreSQL and Redis
./dev.sh

# Run runtime (dev build)
./x run-dev

# Or with Cargo directly
cargo run -p flora

Production

# Build release binary via Buck2
./x build-release

# Run release binary
./x run-release

# Or run Buck2 target directly
buck2 run //apps/runtime:flora_bin_release
Source: AGENTS.md

Lifecycle Management

Startup

The runtime initializes in this order:
  1. Load configuration from config.toml and environment variables
  2. Initialize tracing with configured log level
  3. Connect to PostgreSQL and run migrations
  4. Connect to Redis with exponential backoff
  5. Initialize V8 once per process
  6. Create worker pool (default 4 workers, max 64)
  7. Initialize workers sequentially to avoid V8 race conditions
  8. Load SDK bundle into all workers
  9. Restore deployments from database
  10. Deploy guild scripts to appropriate workers
  11. Start Discord client (Serenity)
  12. Start HTTP API server (Axum on port 3000)
Source: apps/runtime/src/main.rs:40
Workers are initialized sequentially to prevent V8 initialization race conditions.

Shutdown

Graceful shutdown sequence:
  1. Receive SIGTERM or SIGINT
  2. Migrate guilds between workers (load balancing)
  3. Send shutdown to each worker
  4. Join worker threads (wait for completion)
  5. Drop runtime states (cleanup V8 isolates)
Source: apps/runtime/src/runtime/mod.rs:325
Ungraceful shutdown (SIGKILL) may leave incomplete transactions. Always use SIGTERM for production shutdowns.

Guild Deployments

Deploying a Script

# Via CLI
flora deploy

# Via HTTP API
curl -X POST http://localhost:3000/deployments \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "guild_id": "123456789",
    "entry_point": "src/main.ts",
    "bundle": "...",
    "source_map": {...}
  }'
Deployment flow:
  1. Validate bundle size and file count
  2. Store in PostgreSQL with source map
  3. Invalidate Redis cache for guild
  4. Route to worker via guild hash
  5. Create V8 isolate for guild
  6. Execute bootstrap code
  7. Load SDK bundle
  8. Load guild script
  9. Extract dispatch function
  10. Register event handlers
Source: apps/runtime/src/runtime/worker.rs:474

Rollback

Deployments can be rolled back by redeploying a previous bundle.
# Get previous deployment
flora deployments list

# Redeploy old bundle
flora deploy --from-backup <backup-id>
flora does not maintain deployment history. Store deployment bundles externally for rollback capability.

Worker Management

Worker Pool Sizing

Workers are CPU-bound (V8 execution). Size based on CPU cores.
[runtime]
max_workers = 8  # Recommended: CPU cores
Considerations:
  • More workers = more memory (each has V8 instance)
  • More workers = better parallelism
  • Max workers: 64 (hard limit)
Source: apps/runtime/src/runtime/constants.rs

Guild Migration

Guilds can be migrated between workers for load balancing.
runtime.migrate_guild_runtime("guild_id", target_worker).await?;
Migration steps:
  1. Begin migration (queue new events)
  2. Migrate out from source worker
    • Quiesce event loop (wait for pending ops)
    • Verify isolate is idle
    • Transfer ownership
  3. Migrate in to target worker
    • Accept ownership
    • Re-enter V8 context
  4. Finish migration (replay queued events)
Source: apps/runtime/src/runtime/mod.rs:112
Migration fails if the runtime is not idle (pending async ops). The runtime rolls back to source worker on failure.

Monitoring

Metrics

Metrics exposed via Prometheus format at /metrics. Available metrics:
  • flora_dispatch_success - Event dispatch successes
  • flora_dispatch_error - Event dispatch errors
  • flora_dispatch_duration - Event dispatch duration
  • flora_timeout_error - Timeout errors
  • flora_oom_error - Out-of-memory errors
  • flora_isolate_restarted - Isolate restart count
  • flora_migration_success - Successful migrations
  • flora_migration_timeout - Migration timeouts
  • flora_migration_quiesce_duration - Migration quiesce duration
Source: apps/runtime/src/metrics.rs

Logs

Structured logging via tracing.
log_level = "flora::runtime=debug,flora=info"
Log targets:
  • flora:runtime - Runtime lifecycle events
  • flora:discord - Discord events
  • flora:deployments - Deployment operations
  • flora:kv - KV operations
Log streaming:
# Via CLI
flora logs <guild_id> --follow

# Via HTTP API (SSE)
curl http://localhost:3000/logs/<guild_id>?follow=true
Source: apps/runtime/src/handlers/logs.rs

Health Checks

# Health endpoint
curl http://localhost:3000/health
Returns:
{
  "status": "ok",
  "timestamp": "2026-03-05T12:00:00Z"
}
Source: apps/runtime/src/handlers/health.rs

Troubleshooting

Runtime Won’t Start

Check required configuration:
# Verify required env vars are set
echo $DISCORD_TOKEN
echo $SECRETS_MASTER_KEY
echo $API_SECRET
echo $BUILD_SERVICE_SECRET
Check database connectivity:
# Test PostgreSQL connection
psql $DATABASE_URL -c "SELECT 1;"

# Test Redis connection
redis-cli -u $CACHE_URL PING

High Memory Usage

Causes:
  • Too many workers (each has V8 instance)
  • Too many Sled instances in cache
  • Memory leaks in guild scripts
Solutions:
[runtime]
max_workers = 4  # Reduce workers
// Reduce Sled cache size
const MAX_DB_CACHE_SIZE: usize = 50;

Timeouts

Dispatch timeouts:
[runtime]
dispatch_timeout_secs = 5  # Increase timeout
Migration timeouts:
[runtime]
migration_timeout_ms = 1000  # Increase timeout
Frequent timeouts indicate slow guild scripts. Profile and optimize user code.

Isolate Crashes

Common causes:
  • Out-of-memory (OOM)
  • Infinite loops
  • Unhandled promise rejections
Check metrics:
curl http://localhost:3000/metrics | grep flora_oom_error
curl http://localhost:3000/metrics | grep flora_isolate_restarted
Solutions:
  • Increase V8 heap size (requires code change)
  • Add timeout protection to user code
  • Restart guild runtime

Backup and Recovery

Database Backups

# Backup PostgreSQL
pg_dump $DATABASE_URL > flora_backup.sql

# Restore PostgreSQL
psql $DATABASE_URL < flora_backup.sql

KV Backups

# Export all guilds via API
curl -X POST http://localhost:3000/kv/<guild_id>/export \
  -H "Authorization: Bearer $TOKEN"

# Returns backup ID
# Backups stored at: data/kv/<guild_id>/backups/<backup_id>/
Manual backup:
# Copy entire KV directory
tar -czf flora_kv_backup.tar.gz data/kv/
Source: apps/runtime/src/services/kv/service.rs:293

Secret Backups

Secrets are encrypted with SECRETS_MASTER_KEY. Store master key securely. Without the key, secrets cannot be decrypted.
# Backup secrets table
pg_dump $DATABASE_URL -t secrets > secrets_backup.sql

# Backup master key (store securely)
echo $SECRETS_MASTER_KEY > master_key.txt
chmod 600 master_key.txt

Scaling

Vertical Scaling

Increase resources on single instance.
[runtime]
max_workers = 16  # Scale to CPU cores

[database]
max_connections = 20  # Scale with workers

[cache]
pool_size = 30  # Scale with workers

Horizontal Scaling

flora runtime currently does not support horizontal scaling (multiple instances). Guild state is tied to worker isolates.
Future scaling approaches:
  • Shard guilds across runtime instances
  • Shared state via distributed cache
  • Leader election for worker coordination

Maintenance

Updating Dependencies

# Update Rust dependencies
cargo update

# Sync Buck2 deps
./x sync-rust-deps
./x buckify-rust-deps

# Rebuild
./x build-release
Source: AGENTS.md

Database Migrations

Migrations are applied automatically at startup.
# Create new migration
sqlx migrate add <migration_name>

# Edit migration file
vim apps/runtime/migrations/<timestamp>_<migration_name>.sql

# Migrations run on next startup
./x run-dev

Updating V8

V8 is embedded in Deno Core. Update via deno_core dependency.
[dependencies]
deno_core = "0.x.y"  # Update version
V8 updates may introduce breaking changes. Test thoroughly before deploying.

Production Checklist

  • Set all required environment variables
  • Use cookie_secure = true for HTTPS
  • Configure appropriate timeouts
  • Set up database backups
  • Set up KV backups
  • Monitor metrics endpoint
  • Configure log aggregation
  • Test disaster recovery procedure
  • Document secret rotation process
  • Set up health check monitoring
  • Configure firewall rules
  • Use TLS for PostgreSQL and Redis
  • Rotate API secrets regularly
  • Test deployment rollback
  • Monitor worker CPU usage
  • Set up alerting for OOM errors

Build docs developers (and LLMs) love