Production Deployment

Overview

This guide covers production deployment best practices for BR-ACC, including security hardening, performance tuning, and scaling considerations.

BR-ACC handles sensitive procurement and financial data. Follow all security recommendations carefully to protect your deployment.

Pre-Deployment Checklist

Security Hardening

Required Security Changes

These settings MUST be changed before deploying to production. Using default values is a critical security vulnerability.

1. Neo4j Password

Generate a strong random password:

# Generate a 32-character random password
openssl rand -base64 32

Set in .env:

.env

NEO4J_PASSWORD=<generated-password>

2. JWT Secret Key

Generate a cryptographically secure secret:

# Generate a 64-character hex string
openssl rand -hex 32

Set in .env:

.env

JWT_SECRET_KEY=<generated-secret>

The JWT secret is used to sign authentication tokens. If compromised, attackers can forge valid sessions. Never reuse this secret across environments.

3. HTTPS and Secure Cookies

Enable secure cookies for HTTPS:

.env

AUTH_COOKIE_SECURE=true

Do not set AUTH_COOKIE_SECURE=true without HTTPS. This will break authentication.

4. CORS Configuration

Restrict CORS to your production domain:

.env

# Development (NEVER use in production)
CORS_ORIGINS=http://localhost:3000

# Production
CORS_ORIGINS=https://bracc.yourdomain.com

# Multiple domains
CORS_ORIGINS=https://bracc.yourdomain.com,https://www.yourdomain.com

Never use CORS_ORIGINS=* when allow_credentials is enabled. This allows any website to make authenticated requests to your API.

Optional Security Settings

Invite Codes

Restrict registration with an invite code:

.env

INVITE_CODE=your-secret-invite-code

Users must provide this code during registration.

Public Access Control

Disable public access to sensitive features:

.env

PUBLIC_MODE=false
PUBLIC_ALLOW_PERSON=false
PUBLIC_ALLOW_ENTITY_LOOKUP=false
PUBLIC_ALLOW_INVESTIGATIONS=false

Rate Limiting

Rate limits are configured in api/src/bracc/config.py:21-22:

Anonymous requests: 60/minute
Authenticated requests: 300/minute

Adjust these limits based on your usage patterns.

Performance Tuning

Neo4j Memory Configuration

Neo4j memory configuration is the most critical factor for performance. The page cache should be large enough to fit your entire graph in memory.

Development Settings (Default)

Suitable for small datasets (<1M nodes):

.env

NEO4J_HEAP_INITIAL=512m
NEO4J_HEAP_MAX=1G
NEO4J_PAGECACHE=512m

Production Settings (40M+ nodes)

Requires server with 32GB+ RAM (recommended: 64GB)

.env

NEO4J_HEAP_INITIAL=4G
NEO4J_HEAP_MAX=8G
NEO4J_PAGECACHE=12G

Memory Allocation Guidelines

Dataset Size	Min RAM	Heap Initial	Heap Max	Page Cache
<1M nodes	4GB	512m	1G	512m
1-10M nodes	16GB	2G	4G	4G
10-40M nodes	32GB	4G	8G	8G
40M+ nodes	64GB	4G	8G	12G

Rule of thumb:

Heap: 25-50% of RAM
Page Cache: 50-75% of remaining RAM after heap
Leave 1-2GB for OS and other processes

Docker Memory Limits

Set memory limits in docker-compose.yml:

docker-compose.yml

services:
  neo4j:
    deploy:
      resources:
        limits:
          memory: 24G
        reservations:
          memory: 20G

Database Indexes

Ensure critical indexes exist:

// Create indexes for common queries
CREATE INDEX person_cpf IF NOT EXISTS FOR (p:Person) ON (p.cpf);
CREATE INDEX org_cnpj IF NOT EXISTS FOR (o:Organization) ON (o.cnpj);
CREATE INDEX contract_id IF NOT EXISTS FOR (c:Contract) ON (c.id);

Check existing indexes:

SHOW INDEXES;

High Availability

Database Backups

Automated Daily Backups

Create a backup script:

backup.sh

#!/bin/bash
set -e

BACKUP_DIR="/backups/neo4j"
DATE=$(date +%Y%m%d-%H%M%S)
BACKUP_FILE="$BACKUP_DIR/bracc-$DATE.dump"

mkdir -p "$BACKUP_DIR"

# Create backup
docker compose exec -T neo4j neo4j-admin database dump neo4j \
  --to-path=/data > "$BACKUP_FILE"

# Copy from container
docker cp bracc-neo4j:/data/neo4j.dump "$BACKUP_FILE"

# Compress
gzip "$BACKUP_FILE"

# Delete backups older than 30 days
find "$BACKUP_DIR" -name "*.dump.gz" -mtime +30 -delete

echo "Backup completed: $BACKUP_FILE.gz"

Add to cron:

0 2 * * * /opt/bracc/backup.sh

Restore from Backup

# Stop Neo4j
docker compose stop neo4j

# Extract backup
gunzip backup.dump.gz

# Copy to container
docker cp backup.dump bracc-neo4j:/data/neo4j.dump

# Restore
docker compose exec neo4j neo4j-admin database load neo4j \
  --from-path=/data --overwrite-destination=true

# Start Neo4j
docker compose start neo4j

Health Monitoring

All services expose health endpoints:

API: http://localhost:8000/health
Neo4j: Port 7687 (Bolt protocol)
Frontend: Depends on API health

Set up monitoring with your preferred tool (Prometheus, Datadog, etc.).

Example Health Check Script

health-check.sh

#!/bin/bash

# Check API health
if ! curl -sf http://localhost:8000/health > /dev/null; then
  echo "API health check failed"
  exit 1
fi

# Check Neo4j
if ! docker compose exec -T neo4j cypher-shell -u neo4j -p "$NEO4J_PASSWORD" \
     "RETURN 1" > /dev/null 2>&1; then
  echo "Neo4j health check failed"
  exit 1
fi

echo "All services healthy"

Load Balancing

For high-traffic deployments, run multiple API instances behind a load balancer:

docker-compose.prod.yml

services:
  api:
    deploy:
      replicas: 3
    # ... other config

  nginx:
    image: nginx:alpine
    ports:
      - "443:443"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf:ro
      - ./ssl:/etc/nginx/ssl:ro
    depends_on:
      - api

Deployment Architecture

Recommended Production Setup

┌─────────────────────────────────────────────────┐
│                   Internet                      │
└───────────────────┬─────────────────────────────┘
                    │
            ┌───────▼────────┐
            │  Load Balancer │
            │   (Nginx/Caddy)│
            └───────┬────────┘
                    │
        ┌───────────┼───────────┐
        │           │           │
    ┌───▼───┐   ┌──▼────┐  ┌───▼───┐
    │ API 1 │   │ API 2 │  │ API 3 │
    └───┬───┘   └──┬────┘  └───┬───┘
        │          │           │
        └──────────┼───────────┘
                   │
            ┌──────▼──────┐
            │   Neo4j     │
            │  (Primary)  │
            └──────┬──────┘
                   │
            ┌──────▼──────┐
            │  Backup     │
            │  Storage    │
            └─────────────┘

Reverse Proxy Configuration

Caddy (Recommended)

Caddyfile

bracc.yourdomain.com {
    # Automatic HTTPS
    
    # Frontend
    handle /* {
        reverse_proxy frontend:3000
    }
    
    # API
    handle /api/* {
        reverse_proxy api:8000
    }
    
    # Neo4j Browser (optional, restrict with basic auth)
    handle /neo4j/* {
        basicauth {
            admin $2a$14$...
        }
        reverse_proxy neo4j:7474
    }
}

Nginx

nginx.conf

upstream api {
    server api:8000;
}

upstream frontend {
    server frontend:3000;
}

server {
    listen 443 ssl http2;
    server_name bracc.yourdomain.com;

    ssl_certificate /etc/nginx/ssl/cert.pem;
    ssl_certificate_key /etc/nginx/ssl/key.pem;

    location / {
        proxy_pass http://frontend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
    }

    location /api/ {
        proxy_pass http://api;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}

Logging and Monitoring

Log Aggregation

Configure Docker logging:

docker-compose.prod.yml

services:
  api:
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"
  
  neo4j:
    logging:
      driver: "json-file"
      options:
        max-size: "50m"
        max-file: "5"

Log Levels

Set appropriate log levels:

.env

# Development
LOG_LEVEL=debug

# Production
LOG_LEVEL=info

# Troubleshooting
LOG_LEVEL=debug

Metrics Collection

Neo4j exposes metrics at:

JMX metrics on port 3637
HTTP metrics at http://localhost:7474/metrics

Configure Prometheus scraping:

prometheus.yml

scrape_configs:
  - job_name: 'neo4j'
    static_configs:
      - targets: ['neo4j:7474']

Scaling Considerations

Vertical Scaling

For datasets up to 100M nodes:

Increase RAM (up to 128GB)
Increase page cache proportionally
Use faster NVMe SSD storage

Horizontal Scaling

Neo4j Community Edition does not support clustering. For multi-instance deployments, consider Neo4j Enterprise Edition or a sharding strategy.

For the API layer:

Run multiple API containers
Use a load balancer (Nginx, HAProxy, Caddy)
Share the same Neo4j instance

Disaster Recovery

Recovery Time Objective (RTO)

Target: 1 hour

Detect failure

Automated monitoring alerts within 5 minutes

Spin up new instance

Deploy new server with Docker Compose (15 minutes)

Restore from backup

Download and restore latest backup (30 minutes)

Verify and redirect

Health checks and DNS update (10 minutes)

Recovery Point Objective (RPO)

Target: 24 hours (daily backups) For lower RPO:

Increase backup frequency
Use Neo4j transaction logs
Consider Neo4j Enterprise for online backup

Deployment Workflow

Prepare environment

# Clone repository
git clone https://github.com/your-org/bracc.git
cd bracc

# Create production .env
cp .env.example .env
nano .env  # Edit configuration

Build containers

docker compose build

Initialize database

# Start Neo4j only
docker compose up -d neo4j

# Wait for health check
docker compose ps neo4j

# Run initialization script
docker compose exec neo4j cypher-shell -u neo4j -p "$NEO4J_PASSWORD" \
  < infra/neo4j/init.cypher

Start all services

docker compose up -d

Verify deployment

# Check service health
docker compose ps

# Test API
curl https://bracc.yourdomain.com/api/health

# Check logs
docker compose logs -f

Load data (optional)

# Run ETL pipeline
docker compose run --rm etl python -m etl.tasks.load_contracts

Troubleshooting

Common Production Issues

Out of Memory Errors

# Check Neo4j memory usage
docker stats bracc-neo4j

# Increase heap and page cache in .env
NEO4J_HEAP_MAX=16G
NEO4J_PAGECACHE=24G

# Restart Neo4j
docker compose restart neo4j

Slow Query Performance

# Check query logs
docker compose logs neo4j | grep -i "slow"

# Analyze query plan
docker compose exec neo4j cypher-shell -u neo4j -p "$NEO4J_PASSWORD" \
  "EXPLAIN MATCH (p:Person) RETURN p LIMIT 10;"

# Create missing indexes
docker compose exec neo4j cypher-shell -u neo4j -p "$NEO4J_PASSWORD" \
  "CREATE INDEX IF NOT EXISTS FOR (p:Person) ON (p.cpf);"

Service Not Starting

# Check service logs
docker compose logs api

# Verify dependencies
docker compose ps

# Check health status
docker inspect bracc-neo4j | grep -A 10 Health

Next Steps

Configuration

Review all configuration options

Docker Setup

Docker Compose reference

ETL Pipeline

Load data into BR-ACC

API Reference

API documentation

Get Started

Data Model

Deployment

ETL Pipelines

Querying Data

Legal & Ethics

​Overview

​Pre-Deployment Checklist

​Security Hardening

​Required Security Changes

​1. Neo4j Password

​2. JWT Secret Key

​3. HTTPS and Secure Cookies

​4. CORS Configuration

​Optional Security Settings

​Invite Codes

​Public Access Control

​Rate Limiting

​Performance Tuning

​Neo4j Memory Configuration

​Development Settings (Default)

​Production Settings (40M+ nodes)

​Memory Allocation Guidelines

​Docker Memory Limits

​Database Indexes

​High Availability

​Database Backups

​Automated Daily Backups

​Restore from Backup

​Health Monitoring

​Example Health Check Script

​Load Balancing

​Deployment Architecture

​Recommended Production Setup

​Reverse Proxy Configuration

​Caddy (Recommended)

​Nginx

​Logging and Monitoring

​Log Aggregation

​Log Levels

​Metrics Collection

​Scaling Considerations

​Vertical Scaling

​Horizontal Scaling

​Disaster Recovery

​Recovery Time Objective (RTO)

​Recovery Point Objective (RPO)

​Deployment Workflow

​Troubleshooting

​Common Production Issues

​Out of Memory Errors

​Slow Query Performance

​Service Not Starting

​Next Steps

Configuration

Docker Setup

ETL Pipeline

API Reference

Build docs developers (and LLMs) love

Overview

Pre-Deployment Checklist

Security Hardening

Required Security Changes

1. Neo4j Password

2. JWT Secret Key

3. HTTPS and Secure Cookies

4. CORS Configuration

Optional Security Settings

Invite Codes

Public Access Control

Rate Limiting

Performance Tuning

Neo4j Memory Configuration

Development Settings (Default)

Production Settings (40M+ nodes)

Memory Allocation Guidelines

Docker Memory Limits

Database Indexes

High Availability

Database Backups

Automated Daily Backups

Restore from Backup

Health Monitoring

Example Health Check Script

Load Balancing

Deployment Architecture

Recommended Production Setup

Reverse Proxy Configuration

Caddy (Recommended)

Nginx

Logging and Monitoring

Log Aggregation

Log Levels

Metrics Collection

Scaling Considerations

Vertical Scaling

Horizontal Scaling

Disaster Recovery

Recovery Time Objective (RTO)

Recovery Point Objective (RPO)

Deployment Workflow

Troubleshooting

Common Production Issues

Out of Memory Errors

Slow Query Performance

Service Not Starting

Next Steps