Skip to main content
Backup your Infrahub deployment to protect against data loss and enable disaster recovery. This page covers backup strategies, procedures, and restoration processes.

What to backup

A complete Infrahub backup includes:
  1. Neo4j database - All graph data, schemas, branches, and version history
  2. Prefect PostgreSQL database - Task execution history and workflow state
  3. Artifact storage - Generated artifacts, files, and Git repositories
  4. Configuration - Environment variables and configuration files

Backup strategies

Full backup

Capture all components at a consistent point in time:
  • Best for disaster recovery
  • Ensures data consistency across all services
  • Requires stopping or pausing services
  • Suitable for scheduled maintenance windows

Hot backup

Backup while services are running:
  • Minimizes downtime
  • May have slight inconsistencies
  • Suitable for continuous operations
  • Requires careful coordination

Incremental backup

Backup only changed data since last backup:
  • Reduces backup size and time
  • Requires baseline full backup
  • More complex restoration process
  • Best for large deployments

Backup tools

The infrahub-backup tool provides automated backup and restore:
# Download the tool
curl https://infrahub.opsmill.io/ops/$(uname -s)/$(uname -m)/infrahub-backup \
  -o infrahub-backup
chmod +x infrahub-backup

# Create backup
./infrahub-backup create

# Restore from backup
./infrahub-backup restore infrahub_backup_20250302_120000.tar.gz
Features:
  • Automatic Neo4j and PostgreSQL backup
  • Integrity verification with SHA256 checksums
  • Coordinated service shutdown and restart
  • Metadata preservation
Limitations:
  • Artifact storage backup not yet included
  • Requires Docker Compose deployment
For detailed usage, see the backup guide.

Manual backup procedures

For custom backup workflows or Kubernetes deployments:

    Backup procedures

    Docker Compose backup

    Step 1: Stop task workers Prevent new tasks from starting:
    docker compose stop task-worker
    
    Step 2: Backup Neo4j
    docker exec -it -u neo4j infrahub-database-1 bash
    mkdir -p backups
    neo4j-admin database backup --to-path=backups/ neo4j
    exit
    
    # Copy to host
    docker cp infrahub-database-1:/var/lib/neo4j/backups/neo4j-2025-03-02T12-00-00.backup .
    
    Step 3: Backup PostgreSQL
    docker compose exec -T task-manager-db \
      pg_dump -Fc -U postgres -d prefect > prefect_backup.dump
    
    Step 4: Backup artifacts
    # For local storage
    docker compose cp infrahub-server:/opt/infrahub/storage /backup/artifacts/
    
    # For S3 storage
    aws s3 sync s3://your-infrahub-bucket /backup/artifacts/
    
    Step 5: Restart services
    docker compose start task-worker
    

    Kubernetes backup

    For Kubernetes deployments, see the dedicated backup guide: Kubernetes Backup Guide

    Restore procedures

    Docker Compose restore

    Step 1: Stop services
    docker compose stop task-worker infrahub-server task-manager
    
    Step 2: Restore Neo4j
    # Copy backup to container
    docker cp neo4j-2025-03-02T12-00-00.backup infrahub-database-1:/var/lib/neo4j/
    
    # Connect to container
    docker exec -it -u neo4j infrahub-database-1 bash
    
    # Drop existing database
    cypher-shell -d system -u neo4j
    DROP DATABASE neo4j;
    exit;
    
    # Clean data directories
    rm -rf /data/databases/neo4j
    rm -rf /data/transactions/neo4j
    
    # Restore backup
    neo4j-admin database restore \
      --from-path=/var/lib/neo4j/neo4j-2025-03-02T12-00-00.backup neo4j \
      --overwrite-destination=true
    
    # Recreate database
    cypher-shell -d system -u neo4j
    CREATE DATABASE neo4j;
    exit
    
    Step 3: Restore PostgreSQL
    docker compose exec -T task-manager-db \
      pg_restore -d postgres -U postgres --clean --create prefect_backup.dump
    
    Step 4: Restore artifacts
    # For local storage
    docker compose cp /backup/artifacts/ infrahub-server:/opt/infrahub/storage
    
    # For S3 storage
    aws s3 sync /backup/artifacts/ s3://your-infrahub-bucket
    
    Step 5: Restart services
    docker compose start task-manager
    docker compose start infrahub-server
    docker compose start task-worker
    
    Step 6: Verify restoration
    # Check API health
    curl http://localhost:8000/api/schema/summary
    
    # Check database
    docker compose exec database cypher-shell -u neo4j -c "SHOW DATABASES;"
    

    Neo4j cluster backup and restore

    For Neo4j Enterprise clusters, follow these specialized procedures:

    Backup from cluster node

    # Connect to a follower node
    docker exec -it -u neo4j infrahub-database-core2-1 bash
    
    # Create backup
    mkdir -p backups
    neo4j-admin database backup --to-path=backups/ neo4j
    

    Restore to cluster node

    Step 1: Transfer backup
    docker cp neo4j-2025-03-02T12-00-00.backup infrahub-database-core3-1:/var/lib/neo4j/
    
    Step 2: Drop database cluster-wide
    cypher-shell -d system -u neo4j
    DROP DATABASE neo4j;
    SHOW SERVERS;
    
    Step 3: Clean target node
    docker exec -it -u neo4j infrahub-database-core3-1 bash
    rm -rf /data/databases/neo4j
    rm -rf /data/transactions/neo4j
    exit
    
    docker restart infrahub-database-core3-1
    
    Step 4: Restore backup
    docker exec -it -u neo4j infrahub-database-core3-1 bash
    neo4j-admin database restore \
      --from-path=/var/lib/neo4j/neo4j-2025-03-02T12-00-00.backup neo4j
    
    Step 5: Get seed instance ID
    cypher-shell -d system -u neo4j
    SHOW SERVERS;
    
    Note the serverId for the target node. Step 6: Recreate database from seed
    CREATE DATABASE neo4j
    TOPOLOGY 3 PRIMARIES
    OPTIONS {
      existingData: 'use',
      existingDataSeedInstance: 'd05fce79-e63e-485a-9ce7-1abbf9d18fce'
    };
    
    Step 7: Verify cluster sync
    SHOW DATABASES;
    SHOW SERVERS;
    
    For detailed cluster procedures, see the backup guide.

    Backup automation

    Scheduled backups with cron

    Create a backup script:
    backup.sh
    #!/bin/bash
    set -e
    
    BACKUP_DIR="/backup/infrahub"
    DATE=$(date +%Y%m%d_%H%M%S)
    
    # Create backup directory
    mkdir -p "$BACKUP_DIR"
    
    # Run infrahub-backup
    /usr/local/bin/infrahub-backup create
    
    # Move backup to storage
    mv infrahub_backup_*.tar.gz "$BACKUP_DIR/infrahub_backup_$DATE.tar.gz"
    
    # Clean old backups (keep last 7 days)
    find "$BACKUP_DIR" -name "infrahub_backup_*.tar.gz" -mtime +7 -delete
    
    # Upload to S3 (optional)
    aws s3 sync "$BACKUP_DIR" s3://your-backup-bucket/infrahub/
    
    Schedule with cron:
    # Edit crontab
    crontab -e
    
    # Run daily at 2 AM
    0 2 * * * /usr/local/bin/backup.sh >> /var/log/infrahub-backup.log 2>&1
    

    Kubernetes CronJob

    Create a Kubernetes CronJob:
    backup-cronjob.yaml
    apiVersion: batch/v1
    kind: CronJob
    metadata:
      name: infrahub-backup
      namespace: infrahub
    spec:
      schedule: "0 2 * * *"  # Daily at 2 AM
      jobTemplate:
        spec:
          template:
            spec:
              containers:
              - name: backup
                image: registry.opsmill.io/opsmill/infrahub-backup:latest
                env:
                - name: BACKUP_DESTINATION
                  value: s3://your-backup-bucket/infrahub/
                volumeMounts:
                - name: backup-storage
                  mountPath: /backup
              volumes:
              - name: backup-storage
                persistentVolumeClaim:
                  claimName: backup-pvc
              restartPolicy: OnFailure
    
    Apply:
    kubectl apply -f backup-cronjob.yaml
    

    Backup retention

    Retention policy example

    • Hourly backups: Keep for 24 hours
    • Daily backups: Keep for 7 days
    • Weekly backups: Keep for 4 weeks
    • Monthly backups: Keep for 12 months

    Implement retention with script

    retention.sh
    #!/bin/bash
    BACKUP_DIR="/backup/infrahub"
    
    # Keep hourly backups for 1 day
    find "$BACKUP_DIR/hourly" -mtime +1 -delete
    
    # Keep daily backups for 7 days
    find "$BACKUP_DIR/daily" -mtime +7 -delete
    
    # Keep weekly backups for 28 days
    find "$BACKUP_DIR/weekly" -mtime +28 -delete
    
    # Keep monthly backups for 365 days
    find "$BACKUP_DIR/monthly" -mtime +365 -delete
    

    Testing backups

    Verify backup integrity

    # Verify checksum
    sha256sum -c infrahub_backup_20250302_120000.tar.gz.sha256
    
    # Test archive extraction
    tar -tzf infrahub_backup_20250302_120000.tar.gz > /dev/null
    

    Test restoration

    Periodically test restoration in a separate environment:
    1. Deploy fresh Infrahub instance
    2. Restore from backup
    3. Verify data integrity
    4. Test API functionality
    5. Document any issues

    Build docs developers (and LLMs) love