This guide covers backing up and restoring CVAT data for self-hosted deployments.
Source: site/content/en/docs/administration/community/advanced/backup_guide.md
CVAT Data Volumes
CVAT uses Docker volumes to persist data. Understanding these volumes is essential for backup and restore operations.
Defined in docker-compose.yml:421-428:
Volume Descriptions
| Volume | Container | Mount Path | Contents |
|---|
cvat_db | cvat_db | /var/lib/postgresql/data | PostgreSQL database (users, tasks, projects, annotations) |
cvat_data | cvat_server | /home/django/data | Uploaded media files and prepared data |
cvat_keys | cvat_server | /home/django/keys | Django secret key |
cvat_logs | cvat_server | /home/django/logs | Backend process logs |
cvat_inmem_db | cvat_redis_inmem | /data | Redis in-memory data (sessions, cache) |
cvat_events_db | cvat_clickhouse | /var/lib/clickhouse | ClickHouse analytics database |
cvat_cache_db | cvat_redis_ondisk | /var/lib/kvrocks | Kvrocks on-disk cache |
Full Backup Procedure
Step 1: Stop All CVAT Containers
Stop all containers to ensure data consistency:
If you’re using additional compose files (e.g., docker-compose.override.yml):
docker compose -f docker-compose.yml -f docker-compose.override.yml stop
Step 2: Create Backup Directory
mkdir -p backup
cd backup
Step 3: Backup Core Data
Backup the three essential volumes:
Backup PostgreSQL Database
docker run --rm --name cvat_backup \
--volumes-from cvat_db \
-v $(pwd):/backup \
ubuntu tar -czvf /backup/cvat_db.tar.gz /var/lib/postgresql/data
docker run --rm --name cvat_backup \
--volumes-from cvat_server \
-v $(pwd):/backup \
ubuntu tar -czvf /backup/cvat_data.tar.gz /home/django/data
Backup ClickHouse Analytics Database
docker run --rm --name cvat_backup \
--volumes-from cvat_clickhouse \
-v $(pwd):/backup \
ubuntu tar -czvf /backup/cvat_events_db.tar.gz /var/lib/clickhouse
Step 4: Verify Backups
Confirm all backup archives were created:
Expected output:
cvat_data.tar.gz
cvat_db.tar.gz
cvat_events_db.tar.gz
Step 5: Restart CVAT
Optional Backups
Depending on your needs, you may also want to backup:
Backup Django Keys
docker run --rm --name cvat_backup \
--volumes-from cvat_server \
-v $(pwd):/backup \
ubuntu tar -czvf /backup/cvat_keys.tar.gz /home/django/keys
Backup Logs
docker run --rm --name cvat_backup \
--volumes-from cvat_server \
-v $(pwd):/backup \
ubuntu tar -czvf /backup/cvat_logs.tar.gz /home/django/logs
Backup Redis Cache
docker run --rm --name cvat_backup \
--volumes-from cvat_redis_ondisk \
-v $(pwd):/backup \
ubuntu tar -czvf /backup/cvat_cache_db.tar.gz /var/lib/kvrocks
Automated Backup Script
Create a backup script (backup_cvat.sh):
#!/bin/bash
set -e
BACKUP_DIR="/path/to/backups/$(date +%Y%m%d_%H%M%S)"
mkdir -p "$BACKUP_DIR"
echo "Stopping CVAT containers..."
docker compose stop
echo "Backing up database..."
docker run --rm --volumes-from cvat_db \
-v "$BACKUP_DIR":/backup \
ubuntu tar -czvf /backup/cvat_db.tar.gz /var/lib/postgresql/data
echo "Backing up data..."
docker run --rm --volumes-from cvat_server \
-v "$BACKUP_DIR":/backup \
ubuntu tar -czvf /backup/cvat_data.tar.gz /home/django/data
echo "Backing up analytics..."
docker run --rm --volumes-from cvat_clickhouse \
-v "$BACKUP_DIR":/backup \
ubuntu tar -czvf /backup/cvat_events_db.tar.gz /var/lib/clickhouse
echo "Restarting CVAT..."
docker compose up -d
echo "Backup completed: $BACKUP_DIR"
ls -lh "$BACKUP_DIR"
Make it executable:
Schedule Automatic Backups
Add to crontab for daily backups at 2 AM:
Add line:
0 2 * * * /path/to/backup_cvat.sh >> /var/log/cvat_backup.log 2>&1
Restore Procedure
Use the exact same CVAT version when restoring. Database schemas change between versions. After restoring, you can upgrade CVAT, which will migrate the database automatically.
Step 1: Verify CVAT Installation
Ensure CVAT is installed and containers exist:
If containers don’t exist, run initial setup first:
docker compose up -d --no-start
Step 2: Stop All Containers
Step 3: Restore Data from Backups
Navigate to your backup directory:
cd /path/to/backup/folder
Restore PostgreSQL Database
docker run --rm --name cvat_restore \
--volumes-from cvat_db \
-v $(pwd):/backup \
ubuntu bash -c "cd /var/lib/postgresql/data && tar -xvf /backup/cvat_db.tar.gz --strip 4"
docker run --rm --name cvat_restore \
--volumes-from cvat_server \
-v $(pwd):/backup \
ubuntu bash -c "cd /home/django/data && tar -xvf /backup/cvat_data.tar.gz --strip 3"
Restore ClickHouse Database
docker run --rm --name cvat_restore \
--volumes-from cvat_clickhouse \
-v $(pwd):/backup \
ubuntu bash -c "cd /var/lib/clickhouse && tar -xvf /backup/cvat_events_db.tar.gz --strip 3"
Step 4: Start CVAT
Step 5: Verify Restoration
Check logs to ensure services started correctly:
docker compose logs -f cvat_server
Access the web interface and verify:
- Users can log in
- Projects and tasks are visible
- Media files load correctly
- Analytics data is available (if applicable)
Database-Only Backup (PostgreSQL)
For more efficient database backups using PostgreSQL tools:
Using pg_dump
Create Backup
docker exec cvat_db pg_dump -U root cvat | gzip > cvat_db_$(date +%Y%m%d_%H%M%S).sql.gz
Restore from Backup
gunzip -c cvat_db_20240315_020000.sql.gz | docker exec -i cvat_db psql -U root cvat
Using pg_dumpall (includes users and roles)
Create Backup
docker exec cvat_db pg_dumpall -U root | gzip > cvat_db_all_$(date +%Y%m%d_%H%M%S).sql.gz
Restore from Backup
gunzip -c cvat_db_all_20240315_020000.sql.gz | docker exec -i cvat_db psql -U root postgres
Backup to Remote Storage
Amazon S3
#!/bin/bash
BACKUP_NAME="cvat_backup_$(date +%Y%m%d_%H%M%S)"
BACKUP_DIR="/tmp/$BACKUP_NAME"
mkdir -p "$BACKUP_DIR"
# Create backups
docker compose stop
docker run --rm --volumes-from cvat_db -v "$BACKUP_DIR":/backup ubuntu tar -czf /backup/db.tar.gz /var/lib/postgresql/data
docker run --rm --volumes-from cvat_server -v "$BACKUP_DIR":/backup ubuntu tar -czf /backup/data.tar.gz /home/django/data
docker compose up -d
# Upload to S3
aws s3 sync "$BACKUP_DIR" "s3://your-bucket/cvat-backups/$BACKUP_NAME/"
# Cleanup local backup
rm -rf "$BACKUP_DIR"
Rsync to Remote Server
BACKUP_DIR="/path/to/backups/$(date +%Y%m%d_%H%M%S)"
mkdir -p "$BACKUP_DIR"
# Create backups (as shown above)
# ...
# Sync to remote server
rsync -avz "$BACKUP_DIR" user@backup-server:/backups/cvat/
Backup Retention Policy
Example script to keep only the last 7 daily backups:
#!/bin/bash
BACKUP_ROOT="/path/to/backups"
find "$BACKUP_ROOT" -maxdepth 1 -type d -mtime +7 -exec rm -rf {} \;
Add to crontab to run daily:
0 3 * * * /path/to/cleanup_backups.sh
Disaster Recovery Checklist
- Document your setup: Keep notes on CVAT version, customizations, and configuration
- Test restores regularly: Verify backups work before you need them
- Store backups offsite: Use remote storage for critical data
- Backup configuration files: Include
docker-compose.yml, .env, custom settings
- Version compatibility: Match CVAT version during restore
- SSL certificates: Backup certificates if using HTTPS
- Custom integrations: Document any custom scripts or integrations
Troubleshooting
Backup Issues
Issue: Permission denied during backup
Solution: Ensure the backup directory is writable:
Issue: Container not found
Solution: Check container names:
Restore Issues
Issue: Database restore fails with version mismatch
Solution: Use the same CVAT version as the backup, then upgrade:
export CVAT_VERSION=v2.x.x # Match your backup version
docker compose up -d
Issue: Data directory not empty
Solution: Remove existing data before restore:
docker volume rm cvat_data
docker volume create cvat_data
Additional Resources