Backup Strategies
Qdrant provides multiple approaches to backing up your data:
- Full snapshots - Complete backup of all collections and system state
- Collection snapshots - Individual collection backups
- Shard snapshots - Granular backups at the shard level
Full Snapshots
Full snapshots capture the entire Qdrant instance including all collections and collection aliases.
Creating a Full Snapshot
curl -X POST http://localhost:6333/snapshots
from qdrant_client import QdrantClient
client = QdrantClient("localhost", port=6333)
snapshot_info = client.create_full_snapshot()
print(snapshot_info.name)
Asynchronous Creation
For large datasets, create snapshots asynchronously:
curl -X POST "http://localhost:6333/snapshots?wait=false"
This returns immediately while the snapshot is created in the background.
Listing Full Snapshots
curl http://localhost:6333/snapshots
Response:
{
"result": [
{
"name": "full-snapshot-2024-03-04-12-00-00.snapshot",
"creation_time": "2024-03-04T12:00:00Z",
"size": 1048576000
}
],
"status": "ok",
"time": 0.001
}
Downloading Full Snapshots
curl http://localhost:6333/snapshots/{snapshot_name} \
--output backup.snapshot
Snapshot Storage Configuration
Configure snapshot storage location:
storage:
snapshots_path: ./snapshots
snapshots_config:
snapshots_storage: local # or "s3"
# For S3 storage:
# s3_config:
# bucket: "my-qdrant-backups"
# region: "us-east-1"
# access_key: "AKIAIOSFODNN7EXAMPLE"
# secret_key: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
Collection Snapshots
Create backups for individual collections.
Creating a Collection Snapshot
curl -X POST http://localhost:6333/collections/{collection_name}/snapshots
snapshot_info = client.create_snapshot(
collection_name="my_collection"
)
print(snapshot_info.name)
Listing Collection Snapshots
curl http://localhost:6333/collections/{collection_name}/snapshots
Downloading Collection Snapshots
curl http://localhost:6333/collections/{collection_name}/snapshots/{snapshot_name} \
--output collection-backup.snapshot
Deleting Collection Snapshots
curl -X DELETE \
http://localhost:6333/collections/{collection_name}/snapshots/{snapshot_name}
Restore Procedures
Restoring from Full Snapshot
Restore a complete Qdrant instance from a full snapshot.
Using Command Line
Restore during startup:
./qdrant --snapshot /path/to/full-snapshot.snapshot
With force option (overwrites existing collections):
./qdrant --snapshot /path/to/full-snapshot.snapshot --force-snapshot
Using --force-snapshot will overwrite existing collections and aliases. Use with caution.
Restore Process
The restore process:
- Unpacks the snapshot to a temporary directory
- Reads the snapshot configuration mapping
- Restores each collection from its snapshot file
- Recreates collection aliases
- Removes temporary files
Restoring from Collection Snapshot
Restore individual collections while Qdrant is running.
Upload and Recover
# Upload snapshot file
curl -X POST \
-F "snapshot=@/path/to/collection-backup.snapshot" \
http://localhost:6333/collections/{collection_name}/snapshots/upload
client.recover_snapshot(
collection_name="my_collection",
location="http://example.com/snapshot.snapshot"
)
Recover from URL
Restore from a remote snapshot:
curl -X PUT http://localhost:6333/collections/{collection_name}/snapshots/recover \
-H 'Content-Type: application/json' \
-d '{
"location": "http://example.com/snapshots/backup.snapshot"
}'
With S3:
curl -X PUT http://localhost:6333/collections/{collection_name}/snapshots/recover \
-H 'Content-Type: application/json' \
-d '{
"location": "s3://my-bucket/backup.snapshot",
"api_key": "your-s3-access-key"
}'
Checksum Verification
Verify snapshot integrity during recovery:
curl -X POST \
-F "[email protected]" \
"http://localhost:6333/collections/{collection_name}/snapshots/upload?checksum=sha256_hash_here"
The recovery will fail if the checksum doesn’t match, preventing corrupted data from being restored.
Recovery Priority
Control snapshot recovery priority:
{
"location": "http://example.com/snapshot.snapshot",
"priority": "snapshot" // or "replica"
}
Priority options:
snapshot - Prefer using the snapshot for recovery
replica - Prefer using existing replicas
Data Consistency
Write-Ahead Log (WAL)
Qdrant uses WAL to ensure data consistency:
storage:
wal:
wal_capacity_mb: 32
wal_segments_ahead: 0
- wal_capacity_mb: Size of each WAL segment
- wal_segments_ahead: Number of segments to pre-allocate
Snapshot Consistency
Snapshots are consistent at the time of creation:
- Collection snapshots: Point-in-time consistency per collection
- Full snapshots: Consistent across all collections at snapshot time
- Shard snapshots: Consistent at shard level
Replication for High Availability
Combine snapshots with replication:
storage:
collection:
replication_factor: 3
write_consistency_factor: 2
This ensures:
- Data is replicated across multiple nodes
- Writes are confirmed by multiple replicas
- Snapshots can be taken from any replica
Backup Best Practices
Regular Schedule
Automate snapshot creation on a regular schedule using cron or similar tools.
Off-Site Storage
Store snapshots in a different location or cloud storage for disaster recovery.
Test Restores
Regularly test your restore procedures to ensure backups are valid and complete.
Monitor Space
Ensure sufficient disk space in snapshot directories, especially for large collections.
Automated Backup Script
Example backup automation:
#!/bin/bash
# Configuration
QDRANT_HOST="http://localhost:6333"
BACKUP_DIR="/backup/qdrant"
DATE=$(date +%Y-%m-%d-%H-%M-%S)
RETENTION_DAYS=7
# Create snapshot
echo "Creating full snapshot..."
RESPONSE=$(curl -s -X POST "${QDRANT_HOST}/snapshots")
SNAPSHOT_NAME=$(echo $RESPONSE | jq -r '.result.name')
if [ -z "$SNAPSHOT_NAME" ]; then
echo "Failed to create snapshot"
exit 1
fi
echo "Snapshot created: $SNAPSHOT_NAME"
# Download snapshot
echo "Downloading snapshot..."
curl -s "${QDRANT_HOST}/snapshots/${SNAPSHOT_NAME}" \
-o "${BACKUP_DIR}/${DATE}-${SNAPSHOT_NAME}"
if [ $? -eq 0 ]; then
echo "Backup completed: ${DATE}-${SNAPSHOT_NAME}"
# Upload to S3 (optional)
aws s3 cp "${BACKUP_DIR}/${DATE}-${SNAPSHOT_NAME}" \
"s3://my-qdrant-backups/${DATE}-${SNAPSHOT_NAME}"
else
echo "Backup failed"
exit 1
fi
# Cleanup old backups
find "${BACKUP_DIR}" -name "*.snapshot" -mtime +${RETENTION_DAYS} -delete
echo "Backup process completed"
Listener Nodes
Dedicated backup nodes can be configured as listeners:
storage:
node_type: "Listener"
Listener nodes:
- Receive all updates from the cluster
- Do not serve search or read queries
- Ideal for dedicated backup operations
- Reduce load on primary query-serving nodes
Storage Path Configuration
Configure storage paths:
storage:
storage_path: ./storage
snapshots_path: ./snapshots
temp_path: null # Uses storage/snapshots_temp/ if null
Ensure the temp directory has sufficient space for snapshot creation operations.
Troubleshooting
Snapshot Creation Fails
Check:
- Disk space in
snapshots_path
- Permissions on snapshot directory
- Collection state (must be accessible)
Restore Fails
Verify:
- Snapshot file is not corrupted (use checksum)
- Sufficient disk space in
storage_path
- Qdrant version compatibility
- No permission issues
Slow Snapshot Operations
Optimize by:
- Using
wait=false for asynchronous operations
- Scheduling snapshots during low-traffic periods
- Increasing
temp_path to faster storage (SSD)
- Enabling compression for remote transfers