Backup & Restore

Overview

S2 Lite stores all data in object storage through SlateDB. This guide covers backup strategies, disaster recovery, and data migration approaches.

Understanding S2 Lite Storage

Storage Architecture

S2 Lite uses SlateDB which stores data entirely in object storage:

s3://bucket/path/
├── manifest/          # Database manifest files
├── wal/              # Write-ahead logs
├── compacted/        # Compacted data files
└── sst/              # Sorted string tables

All data is organized under the --path prefix you specify when running S2 Lite.

Data Model

SlateDB organizes S2 data using these key prefixes:

Basin metadata: /basin/{name}/meta
Stream metadata: /stream/{basin}/{stream}/meta
Stream records: /stream/{basin}/{stream}/records/{seq}
Stream tail positions: /stream/{basin}/{stream}/tail
Fencing tokens: /stream/{basin}/{stream}/fence

See lite/src/backend/kv/mod.rs for the complete data model.

Backup Strategies

Strategy 1: Object Storage Native Backups

Recommended for most deployments. Leverage your object storage provider’s built-in backup features.

AWS S3 Versioning + Lifecycle

Enable S3 versioning

aws s3api put-bucket-versioning \
  --bucket my-s2-bucket \
  --versioning-configuration Status=Enabled

This preserves all versions of objects, protecting against accidental deletions.

Configure lifecycle policy

{
  "Rules": [
    {
      "Id": "ArchiveOldVersions",
      "Status": "Enabled",
      "NoncurrentVersionTransitions": [
        {
          "NoncurrentDays": 30,
          "StorageClass": "GLACIER_IR"
        },
        {
          "NoncurrentDays": 90,
          "StorageClass": "DEEP_ARCHIVE"
        }
      ],
      "NoncurrentVersionExpiration": {
        "NoncurrentDays": 365
      }
    }
  ]
}

aws s3api put-bucket-lifecycle-configuration \
  --bucket my-s2-bucket \
  --lifecycle-configuration file://lifecycle.json

Test recovery

List object versions:

aws s3api list-object-versions \
  --bucket my-s2-bucket \
  --prefix s2lite/

Restore a specific version:

aws s3api copy-object \
  --bucket my-s2-bucket \
  --copy-source my-s2-bucket/s2lite/manifest/MANIFEST-00001 \
  --key s2lite/manifest/MANIFEST-00001 \
  --version-id VERSION_ID

S3 Replication (Cross-Region)

Create destination bucket

aws s3 mb s3://my-s2-backup-bucket --region us-west-2
aws s3api put-bucket-versioning \
  --bucket my-s2-backup-bucket \
  --versioning-configuration Status=Enabled

Create replication IAM role

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetReplicationConfiguration",
        "s3:ListBucket"
      ],
      "Resource": "arn:aws:s3:::my-s2-bucket"
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObjectVersionForReplication",
        "s3:GetObjectVersionAcl"
      ],
      "Resource": "arn:aws:s3:::my-s2-bucket/*"
    },
    {
      "Effect": "Allow",
      "Action": [
        "s3:ReplicateObject",
        "s3:ReplicateDelete"
      ],
      "Resource": "arn:aws:s3:::my-s2-backup-bucket/*"
    }
  ]
}

Configure replication

{
  "Role": "arn:aws:iam::ACCOUNT:role/S2ReplicationRole",
  "Rules": [
    {
      "Status": "Enabled",
      "Priority": 1,
      "Filter": {
        "Prefix": "s2lite/"
      },
      "Destination": {
        "Bucket": "arn:aws:s3:::my-s2-backup-bucket",
        "ReplicationTime": {
          "Status": "Enabled",
          "Time": {
            "Minutes": 15
          }
        },
        "Metrics": {
          "Status": "Enabled"
        }
      },
      "DeleteMarkerReplication": {
        "Status": "Enabled"
      }
    }
  ]
}

aws s3api put-bucket-replication \
  --bucket my-s2-bucket \
  --replication-configuration file://replication.json

Cloudflare R2 Replication

Cloudflare R2 doesn’t support automatic replication yet, but you can use object lifecycle policies:

# Create backup using rclone
rclone sync r2:my-s2-bucket/s2lite r2:my-s2-backup/s2lite

Tigris Automatic Backups

Tigris provides automatic multi-region replication and point-in-time recovery. No additional configuration needed.

Strategy 2: Snapshot-Based Backups

Create point-in-time snapshots by copying the entire S2 Lite path:

Create snapshot script

#!/bin/bash
# s2-backup.sh

BUCKET="my-s2-bucket"
SOURCE_PATH="s2lite"
BACKUP_PATH="backups/s2lite-$(date +%Y%m%d-%H%M%S)"

echo "Creating snapshot: s3://${BUCKET}/${BACKUP_PATH}"

aws s3 sync \
  s3://${BUCKET}/${SOURCE_PATH}/ \
  s3://${BUCKET}/${BACKUP_PATH}/ \
  --storage-class GLACIER_IR

echo "Snapshot complete"

Schedule with CronJob

apiVersion: batch/v1
kind: CronJob
metadata:
  name: s2-lite-backup
  namespace: s2-system
spec:
  schedule: "0 2 * * *"  # 2 AM daily
  jobTemplate:
    spec:
      template:
        spec:
          serviceAccountName: s2-backup
          containers:
          - name: backup
            image: amazon/aws-cli:latest
            env:
            - name: BUCKET
              value: my-s2-bucket
            - name: SOURCE_PATH
              value: s2lite
            command:
            - /bin/bash
            - -c
            - |
              BACKUP_PATH="backups/s2lite-$(date +%Y%m%d-%H%M%S)"
              echo "Creating snapshot: s3://${BUCKET}/${BACKUP_PATH}"
              aws s3 sync \
                s3://${BUCKET}/${SOURCE_PATH}/ \
                s3://${BUCKET}/${BACKUP_PATH}/ \
                --storage-class GLACIER_IR
              echo "Snapshot complete"
          restartPolicy: OnFailure

Create IAM role for backup job

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:ListBucket",
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Resource": [
        "arn:aws:s3:::my-s2-bucket",
        "arn:aws:s3:::my-s2-bucket/*"
      ]
    }
  ]
}

Strategy 3: Stream-Level Backups

Export individual streams to separate storage:

Export stream to file

# Export all records from a stream
s2 read s2://basin/stream --format jsonl > stream-backup.jsonl

# Compress for storage
gzip stream-backup.jsonl

# Upload to backup location
aws s3 cp stream-backup.jsonl.gz s3://backups/streams/

Automated export script

#!/bin/bash
# export-streams.sh

BASIN="production"
BACKUP_BUCKET="s3://backup-bucket/streams"
DATE=$(date +%Y%m%d)

# List all streams in basin
STREAMS=$(s2 list-streams ${BASIN} --format json | jq -r '.[]')

for stream in $STREAMS; do
  echo "Exporting ${BASIN}/${stream}..."
  
  s2 read s2://${BASIN}/${stream} --format jsonl | \
    gzip > ${stream}-${DATE}.jsonl.gz
  
  aws s3 cp ${stream}-${DATE}.jsonl.gz \
    ${BACKUP_BUCKET}/${BASIN}/${stream}/${DATE}/
  
  rm ${stream}-${DATE}.jsonl.gz
done

Stream-level backups require reading all data through S2 Lite, which may impact performance and incur costs.

Restore Procedures

Restore from Object Storage Backup

Full Restore

Stop S2 Lite

kubectl scale deployment my-s2-lite --replicas=0 -n s2-system

Restore from backup

# From S3 versioning
aws s3api list-object-versions \
  --bucket my-s2-bucket \
  --prefix s2lite/ \
  --query 'Versions[?IsLatest==`false`]'

# Restore specific version (if needed)
aws s3api copy-object \
  --copy-source my-s2-bucket/s2lite/manifest/MANIFEST \
  --bucket my-s2-bucket \
  --key s2lite/manifest/MANIFEST \
  --version-id VERSION_ID

# Or restore from snapshot
aws s3 sync \
  s3://my-s2-bucket/backups/s2lite-20260303-020000/ \
  s3://my-s2-bucket/s2lite/ \
  --delete

Start S2 Lite

kubectl scale deployment my-s2-lite --replicas=1 -n s2-system

Verify data

# List basins
s2 list-basins

# Check specific streams
s2 read s2://basin/stream --limit 10

Point-in-Time Recovery

Recover to a specific point in time using S3 versioning:

# List versions with timestamps
aws s3api list-object-versions \
  --bucket my-s2-bucket \
  --prefix s2lite/ \
  --query 'Versions[?LastModified<=`2026-03-03T10:00:00.000Z`]' \
  --output json > versions-to-restore.json

# Restore each version
cat versions-to-restore.json | jq -r '.[] | .VersionId + " " + .Key' | \
while read version_id key; do
  aws s3api copy-object \
    --copy-source my-s2-bucket/${key}?versionId=${version_id} \
    --bucket my-s2-bucket \
    --key ${key}
done

Restore Individual Streams

Import stream from backup

# Download backup
aws s3 cp s3://backup-bucket/streams/basin/stream/20260303/stream.jsonl.gz .
gunzip stream.jsonl.gz

Create stream

s2 create-stream basin stream

Append records

cat stream.jsonl | s2 append s2://basin/stream

This restores data but not the original sequence numbers. Use object storage backups for exact recovery.

Cross-Region Failover

Switch to a replica bucket in another region:

Update S2 Lite configuration

# Updated values.yaml
objectStorage:
  enabled: true
  bucket: my-s2-backup-bucket  # Replica bucket
  path: s2lite

env:
  - name: AWS_REGION
    value: us-west-2  # Backup region

Upgrade deployment

helm upgrade my-s2-lite s2/s2-lite-helm \
  -f values.yaml \
  -n s2-system

Verify failover

# Check health
kubectl get pods -n s2-system
curl http://s2-lite-endpoint/health

# Verify data
s2 list-basins

Data Migration

Migrate Between Object Stores

Move S2 Lite data from one object storage provider to another:

Stop S2 Lite

kubectl scale deployment my-s2-lite --replicas=0 -n s2-system

Copy data to new bucket

# S3 to S3 (different regions)
aws s3 sync \
  s3://old-bucket/s2lite/ \
  s3://new-bucket/s2lite/

# S3 to R2 using rclone
rclone copy \
  s3:old-bucket/s2lite \
  r2:new-bucket/s2lite

# S3 to Tigris
AWS_ENDPOINT_URL_S3=https://fly.storage.tigris.dev \
aws s3 sync \
  s3://old-bucket/s2lite/ \
  s3://tigris-bucket/s2lite/

Update S2 Lite configuration

objectStorage:
  enabled: true
  bucket: new-bucket
  path: s2lite
  endpoint: https://new-endpoint  # If applicable

Start S2 Lite with new bucket

helm upgrade my-s2-lite s2/s2-lite-helm -f values.yaml -n s2-system

Verify migration

s2 list-basins
s2 read s2://basin/stream --limit 10

Blue-Green Migration

Zero-downtime migration strategy:

Copy data to new bucket

While S2 Lite is running:

aws s3 sync s3://old-bucket/s2lite/ s3://new-bucket/s2lite/

Deploy new S2 Lite (green)

helm install s2-lite-green s2/s2-lite-helm \
  --set objectStorage.bucket=new-bucket \
  --set service.port=8081 \
  -n s2-system

Switch traffic

Update DNS or load balancer to point to green deployment.

Cleanup old deployment

helm uninstall my-s2-lite -n s2-system

Disaster Recovery Plan

RTO and RPO Targets

Strategy	RTO	RPO	Cost
S3 Versioning	Minutes	Seconds	Low
Cross-Region Replication	Minutes	15 mins	Medium
Snapshot Backups	Hours	1 day	Low
Stream Exports	Hours	1 day	High

DR Checklist

Document configuration

Record bucket names, regions, endpoints
Save Helm values files in version control
Document IAM roles and policies
List all basins and critical streams

Enable backups

Enable S3 versioning
Configure lifecycle policies
Set up cross-region replication (critical deployments)
Schedule snapshot backups

Test recovery procedures

Perform quarterly restore tests
Validate backup integrity
Measure actual RTO/RPO
Update runbooks based on results

Monitor backup status

Set up alerts for replication lag
Monitor backup job failures
Track backup storage costs

Emergency Recovery Runbook

Assess the situation
- Identify scope of data loss
- Determine last known good state
- Choose recovery strategy

Stop S2 Lite

kubectl scale deployment my-s2-lite --replicas=0 -n s2-system

Restore data (choose one)
- S3 versioning: Restore specific versions
- Replication: Switch to replica bucket
- Snapshot: Sync from backup path

Restart S2 Lite

kubectl scale deployment my-s2-lite --replicas=1 -n s2-system

Verify recovery
- Check health endpoint
- List basins and streams
- Validate critical data
- Test write operations
Document incident
- Record timeline
- Note data loss (if any)
- Update procedures

Cost Optimization

Storage Class Strategies

{
  "Rules": [
    {
      "Id": "TransitionOldData",
      "Status": "Enabled",
      "Filter": {
        "Prefix": "s2lite/"
      },
      "Transitions": [
        {
          "Days": 30,
          "StorageClass": "STANDARD_IA"
        },
        {
          "Days": 90,
          "StorageClass": "GLACIER_IR"
        }
      ]
    }
  ]
}

Backup Retention

# Delete old snapshots
aws s3 ls s3://my-s2-bucket/backups/ | \
while read -r line; do
  backup_date=$(echo $line | awk '{print $2}' | cut -d'-' -f2)
  if [[ $backup_date < $(date -d '90 days ago' +%Y%m%d) ]]; then
    backup_path=$(echo $line | awk '{print $2}')
    aws s3 rm s3://my-s2-bucket/backups/${backup_path}/ --recursive
  fi
done

Usage Patterns

Self-Hosting

Integration

Backup & Restore

Overview

Understanding S2 Lite Storage

Storage Architecture

Data Model

Backup Strategies

Strategy 1: Object Storage Native Backups

AWS S3 Versioning + Lifecycle

S3 Replication (Cross-Region)

Cloudflare R2 Replication

Tigris Automatic Backups

Strategy 2: Snapshot-Based Backups

Strategy 3: Stream-Level Backups

Restore Procedures

Restore from Object Storage Backup

Full Restore

Point-in-Time Recovery

Restore Individual Streams

Cross-Region Failover

Data Migration

Migrate Between Object Stores

Blue-Green Migration

Disaster Recovery Plan

RTO and RPO Targets

DR Checklist

Emergency Recovery Runbook

Cost Optimization

Storage Class Strategies

Backup Retention

Next Steps

Production Deployment

S3 Setup

Build docs developers (and LLMs) love

Usage Patterns

Self-Hosting

Integration

​Overview

​Understanding S2 Lite Storage

​Storage Architecture

​Data Model

​Backup Strategies

​Strategy 1: Object Storage Native Backups

​AWS S3 Versioning + Lifecycle

​S3 Replication (Cross-Region)

​Cloudflare R2 Replication

​Tigris Automatic Backups

​Strategy 2: Snapshot-Based Backups

​Strategy 3: Stream-Level Backups

​Restore Procedures

​Restore from Object Storage Backup

​Full Restore

​Point-in-Time Recovery

​Restore Individual Streams

​Cross-Region Failover

​Data Migration

​Migrate Between Object Stores

​Blue-Green Migration

​Disaster Recovery Plan

​RTO and RPO Targets

​DR Checklist

​Emergency Recovery Runbook

​Cost Optimization

​Storage Class Strategies

​Backup Retention

​Next Steps

Production Deployment

S3 Setup

Build docs developers (and LLMs) love

Overview

Understanding S2 Lite Storage

Storage Architecture

Data Model

Backup Strategies

Strategy 1: Object Storage Native Backups

AWS S3 Versioning + Lifecycle

S3 Replication (Cross-Region)

Cloudflare R2 Replication

Tigris Automatic Backups

Strategy 2: Snapshot-Based Backups

Strategy 3: Stream-Level Backups

Restore Procedures

Restore from Object Storage Backup

Full Restore

Point-in-Time Recovery

Restore Individual Streams

Cross-Region Failover

Data Migration

Migrate Between Object Stores

Blue-Green Migration

Disaster Recovery Plan

RTO and RPO Targets

DR Checklist

Emergency Recovery Runbook

Cost Optimization

Storage Class Strategies

Backup Retention

Next Steps