Skip to main content

Overview

Open Wearables uses PostgreSQL as its primary database and Alembic for schema migrations. This guide covers database setup, migration management, backup strategies, and troubleshooting.

Database Requirements

PostgreSQL Version: 18+ (recommended), 15+ (minimum) Extensions Required:
  • uuid-ossp - UUID generation functions (auto-enabled)
Recommended Configuration:
  • Minimum RAM: 2GB
  • Minimum Storage: 20GB (varies with data volume)
  • Timezone: UTC (strongly recommended)

Quick Setup

1

Configure database connection

Set database credentials in backend/config/.env:
.env
DB_HOST=db
DB_PORT=5432
DB_NAME=open-wearables
DB_USER=open-wearables
DB_PASSWORD=open-wearables
See Environment Variables for production configuration.
2

Start database service

Using Docker Compose:
docker compose up -d db
The database will be available on localhost:5432 with health checks enabled.
3

Run migrations

Migrations run automatically when the backend starts, or manually:
# Using Makefile
make migrate

# Or directly with Docker
docker compose exec app alembic upgrade head
4

Verify setup

Check that tables were created:
docker compose exec db psql -U open-wearables -d open-wearables -c "\dt"
You should see tables for users, providers, workouts, sleep data, and more.

Database Schema

The Open Wearables database schema includes:

Core Tables

TableDescription
usersUser accounts and authentication
api_keysAPI keys for programmatic access
invitationsUser invitation management
providersWearable provider definitions (Garmin, Polar, etc.)
user_connectionsOAuth connections to providers

Data Tables

TableDescription
series_type_definitionsStandardized metric definitions
workoutsExercise activities and sessions
workout_data_pointsTime-series data within workouts
sleep_sessionsSleep tracking sessions
sleep_data_pointsTime-series sleep stage data
heart_rate_samplesHeart rate measurements
samplesGeneric time-series data

Background Processing

TableDescription
background_tasksCelery task tracking
sync_logsProvider synchronization history
The schema is designed to normalize data from multiple wearable providers into a consistent format using series_type_definitions.

Migration Management

Running Migrations

# Apply all pending migrations
make migrate

# Or using docker compose directly
docker compose exec app alembic upgrade head
Migration behavior:
  • Migrations run automatically on application startup
  • Safe to run multiple times (idempotent)
  • Applied in sequential order based on revision numbers

Creating Migrations

When you modify database models in app/models/, create a migration:
1

Make model changes

Edit files in backend/app/models/:
app/models/user.py
from app.database import BaseDbModel
from app.mappings import PrimaryKey, email, str_100

class User(BaseDbModel):
    id: Mapped[PrimaryKey[UUID]]
    email: Mapped[Unique[email]]
    # Add new field
    timezone: Mapped[str_100 | None] = mapped_column(default="UTC")
2

Generate migration

Create an autogenerated migration:
# Using Makefile
make create_migration m="Add timezone to users"

# Or directly
docker compose exec app alembic revision --autogenerate -m "Add timezone to users"
This creates a new file in backend/migrations/versions/.
3

Review migration

Always review the generated migration file:
migrations/versions/xxx_add_timezone_to_users.py
def upgrade() -> None:
    op.add_column('users', 
        sa.Column('timezone', sa.String(length=100), 
                 nullable=True, server_default='UTC'))

def downgrade() -> None:
    op.drop_column('users', 'timezone')
Alembic’s autogenerate is helpful but not perfect. Always review migrations before applying them, especially:
  • Index changes
  • Constraint modifications
  • Data migrations
  • Column type changes
4

Apply migration

make migrate

Migration Best Practices

Write migrations that can be rolled back:
def upgrade() -> None:
    # Add column as nullable first
    op.add_column('users', sa.Column('new_field', sa.String(100), nullable=True))
    
    # Populate data for existing rows
    op.execute("UPDATE users SET new_field = 'default' WHERE new_field IS NULL")
    
    # Make it non-nullable after data is populated
    op.alter_column('users', 'new_field', nullable=False)

def downgrade() -> None:
    op.drop_column('users', 'new_field')
For complex data transformations, separate schema and data changes:
def upgrade() -> None:
    # Schema change
    op.add_column('workouts', sa.Column('distance_meters', sa.Numeric(15, 5)))
    
    # Data migration
    connection = op.get_bind()
    connection.execute(
        "UPDATE workouts SET distance_meters = distance_km * 1000 "
        "WHERE distance_km IS NOT NULL"
    )
    
    # Cleanup old column
    op.drop_column('workouts', 'distance_km')

def downgrade() -> None:
    op.add_column('workouts', sa.Column('distance_km', sa.Numeric(10, 2)))
    connection = op.get_bind()
    connection.execute(
        "UPDATE workouts SET distance_km = distance_meters / 1000 "
        "WHERE distance_meters IS NOT NULL"
    )
    op.drop_column('workouts', 'distance_meters')
For tables with millions of rows, use concurrent operations:
def upgrade() -> None:
    # Add index concurrently (doesn't lock table)
    op.create_index(
        'ix_workouts_user_id_date',
        'workouts',
        ['user_id', 'start_time'],
        postgresql_concurrently=True
    )

def downgrade() -> None:
    op.drop_index(
        'ix_workouts_user_id_date',
        'workouts',
        postgresql_concurrently=True
    )
Concurrent operations require a database connection outside a transaction. Set connection.execution_options(isolation_level="AUTOCOMMIT") if needed.

Rolling Back Migrations

# Rollback the last migration
docker compose exec app alembic downgrade -1
Rolling back migrations can result in data loss. Always backup the database before performing rollbacks in production.

Migration History

# View migration history
docker compose exec app alembic history

# View current migration version
docker compose exec app alembic current

# Show pending migrations
docker compose exec app alembic heads

Database Connection

Connection String Format

Open Wearables uses SQLAlchemy with the psycopg driver:
postgresql+psycopg://{user}:{password}@{host}:{port}/{database}
Example:
postgresql+psycopg://open-wearables:mypassword@localhost:5432/open-wearables

Connection Pooling

SQLAlchemy manages connection pooling automatically with these defaults:
  • Pool size: 5 connections
  • Max overflow: 10 additional connections
  • Pool timeout: 30 seconds
  • Pool recycle: 3600 seconds (1 hour)
These values work well for most deployments. Adjust if needed:
backend/app/database.py
engine = create_engine(
    settings.db_uri,
    pool_size=10,
    max_overflow=20,
    pool_timeout=30,
    pool_recycle=3600,
)

Testing Database Connection

# Test connection using psql
docker compose exec db psql -U open-wearables -d open-wearables -c "SELECT version();"

Seed Data

Initial Data

On first startup, the application automatically creates:
  1. Admin account - From ADMIN_EMAIL and ADMIN_PASSWORD environment variables
  2. Series type definitions - Standardized metric definitions for normalizing provider data

Sample Data (Development)

Load sample test data for development:
make seed
This creates:
  • Sample users
  • Example workouts with data points
  • Sleep sessions with sleep stages
  • Heart rate samples
  • Provider connections
The seed command is for development only. Never run it in production as it may create test data conflicts.

Custom Seed Scripts

Create custom seed scripts in backend/scripts/:
backend/scripts/seed_custom.py
from app.database import get_db
from app.models import User
from app.schemas import UserCreateInternal
from app.services import user_service

def seed_users():
    db = next(get_db())
    
    users = [
        UserCreateInternal(email="[email protected]", first_name="Alice"),
        UserCreateInternal(email="[email protected]", first_name="Bob"),
    ]
    
    for user_data in users:
        user_service.create(db, user_data)
        print(f"Created user: {user_data.email}")

if __name__ == "__main__":
    seed_users()
Run your script:
docker compose exec app python scripts/seed_custom.py

Backup and Restore

Creating Backups

# Backup entire database
docker compose exec -T db pg_dump -U open-wearables -d open-wearables > backup_$(date +%Y%m%d_%H%M%S).sql

Restoring Backups

Restoring a backup will overwrite all existing data. Always backup current data before restoring.
1

Stop application services

docker compose stop app celery-worker celery-beat
2

Drop and recreate database

docker compose exec db psql -U open-wearables -d postgres -c "DROP DATABASE IF EXISTS \"open-wearables\";"
docker compose exec db psql -U open-wearables -d postgres -c "CREATE DATABASE \"open-wearables\";"
3

Restore backup

cat backup.sql | docker compose exec -T db psql -U open-wearables -d open-wearables
4

Restart services

docker compose start app celery-worker celery-beat

Automated Backup Strategy

For production, implement automated backups:
backup.sh
#!/bin/bash

BACKUP_DIR="/backups/postgres"
RETENTION_DAYS=30

# Create backup directory
mkdir -p $BACKUP_DIR

# Create backup
BACKUP_FILE="$BACKUP_DIR/open_wearables_$(date +%Y%m%d_%H%M%S).sql.gz"
docker compose exec -T db pg_dump -U open-wearables -d open-wearables | gzip > $BACKUP_FILE

# Upload to S3 (optional)
aws s3 cp $BACKUP_FILE s3://my-backups/postgres/

# Delete old backups
find $BACKUP_DIR -name "*.sql.gz" -mtime +$RETENTION_DAYS -delete

echo "Backup completed: $BACKUP_FILE"
Add to crontab:
# Daily backup at 2 AM
0 2 * * * /path/to/backup.sh
When using managed PostgreSQL services (AWS RDS, Azure Database, etc.):
  • Enable automated daily backups
  • Set retention period (7-35 days recommended)
  • Configure backup window during low-traffic hours
  • Enable point-in-time recovery (PITR)
  • Test restore procedures regularly
AWS RDS Example:
  • Automated backups: Enabled
  • Backup retention: 30 days
  • Backup window: 02:00-04:00 UTC
  • PITR: Enabled (allows restore to any second in retention period)

Performance Optimization

Indexes

The schema includes indexes on frequently queried columns:
-- User lookups
CREATE UNIQUE INDEX ix_users_email ON users(email);

-- Workout queries
CREATE INDEX ix_workouts_user_id ON workouts(user_id);
CREATE INDEX ix_workouts_start_time ON workouts(start_time);
CREATE INDEX ix_workouts_provider_id ON workouts(provider_id);

-- Time-series data
CREATE INDEX ix_workout_data_points_workout_id ON workout_data_points(workout_id);
CREATE INDEX ix_sleep_data_points_session_id ON sleep_data_points(sleep_session_id);
Indexes speed up queries but slow down inserts/updates. Monitor query performance and add indexes based on actual usage patterns.

Monitoring Query Performance

-- Enable slow query logging
ALTER SYSTEM SET log_min_duration_statement = 1000; -- Log queries > 1 second
SELECT pg_reload_conf();

Vacuuming and Maintenance

PostgreSQL requires periodic maintenance:
-- Manual vacuum (usually automatic)
VACUUM ANALYZE;

-- Check table bloat
SELECT schemaname, tablename, pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename))
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
PostgreSQL’s autovacuum runs automatically but can be tuned:
-- Increase autovacuum frequency for high-traffic tables
ALTER TABLE workouts SET (autovacuum_vacuum_scale_factor = 0.05);
ALTER TABLE workout_data_points SET (autovacuum_vacuum_scale_factor = 0.05);

Production Database

Managed Database Services

For production, use managed PostgreSQL services:
Recommended configuration:
  • Instance: db.t4g.medium or larger
  • Storage: General Purpose SSD (gp3) with auto-scaling
  • Multi-AZ: Enabled for high availability
  • Encryption: Enabled at rest and in transit
  • Automated backups: 30-day retention
  • Monitoring: Enhanced monitoring enabled
Connection:
.env
DB_HOST=open-wearables.abcd1234.us-east-1.rds.amazonaws.com
DB_PORT=5432
DB_NAME=open_wearables_prod
DB_USER=ow_admin
DB_PASSWORD=your-secure-password
Recommended configuration:
  • Tier: General Purpose
  • Compute: 2-4 vCores
  • Storage: 100GB with auto-growth
  • Backup retention: 30 days
  • High availability: Zone-redundant
  • SSL enforcement: Enabled
Connection:
.env
DB_HOST=open-wearables.postgres.database.azure.com
DB_PORT=5432
DB_NAME=open_wearables_prod
DB_USER=ow_admin@open-wearables
DB_PASSWORD=your-secure-password
Recommended configuration:
  • Machine type: db-custom-2-7680 (2 vCPU, 7.5GB RAM)
  • Storage: 100GB SSD with auto-increase
  • Availability: Regional (high availability)
  • Backups: Automated daily with 30-day retention
  • Encryption: Enabled
Connection:
.env
DB_HOST=10.1.2.3  # Private IP
DB_PORT=5432
DB_NAME=open_wearables_prod
DB_USER=ow_admin
DB_PASSWORD=your-secure-password

Security Best Practices

Production Database Security Checklist:
  1. Network isolation - Use VPC/private network, no public access
  2. Strong passwords - Use generated passwords with high entropy
  3. Encryption - Enable at-rest and in-transit encryption
  4. SSL/TLS - Require SSL connections
  5. Least privilege - Application user should not have superuser access
  6. Regular updates - Apply security patches promptly
  7. Audit logging - Enable and monitor database logs
  8. Backup encryption - Encrypt backup files
  9. Access control - Use IAM authentication when available
  10. Monitoring - Set up alerts for suspicious activity

Troubleshooting

Symptoms: psycopg.OperationalError: connection refusedSolutions:
  1. Verify database container is running:
    docker compose ps db
    
  2. Check database logs:
    docker compose logs db
    
  3. Verify connection settings in .env:
    docker compose exec app python -c "from app.config import settings; print(settings.db_uri)"
    
  4. Test connection manually:
    docker compose exec db psql -U open-wearables -d open-wearables
    
Symptoms: alembic.util.exc.CommandError or migration failuresSolutions:
  1. Check current migration version:
    docker compose exec app alembic current
    
  2. View migration history:
    docker compose exec app alembic history
    
  3. If migrations are out of sync, stamp the database:
    # Stamp database to specific revision (use with caution)
    docker compose exec app alembic stamp head
    
  4. For conflicts, manually resolve or rollback:
    docker compose exec app alembic downgrade -1
    docker compose exec app alembic upgrade head
    
Symptoms: API responses are slow, database CPU is highDiagnosis:
  1. Enable slow query logging:
    ALTER SYSTEM SET log_min_duration_statement = 1000;
    SELECT pg_reload_conf();
    
  2. View running queries:
    SELECT pid, now() - query_start as duration, query
    FROM pg_stat_activity
    WHERE state = 'active'
    ORDER BY duration DESC;
    
  3. Check for missing indexes:
    SELECT schemaname, tablename, attname
    FROM pg_stats
    WHERE schemaname = 'public'
    AND n_distinct > 100
    AND correlation < 0.1;
    
Solutions:
  • Add indexes on frequently filtered columns
  • Optimize queries to reduce data scanned
  • Use pagination for large result sets
  • Consider database scaling if consistently slow
Symptoms: No space left on device errorsSolutions:
  1. Check volume usage:
    docker system df -v
    
  2. Check database size:
    SELECT pg_database.datname, pg_size_pretty(pg_database_size(pg_database.datname))
    FROM pg_database
    ORDER BY pg_database_size(pg_database.datname) DESC;
    
  3. Clean up old data:
    -- Example: Delete old sync logs
    DELETE FROM sync_logs WHERE created_at < NOW() - INTERVAL '90 days';
    VACUUM FULL sync_logs;
    
  4. Increase volume size or enable auto-scaling in managed services

Next Steps

Docker Deployment

Complete deployment guide with Docker Compose

Environment Variables

Configure all environment settings

Build docs developers (and LLMs) love