Migrating to CockroachDB

Migrating to CockroachDB from other databases requires careful planning and execution. This guide covers migration strategies, tools, and best practices for moving data from PostgreSQL, MySQL, and other databases.

Migration Overview

PostgreSQL

Highly compatible with minimal application changes

MySQL

Requires SQL dialect translation and schema adjustments

Oracle

Moderate compatibility with some feature mapping

Other SQL Databases

Case-by-case assessment required

Migration Strategies

Offline Migration

Complete migration during a maintenance window:

Prepare CockroachDB Cluster

Set up and configure your target CockroachDB cluster with appropriate sizing and replication.

Export Source Data

Extract data from the source database using database-specific tools.

Transform Schema and Data

Convert schemas and data to CockroachDB-compatible format.

Import to CockroachDB

Load data into CockroachDB using IMPORT or RESTORE commands.

Validate and Cutover

Verify data integrity, test applications, and switch production traffic.

Online Migration

Migrate with minimal downtime using change data capture:

Initial Snapshot

Take a snapshot of the source database and load it into CockroachDB.

Set Up CDC

Configure change data capture to stream ongoing changes from source to CockroachDB.

Sync Period

Allow both databases to stay in sync while applications continue running.

Cutover

Switch application traffic to CockroachDB during a brief maintenance window.

Migrating from PostgreSQL

CockroachDB is highly compatible with PostgreSQL, making migration straightforward:

Schema Export

Export PostgreSQL Schema

pg_dump \
  --host=postgres-host \
  --username=postgres \
  --schema-only \
  --no-owner \
  --no-acl \
  mydatabase > schema.sql

Schema Compatibility Review

Review and adjust PostgreSQL-specific features:

Common PostgreSQL Compatibility Issues

SERIAL vs SEQUENCE: CockroachDB uses unique_rowid() or gen_random_uuid() instead of SERIAL
Stored Procedures: Limited support; may need conversion to application logic
Extensions: Review required extensions (most are unsupported)
Triggers: Evaluate trigger logic for conversion
Custom Types: Map custom types to CockroachDB equivalents
Partitioning Syntax: Different partitioning implementation

Schema Conversion Example

CREATE TABLE users (
  id SERIAL PRIMARY KEY,
  email VARCHAR(255) UNIQUE NOT NULL,
  created_at TIMESTAMP DEFAULT NOW()
);

CREATE INDEX idx_users_email ON users(email);

Data Export and Import

Export Data to CSV

psql -h postgres-host -U postgres -d mydatabase -c \
  "COPY users TO STDOUT CSV HEADER" > users.csv

Upload to Cloud Storage

aws s3 cp users.csv s3://my-migration-bucket/

Import into CockroachDB

IMPORT INTO users 
  CSV DATA (
    's3://my-migration-bucket/users.csv?AWS_ACCESS_KEY_ID=xxx&AWS_SECRET_ACCESS_KEY=xxx'
  )
  WITH skip_foreign_keys;

The IMPORT statement is much faster than INSERT for bulk data loading.

Migrating from MySQL

MySQL migrations require more extensive schema and SQL translation:

Schema Differences

CREATE TABLE orders (
  id INT AUTO_INCREMENT PRIMARY KEY,
  user_id INT NOT NULL,
  total DECIMAL(10,2) NOT NULL,
  status ENUM('pending', 'completed', 'cancelled'),
  created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
  FOREIGN KEY (user_id) REFERENCES users(id),
  INDEX idx_status (status)
) ENGINE=InnoDB;

MySQL to CockroachDB Type Mapping

MySQL Type	CockroachDB Type	Notes
`INT AUTO_INCREMENT`	`UUID` or `INT8` with `unique_rowid()`	Use UUID for distributed systems
`VARCHAR(n)`	`STRING(n)`	Direct mapping
`DATETIME`	`TIMESTAMP`	Similar functionality
`ENUM`	`STRING` with `CHECK` constraint	Convert to check constraint
`TINYINT(1)`	`BOOL`	Direct mapping for boolean
`BLOB`	`BYTES`	Direct mapping
`TEXT`	`STRING`	Direct mapping

Data Migration from MySQL

Export MySQL Data

mysqldump \
  --host=mysql-host \
  --user=root \
  --password \
  --tab=/tmp/export \
  --fields-terminated-by=',' \
  --fields-enclosed-by='"' \
  --lines-terminated-by='\n' \
  mydatabase orders

Using IMPORT for Bulk Data

The IMPORT statement is the fastest way to load large datasets:

Import CSV Data

Import CSV

IMPORT INTO users (id, email, name, created_at)
CSV DATA (
  's3://bucket/users.csv?AWS_ACCESS_KEY_ID=xxx&AWS_SECRET_ACCESS_KEY=xxx'
)
WITH
  skip = '1',  -- Skip header row
  delimiter = ',',
  nullif = '';

Import from Multiple Files

Import Multiple CSV Files

IMPORT INTO orders
CSV DATA (
  's3://bucket/orders-part1.csv?AWS_ACCESS_KEY_ID=xxx&AWS_SECRET_ACCESS_KEY=xxx',
  's3://bucket/orders-part2.csv?AWS_ACCESS_KEY_ID=xxx&AWS_SECRET_ACCESS_KEY=xxx',
  's3://bucket/orders-part3.csv?AWS_ACCESS_KEY_ID=xxx&AWS_SECRET_ACCESS_KEY=xxx'
)
WITH skip = '1';

Import PostgreSQL Dump

IMPORT PGDUMP 
's3://bucket/pgdump.sql?AWS_ACCESS_KEY_ID=xxx&AWS_SECRET_ACCESS_KEY=xxx'
WITH ignore_unsupported_statements;

Import MySQL Dump

IMPORT MYSQLDUMP 
's3://bucket/mysqldump.sql?AWS_ACCESS_KEY_ID=xxx&AWS_SECRET_ACCESS_KEY=xxx'
WITH ignore_unsupported_statements;

The ignore_unsupported_statements option skips incompatible SQL. Review warnings to ensure critical statements weren’t skipped.

Change Data Capture (CDC)

For online migrations, use CDC to keep databases synchronized:

Using External CDC Tools

Debezium

Open-source CDC platform supporting PostgreSQL and MySQL

Striim

Real-time data integration with CockroachDB support

AWS DMS

Database Migration Service for AWS environments

Custom Scripts

Application-level replication logic

Application Migration

Connection String Changes

# PostgreSQL
conn_string = "postgresql://user:pass@postgres-host:5432/mydb"

# CockroachDB (secure)
conn_string = "postgresql://user:pass@cockroach-host:26257/mydb?sslmode=verify-full&sslrootcert=certs/ca.crt"

ORM Configuration Updates

# settings.py
DATABASES = {
    'default': {
        'ENGINE': 'django_cockroachdb',
        'NAME': 'mydb',
        'USER': 'myuser',
        'PASSWORD': 'mypassword',
        'HOST': 'cockroach-host',
        'PORT': '26257',
        'OPTIONS': {
            'sslmode': 'verify-full',
            'sslrootcert': '/path/to/ca.crt',
        },
    },
}

Query Optimization

Some queries may need optimization for CockroachDB:

Common Query Adjustments

LIMIT without ORDER BY: Add ORDER BY for consistent results in distributed system
Transactions: Consider using SELECT FOR UPDATE for row-level locking
Auto-increment IDs: Replace with UUIDs or use unique_rowid()
Timestamp precision: CockroachDB uses microsecond precision
Implicit type conversions: May behave differently; make conversions explicit

Validation and Testing

Schema Validation

-- Verify table counts
SELECT tablename, row_count 
FROM [SHOW TABLES] 
JOIN (SELECT table_name, count(*) as row_count 
      FROM information_schema.tables 
      GROUP BY table_name) 
ON tablename = table_name;

Data Validation

-- Compare row counts
SELECT COUNT(*) FROM users;

-- Verify data integrity
SELECT COUNT(DISTINCT id) = COUNT(*) AS no_duplicates FROM users;

-- Check for NULL values
SELECT COUNT(*) FROM users WHERE email IS NULL;

Performance Testing

Run representative queries and transactions to ensure acceptable performance.

Application Testing

Test all application features thoroughly in a staging environment.

Migration Best Practices

Start Small

Begin with non-critical databases or tables to gain experience with the migration process.

Test Thoroughly

Perform multiple test migrations in non-production environments before the actual migration.

Plan for Rollback

Have a tested rollback plan in case issues arise during cutover.

Monitor Closely

Set up comprehensive monitoring before migration and watch closely during and after cutover.

Migrate During Low Traffic

Schedule cutover during periods of low application usage to minimize impact.

Document Changes

Record all schema modifications, query changes, and configuration adjustments.

Keep Source Database

Maintain the source database for a period after migration as a safety net.

Common Migration Challenges

Sequence Gaps

Use UUIDs instead of auto-incrementing integers for distributed systems

Foreign Key Constraints

May need to disable during import and re-enable after

Large Object Storage

Consider external blob storage for very large files

Timezone Handling

Ensure consistent timezone handling (CockroachDB stores in UTC)

Monitoring Migration Progress

Check Import Job Status

SHOW JOBS 
WHERE job_type = 'IMPORT' 
  AND status = 'running';

View Job Details

SELECT 
  job_id,
  description,
  status,
  fraction_completed,
  created,
  started
FROM crdb_internal.jobs
WHERE job_type = 'IMPORT'
ORDER BY created DESC
LIMIT 10;

Post-Migration Tasks

Update Statistics

ANALYZE users;
ANALYZE orders;

Rebuild Indexes

Ensure all indexes are properly created:

SHOW INDEXES FROM users;

Set Up Monitoring

Configure alerts for the new CockroachDB cluster.

Optimize Performance

Review query plans and add indexes as needed:

EXPLAIN ANALYZE SELECT * FROM users WHERE email = '[email protected]';

Migration Checklist

Complete Migration Checklist

Next Steps

Scaling

Scale your new cluster

Backup & Restore

Set up backup strategies

Security

Secure your deployment

Security Configuration

Upgrading CockroachDB

⌘I

Getting Started

Architecture

SQL Reference

Administration

Operations

Performance

​Migration Overview

PostgreSQL

MySQL

Oracle

Other SQL Databases

​Migration Strategies

​Offline Migration

​Online Migration

​Migrating from PostgreSQL

​Schema Export

​Schema Compatibility Review

​Schema Conversion Example

​Data Export and Import

​Migrating from MySQL

​Schema Differences

​MySQL to CockroachDB Type Mapping

​Data Migration from MySQL

​Using IMPORT for Bulk Data

​Import CSV Data

​Import from Multiple Files

​Import PostgreSQL Dump

​Import MySQL Dump

​Change Data Capture (CDC)

​Using External CDC Tools

Debezium

Striim

AWS DMS

Custom Scripts

​Application Migration

​Connection String Changes

​ORM Configuration Updates

​Query Optimization

​Validation and Testing

​Migration Best Practices

​Common Migration Challenges

Sequence Gaps

Foreign Key Constraints

Large Object Storage

Timezone Handling

​Monitoring Migration Progress

​Post-Migration Tasks

​Migration Checklist

​Next Steps

Scaling

Backup & Restore

Security

Build docs developers (and LLMs) love

Migration Overview

Migration Strategies

Offline Migration

Online Migration

Migrating from PostgreSQL

Schema Export

Schema Compatibility Review

Schema Conversion Example

Data Export and Import

Migrating from MySQL

Schema Differences

MySQL to CockroachDB Type Mapping

Data Migration from MySQL

Using IMPORT for Bulk Data

Import CSV Data

Import from Multiple Files

Import PostgreSQL Dump

Import MySQL Dump

Change Data Capture (CDC)

Using External CDC Tools

Application Migration

Connection String Changes

ORM Configuration Updates

Query Optimization

Validation and Testing

Migration Best Practices

Common Migration Challenges

Monitoring Migration Progress

Post-Migration Tasks

Migration Checklist

Next Steps