Development Best Practices

Follow these best practices to build high-performance, reliable applications with CockroachDB.

Transaction Best Practices

Implement Transaction Retry Logic

CockroachDB may require clients to retry transactions due to contention. All production applications must implement retry logic.

Applications that do not implement transaction retry logic will fail with “retry write” errors under contention.

Node.js
Python
Go
Java

async function retryTxn(client, operation, maxRetries = 15) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      await client.query('BEGIN')
      const result = await operation(client)
      await client.query('COMMIT')
      return result
    } catch (err) {
      await client.query('ROLLBACK')
      
      // Check for retry error code
      if (err.code !== '40001') {
        throw err
      }
      
      // Exponential backoff
      const sleepMs = Math.pow(2, i) * 100
      await new Promise(resolve => setTimeout(resolve, sleepMs))
    }
  }
  throw new Error('Transaction max retry limit reached')
}

// Usage
await retryTxn(client, async (txnClient) => {
  await txnClient.query(
    'UPDATE accounts SET balance = balance - $1 WHERE id = $2',
    [100, fromId]
  )
  await txnClient.query(
    'UPDATE accounts SET balance = balance + $1 WHERE id = $2',
    [100, toId]
  )
})

from sqlalchemy_cockroachdb import run_transaction

def transfer_funds(session, from_id, to_id, amount):
    from_account = session.query(Account).filter_by(id=from_id).one()
    to_account = session.query(Account).filter_by(id=to_id).one()
    
    from_account.balance -= amount
    to_account.balance += amount

# run_transaction handles retries automatically
run_transaction(
    sessionmaker,
    lambda session: transfer_funds(session, id1, id2, 100)
)

import crdbpgx "github.com/cockroachdb/cockroach-go/v2/crdb/crdbpgxv5"

err := crdbpgx.ExecuteTx(ctx, conn, pgx.TxOptions{}, func(tx pgx.Tx) error {
    // Deduct from sender
    _, err := tx.Exec(ctx,
        "UPDATE accounts SET balance = balance - $1 WHERE id = $2",
        amount, fromID)
    if err != nil {
        return err
    }
    
    // Add to recipient
    _, err = tx.Exec(ctx,
        "UPDATE accounts SET balance = balance + $1 WHERE id = $2",
        amount, toID)
    return err
})

public static void transferFunds(
    Connection conn, UUID fromId, UUID toId, BigDecimal amount
) throws SQLException {
    int maxRetries = 15;
    
    for (int i = 0; i < maxRetries; i++) {
        try {
            conn.setAutoCommit(false);
            
            try (PreparedStatement ps = conn.prepareStatement(
                "UPDATE accounts SET balance = balance - ? WHERE id = ?")) {
                ps.setBigDecimal(1, amount);
                ps.setObject(2, fromId);
                ps.executeUpdate();
            }
            
            try (PreparedStatement ps = conn.prepareStatement(
                "UPDATE accounts SET balance = balance + ? WHERE id = ?")) {
                ps.setBigDecimal(1, amount);
                ps.setObject(2, toId);
                ps.executeUpdate();
            }
            
            conn.commit();
            return;
        } catch (SQLException e) {
            conn.rollback();
            
            // Check for retry error
            if (!e.getSQLState().equals("40001")) {
                throw e;
            }
            
            // Exponential backoff
            Thread.sleep((long) Math.pow(2, i) * 100);
        }
    }
}

Keep Transactions Short

Long-running transactions can cause performance issues and increase retry errors.

Do

# Good: Short, focused transaction
def update_balance(session, account_id, amount):
    account = session.query(Account).get(account_id)
    account.balance += amount

run_transaction(sessionmaker, 
    lambda s: update_balance(s, id, 100))

Don't

# Bad: Long transaction with external calls
def process_payment(session, account_id, amount):
    account = session.query(Account).get(account_id)
    
    # External API call - DON'T DO THIS
    response = requests.post('https://api.payment.com/charge')
    
    account.balance -= amount

DML Best Practices

Use Multi-Row Statements

Multi-row statements are significantly faster than multiple single-row statements.

INSERT INTO accounts (id, balance) VALUES 
  (gen_random_uuid(), 1000),
  (gen_random_uuid(), 2000),
  (gen_random_uuid(), 3000);

Use UPSERT for Insert-or-Update

On tables with no secondary indexes, UPSERT is faster than INSERT ON CONFLICT.

UPSERT INTO accounts (id, balance) 
VALUES ('123e4567-e89b-12d3-a456-426614174000', 1000);

Batch Large Operations

For bulk inserts, use batches of 100-1000 rows. Experiment to find the optimal size.

Python
Java (JDBC)

BATCH_SIZE = 500

for i in range(0, len(records), BATCH_SIZE):
    batch = records[i:i + BATCH_SIZE]
    
    def insert_batch(session):
        session.bulk_insert_mappings(Account, batch)
    
    run_transaction(sessionmaker, insert_batch)

int BATCH_SIZE = 128;  // Use powers of 2 for JDBC

connection.setAutoCommit(false);

try (PreparedStatement pstmt = connection.prepareStatement(
    "INSERT INTO accounts (id, balance) VALUES (?, ?)")) {
    
    for (int i = 0; i < records.size(); i++) {
        pstmt.setObject(1, UUID.randomUUID());
        pstmt.setInt(2, records.get(i).balance);
        pstmt.addBatch();
        
        if ((i + 1) % BATCH_SIZE == 0) {
            pstmt.executeBatch();
        }
    }
    
    pstmt.executeBatch();  // Execute remaining
    connection.commit();
}

For JDBC specifically, use a batch size of 128 (or other powers of 2) for optimal performance.

Primary Key Best Practices

Avoid Sequential Primary Keys

Sequential IDs (like SERIAL) create hotspots that hurt performance in distributed databases.

Avoid

Sequential SERIAL

CREATE TABLE accounts (
  id SERIAL PRIMARY KEY,
  balance INT
);

Creates hotspots on a single range.

Use Multi-Column Primary Keys

For best performance, design composite primary keys with:

A well-distributed prefix (e.g., user ID)
A monotonically increasing suffix (e.g., timestamp)

CREATE TABLE posts (
  username STRING,
  post_timestamp TIMESTAMP,
  post_id UUID,
  content TEXT,
  PRIMARY KEY (username, post_timestamp)
);

This allows efficient queries like:

SELECT * FROM posts 
WHERE username = 'alice' 
ORDER BY post_timestamp DESC 
LIMIT 10;

Connection Management

Use Connection Pooling

Always use connection pooling in production applications.

Node.js (pg)
Python (SQLAlchemy)
Java (HikariCP)

const { Pool } = require('pg')

const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 20,                    // Maximum pool size
  idleTimeoutMillis: 30000,   // Close idle clients after 30s
  connectionTimeoutMillis: 2000,
})

// Use pool instead of individual clients
const result = await pool.query('SELECT * FROM accounts')

from sqlalchemy import create_engine

engine = create_engine(
    db_uri,
    pool_size=10,           # Base pool size
    max_overflow=20,        # Allow up to 30 total connections
    pool_pre_ping=True,     # Verify connections before use
    pool_recycle=3600       # Recycle connections after 1 hour
)

HikariConfig config = new HikariConfig();
config.setJdbcUrl(jdbcUrl);
config.setMaximumPoolSize(20);
config.setMinimumIdle(5);
config.setConnectionTimeout(2000);
config.setIdleTimeout(30000);

HikariDataSource dataSource = new HikariDataSource(config);

Set Appropriate Timeouts

SET statement_timeout = '30s';
SET idle_in_transaction_session_timeout = '60s';

Index Best Practices

Index WHERE Clause Columns

Create indexes on columns frequently used in WHERE clauses.

-- Query pattern
SELECT * FROM orders WHERE customer_id = $1 AND status = 'pending';

-- Appropriate index
CREATE INDEX ON orders (customer_id, status);

Use Covering Indexes

Include frequently selected columns in the index to avoid table lookups.

CREATE INDEX orders_by_customer ON orders (customer_id) 
STORING (order_date, total_amount);

Now this query only reads the index:

SELECT order_date, total_amount 
FROM orders 
WHERE customer_id = $1;

Avoid Too Many Indexes

Each index slows down writes. Only create indexes you actually use.

Schema Design Best Practices

Use Column Families for Large Rows

Group frequently updated columns separately from large, seldom-updated columns.

CREATE TABLE users (
  id UUID PRIMARY KEY,
  email STRING,
  name STRING,
  
  -- Large, infrequently updated
  profile_image BYTES,
  bio TEXT,
  
  FAMILY primary (id, email, name),
  FAMILY large_data (profile_image, bio)
);

Choose Appropriate Data Types

For IDs

Use UUID instead of SERIAL

id UUID DEFAULT gen_random_uuid()

For Money

Use DECIMAL for exact precision

price DECIMAL(10, 2)

For Timestamps

Use TIMESTAMPTZ for time zones

created_at TIMESTAMPTZ DEFAULT now()

For JSON

Use JSONB for flexible schemas

metadata JSONB

Application Architecture Best Practices

Retry Idempotently

Ensure retried operations are idempotent to avoid duplicate work.

# Use unique constraint to prevent duplicates
def create_user_idempotent(session, user_id, email):
    try:
        user = User(id=user_id, email=email)
        session.add(user)
    except IntegrityError:
        # User already exists - this is okay for retries
        session.rollback()
        user = session.query(User).filter_by(id=user_id).one()
    return user

Use IMPORT INTO for Bulk Loads

For large data imports, use IMPORT INTO instead of INSERT statements.

IMPORT INTO accounts (id, balance)
CSV DATA ('s3://bucket/accounts.csv')
WITH delimiter = ',', skip = '1';

This bypasses the SQL layer and writes directly to storage, making it 10-100x faster.

Monitor and Log Slow Queries

Enable slow query logging to identify performance issues.

SET CLUSTER SETTING sql.log.slow_query.latency_threshold = '100ms';

Language-Specific Best Practices

Java (JDBC)

Enable batch rewriting

String url = jdbcUrl + "?reWriteBatchedInserts=true";

This provides 2-3x performance improvement for batched inserts.

Use batch size of 128

PGJDBC performs best with power-of-2 batch sizes.

int BATCH_SIZE = 128;

Use PKCS8 key format

For certificate authentication, convert keys to PKCS8 format:

openssl pkcs8 -topk8 -inform PEM -outform DER \
  -in client.key -out client.pk8 -nocrypt

Python (SQLAlchemy)

Always use run_transaction() from sqlalchemy-cockroachdb
Replace postgresql:// with cockroachdb:// in connection strings
Never call session.commit(), session.rollback(), or session.flush() inside run_transaction()

Node.js

Use the pg driver’s native connection pooling
Implement exponential backoff for transaction retries
Set statement_timeout for long-running queries

Testing Best Practices

Test Transaction Retries

Simulate contention to ensure your retry logic works:

-- Force retry errors for testing
SET CLUSTER SETTING sql.defaults.inject_retry_errors_enabled = true;
SET inject_retry_errors_enabled = true;

Use Separate Test Database

Create isolated test databases for integration tests:

CREATE DATABASE test_db;
USE test_db;

Summary

Always Implement

Transaction retry logic with exponential backoff
Connection pooling
UUID primary keys (not SERIAL)
Multi-row DML statements

Never Do

Long-running transactions
Sequential primary keys in high-write tables
Missing indexes on WHERE clauses
Ignoring retry errors

Next Steps

Client Drivers

Install drivers for your programming language

ORM Support

Learn about ORM frameworks

Performance Tuning

Advanced performance optimization techniques

Example Applications

View complete example apps in your language

Get Started

Architecture

Core Concepts

Deployment

SQL Reference

Operations

Migration

Developer Guide

​Transaction Best Practices

​Implement Transaction Retry Logic

​Keep Transactions Short

Do

Don't

​DML Best Practices

​Use Multi-Row Statements

​Use UPSERT for Insert-or-Update

​Batch Large Operations

​Primary Key Best Practices

​Avoid Sequential Primary Keys

Recommended

Avoid

​Use Multi-Column Primary Keys

​Connection Management

​Use Connection Pooling

​Set Appropriate Timeouts

​Index Best Practices

​Index WHERE Clause Columns

​Use Covering Indexes

​Avoid Too Many Indexes

​Schema Design Best Practices

​Use Column Families for Large Rows

​Choose Appropriate Data Types

For IDs

For Money

For Timestamps

For JSON

​Application Architecture Best Practices

​Retry Idempotently

​Use IMPORT INTO for Bulk Loads

​Monitor and Log Slow Queries

​Language-Specific Best Practices

​Java (JDBC)

​Python (SQLAlchemy)

​Node.js

​Testing Best Practices

​Test Transaction Retries

​Use Separate Test Database

​Summary

Always Implement

Never Do

​Next Steps

Client Drivers

ORM Support

Performance Tuning

Example Applications

Build docs developers (and LLMs) love

Transaction Best Practices

Implement Transaction Retry Logic

Keep Transactions Short

DML Best Practices

Use Multi-Row Statements

Use UPSERT for Insert-or-Update

Batch Large Operations

Primary Key Best Practices

Avoid Sequential Primary Keys

Use Multi-Column Primary Keys

Connection Management

Use Connection Pooling

Set Appropriate Timeouts

Index Best Practices

Index WHERE Clause Columns

Use Covering Indexes

Avoid Too Many Indexes

Schema Design Best Practices

Use Column Families for Large Rows

Choose Appropriate Data Types

Application Architecture Best Practices

Retry Idempotently

Use IMPORT INTO for Bulk Loads

Monitor and Log Slow Queries

Language-Specific Best Practices

Java (JDBC)

Python (SQLAlchemy)

Node.js

Testing Best Practices

Test Transaction Retries

Use Separate Test Database

Summary

Next Steps