Follow these best practices to build high-performance, reliable applications with CockroachDB.
Transaction Best Practices
Implement Transaction Retry Logic
CockroachDB may require clients to retry transactions due to contention. All production applications must implement retry logic.
Applications that do not implement transaction retry logic will fail with “retry write” errors under contention.
async function retryTxn ( client , operation , maxRetries = 15 ) {
for ( let i = 0 ; i < maxRetries ; i ++ ) {
try {
await client . query ( 'BEGIN' )
const result = await operation ( client )
await client . query ( 'COMMIT' )
return result
} catch ( err ) {
await client . query ( 'ROLLBACK' )
// Check for retry error code
if ( err . code !== '40001' ) {
throw err
}
// Exponential backoff
const sleepMs = Math . pow ( 2 , i ) * 100
await new Promise ( resolve => setTimeout ( resolve , sleepMs ))
}
}
throw new Error ( 'Transaction max retry limit reached' )
}
// Usage
await retryTxn ( client , async ( txnClient ) => {
await txnClient . query (
'UPDATE accounts SET balance = balance - $1 WHERE id = $2' ,
[ 100 , fromId ]
)
await txnClient . query (
'UPDATE accounts SET balance = balance + $1 WHERE id = $2' ,
[ 100 , toId ]
)
})
from sqlalchemy_cockroachdb import run_transaction
def transfer_funds ( session , from_id , to_id , amount ):
from_account = session.query(Account).filter_by( id = from_id).one()
to_account = session.query(Account).filter_by( id = to_id).one()
from_account.balance -= amount
to_account.balance += amount
# run_transaction handles retries automatically
run_transaction(
sessionmaker,
lambda session : transfer_funds(session, id1, id2, 100 )
)
import crdbpgx " github.com/cockroachdb/cockroach-go/v2/crdb/crdbpgxv5 "
err := crdbpgx . ExecuteTx ( ctx , conn , pgx . TxOptions {}, func ( tx pgx . Tx ) error {
// Deduct from sender
_ , err := tx . Exec ( ctx ,
"UPDATE accounts SET balance = balance - $1 WHERE id = $2" ,
amount , fromID )
if err != nil {
return err
}
// Add to recipient
_ , err = tx . Exec ( ctx ,
"UPDATE accounts SET balance = balance + $1 WHERE id = $2" ,
amount , toID )
return err
})
public static void transferFunds (
Connection conn, UUID fromId, UUID toId, BigDecimal amount
) throws SQLException {
int maxRetries = 15 ;
for ( int i = 0 ; i < maxRetries; i ++ ) {
try {
conn . setAutoCommit ( false );
try ( PreparedStatement ps = conn . prepareStatement (
"UPDATE accounts SET balance = balance - ? WHERE id = ?" )) {
ps . setBigDecimal ( 1 , amount);
ps . setObject ( 2 , fromId);
ps . executeUpdate ();
}
try ( PreparedStatement ps = conn . prepareStatement (
"UPDATE accounts SET balance = balance + ? WHERE id = ?" )) {
ps . setBigDecimal ( 1 , amount);
ps . setObject ( 2 , toId);
ps . executeUpdate ();
}
conn . commit ();
return ;
} catch ( SQLException e ) {
conn . rollback ();
// Check for retry error
if ( ! e . getSQLState (). equals ( "40001" )) {
throw e;
}
// Exponential backoff
Thread . sleep (( long ) Math . pow ( 2 , i) * 100 );
}
}
}
Keep Transactions Short
Long-running transactions can cause performance issues and increase retry errors.
Do # Good: Short, focused transaction
def update_balance ( session , account_id , amount ):
account = session.query(Account).get(account_id)
account.balance += amount
run_transaction(sessionmaker,
lambda s : update_balance(s, id , 100 ))
Don't # Bad: Long transaction with external calls
def process_payment ( session , account_id , amount ):
account = session.query(Account).get(account_id)
# External API call - DON'T DO THIS
response = requests.post( 'https://api.payment.com/charge' )
account.balance -= amount
DML Best Practices
Use Multi-Row Statements
Multi-row statements are significantly faster than multiple single-row statements.
Good: Multi-row INSERT
Bad: Multiple single-row INSERTs
INSERT INTO accounts (id, balance) VALUES
(gen_random_uuid(), 1000 ),
(gen_random_uuid(), 2000 ),
(gen_random_uuid(), 3000 );
Use UPSERT for Insert-or-Update
On tables with no secondary indexes, UPSERT is faster than INSERT ON CONFLICT.
Preferred: UPSERT
Slower: INSERT ON CONFLICT
UPSERT INTO accounts (id, balance)
VALUES ( '123e4567-e89b-12d3-a456-426614174000' , 1000 );
Batch Large Operations
For bulk inserts, use batches of 100-1000 rows. Experiment to find the optimal size.
BATCH_SIZE = 500
for i in range ( 0 , len (records), BATCH_SIZE ):
batch = records[i:i + BATCH_SIZE ]
def insert_batch ( session ):
session.bulk_insert_mappings(Account, batch)
run_transaction(sessionmaker, insert_batch)
int BATCH_SIZE = 128 ; // Use powers of 2 for JDBC
connection . setAutoCommit ( false );
try ( PreparedStatement pstmt = connection . prepareStatement (
"INSERT INTO accounts (id, balance) VALUES (?, ?)" )) {
for ( int i = 0 ; i < records . size (); i ++ ) {
pstmt . setObject ( 1 , UUID . randomUUID ());
pstmt . setInt ( 2 , records . get (i). balance );
pstmt . addBatch ();
if ((i + 1 ) % BATCH_SIZE == 0 ) {
pstmt . executeBatch ();
}
}
pstmt . executeBatch (); // Execute remaining
connection . commit ();
}
For JDBC specifically, use a batch size of 128 (or other powers of 2) for optimal performance.
Primary Key Best Practices
Avoid Sequential Primary Keys
Sequential IDs (like SERIAL) create hotspots that hurt performance in distributed databases.
Recommended Use UUID v4 or gen_random_uuid() CREATE TABLE accounts (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
balance INT
);
Spreads data evenly across the cluster.
Avoid Sequential SERIAL CREATE TABLE accounts (
id SERIAL PRIMARY KEY ,
balance INT
);
Creates hotspots on a single range.
Use Multi-Column Primary Keys
For best performance, design composite primary keys with:
A well-distributed prefix (e.g., user ID)
A monotonically increasing suffix (e.g., timestamp)
CREATE TABLE posts (
username STRING,
post_timestamp TIMESTAMP ,
post_id UUID,
content TEXT ,
PRIMARY KEY (username, post_timestamp)
);
This allows efficient queries like:
SELECT * FROM posts
WHERE username = 'alice'
ORDER BY post_timestamp DESC
LIMIT 10 ;
Connection Management
Use Connection Pooling
Always use connection pooling in production applications.
Node.js (pg)
Python (SQLAlchemy)
Java (HikariCP)
const { Pool } = require ( 'pg' )
const pool = new Pool ({
connectionString: process . env . DATABASE_URL ,
max: 20 , // Maximum pool size
idleTimeoutMillis: 30000 , // Close idle clients after 30s
connectionTimeoutMillis: 2000 ,
})
// Use pool instead of individual clients
const result = await pool . query ( 'SELECT * FROM accounts' )
from sqlalchemy import create_engine
engine = create_engine(
db_uri,
pool_size = 10 , # Base pool size
max_overflow = 20 , # Allow up to 30 total connections
pool_pre_ping = True , # Verify connections before use
pool_recycle = 3600 # Recycle connections after 1 hour
)
HikariConfig config = new HikariConfig ();
config . setJdbcUrl (jdbcUrl);
config . setMaximumPoolSize ( 20 );
config . setMinimumIdle ( 5 );
config . setConnectionTimeout ( 2000 );
config . setIdleTimeout ( 30000 );
HikariDataSource dataSource = new HikariDataSource (config);
Set Appropriate Timeouts
SET statement_timeout = '30s' ;
SET idle_in_transaction_session_timeout = '60s' ;
Index Best Practices
Index WHERE Clause Columns
Create indexes on columns frequently used in WHERE clauses.
-- Query pattern
SELECT * FROM orders WHERE customer_id = $ 1 AND status = 'pending' ;
-- Appropriate index
CREATE INDEX ON orders (customer_id, status );
Use Covering Indexes
Include frequently selected columns in the index to avoid table lookups.
CREATE INDEX orders_by_customer ON orders (customer_id)
STORING (order_date, total_amount);
Now this query only reads the index:
SELECT order_date, total_amount
FROM orders
WHERE customer_id = $ 1 ;
Avoid Too Many Indexes
Each index slows down writes. Only create indexes you actually use.
Schema Design Best Practices
Use Column Families for Large Rows
Group frequently updated columns separately from large, seldom-updated columns.
CREATE TABLE users (
id UUID PRIMARY KEY ,
email STRING,
name STRING,
-- Large, infrequently updated
profile_image BYTES,
bio TEXT ,
FAMILY primary (id, email, name ),
FAMILY large_data (profile_image, bio)
);
Choose Appropriate Data Types
For IDs Use UUID instead of SERIAL id UUID DEFAULT gen_random_uuid()
For Money Use DECIMAL for exact precision
For Timestamps Use TIMESTAMPTZ for time zones created_at TIMESTAMPTZ DEFAULT now ()
For JSON Use JSONB for flexible schemas
Application Architecture Best Practices
Retry Idempotently
Ensure retried operations are idempotent to avoid duplicate work.
# Use unique constraint to prevent duplicates
def create_user_idempotent ( session , user_id , email ):
try :
user = User( id = user_id, email = email)
session.add(user)
except IntegrityError:
# User already exists - this is okay for retries
session.rollback()
user = session.query(User).filter_by( id = user_id).one()
return user
Use IMPORT INTO for Bulk Loads
For large data imports, use IMPORT INTO instead of INSERT statements.
IMPORT INTO accounts (id, balance)
CSV DATA ( 's3://bucket/accounts.csv' )
WITH delimiter = ',' , skip = '1' ;
This bypasses the SQL layer and writes directly to storage, making it 10-100x faster.
Monitor and Log Slow Queries
Enable slow query logging to identify performance issues.
SET CLUSTER SETTING sql . log . slow_query . latency_threshold = '100ms' ;
Language-Specific Best Practices
Java (JDBC)
Enable batch rewriting
String url = jdbcUrl + "?reWriteBatchedInserts=true" ;
This provides 2-3x performance improvement for batched inserts.
Use batch size of 128
PGJDBC performs best with power-of-2 batch sizes.
Use PKCS8 key format
For certificate authentication, convert keys to PKCS8 format: openssl pkcs8 -topk8 -inform PEM -outform DER \
-in client.key -out client.pk8 -nocrypt
Python (SQLAlchemy)
Always use run_transaction() from sqlalchemy-cockroachdb
Replace postgresql:// with cockroachdb:// in connection strings
Never call session.commit(), session.rollback(), or session.flush() inside run_transaction()
Node.js
Use the pg driver’s native connection pooling
Implement exponential backoff for transaction retries
Set statement_timeout for long-running queries
Testing Best Practices
Test Transaction Retries
Simulate contention to ensure your retry logic works:
-- Force retry errors for testing
SET CLUSTER SETTING sql . defaults .inject_retry_errors_enabled = true;
SET inject_retry_errors_enabled = true;
Use Separate Test Database
Create isolated test databases for integration tests:
CREATE DATABASE test_db ;
USE test_db;
Summary
Always Implement
Transaction retry logic with exponential backoff
Connection pooling
UUID primary keys (not SERIAL)
Multi-row DML statements
Never Do
Long-running transactions
Sequential primary keys in high-write tables
Missing indexes on WHERE clauses
Ignoring retry errors
Next Steps
Client Drivers Install drivers for your programming language
ORM Support Learn about ORM frameworks
Performance Tuning Advanced performance optimization techniques
Example Applications View complete example apps in your language