SQL Performance Best Practices
DML Operations
Use multi-row statements
Use multi-row statements
- Reduces network round trips
- Decreases transaction overhead
- Improves overall throughput
Use UPSERT for tables without secondary indexes
Use UPSERT for tables without secondary indexes
UPSERT is faster than INSERT ON CONFLICT.UPSERT uses a blind write without checking for existing data, which is faster when there are no secondary indexes to maintain.Bulk Operations
Bulk insert best practices
Bulk insert best practices
Bulk delete strategies
Bulk delete strategies
TRUNCATE benefits:- Drops and recreates the table (no row-by-row deletion)
- Significantly faster than
DELETEfor removing all rows - Releases disk space immediately
Unique ID Generation
The approach to generating unique IDs has significant performance implications in distributed databases.Multi-column primary keys (best performance)
Multi-column primary keys (best performance)
- Distributes data across the cluster (avoids hotspots)
- Monotonically increasing column is not the first column
- Natural clustering for query patterns
UUID and hash-sharded indexes (good performance)
UUID and hash-sharded indexes (good performance)
gen_random_uuid(): Widely compatible, random distributionunique_rowid(): 64-bit integer, good distribution- Hash-sharded indexes: Use when you need sequential IDs with distributed writes
Avoid sequential IDs (poor performance)
Avoid sequential IDs (poor performance)
- All writes go to the same range (the one containing the highest IDs)
- Creates CPU and I/O hotspot on one node
- Limits write throughput to single-node capacity
Index Optimization
Index columns in WHERE clauses
Index columns in WHERE clauses
- Put equality conditions first
- Put range conditions last
- Consider query patterns when ordering columns
Avoid SELECT * for large tables
Avoid SELECT * for large tables
- Reduces network transfer
- Uses less memory
- May use covering indexes
Use column families for hot/cold data
Use column families for hot/cold data
- Reduces write amplification
- Smaller writes when updating hot columns
- Better compaction performance
Query Optimization
Use EXPLAIN to analyze queries
Use EXPLAIN to analyze queries
EXPLAIN to understand query plans:- Full table scans (scan without index)
- High row counts
- Expensive operations (sort, hash join)
- Missing statistics
Use AS OF SYSTEM TIME for long-running queries
Use AS OF SYSTEM TIME for long-running queries
- No transaction conflicts with writes
- Doesn’t affect write latency
- Can read historical data
Cluster Performance Configuration
CPU and Concurrency
CPU planning
CPU planning
- Minimum: 4 vCPUs per node for stability
- Recommended: 8-16 vCPUs per node for production
- Maximum tested: 32 vCPUs per node
Connection pooling
Connection pooling
- Total vCPUs: 24
- Max connections per app instance: 96
- Min connections per app instance: 24
Memory Configuration
Memory allocation
Memory allocation
- Minimum: 16 GB per node
- Recommended: 32-64 GB per node
- Cache + SQL memory should be ≤50% of total RAM
Prevent OOM issues
Prevent OOM issues
- Disable swap:
sudo swapoff -a - Set appropriate
--max-sql-memory - Optimize memory-intensive queries
- Monitor
diagnostics.active_query_dumps.enabledfor memory pressure dumps
Storage Performance
Disk I/O requirements
Disk I/O requirements
- Minimum IOPS: 500 IOPS per node
- Recommended: 2000+ IOPS per node
- Disk type: SSD or NVMe (not HDD)
- Latency: Under 10ms for sustained operations
- AWS: Use gp3 (baseline 3000 IOPS) or io2 volumes
- GCP: Use pd-ssd or pd-extreme disks
- Azure: Use Premium SSD or Ultra Disk
Monitor LSM health
Monitor LSM health
- L0 sublevels: Under 20 healthy, 20-100 warning, over 100 critical
- Read amplification: Under 10 healthy, 10-20 warning, over 20 critical
- Reduce write rate temporarily
- Increase CPU allocation for compaction
- Check
kv.snapshot_rebalance.max_rateisn’t too high
Replication and Distribution
Replication factor
Replication factor
- 3 replicas: Standard, tolerates 1 node failure
- 5 replicas: High availability, tolerates 2 node failures, higher write latency
- 1 replica: Testing only, no fault tolerance
Geo-partitioning for multi-region
Geo-partitioning for multi-region
- Lower latency for regional users
- Compliance with data domiciling requirements
- Reduced cross-region traffic
Performance Monitoring
Key metrics to watch
Key metrics to watch
- P99 SQL latency: Under 100ms for OLTP
- P99 commit latency: Under 50ms single region
- QPS: Based on workload, watch trends
- Queries per second per vCPU: ~1000-2000
- CPU: 30-80% normal, over 80% investigate
- Memory: Under 90% total, under 75% SQL memory
- Disk: Under 85% capacity, under 80% sustained IOPS
- Under-replicated ranges: 0
- Unavailable ranges: 0
- Rebalancing: Minimal during steady state
Troubleshooting Performance Issues
High latency
High latency
- Add indexes for slow queries
- Optimize query plans
- Reduce transaction contention
- Check CPU and disk I/O
Write hotspots
Write hotspots
- Use UUID or composite primary keys (not sequential)
- Enable hash-sharded indexes
- Review table design for hotspot causes
Memory pressure
Memory pressure
- Increase node memory
- Reduce
--max-sql-memory - Optimize memory-intensive queries
- Enable
diagnostics.active_query_dumps.enabled