Tips and techniques for building high-performance YugabyteDB applications
Follow these best practices to build efficient, scalable, and resilient applications on YugabyteDB. These guidelines cover data modeling, connection management, query optimization, and distributed system considerations.
Primary key design impacts data distribution and query performance.
Use HASH for Sequential IDs
For auto-incrementing IDs, use HASH partitioning to distribute data evenly:
-- Without HASH: Data concentrates on one tabletCREATE TABLE orders ( id SERIAL PRIMARY KEY, user_id INTEGER, total DECIMAL(10, 2));-- With HASH: Data distributed across tabletsCREATE TABLE orders ( id SERIAL, user_id INTEGER, total DECIMAL(10, 2), PRIMARY KEY (id HASH)) SPLIT INTO 16 TABLETS;
Composite Keys for Time-Series Data
Use composite keys for efficient time-range queries:
CREATE TABLE sensor_data ( sensor_id INTEGER, timestamp TIMESTAMP, temperature DOUBLE PRECISION, humidity DOUBLE PRECISION, PRIMARY KEY ((sensor_id) HASH, timestamp ASC)) SPLIT INTO 16 TABLETS;-- Efficient range querySELECT * FROM sensor_data WHERE sensor_id = 101 AND timestamp BETWEEN '2024-01-01' AND '2024-01-31';
Natural Keys for Even Distribution
Use naturally distributed keys when available:
-- UUID provides natural distributionCREATE TABLE users ( user_id UUID PRIMARY KEY DEFAULT gen_random_uuid(), username TEXT, email TEXT);-- Email as primary key (if unique and evenly distributed)CREATE TABLE user_profiles ( email TEXT PRIMARY KEY, name TEXT, preferences JSONB);
Include frequently queried columns in the index to avoid table lookups:
-- Regular index requires table lookupCREATE INDEX idx_users_email ON users(email);-- Covering index includes needed columnsCREATE INDEX idx_users_email_covering ON users(email) INCLUDE (username, created_at);-- This query uses index-only scanSELECT email, username, created_at FROM users WHERE email = '[email protected]';
Index only relevant rows to save space and improve write performance:
-- Index only active usersCREATE INDEX idx_active_usersON users(username) WHERE status = 'active';-- Index only recent ordersCREATE INDEX idx_recent_ordersON orders(created_at) WHERE created_at > NOW() - INTERVAL '90 days';
Index computed values for complex queries:
-- Index for case-insensitive searchCREATE INDEX idx_users_lower_emailON users(LOWER(email));SELECT * FROM users WHERE LOWER(email) = '[email protected]';-- Index for JSON fieldCREATE INDEX idx_product_categoryON products((attributes->>'category'));
Combine insert/update with select to reduce round-trips:
-- Single round-trip instead of twoINSERT INTO users (username, email) VALUES ('john_doe', '[email protected]')RETURNING id, created_at;-- Update and return in one operationUPDATE accounts SET balance = balance - 100WHERE account_id = 123RETURNING balance;
-- Bad: Long-running transactionBEGIN; SELECT * FROM large_table; -- Expensive operation -- ... complex processing ... UPDATE accounts SET balance = balance - 100 WHERE id = 1;COMMIT;-- Good: Short transaction-- Do expensive operations outside transactionSELECT * FROM large_table;-- ... process data ...BEGIN; UPDATE accounts SET balance = balance - 100 WHERE id = 1;COMMIT;
-- Analyze query execution planEXPLAIN (ANALYZE, COSTS, VERBOSE) SELECT u.username, COUNT(o.id) FROM users uLEFT JOIN orders o ON u.id = o.user_idGROUP BY u.username;-- Check if index is being usedEXPLAIN SELECT * FROM users WHERE email = '[email protected]';
-- Enable follower reads for current sessionSET yb_read_from_followers = true;-- Set staleness tolerance (in milliseconds)SET yb_follower_read_staleness_ms = 10000; -- 10 seconds-- Query will use nearest replicaSELECT * FROM users WHERE user_id = 'uuid';
-- Enable row-level securityALTER TABLE documents ENABLE ROW LEVEL SECURITY;-- Create policyCREATE POLICY user_documents ON documents FOR ALL TO app_user USING (user_id = current_setting('app.user_id')::INTEGER);-- Set user contextSET app.user_id = 123;-- User only sees their documentsSELECT * FROM documents; -- Filtered by policy