Architecture
Parallel queries in YugabyteDB leverage PostgreSQL’s parallel execution framework:Parallel Execution Flow
- Query Planning: Planner determines if parallelism is beneficial
- Worker Spawning: Leader process spawns background worker processes
- Work Distribution: Each worker scans a portion of the data
- Partial Aggregation: Workers perform local aggregations/sorts
- Result Gathering: Leader gathers and combines worker results
- Final Processing: Leader completes final aggregation/sorting
Supported Operations
Operations that can be parallelized:- Parallel Sequential Scan: Table scans distributed across workers
- Parallel Index Scan: Index range scans parallelized
- Parallel Bitmap Heap Scan: Bitmap index scans with parallel heap access
- Parallel Aggregation: GROUP BY and aggregate functions
- Parallel Hash Joins: Hash join build/probe phases parallelized
- Parallel Append: For partitioned tables (UNION ALL)
Current Support
YugabyteDB v2025.2+: Parallel queries supported for colocated tables. Support for hash-sharded and range-sharded tables is planned.Colocated Tables
Colocated tables store multiple tables in a single tablet, making them ideal for parallel processing:Configuration
Enabling Parallel Queries
For YugabyteDB v2025.2+ universes:- yugabyted deployments: Enabled by default
- YugabyteDB Anywhere: Enabled by default
- YugabyteDB Aeon: Enabled by default
- Parallel append automatically enabled
YugabyteDB-Specific Parameters
yb_parallel_range_rows
- Controls parallel query activation threshold
- Each worker processes approximately this many rows
0= parallel queries disabled- Recommended:
10000for analytical workloads
PostgreSQL Parallel Parameters
Query Examples
Parallel Sequential Scan
Parallel Append (Partitioned Tables)
Parallel Hash Join
Parallel Aggregation
Performance Tuning
Determining Optimal Worker Count
- Start with
max_parallel_workers_per_gather = 2 - Increase for very large tables (millions of rows)
- Diminishing returns typically after 4-8 workers
- Avoid more workers than available CPU cores
Cost-Based Tuning
Adjust costs to influence planner decisions:Forcing Parallel Execution
Disabling Parallel Execution
Monitoring and Analysis
EXPLAIN Output Analysis
Workers Planned: Number of parallel workers plannedWorkers Launched: Actual workers spawned (may be less than planned)Execution Time: Compare parallel vs. serial executionBuffers: I/O operations (shared hit, read)
Performance Comparison
System Resource Monitoring
Use Cases
Analytical Dashboards
Report Generation
Data Export
Best Practices
-
Start Conservative:
- Begin with
max_parallel_workers_per_gather = 2 - Increase gradually based on performance testing
- Monitor CPU and memory usage
- Begin with
-
Use for Appropriate Workloads:
- ✅ Analytical queries on large datasets
- ✅ Aggregations and reporting
- ✅ Batch processing
- ❌ OLTP queries (single-row lookups)
- ❌ Short-running queries (overhead > benefit)
-
Resource Awareness:
- Don’t exceed available CPU cores
- Account for concurrent queries
- Monitor memory usage (each worker uses memory)
-
Testing and Benchmarking:
- Use
EXPLAIN (ANALYZE)to verify parallelism - Compare execution times: parallel vs. serial
- Test with production-like data volumes
- Use
-
Configuration Management:
- Set session-level parameters for specific queries
- Use connection pooling to control parallelism
- Document optimal settings for critical queries
Limitations
Current Limitations
- Table Types: Parallel queries currently supported only for colocated tables
- Hash/Range-Sharded Tables: Parallel execution support planned for future releases
PostgreSQL Limitations (Inherited)
- Transaction Isolation: Some operations not parallelizable in certain isolation levels
- Write Queries: DML operations (INSERT/UPDATE/DELETE) not parallelized
- Cursors:
DECLARE CURSORqueries not parallelized - CTEs: Recursive CTEs and data-modifying CTEs not parallelized
When Parallelism Is Not Used
Planner may avoid parallelism when:- Table too small (below
min_parallel_table_scan_size) - Setup cost exceeds estimated benefit
- Maximum workers already in use
- Query contains non-parallel-safe functions
- Transaction in serializable isolation level

