Overview
TeeBI includes parallel processing capabilities in theBI.Arrays.Parallel unit, enabling you to leverage multiple CPU cores for sorting and merging large arrays.
Parallel Array Sorting
TheTParallelArray class provides parallel sorting using a hybrid QuickSort + InsertionSort algorithm with split & merge.
Basic Usage
Supported Array Types
TParallelArray.Sort supports multiple data types:
Thread Count Parameter
- 0: Auto-detect CPU count using
TThread.ProcessorCount(recommended) - 1: Single-threaded sort (no parallelization)
- N > 1: Use N threads explicitly
- N < 0: Raises exception
Algorithm Details
Split & Merge Strategy
- Split: Divide array into N segments (one per thread)
- Parallel Sort: Each thread sorts its segment independently
- Merge: Sequentially merge sorted segments back together
Hybrid Sorting
Each segment uses a hybrid approach:- QuickSort for larger portions
- InsertionSort for smaller portions (more efficient for small arrays)
Implementation
FromBI.Arrays.Parallel.pas:147-161:
Merging Sorted Arrays
TheTSortedAscendingArray and TSortedDescendingArray classes provide methods to merge already-sorted arrays.
Ascending Merge
Descending Merge
Supported Types for Merge
TInt32ArrayTInt64ArrayTSingleArrayTDoubleArray
Performance Considerations
When to Use Parallel Sort
Use parallel sort when:- Array has > 100,000 elements
- Multiple CPU cores available
- Data is randomly distributed
- Array is small (< 10,000 elements)
- Single-core CPU
- Data is nearly sorted already
Benchmark Example
Thread Overhead
Smaller arrays may perform worse with parallelization due to thread creation overhead. Test with your specific data sizes.Integration with TDataItem
Parallel sorting works directly with TDataItem columns:Limitations
- No Generics: Due to Pascal limitations, each type has its own implementation
- Copy-Based: Returns new sorted array; original is not modified in-place
- Simple Types Only: Works only with numeric types (Int32, Int64, Single, Double)
- Comparison Operator: Requires
<and>operators (why generics can’t be used)
Best Practices
- Auto-Detect Threads: Use
Threads = 0for automatic CPU detection - Profile First: Measure performance before optimizing
- Consider Data Size: Parallel sort helps most with > 100K elements
- Reuse Results: Cache sorted arrays if used multiple times
- Memory Aware: Parallel sort creates temporary arrays; ensure adequate RAM
