Big Data Support

Overview

TeeBI is big-data ready, designed to handle datasets containing billions of cells. The array-based architecture and 64-bit support enable processing of massive datasets that exceed traditional database limitations.

Architecture for Scale

64-Bit Support

TeeBI automatically adapts to platform architecture:

TInteger = NativeInt;  // Int64 on x64, Int32 on x86

{$IFDEF CPUX64}
TNativeIntArray = TInt64Array;
TNativeInteger = Int64;
{$ELSE}
TNativeIntArray = TInt32Array;
TNativeInteger = Integer;
{$ENDIF}

On 64-bit platforms, arrays can theoretically address up to 2^63 elements (though practical limits are lower due to available RAM).

Memory-Efficient Storage

Columnar Format: Only load columns you need, not entire rows. Type Optimization: Each column uses the most appropriate data type. Delayed Loading: Use data providers to load data on-demand.

Billion-Cell Example

See the OneBillion demo for a working example of handling massive datasets.

Creating Large Datasets

uses BI.DataItem;

var Data: TDataItem;
    RowCount: Int64;
begin
  Data := TDataItem.Create(True);
  
  // Add columns
  Data.Items.Add('ID', dkInt64);
  Data.Items.Add('Value', dkDouble);
  Data.Items.Add('Category', dkInt32);
  
  // Allocate for 1 billion rows
  RowCount := 1000000000;
  Data.Resize(RowCount);
  
  // Populate data (consider parallel processing)
  // See Parallel Processing documentation
end;

Memory Considerations

Calculate Memory Requirements

Estimate memory needs before loading:

// Example: 100 million rows
Rows := 100000000;

// Columns:
// - Int64 (8 bytes)
// - Double (8 bytes)  
// - Int32 (4 bytes)
// - String (varies, assume avg 20 bytes)

EstimatedBytes := Rows * (8 + 8 + 4 + 20);
// = 100M * 40 bytes = 4 GB

Memory Management Tips

Free Unused Data: Release TDataItem instances when done
Use Filters: Create filtered views instead of copying data
Stream Processing: Process data in chunks when possible
Compression: Use compressed storage for disk persistence

Delayed Loading

For datasets too large to fit in memory, use delayed loading providers:

var Data: TDataItem;
    Provider: TDataDelayProvider;
begin
  // Create data with metadata only
  Data := TDataItem.Create(True);
  
  // Set up delayed loading provider
  Provider := TMyCustomProvider.Create;
  Data.Provider := Provider;
  
  // Data loads on-demand when accessed
  // Only loaded portions stay in memory
end;

Remote Web Server for Big Data

Use the TeeBI Web Server to serve large datasets remotely:

Server-Side Processing: Perform queries and aggregations on the server
Compressed Transmission: Only transfer needed data, compressed
Pagination: Load data in pages/chunks

See Web Server for details.

Query Optimization for Large Data

Use Indexes

Create indexes on frequently queried columns:

Data['CustomerID'].CreateIndex;

Limit Results

Use TOP and OFFSET in queries:

uses BI.SQL;

// Get first 1000 rows, skip first 5000
var Result: TDataItem;
Result := TBISQL.From(Data, 'top 1000 offset 5000 *');

Group and Aggregate

Reduce data volume through aggregation:

// Summarize billions of transactions
var Summary: TDataItem;
Summary := TBISQL.From(Data, 
  'sum(Amount), count(*) group by Year, Country');

Parallel Processing

Leverage multiple CPU cores for big data operations:

uses BI.Arrays.Parallel;

var Sorted: TInt64Array;
Sorted := TParallelArray.Sort(Data['ID'].Int64Data, True, 0);
// 0 = auto-detect CPU count

See Parallel Processing for more details.

Best Practices

Test with Representative Data: Profile performance with realistic data volumes
Monitor Memory: Use task manager or profiling tools to watch memory usage
Use 64-Bit: Always compile as 64-bit for large datasets
Progressive Loading: Load data in stages if possible
Compression: Use compression for stored data to reduce disk I/O

Get Started

Core Concepts

Data Import

Queries & Summaries

Pivot Tables

Visualization

Data Export

Advanced Topics

Big Data Support

Overview

Architecture for Scale

64-Bit Support

Memory-Efficient Storage

Billion-Cell Example

Creating Large Datasets

Memory Considerations

Calculate Memory Requirements

Memory Management Tips

Delayed Loading

Remote Web Server for Big Data

Query Optimization for Large Data

Use Indexes

Limit Results

Group and Aggregate

Parallel Processing

Best Practices

Platform Limits

Theoretical Limits (64-bit)

Practical Limits

Build docs developers (and LLMs) love

Get Started

Core Concepts

Data Import

Queries & Summaries

Pivot Tables

Visualization

Data Export

Advanced Topics

​Overview

​Architecture for Scale

​64-Bit Support

​Memory-Efficient Storage

​Billion-Cell Example

​Creating Large Datasets

​Memory Considerations

​Calculate Memory Requirements

​Memory Management Tips

​Delayed Loading

​Remote Web Server for Big Data

​Query Optimization for Large Data

​Use Indexes

​Limit Results

​Group and Aggregate

​Parallel Processing

​Best Practices

​Platform Limits

​Theoretical Limits (64-bit)

​Practical Limits

​Related Topics

Build docs developers (and LLMs) love

Overview

Architecture for Scale

64-Bit Support

Memory-Efficient Storage

Billion-Cell Example

Creating Large Datasets

Memory Considerations

Calculate Memory Requirements

Memory Management Tips

Delayed Loading

Remote Web Server for Big Data

Query Optimization for Large Data

Use Indexes

Limit Results

Group and Aggregate

Parallel Processing

Best Practices

Platform Limits

Theoretical Limits (64-bit)

Practical Limits

Related Topics