ComputePartitionStats

The ComputePartitionStats action computes and writes partition statistics for an Iceberg table. This action helps optimize query planning by providing metadata about partition-level data characteristics.

Interface

public interface ComputePartitionStats extends Action<ComputePartitionStats, ComputePartitionStats.Result>

Overview

Partition statistics provide valuable metadata that query engines can use to optimize execution plans. The action:

Computes statistics for partitions in a table snapshot
Writes statistics to a dedicated statistics file
Uses the current snapshot by default
Can target a specific snapshot if needed

Methods

snapshot

Choose a specific table snapshot to compute partition statistics.

ComputePartitionStats snapshot(long snapshotId)

Parameters:

snapshotId - The ID of the snapshot for which stats need to be computed

Returns: this for method chaining Example:

action.snapshot(1234567890L);

If not specified, the action uses the current snapshot of the table.

Result

The Result interface provides information about the computed statistics.

Methods

interface Result {
  PartitionStatisticsFile statisticsFile();
}

statisticsFile

Returns the statistics file that was written, or null if no statistics were collected. Returns: PartitionStatisticsFile or null

Usage Examples

Compute Stats for Current Snapshot

// Compute partition statistics for the current snapshot
ComputePartitionStats.Result result = actions
  .computePartitionStats(table)
  .execute();

PartitionStatisticsFile statsFile = result.statisticsFile();
if (statsFile != null) {
  System.out.println("Statistics file: " + statsFile.path());
  System.out.println("Snapshot ID: " + statsFile.snapshotId());
}

Compute Stats for Specific Snapshot

// Compute partition statistics for a specific snapshot
long targetSnapshotId = table.currentSnapshot().parentId();

ComputePartitionStats.Result result = actions
  .computePartitionStats(table)
  .snapshot(targetSnapshotId)
  .execute();

if (result.statisticsFile() != null) {
  System.out.println("Computed stats for snapshot: " + targetSnapshotId);
}

Check Statistics File Details

// Compute stats and examine the results
ComputePartitionStats.Result result = actions
  .computePartitionStats(table)
  .execute();

PartitionStatisticsFile statsFile = result.statisticsFile();
if (statsFile != null) {
  System.out.println("Statistics Details:");
  System.out.println("  Path: " + statsFile.path());
  System.out.println("  Snapshot: " + statsFile.snapshotId());
  System.out.println("  Size: " + statsFile.fileSizeInBytes() + " bytes");
} else {
  System.out.println("No statistics were collected");
}

Best Practices

Compute after significant data changes: Run this action after major writes or rewrites to keep statistics current
Use with query optimization: Ensure your query engine is configured to use partition statistics
Monitor statistics freshness: Outdated statistics may lead to suboptimal query plans
Consider snapshot selection: For historical analysis, compute stats on specific snapshots

Partition statistics are most beneficial for partitioned tables with selective query patterns.

ComputeTableStats - Compute table-level statistics
ExpireSnapshots - Manage old snapshots
RewriteDataFiles - Optimize data file layout

Core API

Catalog API

Scan API

Write API

Actions API

REST Catalog API

Types & Expressions

ComputePartitionStats Action

ComputePartitionStats

Interface

Overview

Methods

snapshot

Result

Methods

statisticsFile

Usage Examples

Compute Stats for Current Snapshot

Compute Stats for Specific Snapshot

Check Statistics File Details

Best Practices

Build docs developers (and LLMs) love

Core API

Catalog API

Scan API

Write API

Actions API

REST Catalog API

Types & Expressions

​ComputePartitionStats

​Interface

​Overview

​Methods

​snapshot

​Result

​Methods

​statisticsFile

​Usage Examples

​Compute Stats for Current Snapshot

​Compute Stats for Specific Snapshot

​Check Statistics File Details

​Best Practices

​Related

Build docs developers (and LLMs) love

ComputePartitionStats

Interface

Overview

Methods

snapshot

Result

Methods

statisticsFile

Usage Examples

Compute Stats for Current Snapshot

Compute Stats for Specific Snapshot

Check Statistics File Details

Best Practices

Related