Skip to main content

SortOrder Class

The SortOrder class defines how data and delete files should be ordered in an Apache Iceberg table. Package: org.apache.iceberg

Overview

The SortOrder class provides:
  • Multi-column sorting specifications
  • Custom sort directions (ascending/descending)
  • Null ordering control (nulls first/last)
  • Sort order evolution with order IDs
  • Transform-based sorting
Sort order ID 0 is reserved for unsorted tables. All custom sort orders must use IDs greater than 0.

Basic Methods

schema()

public Schema schema()
Returns the schema for this sort order. Returns: The Schema this sort order is bound to

orderId()

public int orderId()
Returns the ID of this sort order. Returns: The order ID (0 for unsorted)

fields()

public List<SortField> fields()
Returns the list of sort fields for this sort order. Returns: List of SortField objects Example:
SortOrder sortOrder = table.sortOrder();
for (SortField field : sortOrder.fields()) {
    System.out.println(field.sourceId() + " " + field.direction());
}

isSorted()

public boolean isSorted()
Returns true if the sort order has at least one sort field. Returns: True if sorted, false otherwise

isUnsorted()

public boolean isUnsorted()
Returns true if the sort order has no sort fields. Returns: True if unsorted, false otherwise

Sort Order Comparison

satisfies(SortOrder anotherSortOrder)

public boolean satisfies(SortOrder anotherSortOrder)
Checks whether this order satisfies another order. A sort order satisfies another if:
  • The other order is unsorted, OR
  • This order has at least as many fields as the other, AND
  • Each field in the other order is satisfied by the corresponding field in this order
anotherSortOrder
SortOrder
required
The sort order to check against
Returns: True if this order satisfies the given order Example:
SortOrder order1 = SortOrder.builderFor(schema)
    .asc("timestamp")
    .asc("id")
    .build();

SortOrder order2 = SortOrder.builderFor(schema)
    .asc("timestamp")
    .build();

// order1 satisfies order2 (has timestamp ascending)
assert order1.satisfies(order2);

// order2 does NOT satisfy order1 (missing id field)
assert !order2.satisfies(order1);

sameOrder(SortOrder anotherSortOrder)

public boolean sameOrder(SortOrder anotherSortOrder)
Checks whether this order is equivalent to another order while ignoring the order ID.
anotherSortOrder
SortOrder
required
The sort order to compare
Returns: True if the orders are equivalent

Conversion

toUnbound()

public UnboundSortOrder toUnbound()
Converts this sort order to an unbound sort order. Returns: An UnboundSortOrder representation

Static Factory Methods

unsorted()

public static SortOrder unsorted()
Returns a sort order for unsorted tables. Returns: An unsorted order with order ID 0 Example:
SortOrder order = SortOrder.unsorted();
assert order.isUnsorted();
assert order.orderId() == 0;

builderFor(Schema schema)

public static Builder builderFor(Schema schema)
Creates a new sort order builder for the given schema.
schema
Schema
required
The table schema
Returns: A Builder instance Example:
SortOrder order = SortOrder.builderFor(schema)
    .asc("timestamp")
    .desc("priority")
    .build();

Builder Class

The Builder class creates valid sort orders.

Builder Methods

asc(Term term)

public Builder asc(Term term)
Adds an expression term to the sort in ascending order with nulls first.
term
Term
required
An expression term (e.g., Expressions.ref("column"))
Returns: This builder for method chaining Example:
import org.apache.iceberg.expressions.Expressions;

Builder builder = SortOrder.builderFor(schema)
    .asc(Expressions.ref("timestamp"));

asc(Term term, NullOrder nullOrder)

public Builder asc(Term term, NullOrder nullOrder)
Adds an expression term to the sort in ascending order with custom null ordering.
term
Term
required
An expression term
nullOrder
NullOrder
required
Null order (NULLS_FIRST or NULLS_LAST)
Returns: This builder for method chaining Example:
Builder builder = SortOrder.builderFor(schema)
    .asc(Expressions.ref("timestamp"), NullOrder.NULLS_LAST);

desc(Term term)

public Builder desc(Term term)
Adds an expression term to the sort in descending order with nulls first.
term
Term
required
An expression term
Returns: This builder for method chaining Example:
Builder builder = SortOrder.builderFor(schema)
    .desc(Expressions.ref("priority"));

desc(Term term, NullOrder nullOrder)

public Builder desc(Term term, NullOrder nullOrder)
Adds an expression term to the sort in descending order with custom null ordering.
term
Term
required
An expression term
nullOrder
NullOrder
required
Null order (NULLS_FIRST or NULLS_LAST)
Returns: This builder for method chaining

sortBy(String name, SortDirection direction, NullOrder nullOrder)

public Builder sortBy(String name, SortDirection direction, NullOrder nullOrder)
Adds a column to the sort by name.
name
String
required
The column name
direction
SortDirection
required
Sort direction (ASC or DESC)
nullOrder
NullOrder
required
Null order (NULLS_FIRST or NULLS_LAST)
Returns: This builder for method chaining Example:
Builder builder = SortOrder.builderFor(schema)
    .sortBy("timestamp", SortDirection.ASC, NullOrder.NULLS_LAST);

sortBy(Term term, SortDirection direction, NullOrder nullOrder)

public Builder sortBy(Term term, SortDirection direction, NullOrder nullOrder)
Adds an expression term to the sort.
term
Term
required
An expression term
direction
SortDirection
required
Sort direction (ASC or DESC)
nullOrder
NullOrder
required
Null order (NULLS_FIRST or NULLS_LAST)
Returns: This builder for method chaining

withOrderId(int newOrderId)

public Builder withOrderId(int newOrderId)
Sets the order ID for this sort order.
newOrderId
int
required
The order ID (must be > 0 for sorted orders)
Returns: This builder for method chaining
Order ID 0 is reserved for unsorted tables and will cause an error if used with sort fields.

caseSensitive(boolean sortCaseSensitive)

public Builder caseSensitive(boolean sortCaseSensitive)
Sets whether column name matching should be case-sensitive.
sortCaseSensitive
boolean
required
True for case-sensitive matching
Returns: This builder for method chaining

build()

public SortOrder build()
Builds the sort order and validates compatibility with the schema. Returns: A new SortOrder Throws: ValidationException if the sort order is invalid

Static Validation Methods

checkCompatibility(SortOrder sortOrder, Schema schema)

public static void checkCompatibility(SortOrder sortOrder, Schema schema)
Checks the compatibility of a sort order with a schema. Validates that:
  • All source fields exist in the schema
  • All source fields are primitive types
  • All transforms can be applied to their source types
sortOrder
SortOrder
required
The sort order to validate
schema
Schema
required
The schema to validate against
Throws: ValidationException if incompatible

Usage Examples

Creating an Unsorted Order

SortOrder order = SortOrder.unsorted();
assert order.isUnsorted();

Creating a Simple Sort Order

import org.apache.iceberg.expressions.Expressions;

SortOrder order = SortOrder.builderFor(schema)
    .asc(Expressions.ref("timestamp"))
    .build();

Creating a Multi-Column Sort Order

SortOrder order = SortOrder.builderFor(schema)
    .asc(Expressions.ref("category"))
    .desc(Expressions.ref("priority"))
    .asc(Expressions.ref("timestamp"))
    .build();

Custom Null Ordering

import org.apache.iceberg.NullOrder;

SortOrder order = SortOrder.builderFor(schema)
    .asc(Expressions.ref("timestamp"), NullOrder.NULLS_LAST)
    .desc(Expressions.ref("score"), NullOrder.NULLS_FIRST)
    .build();

Sort with Transforms

import org.apache.iceberg.expressions.Expressions;

SortOrder order = SortOrder.builderFor(schema)
    .asc(Expressions.year("timestamp"))  // Sort by year
    .asc(Expressions.bucket("user_id", 16))  // Then by user bucket
    .build();

Using sortBy Method

import org.apache.iceberg.SortDirection;
import org.apache.iceberg.NullOrder;

SortOrder order = SortOrder.builderFor(schema)
    .sortBy("timestamp", SortDirection.ASC, NullOrder.NULLS_LAST)
    .sortBy("id", SortDirection.DESC, NullOrder.NULLS_FIRST)
    .build();

Updating Table Sort Order

import org.apache.iceberg.expressions.Expressions;

table.replaceSortOrder()
    .asc(Expressions.ref("timestamp"))
    .desc(Expressions.ref("priority"))
    .commit();

Checking Sort Order Satisfaction

SortOrder order1 = SortOrder.builderFor(schema)
    .asc(Expressions.ref("timestamp"))
    .asc(Expressions.ref("id"))
    .build();

SortOrder order2 = SortOrder.builderFor(schema)
    .asc(Expressions.ref("timestamp"))
    .build();

if (order1.satisfies(order2)) {
    System.out.println("order1 can be used for queries requiring order2");
}

Inspecting Sort Fields

SortOrder order = table.sortOrder();

if (order.isSorted()) {
    System.out.println("Order ID: " + order.orderId());
    for (SortField field : order.fields()) {
        System.out.println("Field: " + field.sourceId());
        System.out.println("  Direction: " + field.direction());
        System.out.println("  Null order: " + field.nullOrder());
        System.out.println("  Transform: " + field.transform());
    }
} else {
    System.out.println("Table is unsorted");
}

Sort Order Evolution

// Original sort order
SortOrder order1 = SortOrder.builderFor(schema)
    .withOrderId(1)
    .asc(Expressions.ref("timestamp"))
    .build();

// Evolved sort order with additional sort field
SortOrder order2 = SortOrder.builderFor(schema)
    .withOrderId(2)
    .asc(Expressions.ref("timestamp"))
    .asc(Expressions.ref("category"))
    .build();

// order2 satisfies order1
assert order2.satisfies(order1);

Case-Insensitive Builder

SortOrder order = SortOrder.builderFor(schema)
    .caseSensitive(false)
    .asc(Expressions.ref("TIMESTAMP"))  // Will match "timestamp"
    .build();

Enums

SortDirection

public enum SortDirection {
    ASC,   // Ascending order
    DESC   // Descending order
}

NullOrder

public enum NullOrder {
    NULLS_FIRST,  // Null values appear first
    NULLS_LAST    // Null values appear last
}

Best Practices

Sort Order Strategy:
  • Sort by commonly filtered columns to improve query performance
  • Place high-cardinality columns first
  • Use transforms (like year, month, bucket) for efficient data organization
  • Consider data clustering when designing sort orders
Performance Considerations:
  • Sort orders affect file organization and query performance
  • Changing sort orders requires rewriting data files
  • More sort fields can improve specific queries but increase complexity

Source Code Reference

Source: org/apache/iceberg/SortOrder.java:41

Build docs developers (and LLMs) love