Skip to main content

Apache Iceberg Java API

The Apache Iceberg Java API provides a comprehensive set of interfaces for working with Iceberg tables. This reference documentation covers the core APIs used for table operations, schema management, partitioning, sorting, and snapshots.

Core APIs

Iceberg’s Java API is built around several key interfaces:

Table API

The Table interface is the primary entry point for interacting with Iceberg tables. It provides methods for:
  • Reading and writing data
  • Managing table metadata
  • Creating scans and queries
  • Updating schemas, partition specs, and sort orders
  • Managing snapshots and history

Schema API

The Schema class defines the structure of table data. It provides:
  • Column definitions and types
  • Field identification and lookup
  • Schema evolution capabilities
  • Identifier field management

PartitionSpec API

The PartitionSpec class defines how data is partitioned in a table. It supports:
  • Multiple partition transforms (identity, bucket, truncate, year, month, day, hour)
  • Partition evolution
  • Efficient data organization

SortOrder API

The SortOrder class defines how data files should be sorted. It provides:
  • Multi-column sorting
  • Custom sort directions and null ordering
  • Sort order evolution

Snapshot API

The Snapshot interface represents a table state at a specific point in time. It enables:
  • Time travel queries
  • Snapshot metadata access
  • Manifest and data file tracking

Package Structure

All core APIs are in the org.apache.iceberg package:
import org.apache.iceberg.Table;
import org.apache.iceberg.Schema;
import org.apache.iceberg.PartitionSpec;
import org.apache.iceberg.SortOrder;
import org.apache.iceberg.Snapshot;

Common Patterns

Loading a Table

// Using a catalog
Catalog catalog = ...
Table table = catalog.loadTable(TableIdentifier.of("db", "table"));

Reading Table Metadata

Table table = ...

// Get current schema
Schema schema = table.schema();

// Get partition spec
PartitionSpec spec = table.spec();

// Get sort order
SortOrder sortOrder = table.sortOrder();

// Get current snapshot
Snapshot snapshot = table.currentSnapshot();

Scanning Data

Table table = ...

// Create a table scan
TableScan scan = table.newScan()
    .filter(Expressions.equal("status", "active"))
    .select("id", "name", "timestamp");

// Execute the scan
for (CombinedScanTask task : scan.planTasks()) {
    // Process tasks
}

Updating Data

Table table = ...

// Append new data files
table.newAppend()
    .appendFile(dataFile)
    .commit();

// Update schema
table.updateSchema()
    .addColumn("new_field", Types.StringType.get())
    .commit();

Thread Safety

All Iceberg table operations are designed to be thread-safe and support concurrent modifications through optimistic concurrency control.

Next Steps

Table API

Explore the Table interface and its methods

Schema API

Learn about schema management and evolution

PartitionSpec API

Understand partition specifications and transforms

SortOrder API

Configure how data files are sorted

Build docs developers (and LLMs) love