Actions API
The Actions API provides a high-level interface for performing table maintenance operations in Apache Iceberg. Actions are designed to work with query engines like Apache Spark to distribute parts of the work.What are Actions?
Actions are operations performed on Iceberg tables that extend beyond basic read/write operations. They are typically maintenance tasks that optimize table performance, clean up old data, or reorganize table structure. All actions implement the baseAction interface:
Available Actions
Iceberg provides several built-in actions for common maintenance tasks:Data Management
- ExpireSnapshots - Remove old snapshots and their associated files
- RewriteDataFiles - Optimize file layout and size
- DeleteOrphanFiles - Remove orphaned files not referenced by any snapshot
Metadata Management
- RewriteManifests - Optimize manifest file organization
- ComputeTableStats - Collect and store table statistics
Using Actions
Actions are obtained through anActionsProvider implementation. The typical workflow is:
- Get an action instance from the provider
- Configure the action with options
- Execute the action
- Process the result
Example
Common Patterns
Configuring Actions
All actions support configuration through:- Method chaining with specific configuration methods
- Generic
option()andoptions()methods for advanced settings
Results
Each action returns a result object containing execution summary information:Implementation Notes
- Actions may use query engines to distribute work across a cluster
- Some actions modify table metadata and create new snapshots
- Actions are designed to be safe and atomic where possible
- Long-running actions may support partial progress and retry mechanisms