OverwriteFiles interface provides an API for replacing existing files in an Iceberg table with new files.
Overview
OverwriteFiles accumulates file additions and deletions, producing a new snapshot that replaces deleted files with added files. This is used for:- Idempotent writes that replace partitions
- Update/delete operations that eagerly overwrite files
- Data compaction and optimization
Interface
Core Methods
overwriteByRowFilter()
Deletes files that match an expression on data rows.expr- An expression on rows in the table
ValidationException if a file can contain both rows that match and rows that do not
Description:
A file is selected to be deleted if it could contain any rows that match the expression (using an inclusive projection). Files are deleted if all rows in the file must match the expression (using a strict projection).
Example:
addFile()
Adds a data file to the table.file- A data file to add
deleteFile()
Deletes a data file from the table.file- A data file to delete
deleteFiles()
Deletes a set of data files along with their delete files.dataFilesToDelete- The data files to be deleteddeleteFilesToDelete- The delete files corresponding to the data files
Validation Methods
validateAddedFilesMatchOverwriteFilter()
Validates that each added file matches the overwrite expression.validateFromSnapshot()
Sets the snapshot ID used in any reads for this operation.snapshotId- A snapshot ID
conflictDetectionFilter()
Sets a conflict detection filter for validating concurrent changes.conflictDetectionFilter- An expression on rows in the table
validateNoConflictingData()
Enables validation that concurrently added data does not conflict.validateNoConflictingDeletes()
Enables validation that concurrent deletes do not conflict.caseSensitive()
Enables or disables case sensitive expression binding.caseSensitive- Whether expression binding should be case sensitive
Examples
Partition Overwrite (Idempotent)
File-Level Overwrite
Compaction with Overwrite
Dynamic Overwrite
Non-Idempotent Update with Validation
Multi-Partition Overwrite
Overwrite with Delete Files
Validation Modes
Idempotent Overwrite
For operations that can safely be retried:Non-Idempotent Overwrite
For operations that must check for conflicts:See Also
- AppendFiles - Append files to a table
- DeleteFiles - Delete files from a table
- RowDelta - Row-level changes
- Expressions - Filter expressions