DeleteFiles interface provides an API for removing data files from an Iceberg table.
Overview
DeleteFiles accumulates file deletions, produces a new snapshot of the table, and commits that snapshot as the current. This is used to remove files that are no longer needed or to delete data matching specific criteria.Interface
Core Methods
deleteFile() with Path
Deletes a file by its path.path- A fully-qualified file path to remove from the table
file:/path/file.avro is equivalent to file:///path/file.avro, but would not remove the latter.
Example:
deleteFile() with DataFile
Deletes a file tracked by a DataFile.file- A DataFile to remove from the table
deleteFromRowFilter()
Deletes files that match an expression on data rows.expr- An expression on rows in the table
ValidationException if a file can contain both rows that match and rows that do not
Description:
A file is selected to be deleted if it could contain any rows that match the expression (using an inclusive projection). Files are deleted if all rows in the file must match the expression (using a strict projection).
Example:
caseSensitive()
Enables or disables case sensitive expression binding.caseSensitive- Whether expression binding should be case sensitive
validateFilesExist()
Enables validation that deleted files still exist when committing.Examples
Delete Single File
Delete Multiple Files
Delete Files from Scan
Delete Partition
Delete Multiple Partitions
Delete with Validation
Conditional Delete
Delete Small Files
Delete with Time-Based Filter
Incremental Delete Pattern
Case-Insensitive Delete
Safe Delete with Error Handling
Important Notes
Path Matching
Paths must match exactly:Atomic Operations
All deletions are atomic:Expression-Based Deletion
Files are only deleted if all rows match:See Also
- AppendFiles - Append files to a table
- OverwriteFiles - Overwrite files in a table
- RowDelta - Row-level changes
- Expressions - Filter expressions