Feature Support
Hive supports the following features with Hive version 4.0.0 and above:Table Operations
- Creating an Iceberg table
- Creating an Iceberg identity-partitioned table
- Creating an Iceberg table with any partition spec, including various transforms
- Creating a table from an existing table (CTAS)
- Dropping a table
- Altering a table while keeping Iceberg and Hive schemas in sync
- Altering the partition schema (updating columns)
- Altering the partition schema by specifying partition transforms
- Truncating a table/partition, dropping a partition
- Migrating tables in Avro, Parquet, or ORC (Non-ACID) format to Iceberg
Query Operations
- Reading an Iceberg table
- Reading the schema of a table
- Querying Iceberg metadata tables
- Time travel applications
Write Operations
- Inserting into a table/partition (INSERT INTO)
- Inserting data overwriting existing data (INSERT OVERWRITE)
- Copy-on-write support for DELETE, UPDATE and MERGE queries
- CRUD support for Iceberg V1 tables
Advanced Features
- Expiring snapshots
- Creating tables like existing tables (CTLT)
- Supporting Parquet compression types
- Altering table metadata location
- Supporting table rollback
- Honoring sort orders on existing tables when writing
- Creating, writing to and dropping Iceberg branches/tags
- Setting current snapshot using snapshot ID
- Table renaming
- Converting tables to Iceberg format
- Fast forwarding and cherry-picking commits to branches
- Creating a branch from a tag
- Deleting orphan files
- Full table compaction
- Showing partition information (SHOW PARTITIONS)
Version Support
Hive 4.1.x
Hive 4.1.x comes with Iceberg 1.9.1 included.Hive 4.0.x
Hive 4.0.x comes with Iceberg 1.4.3 included.Starting from Iceberg 1.8.0, Iceberg doesn’t release a Hive runtime connector. For Hive 2.x and 3.x integration, use the Hive runtime connector from Iceberg 1.6.1, or use Hive 4.0.0 or later.
Enabling Iceberg Support
If the Iceberg storage handler is not in Hive’s classpath, Hive cannot load or update the metadata for an Iceberg table. To avoid broken tables in Hive, Iceberg will not add the storage handler unless Hive support is enabled.Hadoop Configuration
To enable Hive support globally for an application, seticeberg.engine.hive.enabled=true in its Hadoop configuration:
hive-site.xml loaded by Spark will enable the storage handler for all tables created by Spark.
Table Property Configuration
Alternatively, set the propertyengine.hive.enabled=true when creating the Iceberg table:
The table level configuration overrides the global Hadoop configuration.
Catalog Management
Global Hive Catalog
From the Hive engine’s perspective, there is only one global data catalog defined in the Hadoop configuration. In contrast, Iceberg supports multiple catalog types such as Hive, Hadoop, AWS Glue, or custom implementations. A table in the Hive metastore can represent three different ways of loading an Iceberg table, depending on the table’siceberg.catalog property:
- HiveCatalog - No
iceberg.catalogis set - Custom catalog -
iceberg.catalogis set to a catalog name - Location-based table -
iceberg.catalogis set tolocation_based_table
Custom Iceberg Catalogs
To globally register different catalogs, set the following Hadoop configurations:| Config Key | Description |
|---|---|
iceberg.catalog.<catalog_name>.type | Type of catalog: hive, hadoop, or left unset if using a custom catalog |
iceberg.catalog.<catalog_name>.catalog-impl | Catalog implementation, must not be null if type is empty |
iceberg.catalog.<catalog_name>.<key> | Any config key and value pairs for the catalog |
Examples
Register a HiveCatalog calledanother_hive:
hadoop:
glue:
Type Compatibility
Hive and Iceberg support different sets of types. Iceberg can perform type conversion automatically for some combinations. Enable auto-conversion through Hadoop configuration:Hive to Iceberg Type Mapping
| Hive | Iceberg | Notes |
|---|---|---|
| boolean | boolean | |
| short | integer | auto-conversion |
| byte | integer | auto-conversion |
| integer | integer | |
| long | long | |
| float | float | |
| double | double | |
| date | date | |
| timestamp | timestamp without timezone | |
| timestamplocaltz | timestamp with timezone | Hive 3 only |
| char | string | auto-conversion |
| varchar | string | auto-conversion |
| string | string | |
| binary | binary | |
| decimal | decimal | |
| struct | struct | |
| list | list | |
| map | map | |
| interval_year_month | not supported | |
| interval_day_time | not supported | |
| union | not supported |
DDL Commands
CREATE TABLE
Non-partitioned Tables
Partitioned Tables
Create Iceberg partitioned tables using familiar syntax:The resulting table does not create partitions in HMS, but instead converts partition data into Iceberg identity partitions.
years(ts): partition by yearmonths(ts): partition by monthdays(ts)ordate(ts): equivalent to dateint partitioninghours(ts)ordate_hour(ts): equivalent to dateint and hour partitioningbucket(N, col): partition by hashed value mod N bucketstruncate(L, col): partition by value truncated to L
CREATE TABLE AS SELECT
The Iceberg table and corresponding Hive table are created at the beginning of query execution. Data is inserted when the query finishes.
CREATE TABLE LIKE TABLE
CREATE EXTERNAL TABLE
Overlay a Hive table on top of an existing Iceberg table:DML Commands
SELECT
Select statements work the same on Iceberg tables with Iceberg benefits:- No file system listings - especially important on blob stores like S3
- No partition listing from the Metastore
- Advanced partition filtering - partition keys can be calculated
- Handle higher number of partitions than normal Hive tables
- Predicate pushdown to Iceberg TableScan and readers
- Column projection to reduce columns read
- Tez query execution engine support (Hive 4.x)
INSERT INTO
INSERT OVERWRITE
DELETE FROM
UPDATE
MERGE INTO
Metadata Tables
Query Iceberg metadata tables using the full table name:- all_data_files
- all_delete_files
- all_entries
- all_files
- all_manifests
- data_files
- delete_files
- entries
- files
- history
- manifests
- metadata_log_entries
- partitions
- refs
- snapshots
Time Travel
Query historical table snapshots:Maintenance Operations
Expire Snapshots
Delete Orphan Files
Table Rollback
Rollback to the last snapshot before a specific timestamp:Compaction
Next Steps
Configuration
Configure Iceberg table properties
Maintenance
Maintain Iceberg tables