Dependency
Usage with Table API / SQL
Filesystem source and sink
Streaming insert
Format options
| Option | Required | Default | Description |
|---|---|---|---|
format | Yes | — | Must be 'orc'. |
| ORC property | Common values | Description |
|---|---|---|
orc.compress | NONE, ZLIB, SNAPPY, LZO, LZ4, ZSTD | Compression codec. Default is ZLIB. |
orc.compress.size | (integer) | Compression chunk size in bytes. |
orc.stripe.size | (integer) | Size of ORC stripes in bytes. |
orc.row.index.stride | (integer) | Number of rows between row index entries. |
orc.bloom.filter.columns | (comma-separated column names) | Columns for which to create Bloom filter indexes. |
Data type mapping
ORC format type mapping is compatible with Apache Hive.| Flink Data Type | ORC physical type | ORC logical type |
|---|---|---|
CHAR | bytes | CHAR |
VARCHAR | bytes | VARCHAR |
STRING | bytes | STRING |
BOOLEAN | long | BOOLEAN |
BYTES / BINARY / VARBINARY | bytes | BINARY |
DECIMAL | decimal | DECIMAL |
TINYINT | long | BYTE |
SMALLINT | long | SHORT |
INT | long | INT |
BIGINT | long | LONG |
FLOAT | double | FLOAT |
DOUBLE | double | DOUBLE |
DATE | long | DATE |
TIMESTAMP | timestamp | TIMESTAMP |
ARRAY | — | LIST |
MAP | — | MAP |
ROW | — | STRUCT |
Usage with DataStream API
To write ORC files from a DataStream job, implement aVectorizer that converts your type to ORC’s VectorizedRowBatch and use OrcBulkWriterFactory with FileSink:
Adding user metadata
You can attach key-value metadata to ORC files from within thevectorize method:

