TableEnvironment.
Flink supports the following UDF kinds:
| Kind | Base class | Input | Output |
|---|---|---|---|
| Scalar function | ScalarFunction | One or more values per row | One value per row |
| Async scalar function | AsyncScalarFunction | One or more values per row | One value per row (async) |
| Table function | TableFunction<T> | One or more values per row | Zero or more rows |
| Async table function | AsyncTableFunction<T> | One or more values per row | Zero or more rows (async) |
| Aggregate function | AggregateFunction<T, ACC> | Values from multiple rows | One value per group |
| Table aggregate function | TableAggregateFunction<T, ACC> | Values from multiple rows | Zero or more rows per group |
Registering functions
createTemporarySystemFunction is available everywhere in the session. A function registered with createFunction is stored in the current catalog and database and survives session restarts (when using a persistent catalog).
Scalar functions
A scalar function maps zero, one, or multiple input values to a single output value. Implement one or more publiceval(...) methods. The method signature determines the input and output types.
Basic example
Overloaded eval methods
Flink dispatches to the correct overload based on argument types:Parameterized functions
Pass constructor parameters when registering by instance:Python scalar function
Async scalar functions
UseAsyncScalarFunction when the function needs to make network calls (database lookups, REST requests). Async execution overlaps waiting time across multiple concurrent requests, dramatically increasing throughput.
Table functions
A table function (UDTF) maps one input row to zero, one, or many output rows. Implementeval(...) and call collect(...) to emit each output row.
Python table function
Aggregate functions
Aggregate functions (UDAGGs) map values from multiple rows to a single scalar result. The function maintains an accumulator that is updated row by row:createAccumulator()— creates the initial (empty) accumulator.accumulate(acc, value...)— called for each input row.getValue(acc)— returns the final result after all rows are processed.retract(acc, value...)— optional; required for streaming queries that retract rows.merge(acc, Iterable<ACC>)— optional; required for session windows and boundedOVERaggregates.
Type inference with annotations
Flink infers input and output types from the Java method signature using reflection. Use annotations when the default inference is insufficient.@DataTypeHint
@FunctionHint
Named parameters with @ArgumentHint
Annotate parameters with@ArgumentHint to support named-parameter call syntax in SQL:
Runtime context and open/close lifecycle
Access job parameters, metrics, and distributed cache files viaFunctionContext:
Function resolution order
When multiple functions share the same name, Flink resolves them in this order:- Temporary system functions
- System (built-in) functions
- Temporary catalog functions in the current catalog and database
- Catalog functions in the current catalog and database
catalog.database.function) for precise reference.
