What the Table API and SQL are
The Table API and SQL are high-level abstractions built on top of Flink’s DataStream API. They represent data as dynamic tables: tables that change over time as new records arrive. A SQLSELECT on a dynamic table produces a continuously updated result set, not a one-time snapshot.
Flink SQL is based on Apache Calcite, which implements the SQL standard. Most ANSI SQL constructs are supported, including SELECT, INSERT INTO, CREATE TABLE, GROUP BY, window aggregations, JOIN, subqueries, and common table expressions (CTEs). Flink extends standard SQL with streaming-specific constructs such as time attributes, watermarks, and the MATCH_RECOGNIZE pattern matching clause.
The Table API is a fluent, type-safe Java/Scala/Python DSL that composes the same relational operations programmatically:
Relational vs. DataStream API
The DataStream API gives you full control over state, timers, and per-record processing logic. The Table API and SQL trade that control for conciseness and automatic optimization:| Table API & SQL | DataStream API | |
|---|---|---|
| Abstraction level | Relational / declarative | Imperative / procedural |
| Optimization | Automatic (Calcite planner) | Manual |
| Type safety | Schema-based | Strongly typed streams |
| Best for | ETL, analytics, aggregations | Complex event processing, custom state |
Table to a DataStream or wrap a DataStream as a Table to use the most appropriate abstraction for each part of your pipeline.
When to use each interface
Use the Table API when you want type-safe, programmatic composition of relational queries in Java, Scala, or Python, and you need to mix SQL statements with DataStream operations in the same program. Use Flink SQL when you prefer pure SQL strings—for interactive exploration in the SQL Client, for integration with BI tools via the SQL Gateway, or for embedding SQL statements in your application usingexecuteSql().
Use the DataStream API directly when you need fine-grained control over state backends, custom timers, side outputs, or processing-logic that is difficult to express relationally.
Flink SQL vs. ANSI SQL
Flink SQL follows the SQL standard closely but has a few differences worth knowing:- Streaming semantics:
SELECTon an unbounded table produces a continuous result stream, not a finite result set. Window functions (TUMBLE,HOP,CUMULATE) are required to bound aggregations on streams. - Time attributes: Columns of type
TIMESTAMP_LTZorTIMESTAMPcan be declared as event-time or processing-time attributes using theWATERMARKclause inCREATE TABLE. - Dynamic tables: Tables are not static; they are updated continuously as data arrives.
- DDL extensions:
CREATE TABLEaccepts aWITHclause for connector and format properties.
Flink SQL does not support all SQL features. Notably, correlated subqueries have limited support in streaming mode, and
ORDER BY on unbounded streams requires a time attribute.Section overview
Table Environment
Create and configure a
TableEnvironment. Register tables, execute SQL statements, and manage catalogs.Table API
Programmatic relational operators:
select, filter, groupBy, join, union, and window aggregations.Catalogs
Manage metadata for databases, tables, views, and functions. Use
GenericInMemoryCatalog, HiveCatalog, or build your own.User-Defined Functions
Extend Flink SQL with scalar functions, table functions, and aggregate functions written in Java or Python.
Data Types
Reference for all Flink SQL data types: numeric, string, date/time, and complex types (
ROW, ARRAY, MAP).SQL Client
Interactive CLI for running SQL queries without writing any Java or Scala code.
SQL Gateway
REST service for submitting SQL from remote clients, BI tools, and application servers.

