DataStream instances into new ones. You chain them together to build your processing pipeline. Flink evaluates the entire chain lazily when you call env.execute().
Basic transformations
map
Transforms each element one-to-one and returns a new element, potentially of a different type.flatMap
Transforms each element into zero, one, or more output elements. Use aCollector to emit them.
filter
Keeps only elements for which the function returnstrue.
Keyed streams
keyBy
Partitions a stream by key. All elements with the same key are routed to the same parallel task.keyBy returns a KeyedStream, which is required for keyed state and keyed operators like reduce and window.
reduce
Applies a rolling reduce on aKeyedStream. Combines the current element with the previously reduced value and emits the new value after each element.
reduce emits only the final accumulated value per key. In STREAMING mode, it emits an updated value for each incoming element.
Windowing
window
Groups elements of aKeyedStream into windows. Windows collect elements and trigger computation when a condition is met (time elapsed, element count reached, etc.).
| Assigner | Description |
|---|---|
TumblingEventTimeWindows.of(size) | Non-overlapping fixed-size event-time windows |
TumblingProcessingTimeWindows.of(size) | Non-overlapping fixed-size processing-time windows |
SlidingEventTimeWindows.of(size, slide) | Overlapping sliding windows |
EventTimeSessionWindows.withGap(gap) | Session windows, closed after a period of inactivity |
GlobalWindows.create() | A single window containing all elements (requires custom trigger) |
windowAll
Opens windows on a non-keyedDataStream. All records are collected into a single task — this is non-parallel. Use it only for small-volume aggregations:
Window functions
After defining a window, you apply a function to the contents:Stream merging and joining
union
Merges two or more streams of the same type into one. Elements from all streams interleave:connect
Connects two streams of potentially different types. The streams share state but are processed by separatemap1/map2 (or flatMap1/flatMap2) methods:
Window join
Joins two streams within a time window on a common key:Interval join
Joins keyed stream elements where timestamps fall within a relative time range:ProcessFunction
ProcessFunction gives you access to low-level runtime features: element timestamps, timers (event-time and processing-time), and side outputs. It is the most powerful but most verbose operator.
Apply it to a keyed stream to access keyed state and register timers:
ProcessFunctionExample.java
Physical partitioning
These operators control how data is distributed across parallel tasks without changing the stream’s type:Operator chaining
Flink automatically chains operators that have a 1-to-1 connection pattern (e.g., consecutivemap calls). Chained operators run in the same thread and pass records directly without network serialization.
To control chaining manually:

