Skip to main content
Apache Flink is an open-source, distributed stream processing framework built for high-throughput, low-latency data pipelines. It provides unified stream and batch processing on a single runtime, with exactly-once state consistency guarantees and native support for event-time semantics.

Local Installation

Download and run Flink locally in minutes

DataStream Quickstart

Build your first streaming pipeline with the DataStream API

Table API & SQL Quickstart

Query streams and tables with SQL and the Table API

Core Concepts

Understand Flink’s architecture and programming model

Unified Processing

Single runtime for both streaming and batch workloads — no separate systems to manage.

Exactly-Once Guarantees

Built-in fault tolerance with checkpointing ensures exactly-once state consistency even after failures.

Event-Time Processing

Native support for event-time semantics and out-of-order data using the Dataflow Model.

Stateful Computations

Rich state primitives (ValueState, ListState, MapState) backed by pluggable state backends including RocksDB.

High Throughput & Low Latency

Millions of events per second with millisecond latency — designed for demanding production workloads.

SQL & Table API

Declarative SQL and Table API for streaming and batch queries, with full ANSI SQL support.

Choose your API

Flink provides multiple levels of abstraction to suit different use cases:
The DataStream API is Flink’s core API for building complex streaming and batch data pipelines in Java or Scala. It gives you full control over state, time, and fault tolerance.
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

DataStream<String> text = env.socketTextStream("localhost", 9999);

DataStream<Tuple2<String, Integer>> wordCounts = text
    .flatMap((String line, Collector<Tuple2<String, Integer>> out) -> {
        for (String word : line.split("\\s")) {
            out.collect(Tuple2.of(word, 1));
        }
    })
    .returns(Types.TUPLE(Types.STRING, Types.INT))
    .keyBy(t -> t.f0)
    .sum(1);

wordCounts.print();
env.execute("Word Count");

DataStream API Overview

Get started with the DataStream API

Deployment options

Flink runs on a variety of cluster environments:

Standalone

Deploy on any cluster without a resource manager

Kubernetes

Native Kubernetes integration with the Kubernetes Operator

YARN

Run Flink jobs on Apache Hadoop YARN clusters

Key resources

Configuration reference

All configuration options for Flink clusters and jobs

Checkpoints & savepoints

Fault tolerance and operational state management

Metrics & monitoring

Monitor cluster health and job performance

Connectors

Connect Flink to Kafka, filesystems, JDBC, and more

Build docs developers (and LLMs) love