Skip to main content
Triggers determine when Beam emits results for a window. They control the balance between latency (how quickly you get results) and completeness (how much data is included).

What are Triggers?

Triggers answer the question: “When should we emit results for this window?” From the Apache Beam examples:
/**
 * This example illustrates the basic concepts behind triggering. It shows how to use 
 * different trigger definitions to produce partial (speculative) results before all 
 * the data is processed and to control when updated results are produced for late data.
 *
 * Concepts:
 * 1. The default triggering behavior
 * 2. Late data with the default trigger
 * 3. How to get speculative estimates
 * 4. Combining late data and speculative estimates
 */

Why Triggers?

Triggers enable you to:
  • Get early results before all data arrives (speculative results)
  • Handle late data that arrives after the window closes
  • Balance latency and accuracy based on your requirements
  • Refine results as more data becomes available
Without triggers, windows only emit results once when the watermark passes the end of the window. Triggers give you fine-grained control over this timing.

Default Trigger Behavior

By default, Beam uses the following trigger:
AfterWatermark.pastEndOfWindow()
This means:
  • Results are emitted once when the watermark passes the end of the window
  • Late data is discarded (unless you configure allowed lateness)
  • No early (speculative) results
// Default trigger (implicit)
PCollection<String> windowed = input.apply(
    Window.<String>into(FixedWindows.of(Duration.standardMinutes(5))));
// Emits results once per window when watermark passes

// Equivalent explicit form
PCollection<String> windowed = input.apply(
    Window.<String>into(FixedWindows.of(Duration.standardMinutes(5)))
        .triggering(AfterWatermark.pastEndOfWindow())
        .discardingFiredPanes());

Built-in Trigger Types

Event Time Triggers

Fire based on the watermark (event time progress):

AfterWatermark

Fire when the watermark passes the end of the window:
import org.apache.beam.sdk.transforms.windowing.AfterWatermark;
import org.apache.beam.sdk.transforms.windowing.Window;

PCollection<String> results = input.apply(
    Window.<String>into(FixedWindows.of(Duration.standardMinutes(5)))
        .triggering(AfterWatermark.pastEndOfWindow())
        .discardingFiredPanes());

Processing Time Triggers

Fire based on processing time (wall clock time):

AfterProcessingTime

Fire after a certain amount of processing time has elapsed:
import org.apache.beam.sdk.transforms.windowing.AfterProcessingTime;
import org.apache.beam.sdk.transforms.windowing.Repeatedly;

// Fire every minute of processing time
PCollection<String> results = input.apply(
    Window.<String>into(FixedWindows.of(Duration.standardMinutes(5)))
        .triggering(
            Repeatedly.forever(
                AfterProcessingTime.pastFirstElementInPane()
                    .plusDelayOf(Duration.standardMinutes(1))))
        .discardingFiredPanes());

Count-based Triggers

Fire after a certain number of elements:

AfterCount

import org.apache.beam.sdk.transforms.windowing.AfterPane;

// Fire after every 100 elements
PCollection<String> results = input.apply(
    Window.<String>into(FixedWindows.of(Duration.standardMinutes(5)))
        .triggering(
            Repeatedly.forever(
                AfterPane.elementCountAtLeast(100)))
        .discardingFiredPanes());

Composite Triggers

Combine multiple triggers with logical operations:

AfterEach (Sequence)

Fire triggers in sequence:
import org.apache.beam.sdk.transforms.windowing.AfterEach;

// Fire after 100 elements, then after watermark
PCollection<String> results = input.apply(
    Window.<String>into(FixedWindows.of(Duration.standardMinutes(5)))
        .triggering(
            AfterEach.inOrder(
                AfterPane.elementCountAtLeast(100),
                AfterWatermark.pastEndOfWindow()))
        .discardingFiredPanes());

AfterAll

Fire when all sub-triggers have fired:
import org.apache.beam.sdk.transforms.windowing.AfterAll;

// Fire when BOTH conditions are met
PCollection<String> results = input.apply(
    Window.<String>into(FixedWindows.of(Duration.standardMinutes(5)))
        .triggering(
            AfterAll.of(
                AfterPane.elementCountAtLeast(100),
                AfterProcessingTime.pastFirstElementInPane()
                    .plusDelayOf(Duration.standardMinutes(1))))
        .discardingFiredPanes());

AfterAny

Fire when any sub-trigger fires:
import org.apache.beam.sdk.transforms.windowing.AfterFirst;

// Fire when EITHER condition is met
PCollection<String> results = input.apply(
    Window.<String>into(FixedWindows.of(Duration.standardMinutes(5)))
        .triggering(
            AfterFirst.of(
                AfterPane.elementCountAtLeast(1000),
                AfterProcessingTime.pastFirstElementInPane()
                    .plusDelayOf(Duration.standardMinutes(1))))
        .discardingFiredPanes());

Early and Late Firings

The most common pattern combines early (speculative) results with on-time and late results:
import org.apache.beam.sdk.transforms.windowing.*;

// Early results every minute, final result at watermark, late data every 5 minutes
PCollection<KV<String, Integer>> results = input.apply(
    Window.<KV<String, Integer>>into(
            FixedWindows.of(Duration.standardMinutes(30)))
        .triggering(
            AfterWatermark.pastEndOfWindow()
                // Speculative early results
                .withEarlyFirings(
                    AfterProcessingTime.pastFirstElementInPane()
                        .plusDelayOf(Duration.standardMinutes(1)))
                // Handle late data
                .withLateFirings(
                    AfterProcessingTime.pastFirstElementInPane()
                        .plusDelayOf(Duration.standardMinutes(5))))
        .withAllowedLateness(Duration.standardDays(1))
        .accumulatingFiredPanes());
This trigger:
  1. Early: Fires every minute with partial results
  2. On-time: Fires when watermark passes end of window
  3. Late: Fires every 5 minutes for late-arriving data

Accumulation Modes

Control how results accumulate across trigger firings:

Discarding Mode

Each pane contains only new data since the last firing:
.discardingFiredPanes()
Window [0:00-0:05]
  Firing 1 (early):    elements [A, B, C]      → output: [A, B, C]
  Firing 2 (on-time):  elements [D, E]         → output: [D, E]
  Firing 3 (late):     elements [F]            → output: [F]
Use discarding mode when you want to avoid duplicate processing and can handle incremental updates.

Accumulating Mode

Each pane contains all data seen so far:
.accumulatingFiredPanes()
Window [0:00-0:05]
  Firing 1 (early):    elements [A, B, C]      → output: [A, B, C]
  Firing 2 (on-time):  elements [D, E]         → output: [A, B, C, D, E]
  Firing 3 (late):     elements [F]            → output: [A, B, C, D, E, F]
Use accumulating mode when downstream systems expect complete results and can handle retractions/updates.

Complete Example: Real-time Dashboard

From the Apache Beam examples:
public class RealtimeDashboard {
    
    public static void main(String[] args) {
        Pipeline p = Pipeline.create(options);
        
        PCollection<KV<String, Integer>> scores = p
            .apply("ReadScores", 
                PubsubIO.readStrings().fromTopic("game-scores"))
            .apply("ParseScores", 
                ParDo.of(new ParseEventFn()));
        
        // Get early updates every 30 seconds, final at watermark
        PCollection<KV<String, Integer>> teamScores = scores.apply(
            Window.<KV<String, Integer>>into(
                    FixedWindows.of(Duration.standardMinutes(5)))
                .triggering(
                    AfterWatermark.pastEndOfWindow()
                        .withEarlyFirings(
                            AfterProcessingTime.pastFirstElementInPane()
                                .plusDelayOf(Duration.standardSeconds(30))))
                .withAllowedLateness(Duration.standardMinutes(10))
                .accumulatingFiredPanes())
            .apply("SumScores", Sum.integersPerKey());
        
        // Write to dashboard
        teamScores
            .apply("FormatResults", ParDo.of(new FormatFn()))
            .apply("WriteToDashboard", 
                PubsubIO.writeStrings().to("dashboard-updates"));
        
        p.run();
    }
}
This example:
  • Early firings: Updates dashboard every 30 seconds with partial results
  • On-time firing: Emits final result when window closes
  • Late firings: Updates if late data arrives (within 10-minute lateness)
  • Accumulating: Each update contains complete results so far

Pane Information

Access metadata about trigger firings:
class ProcessPaneFn extends DoFn<KV<String, Integer>, String> {
    @ProcessElement
    public void processElement(
            @Element KV<String, Integer> element,
            BoundedWindow window,
            PaneInfo pane,
            OutputReceiver<String> out) {
        
        String timing = pane.getTiming().toString(); // EARLY, ON_TIME, LATE
        boolean isFirst = pane.isFirst();
        boolean isLast = pane.isLast();
        long index = pane.getIndex();
        
        String result = String.format(
            "Key: %s, Value: %d, Timing: %s, IsFirst: %s, IsLast: %s",
            element.getKey(), element.getValue(), timing, isFirst, isLast);
        
        out.output(result);
    }
}

Best Practices

Start simple: Begin with the default trigger. Add complexity only when you need early results or late data handling.
Cost vs. Latency: More frequent firings mean lower latency but higher processing costs. Balance based on your requirements.
Monitor pane counts: High pane counts may indicate issues with your trigger configuration or data arrival patterns.

Common Patterns

  1. Real-time dashboard: Early firings every 30-60 seconds
  2. Hourly reports: Watermark trigger with 1-hour allowed lateness
  3. Session analysis: Default trigger with session windows
  4. High-throughput counting: Process every N elements or M seconds
  5. Critical updates: Watermark trigger with minimal latency tolerance

When to Use Which Trigger

Use CaseRecommended Trigger
Batch processingDefault (AfterWatermark)
Real-time dashboardAfterWatermark with early firings
Low-latency alertsAfterProcessingTime
Exactly-once processingDefault with no early/late firings
Approximate results OKFrequent AfterProcessingTime
Late data importantAfterWatermark with late firings

Debugging Triggers

Log pane information to understand trigger behavior:
class LogPaneInfoFn<T> extends DoFn<T, T> {
    private static final Logger LOG = LoggerFactory.getLogger(LogPaneInfoFn.class);
    
    @ProcessElement
    public void processElement(
            @Element T element,
            BoundedWindow window,
            PaneInfo pane,
            OutputReceiver<T> out) {
        
        LOG.info("Element: {}, Window: {}, Pane: {}, Timing: {}",
            element, window, pane.getIndex(), pane.getTiming());
        
        out.output(element);
    }
}

Next Steps

Windowing

Learn about window functions

Transforms

Apply transformations to your data

Build docs developers (and LLMs) love