Pulsar Functions Overview

Pulsar Functions is a lightweight compute framework built into Apache Pulsar that enables serverless stream processing. Functions allow you to process messages as they flow through Pulsar topics without managing separate processing clusters.

What are Pulsar Functions?

Pulsar Functions are simple, lightweight compute processes that:

Consume messages from one or more Pulsar topics
Apply user-defined processing logic to each message
Optionally publish results to another topic

Functions abstract away the complexities of message consumption, processing guarantees, and state management, allowing you to focus on business logic.

Key Features

Serverless Execution

Deploy functions without managing infrastructure. Pulsar handles scaling, placement, and resource management automatically.

Multi-Language Support

Write functions in your preferred language:

Java
Python
Go

Built-in State Management

Functions provide simple APIs for stateful processing:

// Increment a counter
context.incrCounter("word-count", 1);

// Get counter value
long count = context.getCounter("word-count");

// Store arbitrary state
context.putState("key", ByteBuffer.wrap(value));

Processing Guarantees

Configure delivery semantics based on your requirements:

At-most-once: Message processed zero or one time (fastest)
At-least-once: Message processed one or more times (default)
Effectively-once: Message effects applied exactly once (requires state)

Use Cases

Stream Filtering

Filter messages based on content or metadata before routing to downstream consumers.

Data Transformation

Transform message formats, enrich data, or normalize schemas.

Routing and Aggregation

Route messages to different topics based on content, or aggregate data from multiple sources.

Event-Driven Workflows

Trigger actions based on specific message patterns or thresholds.

Simple Function Example

Here’s a basic function that appends an exclamation mark to incoming strings:

package org.apache.pulsar.functions.api.examples;

import org.apache.pulsar.functions.api.Context;
import org.apache.pulsar.functions.api.Function;

public class ExclamationFunction implements Function<String, String> {
    @Override
    public String process(String input, Context context) {
        return String.format("%s!", input);
    }
}

Architecture

Pulsar Functions run within the Pulsar cluster and integrate directly with the broker and BookKeeper storage layer:

Function Worker: Coordinates function deployment and lifecycle
Function Runtime: Executes function code (process, thread, or container)
State Storage: Persistent state backed by BookKeeper
Metrics: Automatic collection of processing metrics

Function Lifecycle

Functions support initialization and cleanup hooks:

public class MyFunction implements Function<String, String> {
    private SomeResource resource;
    
    @Override
    public void initialize(Context context) throws Exception {
        // Called once when function instance starts
        resource = new SomeResource();
    }
    
    @Override
    public String process(String input, Context context) {
        // Called for each message
        return resource.transform(input);
    }
    
    @Override
    public void close() throws Exception {
        // Called once when function instance stops
        resource.cleanup();
    }
}

Next Steps

Developing Functions

Learn how to write and test Pulsar Functions

Deploying Functions

Deploy and manage functions in production

Runtime Configuration

Configure function runtime and resources

CLI Reference

Manage functions with pulsar-admin

Getting Started

Core Concepts

Client Libraries

Pulsar Functions

Pulsar IO

Deployment

Operations

Security

Pulsar Functions Overview

What are Pulsar Functions?

Key Features

Serverless Execution

Multi-Language Support

Built-in State Management

Processing Guarantees

Use Cases

Stream Filtering

Data Transformation

Routing and Aggregation

Event-Driven Workflows

Simple Function Example

Architecture

Function Lifecycle

Next Steps

Developing Functions

Deploying Functions

Runtime Configuration

CLI Reference

Build docs developers (and LLMs) love

Getting Started

Core Concepts

Client Libraries

Pulsar Functions

Pulsar IO

Deployment

Operations

Security

​What are Pulsar Functions?

​Key Features

​Serverless Execution

​Multi-Language Support

​Built-in State Management

​Processing Guarantees

​Use Cases

​Stream Filtering

​Data Transformation

​Routing and Aggregation

​Event-Driven Workflows

​Simple Function Example

​Architecture

​Function Lifecycle

​Next Steps

Developing Functions

Deploying Functions

Runtime Configuration

CLI Reference

Build docs developers (and LLMs) love

What are Pulsar Functions?

Key Features

Serverless Execution

Multi-Language Support

Built-in State Management

Processing Guarantees

Use Cases

Stream Filtering

Data Transformation

Routing and Aggregation

Event-Driven Workflows

Simple Function Example

Architecture

Function Lifecycle

Next Steps