What are Pulsar Functions?
Pulsar Functions are simple, lightweight compute processes that:- Consume messages from one or more Pulsar topics
- Apply user-defined processing logic to each message
- Optionally publish results to another topic
Key Features
Serverless Execution
Deploy functions without managing infrastructure. Pulsar handles scaling, placement, and resource management automatically.Multi-Language Support
Write functions in your preferred language:- Java
- Python
- Go
Built-in State Management
Functions provide simple APIs for stateful processing:Processing Guarantees
Configure delivery semantics based on your requirements:- At-most-once: Message processed zero or one time (fastest)
- At-least-once: Message processed one or more times (default)
- Effectively-once: Message effects applied exactly once (requires state)
Use Cases
Stream Filtering
Filter messages based on content or metadata before routing to downstream consumers.Data Transformation
Transform message formats, enrich data, or normalize schemas.Routing and Aggregation
Route messages to different topics based on content, or aggregate data from multiple sources.Event-Driven Workflows
Trigger actions based on specific message patterns or thresholds.Simple Function Example
Here’s a basic function that appends an exclamation mark to incoming strings:Architecture
Pulsar Functions run within the Pulsar cluster and integrate directly with the broker and BookKeeper storage layer:- Function Worker: Coordinates function deployment and lifecycle
- Function Runtime: Executes function code (process, thread, or container)
- State Storage: Persistent state backed by BookKeeper
- Metrics: Automatic collection of processing metrics
Function Lifecycle
Functions support initialization and cleanup hooks:Next Steps
Developing Functions
Learn how to write and test Pulsar Functions
Deploying Functions
Deploy and manage functions in production
Runtime Configuration
Configure function runtime and resources
CLI Reference
Manage functions with pulsar-admin