Pulsar IO Overview

What is Pulsar IO?

Pulsar IO is a framework for building connectors that move data between Apache Pulsar and external systems. It provides a simple interface for integrating Pulsar with databases, messaging systems, and other data sources and sinks.

Connector Types

Pulsar IO supports two types of connectors:

Source Connectors

Source connectors read data from external systems and write it to Pulsar topics. They implement the Source interface and continuously pull or receive data from external systems. Key characteristics:

Implement org.apache.pulsar.io.core.Source<T> interface
Read data from external systems
Push messages into Pulsar topics
Support configurable polling and batching

Sink Connectors

Sink connectors read data from Pulsar topics and write it to external systems. They implement the Sink interface and process messages as they arrive. Key characteristics:

Implement org.apache.pulsar.io.core.Sink<T> interface
Read messages from Pulsar topics
Write data to external systems
Support delivery guarantees and error handling

Core Concepts

Connector Lifecycle

Open: Initialize the connector with configuration
Process: Read or write data (depending on connector type)
Close: Clean up resources when shutting down

Configuration

Connectors are configured using YAML files or configuration maps. Each connector has a specific configuration class that defines required and optional parameters.

tenant: public
namespace: default
name: my-connector
inputs: [input-topic]
archive: connectors/my-connector.nar
configs:
  # connector-specific configuration

Runtime Modes

Pulsar IO connectors can run in different modes:

Standalone: Run as part of the Pulsar Functions runtime
Cluster: Deploy across multiple Pulsar brokers for high availability
Kubernetes: Run as containerized workloads

Architecture

Source Connector Flow

External System → Source Connector → Pulsar Topic → Consumer Applications

Sink Connector Flow

Producer Applications → Pulsar Topic → Sink Connector → External System

Key Features

Built-in Connectors

Pulsar includes a rich set of built-in connectors for popular systems. See the Connectors page for a complete list.

Processing Guarantees

At-most-once: Messages may be lost but never redelivered
At-least-once: Messages are never lost but may be redelivered
Effectively-once: Each message is processed exactly once

Schema Support

Connectors support Pulsar’s schema registry, allowing automatic schema evolution and type safety.

Error Handling

Built-in support for:

Dead letter queues for failed messages
Retry policies with exponential backoff
Custom error handlers

Monitoring

Connectors expose metrics for:

Message throughput
Processing latency
Error rates
Connection health

Managing Connectors

Using pulsar-admin CLI

# Create a source connector
pulsar-admin sources create \
  --source-config-file kafka-source-config.yaml

# Create a sink connector
pulsar-admin sinks create \
  --sink-config-file cassandra-sink-config.yaml

# List running connectors
pulsar-admin sources list
pulsar-admin sinks list

# Get connector status
pulsar-admin sources status --name my-source

# Stop a connector
pulsar-admin sources stop --name my-source

# Delete a connector
pulsar-admin sources delete --name my-source

When to Use Pulsar IO

Pulsar IO is ideal when you need to:

Ingest data from external systems into Pulsar
Export Pulsar data to external systems
Bridge between Pulsar and legacy systems
Implement change data capture (CDC) pipelines
Build data integration workflows

Next Steps

Explore available connectors
Learn about developing custom connectors
Review connector-specific configuration guides

Getting Started

Core Concepts

Client Libraries

Pulsar Functions

Pulsar IO

Deployment

Operations

Security

What is Pulsar IO?

Connector Types

Source Connectors

Sink Connectors

Core Concepts

Connector Lifecycle

Configuration

Runtime Modes

Architecture

Source Connector Flow

Sink Connector Flow

Key Features

Built-in Connectors

Processing Guarantees

Schema Support

Error Handling

Monitoring

Managing Connectors

Using pulsar-admin CLI

When to Use Pulsar IO

Next Steps

Build docs developers (and LLMs) love

Getting Started

Core Concepts

Client Libraries

Pulsar Functions

Pulsar IO

Deployment

Operations

Security

​What is Pulsar IO?

​Connector Types

​Source Connectors

​Sink Connectors

​Core Concepts

​Connector Lifecycle

​Configuration

​Runtime Modes

​Architecture

​Source Connector Flow

​Sink Connector Flow

​Key Features

​Built-in Connectors

​Processing Guarantees

​Schema Support

​Error Handling

​Monitoring

​Managing Connectors

​Using pulsar-admin CLI

​When to Use Pulsar IO

​Next Steps

Build docs developers (and LLMs) love

What is Pulsar IO?

Connector Types

Source Connectors

Sink Connectors

Core Concepts

Connector Lifecycle

Configuration

Runtime Modes

Architecture

Source Connector Flow

Sink Connector Flow

Key Features

Built-in Connectors

Processing Guarantees

Schema Support

Error Handling

Monitoring

Managing Connectors

Using pulsar-admin CLI

When to Use Pulsar IO

Next Steps