Skip to main content
Deploy Apache Pulsar using Docker containers for development, testing, and lightweight production environments. This guide covers both standalone containers and multi-container deployments with Docker Compose.

Docker Images

Apache Pulsar provides two official Docker images:
  • apachepulsar/pulsar - Core Pulsar components (broker, bookie, ZooKeeper)
  • apachepulsar/pulsar-all - Includes connectors and offloaders (larger image)

Image Details

  • Base: Alpine Linux 3.23
  • JDK: Amazon Corretto 21
  • User: Non-root user (uid 10000)
  • Security: Runs as non-root by default (since 2.10.0)

Quick Start (Standalone)

1

Pull the Pulsar image

docker pull apachepulsar/pulsar:3.3.0
2

Start Pulsar standalone

docker run -it \
  -p 6650:6650 \
  -p 8080:8080 \
  --name pulsar-standalone \
  apachepulsar/pulsar:3.3.0 \
  bin/pulsar standalone
This starts Pulsar with:
  • Binary protocol on port 6650
  • HTTP admin API on port 8080
3

Verify the installation

In a new terminal:
# Check broker health
curl http://localhost:8080/admin/v2/brokers/health

# Or use docker exec
docker exec -it pulsar-standalone \
  bin/pulsar-admin brokers healthcheck
4

Test with messages

# Produce messages
docker exec -it pulsar-standalone \
  bin/pulsar-client produce my-topic \
  --messages "Hello Docker Pulsar"

# Consume messages
docker exec -it pulsar-standalone \
  bin/pulsar-client consume my-topic \
  --subscription-name my-sub -n 0

Persistent Storage

To persist data across container restarts, mount volumes:
docker run -it \
  -p 6650:6650 \
  -p 8080:8080 \
  -v $PWD/data:/pulsar/data \
  --name pulsar-standalone \
  apachepulsar/pulsar:3.3.0 \
  bin/pulsar standalone
Data directories:
  • /pulsar/data - Message data and metadata
  • /pulsar/conf - Configuration files
  • /pulsar/logs - Log files

Docker Compose Deployment

Standalone with Docker Compose

Create docker-compose.yml:
version: '3.8'

services:
  pulsar:
    image: apachepulsar/pulsar:3.3.0
    container_name: pulsar-standalone
    hostname: pulsar
    ports:
      - "6650:6650"
      - "8080:8080"
    volumes:
      - pulsar-data:/pulsar/data
      - pulsar-conf:/pulsar/conf
    environment:
      - PULSAR_MEM=-Xms512m -Xmx512m -XX:MaxDirectMemorySize=512m
    command: bin/pulsar standalone
    healthcheck:
      test: ["CMD", "bin/pulsar-admin", "brokers", "healthcheck"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s

volumes:
  pulsar-data:
  pulsar-conf:
Start:
docker-compose up -d

Multi-Container Cluster

For a production-like cluster with separate ZooKeeper, BookKeeper, and broker containers:
version: '3.8'

services:
  zookeeper:
    image: apachepulsar/pulsar:3.3.0
    container_name: zookeeper
    hostname: zookeeper
    ports:
      - "2181:2181"
    volumes:
      - zk-data:/pulsar/data/zookeeper
    environment:
      - PULSAR_MEM=-Xms256m -Xmx256m
    command: >
      bash -c "bin/apply-config-from-env.py conf/zookeeper.conf &&
               bin/generate-zookeeper-config.sh conf/zookeeper.conf &&
               bin/pulsar zookeeper"
    healthcheck:
      test: ["CMD", "bin/pulsar-zookeeper-ruok.sh"]
      interval: 10s
      timeout: 5s
      retries: 3

  bookie:
    image: apachepulsar/pulsar:3.3.0
    container_name: bookie
    hostname: bookie
    depends_on:
      zookeeper:
        condition: service_healthy
    ports:
      - "3181:3181"
      - "8000:8000"
    volumes:
      - bk-data:/pulsar/data/bookkeeper
    environment:
      - PULSAR_MEM=-Xms512m -Xmx512m -XX:MaxDirectMemorySize=512m
      - zkServers=zookeeper:2181
      - statsProviderClass=org.apache.bookkeeper.stats.prometheus.PrometheusMetricsProvider
    command: >
      bash -c "bin/apply-config-from-env.py conf/bookkeeper.conf &&
               bin/pulsar bookie"

  broker:
    image: apachepulsar/pulsar:3.3.0
    container_name: broker
    hostname: broker
    depends_on:
      zookeeper:
        condition: service_healthy
      bookie:
        condition: service_started
    ports:
      - "6650:6650"
      - "8080:8080"
    volumes:
      - broker-data:/pulsar/data
    environment:
      - PULSAR_MEM=-Xms512m -Xmx512m -XX:MaxDirectMemorySize=512m
      - metadataStoreUrl=zk:zookeeper:2181
      - configurationMetadataStoreUrl=zk:zookeeper:2181
      - clusterName=pulsar-cluster
      - managedLedgerDefaultEnsembleSize=1
      - managedLedgerDefaultWriteQuorum=1
      - managedLedgerDefaultAckQuorum=1
      - advertisedAddress=broker
      - advertisedListeners=external:pulsar://localhost:6650
    command: >
      bash -c "bin/apply-config-from-env.py conf/broker.conf &&
               bin/pulsar broker"
    healthcheck:
      test: ["CMD", "bin/pulsar-admin", "brokers", "healthcheck"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s

  init:
    image: apachepulsar/pulsar:3.3.0
    container_name: pulsar-init
    hostname: pulsar-init
    depends_on:
      zookeeper:
        condition: service_healthy
    command: >
      bin/pulsar initialize-cluster-metadata
      --cluster pulsar-cluster
      --metadata-store zk:zookeeper:2181
      --configuration-metadata-store zk:zookeeper:2181
      --web-service-url http://broker:8080
      --broker-service-url pulsar://broker:6650
    restart: on-failure

volumes:
  zk-data:
  bk-data:
  broker-data:

networks:
  default:
    name: pulsar-network
Start the cluster:
docker-compose up -d
Wait for all services to be healthy:
docker-compose ps

Environment Variables

Configure Pulsar using environment variables:
docker run -it \
  -e PULSAR_MEM="-Xms1g -Xmx1g -XX:MaxDirectMemorySize=1g" \
  -e PULSAR_GC="-XX:+UseG1GC -XX:MaxGCPauseMillis=10" \
  -e clusterName=my-cluster \
  -e advertisedAddress=pulsar.example.com \
  -p 6650:6650 \
  -p 8080:8080 \
  apachepulsar/pulsar:3.3.0 \
  bin/pulsar standalone
Common environment variables:
  • PULSAR_MEM - JVM heap and direct memory
  • PULSAR_GC - Garbage collection settings
  • clusterName - Cluster name
  • advertisedAddress - Advertised hostname
  • metadataStoreUrl - ZooKeeper connection string

Building Custom Images

Create custom images with specific connectors or configurations:
ARG VERSION=3.3.0

# Load pulsar-all as builder
FROM apachepulsar/pulsar-all:${VERSION} as pulsar-all

# Start from base image
FROM apachepulsar/pulsar:${VERSION}

# Add Cassandra connector
COPY --from=pulsar-all /pulsar/connectors/pulsar-io-cassandra-*.nar /pulsar/connectors/

# Add JCloud offloader
COPY --from=pulsar-all /pulsar/offloaders/tiered-storage-jcloud-*.nar /pulsar/offloaders/

# Add custom configuration
COPY custom-broker.conf /pulsar/conf/broker.conf
Build:
docker build --build-arg VERSION=3.3.0 -t pulsar-custom:3.3.0 .

Adding Debugging Tools

For troubleshooting, create an image with additional tools:
FROM apachepulsar/pulsar:3.3.0

# Switch to root to install tools
USER 0

# Install debugging utilities
RUN apt-get update && \
    apt-get install -y vim net-tools unzip curl dnsutils && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

# Switch back to non-root user
USER 10000

Security Considerations

Non-Root User

Pulsar Docker images run as user 10000 (non-root) by default since version 2.10.0:
# Verify user
docker run --rm apachepulsar/pulsar:3.3.0 id
# Output: uid=10000 gid=0(root) groups=0(root)

Running on OpenShift

For OpenShift compatibility:
docker run --rm \
  --user 10000:10001 \
  -p 6650:6650 \
  -p 8080:8080 \
  apachepulsar/pulsar:3.3.0 \
  bin/pulsar standalone

Debugging as Root

If you need root access for debugging:
# Start container as root
docker exec -it --user 0 pulsar-standalone bash
Or in Docker Compose:
services:
  pulsar:
    user: "0:0"

Networking

Host Network Mode

For better performance (Linux only):
docker run -it \
  --network host \
  apachepulsar/pulsar:3.3.0 \
  bin/pulsar standalone

Custom Network

# Create network
docker network create pulsar-net

# Run container
docker run -it \
  --network pulsar-net \
  --name pulsar \
  -p 6650:6650 \
  -p 8080:8080 \
  apachepulsar/pulsar:3.3.0 \
  bin/pulsar standalone

Resource Limits

Set CPU and memory limits:
docker run -it \
  --cpus="2" \
  --memory="4g" \
  --memory-swap="4g" \
  -p 6650:6650 \
  -p 8080:8080 \
  apachepulsar/pulsar:3.3.0 \
  bin/pulsar standalone
In Docker Compose:
services:
  pulsar:
    image: apachepulsar/pulsar:3.3.0
    deploy:
      resources:
        limits:
          cpus: '2'
          memory: 4G
        reservations:
          cpus: '1'
          memory: 2G

Monitoring

Prometheus Metrics

Expose Prometheus metrics:
services:
  pulsar:
    image: apachepulsar/pulsar:3.3.0
    ports:
      - "8080:8080"  # Broker metrics at /metrics
    command: bin/pulsar standalone
Scrape metrics:
curl http://localhost:8080/metrics

Log Collection

Mount log directory:
docker run -it \
  -v $PWD/logs:/pulsar/logs \
  apachepulsar/pulsar:3.3.0 \
  bin/pulsar standalone

Health Checks

Docker health check configuration:
services:
  broker:
    image: apachepulsar/pulsar:3.3.0
    healthcheck:
      test: ["CMD", "bin/pulsar-admin", "brokers", "healthcheck"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s

Backup and Recovery

Export Data

# Backup data directory
docker run --rm \
  -v pulsar-data:/pulsar/data \
  -v $PWD/backup:/backup \
  alpine tar czf /backup/pulsar-backup.tar.gz /pulsar/data

Restore Data

# Restore from backup
docker run --rm \
  -v pulsar-data:/pulsar/data \
  -v $PWD/backup:/backup \
  alpine tar xzf /backup/pulsar-backup.tar.gz -C /

Troubleshooting

Container Not Starting

Check logs:
docker logs pulsar-standalone

Permission Denied

If you see permission errors, ensure volumes have correct ownership:
# Fix ownership
sudo chown -R 10000:0 ./data

Out of Memory

Adjust JVM settings:
docker run -it \
  -e PULSAR_MEM="-Xms2g -Xmx2g -XX:MaxDirectMemorySize=2g" \
  apachepulsar/pulsar:3.3.0 \
  bin/pulsar standalone

Connection Refused

Verify ports are exposed:
docker port pulsar-standalone

Image Sizes

Comparison of official images (version 2.9.1):
ImageSizeUse Case
apachepulsar/pulsar1.59 GBStandard deployment
apachepulsar/pulsar-all3.44 GBWith all connectors/offloaders
Custom image~1.6 GBSelective connectors

Production Considerations

For production deployments:
  • Use Docker Compose or Kubernetes for orchestration
  • Configure persistent volumes for data durability
  • Set resource limits and health checks
  • Enable monitoring and logging
  • Use specific image tags (not latest)
  • Consider using Kubernetes deployment for better scalability

Next Steps

Build docs developers (and LLMs) love