Skip to main content
This guide shows you how to set up Lemline with Apache Kafka for messaging and PostgreSQL for persistent storage. This combination is ideal for high-throughput production environments.

Why Kafka + PostgreSQL?

  • Kafka: High-throughput, distributed message broker with excellent scalability
  • PostgreSQL: Robust relational database with strong consistency guarantees
  • Use case: Production deployments requiring high message throughput and reliable persistence

Prerequisites

  • Docker and Docker Compose installed
  • Lemline built locally (see Installation)
  • Basic understanding of Kafka and PostgreSQL

Quick Start

1
Start the Infrastructure
2
From the examples directory, start Kafka and PostgreSQL:
3
cd examples
docker compose --profile kafka-pg up -d
4
This starts:
5
  • PostgreSQL (port 5432)
  • Kafka (port 9092)
  • Zookeeper (port 2181, for Kafka coordination)
  • Kafka UI (port 8080, optional management interface)
  • 6
    Wait for Services
    7
    Check that services are healthy:
    8
    docker compose ps
    
    9
    All services should show status “Up” and health “healthy”.
    10
    Configure Lemline
    11
    The example includes a pre-configured file:
    12
    # Lemline Configuration: Kafka + PostgreSQL
    
    lemline:
      database:
        postgresql:
          host: localhost
          port: 5432
          database: lemline
          username: postgres
          password: postgres
    
      messaging:
        kafka:
          brokers: localhost:9092
    
    13
    Start Lemline
    14
    From the project root:
    15
    LEMLINE_CONFIG=./examples/lemline-kafka-pg.yaml \
      java -jar lemline-runner/build/quarkus-app/quarkus-run.jar listen
    
    16
    Lemline will:
    17
  • Connect to PostgreSQL and run migrations
  • Connect to Kafka brokers
  • Create necessary topics (commands-in, commands-out, events-out)
  • Start listening for workflow commands
  • 18
    Deploy a Workflow
    19
    Install the hello world workflow:
    20
    lemline definition post -f examples/workflows/hello.yaml
    
    21
    Run a Workflow Instance
    22
    lemline instance start -n tutorial.hello-workflow -v 0.1.0
    
    23
    Watch the Lemline logs to see execution.

    Configuration Details

    PostgreSQL Settings

    lemline:
      database:
        postgresql:
          host: localhost
          port: 5432
          database: lemline
          username: postgres
          password: postgres
          # Optional advanced settings:
          # maxPoolSize: 20
          # connectionTimeout: 30000
    
    Environment variable overrides:
    LEMLINE_DATABASE_POSTGRESQL_HOST=db.example.com
    LEMLINE_DATABASE_POSTGRESQL_PORT=5432
    LEMLINE_DATABASE_POSTGRESQL_DATABASE=production
    LEMLINE_DATABASE_POSTGRESQL_USERNAME=lemline_user
    LEMLINE_DATABASE_POSTGRESQL_PASSWORD=secret123
    

    Kafka Settings

    lemline:
      messaging:
        kafka:
          brokers: localhost:9092
          # Optional topic configuration:
          # commands:
          #   topic: custom-commands-topic
          #   partitions: 4
          # events:
          #   topic: custom-events-topic
          #   partitions: 4
    
    Environment variable overrides:
    LEMLINE_MESSAGING_KAFKA_BROKERS=kafka1:9092,kafka2:9092,kafka3:9092
    

    Database Schema

    Lemline automatically creates the necessary tables:
    • workflow_definitions: Workflow DSL definitions
    • workflow_instances: Active and completed workflow instances
    • workflow_waits: Scheduled tasks and timeouts
    • workflow_listeners: Event listeners and subscriptions
    • workflow_retries: Retry state for failed tasks
    • workflow_failures: Terminal failures and errors
    • lifecycle_analytics: Execution metrics and history
    Migrations are managed by Flyway and run automatically on startup.

    Kafka Topics

    Lemline uses three Kafka topics by default:
    1. commands-in: Incoming workflow commands (start, resume, cancel)
    2. commands-out: Outgoing workflow commands for task execution
    3. events-out: CloudEvents produced by workflows

    Topic Configuration

    Kafka creates topics automatically with default settings (4 partitions, replication factor 1). For production, pre-create topics with appropriate settings:
    # Create commands-in topic with 8 partitions
    docker exec lemline-kafka kafka-topics \
      --create --topic lemline-commands-in \
      --bootstrap-server localhost:9092 \
      --partitions 8 \
      --replication-factor 3
    
    # Create commands-out topic
    docker exec lemline-kafka kafka-topics \
      --create --topic lemline-commands-out \
      --bootstrap-server localhost:9092 \
      --partitions 8 \
      --replication-factor 3
    
    # Create events-out topic
    docker exec lemline-kafka kafka-topics \
      --create --topic lemline-events-out \
      --bootstrap-server localhost:9092 \
      --partitions 8 \
      --replication-factor 3
    

    Horizontal Scaling

    Run multiple Lemline instances to scale horizontally:
    # Terminal 1
    LEMLINE_CONFIG=./examples/lemline-kafka-pg.yaml \
      java -jar lemline-runner.jar listen
    
    # Terminal 2
    LEMLINE_CONFIG=./examples/lemline-kafka-pg.yaml \
      java -jar lemline-runner.jar listen
    
    # Terminal 3
    LEMLINE_CONFIG=./examples/lemline-kafka-pg.yaml \
      java -jar lemline-runner.jar listen
    
    Kafka consumer groups ensure that each message is processed by exactly one instance.

    Monitoring

    Kafka UI

    Access the Kafka UI at http://localhost:8080 to:
    • View topics and partitions
    • Inspect messages
    • Monitor consumer lag
    • View broker metrics

    PostgreSQL Monitoring

    Connect to PostgreSQL to query workflow state:
    docker exec -it lemline-postgres psql -U postgres -d lemline
    
    Useful queries:
    -- Active workflow instances
    SELECT namespace, name, version, status, created_at
    FROM workflow_instances
    WHERE status = 'running';
    
    -- Recent failures
    SELECT wi.namespace, wi.name, wf.error_type, wf.error_message, wf.failed_at
    FROM workflow_failures wf
    JOIN workflow_instances wi ON wf.instance_id = wi.id
    ORDER BY wf.failed_at DESC
    LIMIT 10;
    
    -- Listener subscriptions
    SELECT event_type, COUNT(*) as listener_count
    FROM workflow_listeners
    GROUP BY event_type;
    

    Production Considerations

    Configure appropriate pool sizes based on your workload:
    lemline:
      database:
        postgresql:
          maxPoolSize: 50
          minPoolSize: 10
          connectionTimeout: 30000
    
    For production, use a multi-broker Kafka cluster with:
    • At least 3 brokers for fault tolerance
    • Appropriate replication factor (typically 3)
    • Sufficient partitions for parallelism (8-16 per topic)
    Optimize PostgreSQL for your workload:
    • Increase max_connections for concurrent workflows
    • Tune shared_buffers and effective_cache_size
    • Enable connection pooling with PgBouncer
    • Configure appropriate backup and replication
    • Use TLS for Kafka connections
    • Enable PostgreSQL SSL
    • Use strong passwords and rotate credentials
    • Configure Kafka SASL authentication
    • Restrict network access with firewalls

    Troubleshooting

    Kafka Connection Refused

    Error: Connection to node -1 (localhost/127.0.0.1:9092) could not be established Solution: Ensure Kafka is running and accessible:
    docker compose ps kafka
    docker compose logs kafka
    

    PostgreSQL Connection Errors

    Error: Connection refused or password authentication failed Solution: Check PostgreSQL status and credentials:
    docker compose ps postgres
    docker compose logs postgres
    

    Topic Not Created

    Error: Topic 'lemline-commands-in' does not exist Solution: Kafka auto-creates topics by default. If disabled, manually create topics (see Kafka Topics).

    Stopping the Infrastructure

    # Stop services but keep data
    docker compose --profile kafka-pg down
    
    # Stop services and remove volumes (clean slate)
    docker compose --profile kafka-pg down -v
    

    Next Steps

    RabbitMQ + MySQL

    Try a different messaging/database combination

    PGMQ Setup

    Use PostgreSQL for both messaging and storage

    Production Deployment

    Deploy Lemline to production

    Monitoring

    Set up metrics and monitoring

    Build docs developers (and LLMs) love