Task Queues

Task queues are the routing mechanism that connects the Temporal Server to your workers. They act as a bridge between workflow executions in the History Service and worker processes that execute workflow and activity code.

Task Queue Basics

A task queue is a logical queue that workers poll for tasks. When the History Service needs workflow or activity code to execute, it dispatches tasks to the appropriate task queue.

Two Types of Tasks

Workflow Tasks

Prompts worker to advance workflow execution by replaying history and generating new commands.

Activity Tasks

Requests worker to execute specific activity function with given input.

These “task queue tasks” are user-visible concepts, distinct from internal History Service tasks (Transfer Tasks, Timer Tasks) which are implementation details.

Task Queue Architecture

The Matching Service manages all task queues:

Task Queue Partitioning

To handle high throughput, each task queue is split into multiple partitions:

// From service/matching/task_queue_partition_manager.go
type taskQueuePartitionManagerImpl struct {
    engine    *matchingEngineImpl
    partition tqid.Partition  // Which partition this manages
    ns        *namespace.Namespace
    config    *taskQueueConfig
    
    // Versioned queues for worker versioning
    versionedQueues     map[PhysicalTaskQueueVersion]physicalTaskQueueManager
    versionedQueuesLock sync.RWMutex
    
    userDataManager     userDataManager
    rateLimitManager    *rateLimitManager
    
    // Default queue future for initialization
    defaultQueueFuture *future.FutureImpl[physicalTaskQueueManager]
}

Why Partition Task Queues?

Scalability: Single queue partition limited by one Matching Service instance. Partitions spread load across instances.Throughput: Each partition handles subset of tasks. More partitions = higher total throughput.Default: 4 partitions per task queue, configurable per namespace.Trade-offs: More partitions increase dispatching overhead. Optimal partition count depends on task rate.

Task Dispatching

When History Service schedules a task, it goes through this flow:

History creates Transfer Task

Workflow/Activity task decision creates a Transfer Task in History shard’s queue

Queue Processor picks up task

Background queue processor reads Transfer Task from persistence

Call Matching Service

History makes AddWorkflowTask or AddActivityTask RPC to Matching Service

Matching routes to partition

Matching Service hashes task queue name to select partition

Attempt sync match

If worker is polling, task delivered immediately (sync match)

Or write to database

If no poller available, task written to persistence (async match)

// From service/history/transfer_queue_active_task_executor.go
func (t *transferQueueActiveTaskExecutor) processWorkflowTask(
    ctx context.Context,
    task *tasks.WorkflowTask,
) error {
    // Load workflow state
    // Build task info
    // Call Matching Service
    return t.matchingClient.AddWorkflowTask(ctx, &matchingservice.AddWorkflowTaskRequest{
        NamespaceId:       task.NamespaceID,
        Execution:         task.WorkflowExecution,
        TaskQueue:         &taskqueuepb.TaskQueue{Name: taskQueueName},
        ScheduledEventId:  task.ScheduledEventID,
        ScheduleToStartTimeout: timeout,
    })
}

Sync Match vs Async Match

The Matching Service optimizes for low latency through sync matching:

Sync Match (Fast Path)

Worker is already polling

Worker has open long-poll connection waiting for task

Task arrives at Matching

History Service dispatches task to Matching Service

Instant match

Task delivered directly to waiting poller without database write

Low latency

Total time from History to Worker: 1-5ms typically

Async Match (Slow Path)

No poller available

Task arrives but no worker is currently polling

Write to database

Task persisted to Matching Service database

Worker polls later

When worker polls, task read from database and returned

Higher latency

Database round-trip adds latency, but ensures task not lost

// From service/matching/matcher.go
func (tm *TaskMatcher) Offer(ctx context.Context, task *internalTask) (bool, error) {
    // Check if we should prefer database due to backlog
    if !tm.isBacklogNegligible() {
        return false, nil  // Force async match for ordering
    }
    
    // Try sync match with waiting poller
    select {
    case tm.taskC <- task:  // Poller waiting!
        if task.responseC != nil {
            err := <-task.responseC  // Wait for poller to accept
            return true, err.startErr
        }
        return false, nil
    default:
        // No poller, try forwarding to parent partition
        return tm.fwdr.ForwardTask(ctx, task)
    }
}

Task Queue Forwarding

To improve sync match rate, Matching uses hierarchical forwarding:

Forwarding Mechanism

How Forwarding Works

Forward Task to Parent: Child partition has task but no poller → forwards task to parent/rootForward Poller to Parent: Child partition has poller but no task → forwards poller to parent/rootMatch at Root: Root partition has better chance of matching tasks with pollers from all childrenBenefits: Higher sync match rate, better load distribution, faster task deliveryCost: Additional RPC hops, slightly higher latency for forwarded tasks

// From service/matching/forwarder.go
type Forwarder struct {
    cfg       *forwarderConfig
    partition tqid.Partition
    client    matchingservice.MatchingServiceClient
    
    // Token bucket for rate limiting forwarding
    addReqTokenBucket atomic.Pointer[dynamicconfig.TaskqueueQuotaBucket]
    pollReqTokenBucket atomic.Pointer[dynamicconfig.TaskqueueQuotaBucket]
}

func (fwdr *Forwarder) ForwardTask(ctx context.Context, task *internalTask) error {
    // Get parent partition
    parent := fwdr.partition.Parent()
    
    // Forward to parent's Matching instance
    return fwdr.client.AddWorkflowTask(ctx, &matchingservice.AddWorkflowTaskRequest{
        // ... task details
        ForwardInfo: &matchingservice.ForwardInfo{
            SourcePartition: fwdr.partition.PartitionId(),
        },
    })
}

Task Queue Lifecycle

Loading

Task queue partitions are loaded on-demand:

First poll or task for a task queue triggers loading
Partition manager created for that partition
State loaded from database (backlog, ack levels, etc.)
Matcher started to match tasks with pollers

Unloading

Partitions unload when idle:

No pollers for configured duration (e.g., 5 minutes)
No tasks dispatched
Memory pressure on Matching Service

const (
    unloadCauseIdle = "idle"           // No activity
    unloadCauseConfigChange = "config"  // Dynamic config changed
    unloadCauseError = "error"         // Persistent errors
)

Unloading is an optimization. When next task/poll arrives, partition reloads automatically. Users don’t see interruption.

Task Queue Backlog

When task arrival rate exceeds worker processing rate, backlog accumulates:

Backlog Management

Backlog Storage

Tasks that can’t sync match are written to persistence:Cassandra: Separate table per task queue partition SQL: Rows in tasks table with task queue + partition columnsBacklog is FIFO ordered by task ID (monotonically increasing).

Backlog Reading

// From service/matching/task_reader.go
type taskReader struct {
    tlMgr           taskQueuePartitionManager
    db              *taskQueueDB
    matcher         *TaskMatcher
    
    // Read batches of tasks from database
    taskBatchSize   int
}

// Continuously read backlog tasks and offer to matcher
func (tr *taskReader) dispatchBufferedTasks() {
    for !tr.stopped() {
        tasks := tr.db.ReadTasks(batchSize)  // Read from persistence
        for _, task := range tasks {
            tr.matcher.Offer(task)  // Try to dispatch
        }
    }
}

Backlog Metrics

Key metrics:

Task Queue Backlog Age: How old is the oldest task?
Task Queue Backlog Size: How many tasks waiting?
Task Dispatch Latency: Time from scheduling to worker pickup

Task Queue Versioning

For worker versioning, each task queue has multiple physical queues:

// From service/matching/task_queue_partition_manager.go
type taskQueuePartitionManagerImpl struct {
    // One versioned queue per Build ID
    versionedQueues map[PhysicalTaskQueueVersion]physicalTaskQueueManager
    
    // User data manager tracks versioning rules
    userDataManager userDataManager
}

Default Queue

Unversioned tasks and workers without Build ID use default queue

Versioned Queues

Each Build ID gets dedicated queue. Tasks routed by versioning rules.

Task Queue Configuration

Dynamic Configuration

type taskQueueConfig struct {
    // How many partitions for this task queue
    NumReadPartitions int
    NumWritePartitions int
    
    // Task lifecycle
    MaxTaskQueueIdleTime time.Duration  // When to unload
    MaxTaskBatchSize int                // Backlog read batch size
    
    // Rate limiting
    TaskDispatchRPS float64
    
    // Forwarding
    ForwarderMaxChildrenPerNode int
    ForwarderMaxOutstandingPolls int
    ForwarderMaxOutstandingTasks int
}

Rate Limiting

Task queues support rate limiting task dispatch:

Global RPS limit per task queue
Smoothed delivery prevents spikes
Shared across partitions
Configurable per namespace

Rate limiting applies to task dispatch from Matching to workers, not task arrival from History Service. Tasks buffer in Matching if rate limited.

Sticky Task Queues

Workers can request “sticky” execution for performance:

Normal task queue: my-task-queue
Sticky task queue: __sticky__:${workerID}

Sticky queues:

Route workflow tasks back to same worker
Enable workflow cache reuse (skip replay)
Timeout quickly and fallback to normal queue
Per-worker, not shared across workers

Task Queue Observability

Key signals to monitor:

Metrics

task_queue_backlog_age_seconds: Age of oldest task
task_queue_backlog_size: Number of pending tasks
poll_success_rate: Poller success rate
sync_match_rate: Percentage of sync vs async matches
task_dispatch_latency: Time from schedule to worker

Troubleshooting

High Backlog Age

Cause: Not enough workers or workers too slowSolution: Scale workers, optimize activity code, increase concurrency

Low Sync Match Rate

Cause: Tasks arriving when no pollers waitingSolution: Increase poller count, reduce partition count, enable forwarding

High Task Latency

Cause: Network issues, Matching overload, database slownessSolution: Check Matching Service health, database performance, network latency

No Recent Poller

Cause: All workers down or misconfiguredSolution: Verify workers running, check task queue name matches, review logs

Workers - Poll task queues and execute tasks
Workflows - Generate workflow tasks
Activities - Generate activity tasks

Getting Started

Core Concepts

Deployment

Architecture

Operations

Advanced Features

Task Queues

Task Queues

Task Queue Basics

Two Types of Tasks

Workflow Tasks

Activity Tasks

Task Queue Architecture

Task Queue Partitioning

Task Dispatching

Sync Match vs Async Match

Sync Match (Fast Path)

Async Match (Slow Path)

Task Queue Forwarding

Forwarding Mechanism

Task Queue Lifecycle

Loading

Unloading

Task Queue Backlog

Backlog Management

Task Queue Versioning

Default Queue

Versioned Queues

Task Queue Configuration

Dynamic Configuration

Rate Limiting

Sticky Task Queues

Task Queue Observability

Metrics

Troubleshooting

High Backlog Age

Low Sync Match Rate

High Task Latency

No Recent Poller

Build docs developers (and LLMs) love

Getting Started

Core Concepts

Deployment

Architecture

Operations

Advanced Features

​Task Queues

​Task Queue Basics

​Two Types of Tasks

Workflow Tasks

Activity Tasks

​Task Queue Architecture

​Task Queue Partitioning

​Task Dispatching

​Sync Match vs Async Match

​Sync Match (Fast Path)

​Async Match (Slow Path)

​Task Queue Forwarding

​Forwarding Mechanism

​Task Queue Lifecycle

​Loading

​Unloading

​Task Queue Backlog

​Backlog Management

​Task Queue Versioning

Default Queue

Versioned Queues

​Task Queue Configuration

​Dynamic Configuration

​Rate Limiting

​Sticky Task Queues

​Task Queue Observability

​Metrics

​Troubleshooting

High Backlog Age

Low Sync Match Rate

High Task Latency

No Recent Poller

​Related Concepts

Build docs developers (and LLMs) love

Task Queues

Task Queue Basics

Two Types of Tasks

Task Queue Architecture

Task Queue Partitioning

Task Dispatching

Sync Match vs Async Match

Sync Match (Fast Path)

Async Match (Slow Path)

Task Queue Forwarding

Forwarding Mechanism

Task Queue Lifecycle

Loading

Unloading

Task Queue Backlog

Backlog Management

Task Queue Versioning

Task Queue Configuration

Dynamic Configuration

Rate Limiting

Sticky Task Queues

Task Queue Observability

Metrics

Troubleshooting

Related Concepts