Workers

Workers are the compute processes that execute your workflow and activity code. They run in your infrastructure, continuously polling Temporal Server for tasks and executing them using the Temporal SDK.

Worker Architecture

A worker process is a long-running application that:

Registers workflow and activity implementations with the SDK
Polls the Temporal Server for workflow and activity tasks
Executes tasks by running your code
Reports results back to the server
Repeats continuously until shut down

Polling Mechanism

Workers use long-polling to receive tasks from the server with minimal latency:

Long-Poll Request

Worker sends poll request

Worker makes PollWorkflowTaskQueue or PollActivityTaskQueue RPC to Frontend Service

Frontend routes to Matching Service

Request is routed to the Matching Service instance responsible for that task queue

Matching waits for task

If no task immediately available, Matching Service holds the connection open (long-poll) waiting for a task

Task arrives

When History Service dispatches a task, it’s matched with the waiting poller

Task returned to worker

Matching Service returns the task details and closes the poll request

// From service/matching/matcher.go
type TaskMatcher struct {
    // Channels for matching tasks with pollers
    taskC      chan *internalTask  // Regular tasks
    queryTaskC chan *internalTask  // Query tasks
    
    // Track when pollers were last seen
    lastPoller atomic.Int64  // unix nanos of most recent poll
    
    // Rate limiting for task dispatch
    rateLimiter quotas.RateLimiter
}

// Offer matches a task with a waiting poller
func (tm *TaskMatcher) Offer(ctx context.Context, task *internalTask) (bool, error) {
    select {
    case tm.taskC <- task:  // Poller waiting, immediate match
        return true, nil
    default:
        // No poller available, try forwarding to parent partition
        return tm.fwdr.ForwardTask(ctx, task)
    }
}

Sync Match vs Async Match

Sync Match

Instant delivery: Task dispatched directly to waiting poller without touching database. Lowest latency path (~1ms).

Async Match

Queued delivery: No poller available, task written to database and retrieved later when poller arrives. Higher latency but ensures delivery.

Workflow Task Execution

When a worker receives a workflow task:

Execution Flow

Worker receives task

Task contains workflow execution history (all events since start)

SDK replays history

SDK feeds history events through workflow code to reconstruct current state. Must be deterministic.

Workflow code executes

Code runs until blocked (waiting for activity, timer, signal) or completes

Generate commands

SDK collects commands like ScheduleActivityTask, StartTimer, CompleteWorkflowExecution

Send response

Worker sends RespondWorkflowTaskCompleted with command list to History Service

Workflow Task Response Structure

type RespondWorkflowTaskCompletedRequest struct {
    TaskToken []byte      // Identifies which task this is completing
    Commands  []*Command  // List of commands to execute
    
    // Possible commands:
    // - ScheduleActivityTask
    // - StartTimer 
    // - CompleteWorkflowExecution
    // - FailWorkflowExecution
    // - CancelTimer
    // - RequestCancelActivity
    // - StartChildWorkflowExecution
    // - SignalExternalWorkflowExecution
    // - ContinueAsNewWorkflowExecution
    // - and more...
}

Workflow Task Failures

Workflow tasks can fail for several reasons:

Non-Determinism

Workflow code produced different commands during replay. SDK detects this and fails the task.

Worker Crash

Worker dies during execution. Task timeout fires and task is retried on another worker.

Transient Error

Temporary failure like network issue. Task automatically retried with backoff.

Bad Code

Unhandled exception in workflow code. Can be configured to fail workflow or retry task.

Activity Task Execution

Activity execution is simpler than workflow execution:

Worker receives activity task

Task contains activity input, attempt number, and heartbeat details (if resuming)

Execute activity function

Activity code runs with full access to I/O, databases, APIs, etc. No determinism required.

Send heartbeats (optional)

Long-running activities periodically heartbeat to prove they’re alive

Return result or error

Worker sends RespondActivityTaskCompleted or RespondActivityTaskFailed

// Activity execution is straightforward
func (w *Worker) executeActivity(task *ActivityTask) {
    // Get registered activity function
    activityFn := w.registry.GetActivity(task.ActivityType)
    
    // Execute with timeout and context
    ctx, cancel := context.WithTimeout(context.Background(), task.StartToCloseTimeout)
    defer cancel()
    
    result, err := activityFn(ctx, task.Input)
    
    // Report result back to server
    if err != nil {
        w.client.RespondActivityTaskFailed(ctx, task.TaskToken, err)
    } else {
        w.client.RespondActivityTaskCompleted(ctx, task.TaskToken, result)
    }
}

Worker Configuration

Workers have several important configuration parameters:

Concurrency Settings

Max Concurrent Workflow Tasks

Controls how many workflow tasks can execute simultaneously. Each task runs in its own goroutine/thread.Default: 100 Considerations: Workflow tasks are typically CPU-bound (replay logic). Scale based on CPU cores.

Max Concurrent Activity Tasks

Controls how many activity tasks can execute simultaneously.Default: 100 Considerations: Activities often wait on I/O. Can typically be higher than workflow task concurrency.

Max Concurrent Local Activity Tasks

Controls local activity execution parallelism.Default: 100 Considerations: Local activities execute inline with workflow task, sharing its resources.

Polling Configuration

type WorkerOptions struct {
    MaxConcurrentWorkflowTaskPollers int  // Number of parallel pollers
    MaxConcurrentActivityTaskPollers int  // Number of parallel pollers
    
    WorkflowPollersCount int  // Deprecated, use above
    ActivityPollersCount int  // Deprecated, use above
    
    // How long to wait for task before considering poll timed out
    WorkflowTaskTimeout time.Duration
    
    // Other options...
}

More pollers increases throughput but also increases load on Matching Service. Typical values: 2-10 pollers per worker process.

Worker Identity and Tracking

Each worker has an identity used for observability:

Worker Identity: Unique identifier (hostname + process ID + UUID)
Binary Checksum: Hash of worker binary for detecting bad deployments
Build ID: Worker version for routing workflows to compatible workers

// Server tracks which worker executed each task
type ActivityInfo struct {
    StartedIdentity         string  // Which worker started this attempt
    RetryLastWorkerIdentity string  // Which worker tried last attempt
    LastWorkerDeploymentVersion *deploymentpb.WorkerDeploymentVersion
    // ...
}

Sticky Execution

For performance, Temporal uses “sticky execution” to route workflow tasks back to the same worker:

How Sticky Execution Works

Worker caches workflow state

After completing a workflow task, worker keeps workflow state in memory

Worker specifies sticky task queue

Response includes a sticky task queue name unique to this worker

Next task routed to sticky queue

History Service dispatches next workflow task to sticky queue first

Fast path if worker available

Same worker picks up task, skips replay, resumes from cached state

Fallback to normal queue

If worker unavailable after timeout, task dispatched to normal queue

Sticky Execution Benefits

Performance: Skip replaying history (can save 100ms+ for large histories)Reduced Server Load: No need to fetch and transmit full historyLower Latency: Workflow tasks complete faster, workflows make progress quickerTradeoff: Workflow execution pinned to specific worker. If worker dies, new worker must do full replay.

Worker Failure Modes

Worker Process Crash

Impact: In-flight tasks timeout and retry on other workers. Sticky cache lost, requiring history replay.Mitigation: Workflow tasks automatically redistributed. Minimal impact if worker pool is healthy.

Worker Hangs

Impact: Tasks never complete, timeouts fire, tasks retried elsewhere.Mitigation: Set appropriate workflow task timeout. Monitor task latency.

Network Partition

Impact: Worker can’t reach server, can’t poll for tasks or report results.Mitigation: Workers retry with backoff. Tasks timeout and retry on healthy workers.

Bad Deployment

Impact: All workers deploy broken code, all tasks fail.Mitigation: Binary checksum tracking, gradual rollouts, automated rollback.

Worker Versioning

Temporal supports routing tasks based on worker version:

Build ID-Based Versioning

Workers register with a Build ID (e.g., git commit, version number)
Workflows can be pinned to specific Build ID sets
New workflows use latest Build ID
Running workflows stay on their Build ID or migrate per rules

Use worker versioning for safe, gradual deployments of workflow code changes without disrupting running workflows.

Monitoring Workers

Key metrics to monitor:

Poll Success Rate: Are workers successfully getting tasks?
Task Execution Latency: How long do tasks take?
Task Failure Rate: How many tasks are failing?
Worker Count: How many workers are active?
Sticky Cache Hit Rate: How often is sticky execution working?
Queue Backlog: Are tasks piling up?

Task Queues - How workers discover tasks
Workflows - What workers execute
Activities - Activity execution in workers

Getting Started

Core Concepts

Deployment

Architecture

Operations

Advanced Features

Workers

Workers

Worker Architecture

Polling Mechanism

Long-Poll Request

Sync Match vs Async Match

Sync Match

Async Match

Workflow Task Execution

Execution Flow

Workflow Task Failures

Non-Determinism

Worker Crash

Transient Error

Bad Code

Activity Task Execution

Worker Configuration

Concurrency Settings

Polling Configuration

Worker Identity and Tracking

Sticky Execution

How Sticky Execution Works

Worker Failure Modes

Worker Process Crash

Worker Hangs

Network Partition

Bad Deployment

Worker Versioning

Build ID-Based Versioning

Monitoring Workers

Build docs developers (and LLMs) love

Getting Started

Core Concepts

Deployment

Architecture

Operations

Advanced Features

​Workers

​Worker Architecture

​Polling Mechanism

​Long-Poll Request

​Sync Match vs Async Match

Sync Match

Async Match

​Workflow Task Execution

​Execution Flow

​Workflow Task Failures

Non-Determinism

Worker Crash

Transient Error

Bad Code

​Activity Task Execution

​Worker Configuration

​Concurrency Settings

​Polling Configuration

​Worker Identity and Tracking

​Sticky Execution

​How Sticky Execution Works

​Worker Failure Modes

Worker Process Crash

Worker Hangs

Network Partition

Bad Deployment

​Worker Versioning

​Build ID-Based Versioning

​Monitoring Workers

​Related Concepts

Build docs developers (and LLMs) love

Workers

Worker Architecture

Polling Mechanism

Long-Poll Request

Sync Match vs Async Match

Workflow Task Execution

Execution Flow

Workflow Task Failures

Activity Task Execution

Worker Configuration

Concurrency Settings

Polling Configuration

Worker Identity and Tracking

Sticky Execution

How Sticky Execution Works

Worker Failure Modes

Worker Versioning

Build ID-Based Versioning

Monitoring Workers

Related Concepts