Skip to main content

Overview

Temporal automatically retries failed activities and child workflows according to configurable retry policies:
  • Exponential backoff - Increasing delays between retry attempts
  • Maximum attempts - Limit total retry count
  • Timeout controls - Cap retry duration
  • Non-retriable errors - Fail immediately for specific error types
  • Backoff coefficient - Control retry delay growth rate
Retry policies are core to Temporal’s reliability guarantees. Configure them carefully to balance fault tolerance with resource usage.

Retry Policy Configuration

initialInterval
duration
required
Delay before the first retry attemptDefault: 1 second
InitialInterval: time.Second
backoffCoefficient
float
required
Multiplier for retry delay on each attemptDefault: 2.0
BackoffCoefficient: 2.0 // Doubles delay each retry
maximumInterval
duration
Maximum delay between retriesDefault: 100x initial interval
MaximumInterval: time.Minute * 5
maximumAttempts
int
Maximum number of retry attempts (0 = unlimited)Default: 0 (unlimited)
MaximumAttempts: 10
nonRetriableErrorTypes
[]string
Error types that should not be retried
NonRetriableErrorTypes: []string{"ValidationError", "PermissionDenied"}

Exponential Backoff Formula

Retry delay calculation:
retryDelay = min(
    initialInterval * (backoffCoefficient ^ attemptNumber),
    maximumInterval
)

Example: Default Policy

With default settings (initialInterval=1s, backoffCoefficient=2.0, maximumInterval=100s):
AttemptDelay CalculationActual Delay
11s × 2^0 = 1s1s
21s × 2^1 = 2s2s
31s × 2^2 = 4s4s
41s × 2^3 = 8s8s
51s × 2^4 = 16s16s
61s × 2^5 = 32s32s
71s × 2^6 = 64s64s
81s × 2^7 = 128s100s (capped)
91s × 2^8 = 256s100s (capped)

Activity Retry Policies

Configure retries when scheduling activities:
ao := workflow.ActivityOptions{
    StartToCloseTimeout: time.Minute * 10,
    RetryPolicy: &temporal.RetryPolicy{
        InitialInterval:    time.Second,
        BackoffCoefficient: 2.0,
        MaximumInterval:    time.Minute,
        MaximumAttempts:    5,
        NonRetriableErrorTypes: []string{
            "ValidationError",
        },
    },
}
ctx = workflow.WithActivityOptions(ctx, ao)

err := workflow.ExecuteActivity(ctx, MyActivity, input).Get(ctx, &result)

Workflow Retry Policies

Configure retries for workflows and child workflows:
// Start workflow with retry policy
workflowOptions := client.StartWorkflowOptions{
    ID:        "my-workflow-id",
    TaskQueue: "my-task-queue",
    RetryPolicy: &temporal.RetryPolicy{
        InitialInterval:    time.Second * 5,
        BackoffCoefficient: 2.0,
        MaximumInterval:    time.Hour,
        MaximumAttempts:    3,
    },
}

we, err := c.ExecuteWorkflow(context.Background(), workflowOptions, MyWorkflow)
Workflow retries create a new workflow execution run. The workflow will restart from the beginning on each retry.

Server-Side Configuration

Default retry policies can be configured in dynamic config:
# Dynamic config for default activity retry
frontend.defaultActivityRetryPolicy:
  InitialIntervalInSeconds: 1
  BackoffCoefficient: 2.0
  MaximumIntervalInSeconds: 100
  MaximumAttempts: 0

# Dynamic config for default workflow retry  
frontend.defaultWorkflowRetryPolicy:
  InitialIntervalInSeconds: 1
  BackoffCoefficient: 2.0
  MaximumIntervalInSeconds: 100
  MaximumAttempts: 0

Non-Retriable Errors

Specify error types that should fail immediately without retry:
type ValidationError struct {
    message string
}

func (e *ValidationError) Error() string {
    return e.message
}

// Configure in retry policy
RetryPolicy: &temporal.RetryPolicy{
    NonRetriableErrorTypes: []string{
        "ValidationError",      // Match by type name
        "*errors.errorString",  // Match wrapped errors
    },
}

Error Matching

Error type matching rules:
  1. Exact type name match - "ValidationError" matches ValidationError type
  2. Package path match - "myapp/errors.ValidationError" matches fully qualified type
  3. Wildcard match - "*ValidationError" matches any package
Non-retriable error checking is implemented in service/history/workflow/activity.go:
func isRetriable(err error, policy *RetryPolicy) bool {
    errorType := reflect.TypeOf(err).String()
    for _, nonRetriable := range policy.NonRetriableErrorTypes {
        if strings.HasSuffix(errorType, nonRetriable) {
            return false
        }
    }
    return true
}

Retry Timeouts

Retries interact with activity and workflow timeouts:

Activity Timeouts

ScheduleToStart

Time from schedule to worker pickup. Retries reset this timeout.

StartToClose

Time for single attempt. Each retry gets full StartToClose timeout.

ScheduleToClose

Total time including all retries. Stops retry loop when exceeded.

Heartbeat

Heartbeat timeout. Retries reset on heartbeat timeout.

Retry Until ScheduleToClose

ao := workflow.ActivityOptions{
    StartToCloseTimeout:    time.Minute,      // 1 minute per attempt
    ScheduleToCloseTimeout: time.Minute * 10, // 10 minutes total
    RetryPolicy: &temporal.RetryPolicy{
        InitialInterval: time.Second,
        MaximumAttempts: 0, // Unlimited
    },
}
Activity will retry until:
  • ScheduleToCloseTimeout (10 minutes) is exceeded
  • OR activity succeeds
  • OR non-retriable error occurs

Best Practices

Match backoff to downstream service recovery time:
  • Fast recovery (seconds): initialInterval=1s, backoffCoefficient=1.5
  • Slow recovery (minutes): initialInterval=30s, backoffCoefficient=2.0
  • Rate limited APIs: initialInterval=5s, maximumInterval=5m
Set maximumAttempts for costly activities:
RetryPolicy: &temporal.RetryPolicy{
    InitialInterval: time.Second * 5,
    MaximumAttempts: 3, // Only retry twice
}
Don’t waste resources retrying permanent failures:
NonRetriableErrorTypes: []string{
    "ValidationError",
    "AuthenticationError",
    "AuthorizationError",
    "NotFoundError",
}
Prevent infinite retry loops:
ao := workflow.ActivityOptions{
    ScheduleToCloseTimeout: time.Hour, // Max 1 hour of retries
    RetryPolicy: &temporal.RetryPolicy{
        MaximumAttempts: 0, // Unlimited attempts within 1 hour
    },
}

Monitoring Retry Behavior

Key metrics for retry monitoring:
  • activity_task_schedule_to_start_latency - Time waiting for worker (increases with retries)
  • activity_execution_failed - Failed activity attempts (includes retries)
  • activity_execution_failed_total - Activities failed after all retries
  • activity_retry_count - Number of retry attempts per activity

Implementation Details

Retry logic is implemented in:
  • Activity retries: service/history/workflow/activity.go
  • Workflow retries: service/history/workflow/mutable_state_impl.go
  • Backoff calculation: common/backoff/retry.go
  • Timer scheduling: History Service timer queue
Retry state is persisted in workflow mutable state:
type ActivityInfo struct {
    Attempt          int32
    LastFailure      *failurepb.Failure
    ScheduledTime    *timestamppb.Timestamp
    RetryLastWorkerIdentity string
    // ...
}

See Also

Activities

Learn about activity execution and timeouts

Workflows

Understand workflow execution model

History Service

Deep dive into retry implementation

Build docs developers (and LLMs) love