Skip to main content

Overview

Workflows and tasks can be configured to timeout after a defined amount of time. Timeouts are essential for preventing workflows from hanging indefinitely and ensuring that resources are not consumed by stalled operations.
When a timeout occurs, runtimes must abruptly interrupt the execution of the workflow/task, and must raise an error that, if uncaught, forces the workflow/task to transition to the faulted status phase.

Timeout Errors

When a timeout occurs, the runtime raises a standardized error:
  • Type: https://serverlessworkflow.io/spec/1.0.0/errors/timeout
  • Status: 408 (Request Timeout)
timeoutError:
  type: https://serverlessworkflow.io/spec/1.0.0/errors/timeout
  status: 408
  title: Timeout
  detail: The operation exceeded the configured timeout duration
  instance: /do/longRunningTask
Timeout errors can be caught and handled like any other error using try-catch blocks.

Workflow Timeouts

Workflow-level timeouts apply to the entire workflow execution:

Basic Workflow Timeout

document:
  dsl: '1.0.3'
  namespace: default
  name: timed-workflow
  version: '1.0.0'

timeout:
  after:
    minutes: 30

do:
  - step1:
      call: function1
  - step2:
      call: function2
  - step3:
      call: function3
timeout.after
object
required
Duration object specifying when the workflow should timeout

Workflow Timeout with Error Handling

You cannot directly catch a workflow timeout within the workflow itself, but you can design workflows to be resilient:
document:
  dsl: '1.0.3'
  namespace: default
  name: resilient-workflow
  version: '1.0.0'

timeout:
  after:
    hours: 2

do:
  - processWithTimeout:
      # Each task has its own timeout to fail faster
      timeout:
        after:
          minutes: 15
      call: longOperation
  
  - handleSuccess:
      call: successHandler

Task Timeouts

Individual tasks can have their own timeout configuration:

Basic Task Timeout

do:
  - fetchData:
      call: http
      with:
        method: get
        endpoint:
          uri: https://api.example.com/data
      timeout:
        after:
          seconds: 30
timeout.after
object
required
Duration object specifying when the task should timeout

Task Timeout with Error Handling

do:
  - fetchDataWithTimeout:
      try:
        call: http
        with:
          method: get
          endpoint:
            uri: https://api.example.com/large-dataset
        timeout:
          after:
            minutes: 5
      catch:
        errors:
          with:
            type: https://serverlessworkflow.io/spec/1.0.0/errors/timeout
        do:
          - logTimeout:
              call: logger
              with:
                message: Data fetch timed out after 5 minutes
          - useCachedData:
              call: cache
              with:
                key: last-known-dataset

Multiple Tasks with Different Timeouts

do:
  - quickOperation:
      call: fastService
      timeout:
        after:
          seconds: 5
  
  - mediumOperation:
      call: standardService
      timeout:
        after:
          seconds: 30
  
  - longOperation:
      call: slowService
      timeout:
        after:
          minutes: 10

Duration Formats

Serverless Workflow supports multiple duration units for specifying timeouts:

Duration Units

UnitPropertyExample
Millisecondsmillisecondsmilliseconds: 500
Secondssecondsseconds: 30
Minutesminutesminutes: 5
Hourshourshours: 2
Daysdaysdays: 1

Duration Examples

Milliseconds

timeout:
  after:
    milliseconds: 500

Seconds

timeout:
  after:
    seconds: 30

Minutes

timeout:
  after:
    minutes: 5

Hours

timeout:
  after:
    hours: 2

Days

timeout:
  after:
    days: 7

Combined Duration Units

You can combine multiple duration units:
timeout:
  after:
    hours: 2
    minutes: 30
    seconds: 45
This creates a timeout of 2 hours, 30 minutes, and 45 seconds (9,045 seconds total).

Complex Duration Example

processLargeDataset:
  call: dataProcessor
  with:
    dataset: ${ .largeDataset }
  timeout:
    after:
      days: 0
      hours: 1
      minutes: 30
      seconds: 0
      milliseconds: 0

Timeout Patterns

Pattern: Fast-Fail with Fallback

do:
  - tryPrimaryWithTimeout:
      try:
        call: http
        with:
          method: get
          endpoint:
            uri: https://primary.example.com/api
        timeout:
          after:
            seconds: 5
      catch:
        errors:
          with:
            type: https://serverlessworkflow.io/spec/1.0.0/errors/timeout
        as: timeoutError
  
  - useFallback:
      if: ${ .timeoutError != null }
      call: http
      with:
        method: get
        endpoint:
          uri: https://fallback.example.com/api
      timeout:
        after:
          seconds: 10
This pattern attempts a fast operation first, then falls back to an alternative with a longer timeout if the first times out.

Pattern: Progressive Timeout

do:
  - attemptQuick:
      try:
        call: processor
        with:
          mode: quick
        timeout:
          after:
            seconds: 10
      catch:
        errors:
          with:
            type: https://serverlessworkflow.io/spec/1.0.0/errors/timeout
        as: quickTimeout
  
  - attemptStandard:
      if: ${ .quickTimeout != null }
      try:
        call: processor
        with:
          mode: standard
        timeout:
          after:
            seconds: 60
      catch:
        errors:
          with:
            type: https://serverlessworkflow.io/spec/1.0.0/errors/timeout
        as: standardTimeout
  
  - attemptDeep:
      if: ${ .standardTimeout != null }
      call: processor
      with:
        mode: deep
      timeout:
        after:
          minutes: 10
This pattern tries progressively slower processing modes with increasing timeouts.

Pattern: Timeout with Retry

reliableOperation:
  try:
    call: http
    with:
      method: post
      endpoint:
        uri: https://api.example.com/process
      body: ${ .data }
    timeout:
      after:
        seconds: 30
  catch:
    errors:
      with:
        type: https://serverlessworkflow.io/spec/1.0.0/errors/timeout
    retry:
      delay:
        seconds: 5
      backoff:
        exponential:
          factor: 2
      limit:
        attempt:
          count: 3
Combining timeouts with retry policies provides robust handling for operations that may occasionally be slow.

Pattern: Partial Results on Timeout

do:
  - fetchWithTimeout:
      try:
        call: http
        with:
          method: get
          endpoint:
            uri: https://api.example.com/complete-data
        timeout:
          after:
            seconds: 15
      catch:
        errors:
          with:
            type: https://serverlessworkflow.io/spec/1.0.0/errors/timeout
        do:
          - fetchPartialData:
              call: http
              with:
                method: get
                endpoint:
                  uri: https://api.example.com/partial-data
                query:
                  limit: 100
              timeout:
                after:
                  seconds: 5

Pattern: Parallel Operations with Individual Timeouts

do:
  - fetchFromMultipleSources:
      fork:
        compete: true
        branches:
          - source1:
              call: http
              with:
                method: get
                endpoint:
                  uri: https://source1.example.com/api
              timeout:
                after:
                  seconds: 10
          
          - source2:
              call: http
              with:
                method: get
                endpoint:
                  uri: https://source2.example.com/api
              timeout:
                after:
                  seconds: 15
          
          - source3:
              call: http
              with:
                method: get
                endpoint:
                  uri: https://source3.example.com/api
              timeout:
                after:
                  seconds: 20
In compete mode with different timeouts, the fastest source that completes within its timeout wins.

Timeout Best Practices

1

Set realistic timeouts

Base timeout values on actual performance measurements, not arbitrary numbers. Monitor your services to understand typical response times.
2

Use task timeouts over workflow timeouts

Prefer setting timeouts on individual tasks rather than the entire workflow. This provides more granular control and better error messages.
3

Always handle timeout errors

Wrap operations with timeouts in try-catch blocks to handle timeout errors gracefully and provide fallback behavior.
4

Consider different timeout strategies

Use shorter timeouts for fast operations and longer timeouts for heavy processing. Adjust based on the criticality of the operation.
5

Combine with retry policies

When using retries, ensure the total retry time (attempts × delay × backoff factor) doesn’t exceed your workflow timeout.
6

Log timeout occurrences

Always log when timeouts occur to help identify performance issues and adjust timeout values accordingly.
7

Test timeout scenarios

Include timeout scenarios in your workflow tests to ensure your error handling works correctly.

Timeout Calculation Examples

Example 1: Simple Timeout

timeout:
  after:
    seconds: 30
Total timeout: 30 seconds

Example 2: Combined Units

timeout:
  after:
    minutes: 5
    seconds: 30
Total timeout: (5 × 60) + 30 = 330 seconds (5 minutes 30 seconds)

Example 3: Complex Duration

timeout:
  after:
    hours: 2
    minutes: 15
    seconds: 45
    milliseconds: 500
Total timeout:
  • Hours: 2 × 3600 = 7,200 seconds
  • Minutes: 15 × 60 = 900 seconds
  • Seconds: 45 seconds
  • Milliseconds: 500 milliseconds = 0.5 seconds
  • Total: 8,145.5 seconds (2 hours 15 minutes 45.5 seconds)

Example 4: Days-Based Timeout

timeout:
  after:
    days: 1
    hours: 6
Total timeout: (1 × 86400) + (6 × 3600) = 108,000 seconds (30 hours)

Common Timeout Scenarios

HTTP API Calls

apiCall:
  call: http
  with:
    method: get
    endpoint:
      uri: https://api.example.com/data
  timeout:
    after:
      seconds: 30  # Standard API timeout

Database Queries

databaseQuery:
  call: database
  with:
    query: SELECT * FROM large_table WHERE condition = true
  timeout:
    after:
      minutes: 5  # Longer timeout for complex queries

File Processing

processFile:
  call: fileProcessor
  with:
    file: ${ .largeFile }
  timeout:
    after:
      hours: 1  # Extended timeout for large file processing

Real-time Operations

realtimeCheck:
  call: realtimeService
  with:
    data: ${ .streamData }
  timeout:
    after:
      milliseconds: 500  # Very short timeout for real-time requirements

Batch Processing

batchProcess:
  call: batchProcessor
  with:
    items: ${ .batchItems }
  timeout:
    after:
      days: 1  # Long-running batch job

Monitoring and Debugging Timeouts

Logging Timeout Information

monitoredOperation:
  try:
    call: slowService
    with:
      data: ${ .inputData }
    timeout:
      after:
        minutes: 5
  catch:
    errors:
      with:
        type: https://serverlessworkflow.io/spec/1.0.0/errors/timeout
    as: timeoutError
    do:
      - logTimeoutDetails:
          call: http
          with:
            method: post
            endpoint:
              uri: https://logging.example.com/timeouts
            body:
              workflowId: ${ $workflow.id }
              taskName: ${ $task.name }
              taskReference: ${ $task.reference }
              startedAt: ${ $task.startedAt.iso8601 }
              timeoutDuration: 300  # seconds
              error: ${ .timeoutError }

Metrics Collection

metricTrackedOperation:
  try:
    call: trackedService
    timeout:
      after:
        seconds: 30
  catch:
    errors:
      with:
        type: https://serverlessworkflow.io/spec/1.0.0/errors/timeout
    do:
      - incrementTimeoutMetric:
          call: http
          with:
            method: post
            endpoint:
              uri: https://metrics.example.com/increment
            body:
              metric: task_timeouts
              tags:
                service: trackedService
                workflow: ${ $workflow.definition.document.name }

Common Pitfalls

Pitfall 1: Too Short Timeouts

# Bad: Unrealistic timeout
processComplexData:
  call: complexProcessor
  with:
    data: ${ .largeDataset }
  timeout:
    after:
      seconds: 5  # Too short for complex processing

# Good: Realistic timeout
processComplexData:
  call: complexProcessor
  with:
    data: ${ .largeDataset }
  timeout:
    after:
      minutes: 10  # Appropriate for the operation

Pitfall 2: No Timeout Error Handling

# Bad: Timeout will fault the workflow
criticalOperation:
  call: importantService
  timeout:
    after:
      seconds: 30

# Good: Timeout is handled gracefully
criticalOperation:
  try:
    call: importantService
    timeout:
      after:
        seconds: 30
  catch:
    errors:
      with:
        type: https://serverlessworkflow.io/spec/1.0.0/errors/timeout
    do:
      - handleTimeout:
          call: fallbackService

Pitfall 3: Retry Timeout Exceeds Workflow Timeout

# Bad: Total retry time can exceed workflow timeout
timeout:
  after:
    minutes: 5

do:
  - retryingOperation:
      try:
        call: service
      catch:
        errors: {}
        retry:
          delay:
            minutes: 2
          limit:
            attempt:
              count: 5  # 5 attempts × 2 minutes = 10 minutes > 5 minute workflow timeout

# Good: Retry times are within workflow timeout
timeout:
  after:
    minutes: 10

do:
  - retryingOperation:
      try:
        call: service
        timeout:
          after:
            seconds: 30  # Individual attempt timeout
      catch:
        errors: {}
        retry:
          delay:
            seconds: 10
          limit:
            attempt:
              count: 5  # 5 attempts × 10 seconds delay = ~50 seconds < 10 minute workflow timeout

Build docs developers (and LLMs) love