Skip to main content

Overview

The History Service is an internal Cadence service responsible for maintaining workflow execution state and event history. It manages the complete lifecycle of workflow executions, handles state transitions, and coordinates with the Matching Service for task dispatch. Service Location: service/history/handler/interface.go
The History Service API is internal to Cadence. Applications should use the Frontend Service API instead.

Health & Lifecycle

Health

Check service health status.
Health(context.Context) (*types.HealthStatus, error)

Start

Start the history service.
Start()

Stop

Stop the history service gracefully.
Stop()

PrepareToStop

Prepare the service for shutdown.
PrepareToStop(time.Duration) time.Duration

Workflow Execution APIs

StartWorkflowExecution

Internal API to start a new workflow execution.
StartWorkflowExecution(context.Context, *types.HistoryStartWorkflowExecutionRequest) (*types.StartWorkflowExecutionResponse, error)
DomainUUID
string
required
UUID of the domain
StartRequest
StartWorkflowExecutionRequest
required
The workflow start request from frontend
PartitionConfig
map[string]string
Task list partition configuration
RunId
string
The unique run ID for this workflow execution
Validation:
  • Checks for duplicate workflow IDs based on WorkflowIdReusePolicy
  • Validates workflow and activity timeouts
  • Ensures domain is active
  • Validates retry policies and cron schedules

SignalWorkflowExecution

Send a signal to a running workflow.
SignalWorkflowExecution(context.Context, *types.HistorySignalWorkflowExecutionRequest) error
DomainUUID
string
required
UUID of the domain
SignalRequest
SignalWorkflowExecutionRequest
required
Signal parameters from frontend

SignalWithStartWorkflowExecution

Signal a workflow, starting it if it doesn’t exist.
SignalWithStartWorkflowExecution(context.Context, *types.HistorySignalWithStartWorkflowExecutionRequest) (*types.StartWorkflowExecutionResponse, error)
Combines the semantics of StartWorkflowExecution and SignalWorkflowExecution in a single atomic operation.

TerminateWorkflowExecution

Terminate a running workflow execution.
TerminateWorkflowExecution(context.Context, *types.HistoryTerminateWorkflowExecutionRequest) error
DomainUUID
string
required
UUID of the domain
TerminateRequest
TerminateWorkflowExecutionRequest
required
Termination parameters
Effects:
  • Immediately closes the workflow execution
  • Cancels all pending activities and child workflows
  • Records WorkflowExecutionTerminated event
  • No further decisions will be scheduled

RequestCancelWorkflowExecution

Request cancellation of a workflow execution.
RequestCancelWorkflowExecution(context.Context, *types.HistoryRequestCancelWorkflowExecutionRequest) error
DomainUUID
string
required
UUID of the domain
CancelRequest
RequestCancelWorkflowExecutionRequest
required
Cancellation request parameters
Effects:
  • Records WorkflowExecutionCancelRequested event
  • Schedules a new decision task
  • Workflow can handle cancellation gracefully

ResetWorkflowExecution

Reset a workflow execution to a previous decision task.
ResetWorkflowExecution(context.Context, *types.HistoryResetWorkflowExecutionRequest) (*types.ResetWorkflowExecutionResponse, error)
DomainUUID
string
required
UUID of the domain
ResetRequest
ResetWorkflowExecutionRequest
required
Reset parameters
RunId
string
New run ID after reset
Reset Process:
  1. Terminates current workflow execution
  2. Creates a new execution with history up to reset point
  3. Optionally reapplies signals that occurred after reset point
  4. Schedules a new decision task

Mutable State APIs

GetMutableState

Retrieve the current mutable state of a workflow execution.
GetMutableState(context.Context, *types.GetMutableStateRequest) (*types.GetMutableStateResponse, error)
DomainUUID
string
required
UUID of the domain
Execution
WorkflowExecution
required
Workflow execution identifier
ExpectedNextEventID
int64
Expected next event ID for validation
CurrentBranchToken
[]byte
Current history branch token
VersionHistoryItem
VersionHistoryItem
Version history for conflict detection
Execution
WorkflowExecution
Workflow execution information
WorkflowType
WorkflowType
The workflow type
NextEventID
int64
Next event ID in the history
PreviousStartedEventID
int64
Event ID of previous decision task started
TaskList
TaskList
Current task list
StickyTaskList
TaskList
Sticky task list if enabled
IsWorkflowRunning
boolean
Whether the workflow is currently running
WorkflowState
int32
Internal workflow state
WorkflowCloseState
int32
Close state if workflow is closed
VersionHistories
VersionHistories
Version histories for multi-cluster replication

PollMutableState

Long poll for mutable state changes.
PollMutableState(context.Context, *types.PollMutableStateRequest) (*types.PollMutableStateResponse, error)
Used for efficient replication - blocks until mutable state changes or timeout.

DescribeMutableState

Get detailed mutable state information for debugging.
DescribeMutableState(context.Context, *types.DescribeMutableStateRequest) (*types.DescribeMutableStateResponse, error)
MutableStateInCache
string
JSON representation of in-memory mutable state
MutableStateInDatabase
string
JSON representation of persisted mutable state

DescribeWorkflowExecution

Get workflow execution details.
DescribeWorkflowExecution(context.Context, *types.HistoryDescribeWorkflowExecutionRequest) (*types.DescribeWorkflowExecutionResponse, error)

Decision Task APIs

RecordDecisionTaskStarted

Record that a decision task has started.
RecordDecisionTaskStarted(context.Context, *types.RecordDecisionTaskStartedRequest) (*types.RecordDecisionTaskStartedResponse, error)
DomainUUID
string
required
UUID of the domain
WorkflowExecution
WorkflowExecution
required
Workflow execution identifier
ScheduleID
int64
required
Schedule event ID of the decision task
TaskID
int64
required
Task ID from the task queue
RequestId
string
required
Unique request identifier for idempotency
PollRequest
PollForDecisionTaskRequest
required
The original poll request
WorkflowType
WorkflowType
The workflow type
PreviousStartedEventId
int64
Previous decision task started event ID
ScheduledEventId
int64
Decision task scheduled event ID
StartedEventId
int64
Decision task started event ID
NextEventId
int64
Next event ID to be assigned
Attempt
int64
Attempt number for this decision task
StickyExecutionEnabled
boolean
Whether sticky execution is enabled
History
History
Workflow execution history
WorkflowExecutionTaskList
TaskList
Normal task list for the workflow
EventStoreVersion
int32
Event store version
BranchToken
[]byte
Branch token for history events
Queries
map[string]WorkflowQuery
Pending queries to be answered

RespondDecisionTaskCompleted

Complete a decision task with decisions.
RespondDecisionTaskCompleted(context.Context, *types.HistoryRespondDecisionTaskCompletedRequest) (*types.HistoryRespondDecisionTaskCompletedResponse, error)
DomainUUID
string
required
UUID of the domain
CompleteRequest
RespondDecisionTaskCompletedRequest
required
Completion request from worker
Decision Processing:
  1. Validates all decisions
  2. Applies decisions to mutable state
  3. Schedules activity tasks, timers, child workflows
  4. Records DecisionTaskCompleted event
  5. Determines if new decision task is needed
Possible Errors:
  • DecisionTaskFailedCauseUnhandledDecision: Invalid decision type
  • DecisionTaskFailedCauseBadScheduleActivityAttributes: Invalid activity parameters
  • DecisionTaskFailedCauseBadBinary: Workflow worker binary is marked as bad

RespondDecisionTaskFailed

Report a decision task failure.
RespondDecisionTaskFailed(context.Context, *types.HistoryRespondDecisionTaskFailedRequest) error
DomainUUID
string
required
UUID of the domain
FailedRequest
RespondDecisionTaskFailedRequest
required
Failure information

ScheduleDecisionTask

Schedule a new decision task for a workflow.
ScheduleDecisionTask(context.Context, *types.ScheduleDecisionTaskRequest) error
Used internally to schedule decision tasks when workflow state changes.

Activity Task APIs

RecordActivityTaskStarted

Record that an activity task has started.
RecordActivityTaskStarted(context.Context, *types.RecordActivityTaskStartedRequest) (*types.RecordActivityTaskStartedResponse, error)
DomainUUID
string
required
UUID of the domain
WorkflowExecution
WorkflowExecution
required
Workflow execution identifier
ScheduleID
int64
required
Schedule event ID of the activity
TaskID
int64
required
Task ID from the task queue
RequestId
string
required
Unique request identifier
PollRequest
PollForActivityTaskRequest
required
The original poll request
ScheduledEvent
HistoryEvent
Activity scheduled event
StartedTimestamp
int64
Timestamp when activity started
Attempt
int64
Retry attempt number
ScheduledTimestampOfThisAttempt
int64
When this attempt was scheduled
HeartbeatDetails
[]byte
Last recorded heartbeat details
WorkflowType
WorkflowType
The workflow type
WorkflowDomain
string
Domain name of the workflow

RespondActivityTaskCompleted

Complete an activity task successfully.
RespondActivityTaskCompleted(context.Context, *types.HistoryRespondActivityTaskCompletedRequest) error
DomainUUID
string
required
UUID of the domain
CompleteRequest
RespondActivityTaskCompletedRequest
required
Completion request from worker
Effects:
  • Records ActivityTaskCompleted event
  • Schedules a new decision task
  • Passes result to workflow for next decision

RespondActivityTaskFailed

Report an activity task failure.
RespondActivityTaskFailed(context.Context, *types.HistoryRespondActivityTaskFailedRequest) error
DomainUUID
string
required
UUID of the domain
FailedRequest
RespondActivityTaskFailedRequest
required
Failure information
Effects:
  • Records ActivityTaskFailed event
  • Applies retry policy if configured
  • Schedules new decision task

RespondActivityTaskCanceled

Report an activity task cancellation.
RespondActivityTaskCanceled(context.Context, *types.HistoryRespondActivityTaskCanceledRequest) error

RecordActivityTaskHeartbeat

Record a heartbeat for a long-running activity.
RecordActivityTaskHeartbeat(context.Context, *types.HistoryRecordActivityTaskHeartbeatRequest) (*types.RecordActivityTaskHeartbeatResponse, error)
DomainUUID
string
required
UUID of the domain
HeartbeatRequest
RecordActivityTaskHeartbeatRequest
required
Heartbeat details
CancelRequested
boolean
Whether the activity has been requested to cancel
Heartbeat Processing:
  • Updates last heartbeat timestamp
  • Stores heartbeat details for retry
  • Checks if cancel was requested
  • Extends activity timeout

Child Workflow APIs

RecordChildExecutionCompleted

Record the completion of a child workflow execution.
RecordChildExecutionCompleted(context.Context, *types.RecordChildExecutionCompletedRequest) error
DomainUUID
string
required
UUID of the parent workflow’s domain
WorkflowExecution
WorkflowExecution
required
Parent workflow execution
InitiatedID
int64
required
Event ID that initiated the child workflow
CompletedExecution
WorkflowExecution
required
Child workflow execution
CompletionEvent
HistoryEvent
required
Child workflow completion event

Query APIs

QueryWorkflow

Query a workflow execution.
QueryWorkflow(context.Context, *types.HistoryQueryWorkflowRequest) (*types.HistoryQueryWorkflowResponse, error)
DomainUUID
string
required
UUID of the domain
Request
QueryWorkflowRequest
required
Query request from frontend
Query Execution:
  1. If workflow has pending decision task, schedules query with that task
  2. Otherwise, creates a new transient decision task for the query
  3. Sends query to matching service
  4. Returns query result or timeout

Replication APIs

ReplicateEventsV2

Replicate workflow events from another cluster.
ReplicateEventsV2(context.Context, *types.ReplicateEventsV2Request) error
DomainUUID
string
required
UUID of the domain
WorkflowExecution
WorkflowExecution
required
Workflow execution being replicated
VersionHistoryItems
[]VersionHistoryItem
required
Version history items
Events
DataBlob
required
Serialized history events
NewRunEvents
DataBlob
Events for new run (if workflow continued as new)

GetReplicationMessages

Get replication messages for cross-cluster replication.
GetReplicationMessages(context.Context, *types.GetReplicationMessagesRequest) (*types.GetReplicationMessagesResponse, error)

GetDLQReplicationMessages

Get replication messages from the DLQ.
GetDLQReplicationMessages(context.Context, *types.GetDLQReplicationMessagesRequest) (*types.GetDLQReplicationMessagesResponse, error)

NotifyFailoverMarkers

Notify about failover markers for graceful failover.
NotifyFailoverMarkers(context.Context, *types.NotifyFailoverMarkersRequest) error

Cross-Cluster APIs

GetCrossClusterTasks

Get cross-cluster tasks for processing.
GetCrossClusterTasks(context.Context, *types.GetCrossClusterTasksRequest) (*types.GetCrossClusterTasksResponse, error)

RespondCrossClusterTasksCompleted

Acknowledge completion of cross-cluster tasks.
RespondCrossClusterTasksCompleted(context.Context, *types.RespondCrossClusterTasksCompletedRequest) (*types.RespondCrossClusterTasksCompletedResponse, error)

DLQ Management APIs

ReadDLQMessages

Read messages from the dead letter queue.
ReadDLQMessages(context.Context, *types.ReadDLQMessagesRequest) (*types.ReadDLQMessagesResponse, error)

CountDLQMessages

Count messages in the DLQ.
CountDLQMessages(context.Context, *types.CountDLQMessagesRequest) (*types.HistoryCountDLQMessagesResponse, error)

PurgeDLQMessages

Purge messages from the DLQ.
PurgeDLQMessages(context.Context, *types.PurgeDLQMessagesRequest) error

MergeDLQMessages

Merge DLQ messages back into the main queue.
MergeDLQMessages(context.Context, *types.MergeDLQMessagesRequest) (*types.MergeDLQMessagesResponse, error)

Administrative APIs

DescribeHistoryHost

Get information about a history service host.
DescribeHistoryHost(context.Context, *types.DescribeHistoryHostRequest) (*types.DescribeHistoryHostResponse, error)
HostAddress
string
Address of the host to describe
ShardIdForHost
int32
Shard ID to get info for
ExecutionForHost
WorkflowExecution
Get host info for this execution

CloseShard

Close a shard on a history host.
CloseShard(context.Context, *types.CloseShardRequest) error

RemoveTask

Remove a task from a shard.
RemoveTask(context.Context, *types.RemoveTaskRequest) error

ResetStickyTaskList

Reset sticky task list for a workflow.
ResetStickyTaskList(context.Context, *types.HistoryResetStickyTaskListRequest) (*types.HistoryResetStickyTaskListResponse, error)

DescribeQueue

Get information about a task queue.
DescribeQueue(context.Context, *types.DescribeQueueRequest) (*types.DescribeQueueResponse, error)

ResetQueue

Reset a task queue.
ResetQueue(context.Context, *types.ResetQueueRequest) error

RefreshWorkflowTasks

Refresh workflow tasks.
RefreshWorkflowTasks(context.Context, *types.HistoryRefreshWorkflowTasksRequest) error

RemoveSignalMutableState

Remove a signal from mutable state.
RemoveSignalMutableState(context.Context, *types.RemoveSignalMutableStateRequest) error

ReapplyEvents

Reapply events to a workflow execution.
ReapplyEvents(context.Context, *types.HistoryReapplyEventsRequest) error

SyncActivity

Sync activity state across clusters.
SyncActivity(context.Context, *types.SyncActivityRequest) error

SyncShardStatus

Sync shard status across hosts.
SyncShardStatus(context.Context, *types.SyncShardStatusRequest) error

GetFailoverInfo

Get failover information for a domain.
GetFailoverInfo(context.Context, *types.GetFailoverInfoRequest) (*types.GetFailoverInfoResponse, error)

RatelimitUpdate

Update rate limit settings.
RatelimitUpdate(context.Context, *types.RatelimitUpdateRequest) (*types.RatelimitUpdateResponse, error)

State Management

Mutable State

The History Service maintains workflow mutable state in memory and persists it to the database. Mutable state includes:
  • Execution Info: Workflow ID, Run ID, state, timeouts
  • Pending Activities: Scheduled/started activities
  • Pending Decisions: Scheduled/started decision tasks
  • Pending Timers: Active timers
  • Pending Child Workflows: Child workflow executions
  • Pending Signals: Buffered signals
  • Activity Info: Heartbeat details, retry state
  • Timer Info: Timer details and fire times

Event History

Workflow event history is stored in a versioned tree structure:
  • Branch: A sequence of history events
  • Version History: Tracks history across multiple clusters
  • Event Store Version: Schema version for events

Sharding

Workflows are distributed across shards:
  • Shard Count: Typically 1024-16384 shards
  • Shard Assignment: Based on workflow ID hash
  • Shard Ownership: Each shard owned by one history host
  • Shard Movement: Supports dynamic rebalancing

Error Handling

ShardOwnershipLostError

Shard ownership moved to another host.
type ShardOwnershipLostError struct {
    ShardID int32
    Owner   string
}

EventAlreadyStartedError

Attempt to start an already started task.
type EventAlreadyStartedError struct {
    Message string
}

RetryTaskV2Error

Task should be retried with exponential backoff.
type RetryTaskV2Error struct {
    Message        string
    DomainID       string
    WorkflowID     string
    RunID          string
    StartEventID   int64
    EndEventID     int64
}

Best Practices

Idempotency

All mutating operations use idempotency tokens:
  • RequestId for API calls
  • TaskID for task processing
  • Event IDs for history deduplication

Conflict Resolution

When concurrent operations occur:
  • Use version histories for conflict detection
  • Retry with updated version information
  • Handle CurrentBranchChangedError

Performance Optimization

  • Sticky Execution: Keeps mutable state in memory
  • Batch Operations: Group history events
  • Lazy Loading: Load history on demand
  • Event Pagination: Stream large histories

See Also

Build docs developers (and LLMs) love