Interruption is inevitable. Servers restart, deployments roll out, processes crash. Traditional code loses all in-flight state when any of this happens. Durable execution treats interruption as a first-class concern: every meaningful result is persisted before the workflow continues, so when execution resumes it picks up exactly where it left off.
Workflow starts → Executes steps → Each step result is persisted ↓Server crashes → Workflow resumes → Cached results are replayed ↓ Execution continues from last incomplete step
Two primitives create durability boundaries: Workflow.step() and Workflow.sleep(). Code inside a step runs non-durably — no step or sleep calls belong inside a step’s execute effect.
Every Workflow.step() call persists its result to Durable Object storage before returning. On any subsequent execution of the same workflow instance, completed steps check the cache first:
Cache miss — the effect runs, the result is stored, and the value is returned.
Cache hit — the effect is skipped entirely and the stored value is returned.
This is what makes replay safe. If a workflow has completed steps A and B before crashing, a resumed execution returns cached results for both without re-running any side effects.
const order = yield* Workflow.step({ name: "Fetch order", execute: fetchOrder(orderId), // only runs if not already cached});
Cached step results include metadata alongside the value:
interface CachedStepResult<T> { value: T meta: { completedAt: number attempt: number durationMs: number }}
Step results must be JSON-serializable. If your effect returns a class instance, ORM result, or other non-serializable value, map it to a plain object or use Effect.asVoid to discard the result before the step completes.
// Map to serializable shapeyield* Workflow.step({ name: "Create order", execute: createOrder(data).pipe( Effect.map((order) => ({ id: order.id, status: order.status })) ),});// Or discard the result entirelyyield* Workflow.step({ name: "Update database", execute: updateRecord(id).pipe(Effect.asVoid),});
The Durable Object then goes idle. When the alarm fires, the orchestrator resumes the workflow in resume mode, replaying all cached steps and continuing past the sleep point.
// Short delays for rate limitingyield* Workflow.sleep("30 seconds");// Wait a full day — durable across restartsyield* Workflow.sleep("24 hours");// Subscription renewalyield* Workflow.sleep("30 days");
Retry delays work the same way: a failed step with retry config throws a PauseSignal with reason: "retry", schedules an alarm, and resumes when the delay expires.
When the executor runs a workflow definition, it operates in one of three modes:
Mode
When used
Behavior
fresh
First execution of an instance
No cached data; execute everything
resume
After a scheduled pause (sleep or retry delay)
Replay cached steps; continue from the pause point
recover
After an infrastructure failure
Same as resume, but triggered by the recovery system
In resume and recover modes the workflow function runs from the top, but every completed step returns its cached result immediately without executing the underlying effect.
When a Durable Object restarts (process crash, deployment, eviction) any workflow that was Running did not get a chance to record a Completed or Paused transition. The recovery system detects this on startup:
The engine’s constructor runs RecoveryManager.checkAndScheduleRecovery().
The recovery manager reads the current status. If it is Running and lastUpdated is older than the stale threshold (default: 30 s), the workflow is considered stale.
A short-delay alarm is scheduled. When it fires, the orchestrator re-executes the workflow in recover mode.
Because all completed steps are cached in DO storage, replay is safe and the workflow continues from the last incomplete step.
Recovery attempt count is bounded by maxRecoveryAttempts (configurable in createDurableWorkflows). If the limit is exceeded the workflow transitions to Failed.
Only Workflow.step() and Workflow.sleep() create durability checkpoints. Code written directly in the workflow body between steps runs non-durably on each execution:
const orderWorkflow = Workflow.make((orderId: string) => Effect.gen(function* () { // ✅ Durable — result is cached const order = yield* Workflow.step({ name: "Fetch order", execute: fetchOrder(orderId), }); // ✅ Durable — pause is recorded, alarm is scheduled yield* Workflow.sleep("24 hours"); // ✅ Durable — result is cached yield* Workflow.step({ name: "Charge card", execute: chargeCard(order), }); }));
Do not call Workflow.step() or Workflow.sleep() inside the execute effect of another step. The library enforces this with both a compile-time guard (WorkflowLevel context) and a runtime check (StepScope). Violating this constraint will cause an error.