Node Statuses
Each node in the DAG tracks its execution state:WorkerDone is a critical intermediate state. When a worker finishes, the node transitions to
WorkerDone (not Done). This frees the tier slot for other workers while the merge runs asynchronously. Once the merge lands, the node advances to Done and fires Merged edges.Node State Transitions
Here’s the full state machine:Cancel cascades: Cancelling a node also cancels all transitive dependents that are
Pending, Ready, Blocked, or Paused. Done nodes are immune.Pause does not cascade: Pausing a node only affects that node. Downstream nodes stay in their current state.Edge Conditions: When Dependencies Fire
Dependencies between nodes can specify when they’re satisfied. This enables overlapping execution.Example: Overlapping Test Execution
Suppose you have an implementation step and a test step. You want tests to start as soon as implementation begins writing code, not after the merge lands.implementtransitions toRunning→testbecomesReadyimmediately- Both workers run in parallel
- When
implementfinishes and merges,testmay still be running
Completed vs Merged
Completed fires when the worker finishes (node isWorkerDone or Done). This is useful for steps that need the worker’s output but don’t care about the merge:
Tier-Based Concurrency
The scheduler enforces concurrency limits per model tier:- Which model handles it
- How many can run concurrently
Scheduler Tick Behavior
The scheduler’stick() method runs periodically (every 3 seconds by default). On each tick:
- Evaluate ready nodes across all active executions
- Check tier limits for each ready node’s tier
- Mark nodes as Running if slots are available
- Return SpawnWorker actions for the coordinator to execute
- Emit TaskBlocked for newly blocked nodes (dependency failed)
- Check for completion: emit
ExecutionCompleteorExecutionFailed
Declaring Dependencies
When creating an execution, you define steps with their dependencies:What happens with this DAG?
What happens with this DAG?
designspawns immediately (no dependencies)- When
designworker finishes and merge lands →designisDone implementbecomesReadyand spawns (if tier slot available)- As soon as
implementtransitions toRunning,testbecomesReadyand spawns - Both
implementandtestrun in parallel - When both finish and merge → execution is complete
Pause, Resume, Cancel
Pause
- Execution-level:
Pause(Target::Execution(exec_id))— scheduler skips this execution on all future ticks. Running workers continue, but no new workers spawn. - Node-level:
Pause(Target::Node { execution_id, step_id })— pauses just that node. IfRunning, the coordinator kills the worker. IfReady/Pending, it stays paused.
Resume
- Execution-level:
Resume(Target::Execution(exec_id))— execution re-enters the scheduler. Next tick evaluates ready nodes. - Node-level:
Resume(Target::Node { execution_id, step_id })— re-evaluates dependencies. If satisfied →Ready, else →Pending.
Cancel
- Execution-level:
Cancel(Target::Execution(exec_id))— cancels all nodes and removes the execution from the scheduler. Returns(task_id, session_id)pairs for running workers the coordinator must kill. - Node-level:
Cancel(Target::Node { execution_id, step_id })— cancels the node and all transitive dependents.
Upstream Outputs
When a node spawns, the scheduler collects outputs from all completed upstream dependencies and passes them to the worker:Failure Propagation
When a node fails (mark_failed):
- The node’s status becomes
Failed - All transitive dependents transition to
Blocked - The scheduler emits
TaskBlockedactions for newly blocked nodes - If no nodes are
Running,Ready, orPending→ execution isFailed
RetryTask { task_id } resets a failed node and all its blocked dependents back to Pending, then re-evaluates. The node can be re-queued.
Real-World Example
Here’s a realistic multi-step execution with mixed edge conditions:Next Steps
Worker Isolation
Learn how workers get isolated filesystem copies
Merge Queue
Understand the refinery and conflict resolution
