Skip to main content
The Loom agent uses an explicit, event-driven state machine to manage conversation flow and tool execution. This design provides predictable behavior, clear ownership of context, graceful error recovery, and clean separation between state logic and I/O operations.

Design Principles

Predictable Behavior

All state transitions are explicit and testable with exhaustive pattern matching.

Clear Ownership

Each state carries its required context (conversation, retries, etc.).

Graceful Recovery

Built-in retry mechanisms with bounded attempts and backoff.

Clean Separation

State machine logic is synchronous and pure; caller manages async I/O.

State Machine Overview

The state machine receives AgentEvents and returns AgentActions that the caller must execute. This inversion of control allows the caller to manage async operations (LLM calls, tool execution) while the state machine remains synchronous and pure.

States

AgentState Enum

pub enum AgentState {
    WaitingForUserInput {
        conversation: ConversationContext,
    },
    CallingLlm {
        conversation: ConversationContext,
        retries: u32,
    },
    ProcessingLlmResponse {
        conversation: ConversationContext,
        response: LlmResponse,
    },
    ExecutingTools {
        conversation: ConversationContext,
        executions: Vec<ToolExecutionStatus>,
    },
    PostToolsHook {
        conversation: ConversationContext,
        pending_llm_request: LlmRequest,
        completed_tools: Vec<CompletedToolInfo>,
    },
    Error {
        conversation: ConversationContext,
        error: AgentError,
        retries: u32,
        origin: ErrorOrigin,
    },
    ShuttingDown,
}

State Details

Initial and terminal state for user turnsThe agent is idle and awaits user input. The conversation context preserves all prior messages.Transitions:
  • UserInputCallingLlm
  • ShutdownRequestedShuttingDown
Active LLM request in flightThe retries counter tracks how many retry attempts have been made for the current request.Transitions:
  • TextDeltaCallingLlm (streaming text)
  • ToolCallDeltaCallingLlm (streaming tool call)
  • CompletedProcessingLlmResponse
  • Error (retries < max) → Error
  • Error (retries >= max) → WaitingForUserInput
  • ShutdownRequestedShuttingDown
Transient state for examining an LLM responseImmediately transitions to either ExecutingTools (if tool calls present) or WaitingForUserInput (if text-only response).Transitions:
  • Has tool calls → ExecutingTools
  • No tool calls → WaitingForUserInput
  • ShutdownRequestedShuttingDown
Tracks multiple concurrent tool executionsEach execution progresses through PendingRunningCompleted. The state tracks all executions via Vec<ToolExecutionStatus>.Transitions:
  • ToolCompleted (some pending) → ExecutingTools
  • ToolCompleted (all done, mutating tools) → PostToolsHook
  • ToolCompleted (all done, no mutation) → CallingLlm
  • ShutdownRequestedShuttingDown
Runs post-tool hooks after tool executionThis state enables features like auto-commit that need to run after file-modifying tools (e.g., edit_file, bash).Fields:
  • pending_llm_request - The next LLM request to send after hooks complete
  • completed_tools - Information about which tools completed (for hook decision-making)
Transitions:
  • PostToolsHookCompletedCallingLlm
  • ShutdownRequestedShuttingDown
Holds failed state with retry informationThe origin field (Llm, Tool, or Io) determines retry strategy.Transitions:
  • RetryTimeoutFiredCallingLlm
  • ShutdownRequestedShuttingDown
Terminal stateNo transitions out. The agent should be dropped after reaching this state.

Events

AgentEvent Enum

pub enum AgentEvent {
    UserInput(Message),
    LlmEvent(LlmEvent),
    ToolProgress(ToolProgressEvent),
    ToolCompleted {
        call_id: String,
        outcome: ToolExecutionOutcome,
    },
    PostToolsHookCompleted {
        action_taken: bool,
    },
    RetryTimeoutFired,
    ShutdownRequested,
}

LlmEvent Sub-variants

pub enum LlmEvent {
    TextDelta {
        content: String,
    },
    ToolCallDelta {
        call_id: String,
        tool_name: String,
        arguments_fragment: String,
    },
    Completed(LlmResponse),
    Error(LlmError),
}

ToolExecutionOutcome

pub enum ToolExecutionOutcome {
    Success {
        call_id: String,
        output: serde_json::Value,
    },
    Error {
        call_id: String,
        error: ToolError,
    },
}

Actions

AgentAction Enum

Actions are returned to the caller indicating what I/O operation to perform:
pub enum AgentAction {
    SendLlmRequest(LlmRequest),
    ExecuteTools(Vec<ToolCall>),
    RunPostToolsHook {
        completed_tools: Vec<CompletedToolInfo>,
    },
    WaitForInput,
    DisplayMessage(String),
    DisplayError(String),
    Shutdown,
}
The caller is responsible for executing actions and feeding events back into the state machine via agent.handle_event(event).

State Transitions

Transition Table

Current StateEventNew StateAction
WaitingForUserInputUserInput(msg)CallingLlmSendLlmRequest
CallingLlmLlmEvent::TextDeltaCallingLlmDisplayMessage
CallingLlmLlmEvent::ToolCallDeltaCallingLlmWaitForInput
CallingLlmLlmEvent::CompletedProcessingLlmResponse(internal)
CallingLlmLlmEvent::Error (retries < max)ErrorWaitForInput
CallingLlmLlmEvent::Error (retries >= max)WaitingForUserInputDisplayError
ProcessingLlmResponse(has tool calls)ExecutingToolsExecuteTools
ProcessingLlmResponse(no tool calls)WaitingForUserInputWaitForInput
ExecutingToolsToolCompleted (some pending)ExecutingToolsWaitForInput
ExecutingToolsToolCompleted (all done, mutating)PostToolsHookRunPostToolsHook
ExecutingToolsToolCompleted (all done, no mutation)CallingLlmSendLlmRequest
PostToolsHookPostToolsHookCompletedCallingLlmSendLlmRequest
Error (origin=Llm)RetryTimeoutFiredCallingLlmSendLlmRequest
any stateShutdownRequestedShuttingDownShutdown

Implementation

Core Method

impl Agent {
    pub fn handle_event(
        &mut self,
        event: AgentEvent
    ) -> AgentResult<AgentAction> {
        // Pattern match on (current_state, event)
        // Update self.state
        // Return action for caller to execute
    }
}
The handle_event method is synchronous and returns immediately. No async operations are performed inside the state machine.

Example Usage

let mut agent = Agent::new(config);

// User sends a message
let action = agent.handle_event(AgentEvent::UserInput(message))?;
match action {
    AgentAction::SendLlmRequest(request) => {
        // Execute async LLM call
        let mut stream = llm_client.complete_streaming(request).await?;
        
        // Feed stream events back to state machine
        while let Some(event) = stream.next().await {
            let action = agent.handle_event(AgentEvent::LlmEvent(event))?;
            // Handle action...
        }
    }
    AgentAction::ExecuteTools(tool_calls) => {
        // Execute tools in parallel
        for tool_call in tool_calls {
            let outcome = execute_tool(tool_call).await?;
            let action = agent.handle_event(AgentEvent::ToolCompleted {
                call_id: tool_call.id,
                outcome,
            })?;
            // Handle action...
        }
    }
    _ => {}
}

Design Decisions

Why Explicit State Machine vs Implicit

1

Testability

Every state and transition can be unit tested in isolation. Property-based tests verify invariants like “shutdown always succeeds from any state”.
2

Debuggability

State transitions are logged with tracing::info!, making it easy to trace agent behavior in production.
3

No Hidden State

All context is carried explicitly in state variants. There are no ambient flags or mutable fields that could get out of sync.
4

Exhaustive Matching

Rust’s match ensures all state/event combinations are handled. New events or states trigger compiler errors until addressed.

Why Events Are Processed Synchronously

The handle_event method is synchronous and returns immediately:
pub fn handle_event(&mut self, event: AgentEvent) -> AgentResult<AgentAction>
Rationale:
  1. Separation of Concerns - The state machine decides what to do; the caller decides how to do it (async, parallel, etc.)
  2. Backpressure - The caller controls the pace of event delivery. No internal queues or background tasks
  3. Determinism - Given the same sequence of events, the state machine produces the same sequence of actions (essential for testing and replay)
  4. Flexibility - The caller can implement different execution strategies (single-threaded, tokio, async-std) without changing the state machine

How Conversation Context Is Threaded

Each state variant carries its own ConversationContext:
pub enum AgentState {
    WaitingForUserInput {
        conversation: ConversationContext,
    },
    CallingLlm {
        conversation: ConversationContext,
        retries: u32,
    },
    // ...
}
During transitions, the context is cloned and updated:
  • UserInput → message appended to conversation
  • LlmEvent::Completed → assistant message appended
  • ToolCompleted (all done) → tool result messages appended
This ensures the conversation history is always consistent with the current state.

Testing Guidelines

Unit Tests

Verify specific state transitions in isolation:
#[test]
fn transitions_to_calling_llm_on_user_input() {
    let mut agent = Agent::new(config);
    let action = agent.handle_event(
        AgentEvent::UserInput(Message::user("hello"))
    ).unwrap();
    
    assert!(matches!(action, AgentAction::SendLlmRequest(_)));
    assert!(matches!(agent.state(), AgentState::CallingLlm { .. }));
}

Property Tests

Verify invariants hold across all configurations:
proptest! {
    #[test]
    fn agent_always_starts_in_waiting(config in any::<AgentConfig>()) {
        let agent = Agent::new(config);
        assert!(matches!(agent.state(), AgentState::WaitingForUserInput { .. }));
    }
    
    #[test]
    fn shutdown_succeeds_from_any_state(state in any::<AgentState>()) {
        let mut agent = Agent::with_state(state);
        let action = agent.handle_event(AgentEvent::ShutdownRequested).unwrap();
        assert!(matches!(action, AgentAction::Shutdown));
        assert!(matches!(agent.state(), AgentState::ShuttingDown));
    }
}

Integration Tests

Verify end-to-end flows through multiple transitions:
#[tokio::test]
async fn completes_full_conversation_cycle() {
    let mut agent = Agent::new(config);
    let llm_client = MockLlmClient::new();
    
    // User input
    let action = agent.handle_event(
        AgentEvent::UserInput(Message::user("hello"))
    ).unwrap();
    
    // LLM completion
    let action = agent.handle_event(
        AgentEvent::LlmEvent(LlmEvent::Completed(response))
    ).unwrap();
    
    // Back to waiting
    assert!(matches!(agent.state(), AgentState::WaitingForUserInput { .. }));
}

Extension Guide

Adding a New State

1

Add variant to AgentState

pub enum AgentState {
    // existing variants...
    NewState {
        conversation: ConversationContext,
        custom_field: CustomType,
    },
}
2

Update name() method

Self::NewState { .. } => "NewState",
3

Update conversation() accessors

Handle the new variant in agent.rs.
4

Add transition handlers

(AgentState::NewState { conversation, .. }, AgentEvent::SomeEvent) => {
    // transition logic
}
5

Add tests

Verify transitions to/from the new state.

Adding a New Event

1

Add variant to AgentEvent

pub enum AgentEvent {
    // existing variants...
    NewEvent { payload: PayloadType },
}
2

Handle the event

In each relevant state within handle_event().
3

Update catch-all pattern

Invalid transitions log a warning and return WaitForInput.
4

Add property tests

Verify the event is handled correctly from all reachable states.

Source Files

state.rs

State and event type definitions

agent.rs

State machine implementation

State Machine Spec

Detailed state machine specification

Architecture Overview

High-level system architecture

Build docs developers (and LLMs) love