Skip to main content

Overview

The Agent class is the backbone of the Swarms framework, connecting LLMs with tools, long-term memory, and advanced autonomous capabilities. It provides a production-ready interface for building intelligent agents that can reason, use tools, handle multimodal inputs, and execute complex tasks.

Import

from swarms import Agent

Key Features

  • Tool Integration: Native support for function calling and tool execution
  • Long-term Memory: RAG-based memory system for context retention
  • Autonomous Loops: Dynamic execution with configurable stopping conditions
  • Multi-modal Support: Process text, images, and other media
  • MCP Support: Integration with Model Context Protocol servers
  • Agent Handoffs: Delegate tasks to specialized agents
  • Streaming: Real-time token streaming with callbacks
  • Fallback Models: Automatic failover to backup models
  • State Management: Autosave and state persistence

Initialization

id
str
default:"agent-{uuid}"
Unique identifier for the agent instance
agent_name
str
default:"swarm-worker-01"
The name of the agent, used for identification and logging
agent_description
str
A description of the agent’s purpose and capabilities
system_prompt
str
The system prompt that defines the agent’s behavior and personality
llm
Any
The language model instance to use. If None, a LiteLLM instance will be created
model_name
str
The name of the model to use (e.g., “gpt-4o”, “claude-3-opus”)
max_loops
Union[int, str]
default:"1"
Maximum number of reasoning loops. Use “auto” for autonomous mode with dynamic planning
tools
List[Callable]
List of callable functions that the agent can use as tools
temperature
float
default:"0.5"
Temperature for LLM sampling (0.0 to 1.0)
max_tokens
int
default:"4096"
Maximum number of tokens in the LLM response
context_length
int
Context window size. Automatically set based on model_name if not specified
streaming_on
bool
default:"false"
Enable basic streaming with formatted panels
stream
bool
default:"false"
Enable detailed token-by-token streaming with metadata (citations, tokens used, etc.)
streaming_callback
Callable[[str], None]
Callback function to receive streaming tokens in real-time
interactive
bool
default:"false"
Enable interactive mode for user input between loops
verbose
bool
default:"false"
Enable verbose logging for debugging
return_history
bool
default:"false"
Return the full conversation history instead of just the final response
output_type
OutputType
default:"str-all-except-first"
Output format: ‘str’, ‘string’, ‘list’, ‘json’, ‘dict’, ‘yaml’, ‘xml’
autosave
bool
default:"false"
Automatically save agent state during execution
dashboard
bool
default:"false"
Display agent dashboard on initialization
long_term_memory
Union[Callable, Any]
Long-term memory system (e.g., vector database) for RAG
rag_every_loop
bool
default:"false"
Query RAG memory on every loop iteration
fallback_models
List[str]
List of model names to try in order. First model is primary, rest are fallbacks
retry_attempts
int
default:"3"
Number of retry attempts for LLM calls
retry_interval
int
default:"1"
Interval in seconds between retry attempts
stopping_token
str
Token that signals the agent to stop execution
stopping_condition
Callable[[str], bool]
Function that returns True when the agent should stop
stopping_func
Callable
Alternative stopping function
dynamic_temperature_enabled
bool
default:"false"
Enable dynamic temperature adjustment during execution
dynamic_loops
bool
default:"false"
Enable dynamic loop count adjustment (sets max_loops=“auto”)
user_name
str
default:"Human"
Name of the user in conversation history
saved_state_path
str
Path to save agent state
sop
str
Standard operating procedure for the agent
sop_list
List[str]
List of standard operating procedures
rules
str
Rules that govern agent behavior
planning_prompt
str
Prompt for planning phase
plan_enabled
bool
default:"false"
Enable planning phase before execution
multi_modal
bool
Enable multi-modal processing (images, etc.)
timeout
int
Timeout for agent execution in seconds
artifacts_on
bool
default:"false"
Enable artifact generation and storage
artifacts_output_path
str
Path to save artifacts
artifacts_file_extension
str
File extension for artifacts (.pdf, .md, .txt)
mcp_url
Union[str, MCPConnection]
URL or connection object for a single MCP server
mcp_urls
List[str]
List of MCP server URLs for multiple connections
mcp_config
MCPConnection
MCP connection configuration object
mcp_configs
MultipleMCPConnections
Configuration for multiple MCP connections
handoffs
Union[Sequence[Callable], Any]
List of agents to enable task handoffs/delegation
capabilities
List[str]
List of agent capabilities for documentation
mode
Literal['interactive', 'fast', 'standard']
default:"standard"
Execution mode: interactive (with user input), fast (minimal logging), or standard
marketplace_prompt_id
str
UUID of a prompt from the Swarms marketplace to use as system prompt
skills_dir
str
Path to directory containing Agent Skills in SKILL.md format (Anthropic framework)
selected_tools
Union[str, List[str]]
default:"all"
Tools to enable for autonomous looper when max_loops=“auto”. Options: “all” or list of tool names
reasoning_enabled
bool
default:"false"
Enable reasoning mode for supported models (e.g., o1, o3)
reasoning_effort
str
Effort level for reasoning models (“low”, “medium”, “high”)
thinking_tokens
int
Maximum thinking tokens for reasoning models

Methods

run

Execute the agent’s main loop for a given task.
def run(
    task: Optional[Union[str, Any]] = None,
    img: Optional[str] = None,
    *args,
    **kwargs
) -> Any
task
Union[str, Any]
The task or prompt for the agent to process
img
str
Optional image path or data for vision-enabled models
return
Any
Agent output formatted according to output_type configuration

call

Alternative syntax for running the agent (calls run internally).
def __call__(
    task: str,
    *args,
    **kwargs
) -> str

run_concurrent

Run multiple tasks concurrently.
def run_concurrent(
    tasks: List[str],
    *args,
    **kwargs
) -> List[str]

bulk_run

Execute multiple tasks in batch.
def bulk_run(
    tasks: List[str],
    *args,
    **kwargs
) -> List[str]

save

Save the agent’s current state to disk.
def save(
    file_path: str = None
) -> None

load

Load agent state from a saved file.
def load(
    file_path: str
) -> Agent

to_dict

Convert agent configuration to dictionary.
def to_dict() -> Dict[str, Any]

to_json

Convert agent configuration to JSON string.
def to_json(
    indent: int = 4
) -> str

to_yaml

Convert agent configuration to YAML string.
def to_yaml(
    indent: int = 4
) -> str

add_tool

Dynamically add a tool to the agent.
def add_tool(
    tool: Callable
) -> None

reset

Reset the agent’s memory and state.
def reset() -> None
Display the agent’s configuration dashboard.
def print_dashboard() -> None

Examples

Basic Usage

from swarms import Agent

# Create a simple agent
agent = Agent(
    agent_name="Financial-Analyst",
    model_name="gpt-4o",
    max_loops=1,
    system_prompt="You are a financial analyst. Provide detailed, data-driven insights."
)

# Run a task
response = agent.run("Analyze the Q4 2024 revenue trends")
print(response)

Agent with Tools

from swarms import Agent

def search_web(query: str) -> str:
    """Search the web for information."""
    # Implementation
    return f"Search results for: {query}"

def calculate(expression: str) -> float:
    """Evaluate a mathematical expression."""
    return eval(expression)

agent = Agent(
    agent_name="Research-Agent",
    model_name="gpt-4o",
    max_loops=5,
    tools=[search_web, calculate],
    system_prompt="You are a research assistant with web search and calculation abilities."
)

result = agent.run("Search for the population of Tokyo and calculate its growth rate")

Multi-modal Agent

agent = Agent(
    agent_name="Vision-Agent",
    model_name="gpt-4o",
    multi_modal=True,
    max_loops=1
)

response = agent.run(
    task="Describe what you see in this image and identify any objects",
    img="path/to/image.jpg"
)

Autonomous Agent with Auto Loops

agent = Agent(
    agent_name="Autonomous-Developer",
    model_name="gpt-4o",
    max_loops="auto",  # Enables autonomous planning and execution
    system_prompt="You are an autonomous software developer."
)

result = agent.run("Build a REST API for a todo application with authentication")
# Agent will:
# 1. Create a plan with subtasks
# 2. Execute each subtask using available tools
# 3. Generate a comprehensive summary

Agent with Streaming

def on_token(token: str):
    print(token, end="", flush=True)

agent = Agent(
    agent_name="Streaming-Agent",
    model_name="gpt-4o",
    stream=True,
    streaming_callback=on_token,
    max_loops=1
)

response = agent.run("Write a creative story about space exploration")

Agent with Fallback Models

agent = Agent(
    agent_name="Reliable-Agent",
    fallback_models=["gpt-4o", "gpt-4o-mini", "gpt-3.5-turbo"],
    max_loops=1
)

# Will try gpt-4o first, then gpt-4o-mini, then gpt-3.5-turbo if each fails
response = agent.run("Generate a market analysis report")

Agent with MCP Integration

agent = Agent(
    agent_name="MCP-Agent",
    model_name="gpt-4o",
    mcp_url="npx -y @modelcontextprotocol/server-filesystem /path/to/directory",
    max_loops=3
)

result = agent.run("Read the contents of config.json and summarize the settings")

Agent Handoffs

# Create specialized agents
researcher = Agent(
    agent_name="Researcher",
    model_name="gpt-4o",
    system_prompt="You are a research specialist."
)

writer = Agent(
    agent_name="Writer",
    model_name="gpt-4o",
    system_prompt="You are a technical writer."
)

# Create coordinator with handoffs
coordinator = Agent(
    agent_name="Coordinator",
    model_name="gpt-4o",
    max_loops=5,
    handoffs=[researcher, writer],
    system_prompt="You coordinate tasks between research and writing teams."
)

result = coordinator.run("Create a comprehensive report on quantum computing")
# Coordinator can delegate research to researcher and writing to writer

Marketplace Prompt Loading

# Load a prompt from Swarms marketplace in one line
agent = Agent(
    model_name="gpt-4o",
    marketplace_prompt_id="550e8400-e29b-41d4-a716-446655440000",
    max_loops=1
)

response = agent.run("Execute the marketplace prompt task")
# Agent automatically loads system prompt from marketplace

Output Types

The agent supports multiple output formats via the output_type parameter:
  • "str" or "string": Returns the last response as a string
  • "str-all-except-first": Returns all responses except system prompt as string (default)
  • "list": Returns conversation as a list of messages
  • "json": Returns conversation as JSON string
  • "dict": Returns conversation as dictionary
  • "yaml": Returns conversation as YAML string
  • "xml": Returns conversation as XML string

Error Handling

The Agent class includes comprehensive error handling:
  • AgentInitializationError: Raised when agent fails to initialize
  • AgentRunError: Raised when execution fails
  • AgentLLMError: Raised when LLM encounters issues
  • AgentToolError: Raised when tool execution fails
  • AgentMemoryError: Raised for memory-related issues

Best Practices

  1. Set appropriate max_loops: Use 1 for simple tasks, higher numbers for complex reasoning, or “auto” for autonomous planning
  2. Use tools wisely: Provide well-documented tools with clear function signatures and docstrings
  3. Enable autosave for long-running tasks: Prevents data loss on interruption
  4. Configure fallback models: Ensures reliability in production
  5. Use streaming for real-time feedback: Better user experience for long-running tasks
  6. Set context_length appropriately: Prevents token limit errors
  7. Enable verbose mode during development: Helps debug issues
  8. Use agent handoffs for complex workflows: Delegate subtasks to specialized agents
  • BaseSwarm - Base class for multi-agent systems
  • BaseStructure - Foundation for swarm structures
  • Tools - Creating and using agent tools
  • Memory - Long-term memory systems

Build docs developers (and LLMs) love