Magentic-One is a generalist multi-agent system for solving open-ended web and file-based tasks across a variety of domains. It represents a significant step forward for multi-agent systems, achieving competitive performance on numerous agentic benchmarks.
Magentic-One is now fully integrated into autogen-agentchat, providing a modular and easy-to-use interface. The original implementation based on autogen-core is deprecated but available here .
Using Magentic-One involves interacting with a digital world designed for humans, which carries inherent risks. See the Safety Precautions section for important security guidelines.
Overview
Magentic-One uses a multi-agent architecture where a lead Orchestrator agent manages high-level planning, directs other agents, and tracks task progress. The system autonomously adapts to dynamic web and file-system environments to solve complex tasks.
Multi-Agent Architecture Orchestrator coordinates specialized agents for different capabilities
Web & File Tasks Handles open-ended tasks involving web browsing and file manipulation
Autonomous Adaptation Dynamically adjusts plans based on task progress and obstacles
Competitive Performance Achieves strong results on benchmarks like GAIA and HumanEval
Installation
Install required packages
pip install "autogen-agentchat" "autogen-ext[magentic-one,openai]"
Install Playwright (for MultimodalWebSurfer)
If you plan to use the web browsing agent: playwright install --with-deps chromium
Quick Start
Get started with Magentic-One in just a few lines of code:
import asyncio
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_ext.teams.magentic_one import MagenticOne
from autogen_agentchat.ui import Console
async def main ():
client = OpenAIChatCompletionClient( model = "gpt-4o" )
m1 = MagenticOne( client = client)
task = "What is the UV index in Melbourne today?"
result = await Console(m1.run_stream( task = task))
print (result)
if __name__ == "__main__" :
asyncio.run(main())
Architecture
Magentic-One consists of five specialized agents working together:
Orchestrator
The lead agent responsible for:
Task decomposition and planning
Directing other agents in executing subtasks
Tracking overall progress
Taking corrective actions when needed
The Orchestrator maintains two ledgers:
Task Ledger : High-level plan, facts, and educated guesses
Progress Ledger : Self-reflection on task progress at each step
WebSurfer
An LLM-based agent proficient in commanding a Chromium-based web browser:
Navigation : Visit URLs, perform web searches
Web Actions : Click elements, type text, fill forms
Reading : Summarize content, answer questions about pages
Uses accessibility tree and set-of-marks prompting for precise interactions.
FileSurfer
An LLM-based agent for file system operations:
Read local files of most types (via markdown preview)
List directory contents
Navigate folder structures
Extract information from documents
Coder
Specialized through its system prompt for:
Writing code to solve problems
Analyzing information from other agents
Creating new artifacts and tools
Implementing complex algorithms
ComputerTerminal
Provides access to a console shell:
Execute code written by the Coder
Install programming libraries
Run system commands
Interact with the file system
Usage Examples
Basic Usage with MagenticOne Helper
The simplest way to use Magentic-One with all agents:
import asyncio
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_ext.teams.magentic_one import MagenticOne
from autogen_agentchat.ui import Console
async def example_usage ():
client = OpenAIChatCompletionClient( model = "gpt-4o" )
m1 = MagenticOne( client = client)
task = "Write a Python script to fetch data from an API."
result = await Console(m1.run_stream( task = task))
print (result)
if __name__ == "__main__" :
asyncio.run(example_usage())
Human-in-the-Loop Mode
Add human oversight for safety-critical tasks:
import asyncio
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_ext.teams.magentic_one import MagenticOne
from autogen_ext.code_executors.docker import DockerCommandLineCodeExecutor
from autogen_agentchat.ui import Console
from autogen_agentchat.agents import ApprovalRequest, ApprovalResponse
def user_input_func ( prompt : str ) -> str :
"""Custom input function for user interaction."""
return input (prompt)
def approval_func ( request : ApprovalRequest) -> ApprovalResponse:
"""Request user approval before executing code."""
print ( f "Code to execute: \n { request.code } " )
user_input = input ( "Do you approve this code execution? (y/n): " ).strip().lower()
if user_input == 'y' :
return ApprovalResponse( approved = True , reason = "User approved" )
else :
return ApprovalResponse( approved = False , reason = "User denied" )
async def example_usage_hil ():
client = OpenAIChatCompletionClient( model = "gpt-4o" )
# Use Docker executor for better security
async with DockerCommandLineCodeExecutor() as code_executor:
m1 = MagenticOne(
client = client,
hil_mode = True ,
input_func = user_input_func,
code_executor = code_executor,
approval_func = approval_func
)
task = "Write a Python script to fetch data from an API."
result = await Console(m1.run_stream( task = task))
print (result)
if __name__ == "__main__" :
asyncio.run(example_usage_hil())
Code Approval Without Full HIL Mode
Approve only code execution while keeping the system autonomous:
import asyncio
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_ext.teams.magentic_one import MagenticOne
from autogen_ext.code_executors.docker import DockerCommandLineCodeExecutor
from autogen_agentchat.ui import Console
from autogen_agentchat.agents import ApprovalRequest, ApprovalResponse
def approval_func ( request : ApprovalRequest) -> ApprovalResponse:
"""Request user approval before executing code."""
print ( f "Code to execute: \n { request.code } " )
user_input = input ( "Approve? (y/n): " ).strip().lower()
if user_input == 'y' :
return ApprovalResponse( approved = True , reason = "User approved" )
return ApprovalResponse( approved = False , reason = "User denied" )
async def example_usage_with_approval ():
client = OpenAIChatCompletionClient( model = "gpt-4o" )
async with DockerCommandLineCodeExecutor() as code_executor:
m1 = MagenticOne(
client = client,
hil_mode = False , # No human intervention in conversation
code_executor = code_executor,
approval_func = approval_func # But approve code execution
)
task = "Write a Python script to fetch data from an API."
result = await Console(m1.run_stream( task = task))
print (result)
if __name__ == "__main__" :
asyncio.run(example_usage_with_approval())
Using MagenticOneGroupChat
For more control, use MagenticOneGroupChat directly:
import asyncio
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import MagenticOneGroupChat
from autogen_agentchat.ui import Console
async def main () -> None :
model_client = OpenAIChatCompletionClient( model = "gpt-4o" )
assistant = AssistantAgent(
"Assistant" ,
model_client = model_client,
)
team = MagenticOneGroupChat([assistant], model_client = model_client)
await Console(team.run_stream( task = "Provide a proof for Fermat's Last Theorem" ))
await model_client.close()
asyncio.run(main())
Using Individual Magentic-One Agents
Combine specific agents in a custom team:
import asyncio
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_agentchat.teams import MagenticOneGroupChat
from autogen_agentchat.ui import Console
from autogen_ext.agents.web_surfer import MultimodalWebSurfer
from autogen_ext.agents.file_surfer import FileSurfer
from autogen_ext.agents.magentic_one import MagenticOneCoderAgent
from autogen_agentchat.agents import CodeExecutorAgent
from autogen_ext.code_executors.local import LocalCommandLineCodeExecutor
async def main () -> None :
model_client = OpenAIChatCompletionClient( model = "gpt-4o" )
surfer = MultimodalWebSurfer( "WebSurfer" , model_client = model_client)
file_surfer = FileSurfer( "FileSurfer" , model_client = model_client)
coder = MagenticOneCoderAgent( "Coder" , model_client = model_client)
terminal = CodeExecutorAgent(
"ComputerTerminal" ,
code_executor = LocalCommandLineCodeExecutor()
)
team = MagenticOneGroupChat(
[surfer, file_surfer, coder, terminal],
model_client = model_client
)
await Console(team.run_stream( task = "What is the UV index in Melbourne today?" ))
asyncio.run(main())
Safety Precautions
Magentic-One interacts with real web pages, executes code, and accesses files. Always follow these safety guidelines:
Run all tasks in Docker containers to isolate the agents and prevent direct system attacks. from autogen_ext.code_executors.docker import DockerCommandLineCodeExecutor
async with DockerCommandLineCodeExecutor() as code_executor:
m1 = MagenticOne( client = client, code_executor = code_executor)
Use a virtual environment to prevent agents from accessing sensitive data or system files.
Closely monitor logs during and after execution to detect and mitigate risky behavior.
Run examples with a human in the loop to supervise agents and prevent unintended consequences. m1 = MagenticOne(
client = client,
hil_mode = True ,
approval_func = approval_func
)
Restrict agents’ access to the internet and other resources to prevent unauthorized actions.
Ensure agents do not have access to sensitive data or resources. Never share sensitive information with the agents.
Be aware that agents may occasionally attempt risky actions, such as:
Recruiting humans for help
Accepting cookie agreements without human involvement
Following instructions from compromised web pages (prompt injection)
Always ensure agents are monitored and operate within a controlled environment.
Model Recommendations
Magentic-One is model-agnostic and can work with various LLMs:
GPT-4o (Recommended) Default multimodal LLM for all agents. Strong reasoning and vision capabilities.
GPT-4o for Orchestrator Use a strong reasoning model for the Orchestrator agent.
OpenAI o1-preview For advanced reasoning in Orchestrator outer loop and Coder agent.
Heterogeneous Models Mix different models for different agents to balance cost and capabilities.
Azure OpenAI Example
from autogen_ext.models.openai import AzureOpenAIChatCompletionClient
client = AzureOpenAIChatCompletionClient(
azure_endpoint = "https://your-endpoint.openai.azure.com/" ,
api_version = "2024-02-15-preview" ,
model = "gpt-4o" ,
api_key = "your-api-key"
)
m1 = MagenticOne( client = client)
Magentic-One achieves competitive results on multiple benchmarks:
GAIA : Strong performance on general AI assistant tasks
HumanEval : Effective code generation capabilities
AssistantBench : Competitive across diverse assistant scenarios
See the technical report for detailed benchmark results.
Orchestrator Workflow
The Orchestrator uses a two-loop architecture:
Outer Loop (Task Ledger)
Create initial plan for the task
Gather facts and educated guesses
Update plan if progress stalls
Inner Loop (Progress Ledger)
Self-reflect on current progress
Check if task is completed
Assign subtask to appropriate agent
Update progress after agent completes subtask
Repeat until task is complete or replanning is needed
This architecture allows Magentic-One to:
Dynamically adapt to obstacles
Recover from failures
Optimize agent selection based on subtask requirements
API Reference
MagenticOne
client
ChatCompletionClient
required
The client used for model interactions (e.g., OpenAIChatCompletionClient)
If True, adds UserProxyAgent to enable human-in-the-loop interactions
input_func
InputFuncType
default: "None"
Function to use for user input in human-in-the-loop mode
code_executor
CodeExecutor
default: "None"
Code executor to use. If None, will use Docker if available, otherwise local executor.
approval_func
ApprovalFuncType
default: "None"
Function to approve code execution before running. If None, code executes without approval.
Resources
Blog Post Read the official Magentic-One announcement
Technical Report Full academic paper with detailed methodology
GitHub Repository View source code and contribute
API Reference Complete API documentation
Citation
If you use Magentic-One in your research, please cite:
@misc { fourney2024magenticonegeneralistmultiagentsolving ,
title = { Magentic-One: A Generalist Multi-Agent System for Solving Complex Tasks } ,
author = { Adam Fourney and Gagan Bansal and Hussein Mozannar and Cheng Tan and Eduardo Salinas and Erkang Zhu and Friederike Niedtner and Grace Proebsting and Griffin Bassman and Jack Gerrits and Jacob Alber and Peter Chang and Ricky Loynd and Robert West and Victor Dibia and Ahmed Awadallah and Ece Kamar and Rafah Hosn and Saleema Amershi } ,
year = { 2024 } ,
eprint = { 2411.04468 } ,
archivePrefix = { arXiv } ,
primaryClass = { cs.AI } ,
url = { https://arxiv.org/abs/2411.04468 }
}