Skip to main content
Magentic-One is a generalist multi-agent system for solving open-ended web and file-based tasks across a variety of domains. It represents a significant step forward for multi-agent systems, achieving competitive performance on numerous agentic benchmarks.
Magentic-One is now fully integrated into autogen-agentchat, providing a modular and easy-to-use interface. The original implementation based on autogen-core is deprecated but available here.
Using Magentic-One involves interacting with a digital world designed for humans, which carries inherent risks. See the Safety Precautions section for important security guidelines.

Overview

Magentic-One uses a multi-agent architecture where a lead Orchestrator agent manages high-level planning, directs other agents, and tracks task progress. The system autonomously adapts to dynamic web and file-system environments to solve complex tasks.

Multi-Agent Architecture

Orchestrator coordinates specialized agents for different capabilities

Web & File Tasks

Handles open-ended tasks involving web browsing and file manipulation

Autonomous Adaptation

Dynamically adjusts plans based on task progress and obstacles

Competitive Performance

Achieves strong results on benchmarks like GAIA and HumanEval

Installation

1

Install required packages

pip install "autogen-agentchat" "autogen-ext[magentic-one,openai]"
2

Install Playwright (for MultimodalWebSurfer)

If you plan to use the web browsing agent:
playwright install --with-deps chromium

Quick Start

Get started with Magentic-One in just a few lines of code:
import asyncio
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_ext.teams.magentic_one import MagenticOne
from autogen_agentchat.ui import Console

async def main():
    client = OpenAIChatCompletionClient(model="gpt-4o")
    m1 = MagenticOne(client=client)
    task = "What is the UV index in Melbourne today?"
    result = await Console(m1.run_stream(task=task))
    print(result)

if __name__ == "__main__":
    asyncio.run(main())

Architecture

Magentic-One Architecture Magentic-One consists of five specialized agents working together:

Orchestrator

The lead agent responsible for:
  • Task decomposition and planning
  • Directing other agents in executing subtasks
  • Tracking overall progress
  • Taking corrective actions when needed
The Orchestrator maintains two ledgers:
  • Task Ledger: High-level plan, facts, and educated guesses
  • Progress Ledger: Self-reflection on task progress at each step

WebSurfer

An LLM-based agent proficient in commanding a Chromium-based web browser:
  • Navigation: Visit URLs, perform web searches
  • Web Actions: Click elements, type text, fill forms
  • Reading: Summarize content, answer questions about pages
Uses accessibility tree and set-of-marks prompting for precise interactions.

FileSurfer

An LLM-based agent for file system operations:
  • Read local files of most types (via markdown preview)
  • List directory contents
  • Navigate folder structures
  • Extract information from documents

Coder

Specialized through its system prompt for:
  • Writing code to solve problems
  • Analyzing information from other agents
  • Creating new artifacts and tools
  • Implementing complex algorithms

ComputerTerminal

Provides access to a console shell:
  • Execute code written by the Coder
  • Install programming libraries
  • Run system commands
  • Interact with the file system

Usage Examples

Basic Usage with MagenticOne Helper

The simplest way to use Magentic-One with all agents:
import asyncio
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_ext.teams.magentic_one import MagenticOne
from autogen_agentchat.ui import Console

async def example_usage():
    client = OpenAIChatCompletionClient(model="gpt-4o")
    m1 = MagenticOne(client=client)
    task = "Write a Python script to fetch data from an API."
    result = await Console(m1.run_stream(task=task))
    print(result)

if __name__ == "__main__":
    asyncio.run(example_usage())

Human-in-the-Loop Mode

Add human oversight for safety-critical tasks:
import asyncio
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_ext.teams.magentic_one import MagenticOne
from autogen_ext.code_executors.docker import DockerCommandLineCodeExecutor
from autogen_agentchat.ui import Console
from autogen_agentchat.agents import ApprovalRequest, ApprovalResponse

def user_input_func(prompt: str) -> str:
    """Custom input function for user interaction."""
    return input(prompt)

def approval_func(request: ApprovalRequest) -> ApprovalResponse:
    """Request user approval before executing code."""
    print(f"Code to execute:\n{request.code}")
    user_input = input("Do you approve this code execution? (y/n): ").strip().lower()
    if user_input == 'y':
        return ApprovalResponse(approved=True, reason="User approved")
    else:
        return ApprovalResponse(approved=False, reason="User denied")

async def example_usage_hil():
    client = OpenAIChatCompletionClient(model="gpt-4o")
    
    # Use Docker executor for better security
    async with DockerCommandLineCodeExecutor() as code_executor:
        m1 = MagenticOne(
            client=client,
            hil_mode=True,
            input_func=user_input_func,
            code_executor=code_executor,
            approval_func=approval_func
        )
        task = "Write a Python script to fetch data from an API."
        result = await Console(m1.run_stream(task=task))
        print(result)

if __name__ == "__main__":
    asyncio.run(example_usage_hil())

Code Approval Without Full HIL Mode

Approve only code execution while keeping the system autonomous:
import asyncio
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_ext.teams.magentic_one import MagenticOne
from autogen_ext.code_executors.docker import DockerCommandLineCodeExecutor
from autogen_agentchat.ui import Console
from autogen_agentchat.agents import ApprovalRequest, ApprovalResponse

def approval_func(request: ApprovalRequest) -> ApprovalResponse:
    """Request user approval before executing code."""
    print(f"Code to execute:\n{request.code}")
    user_input = input("Approve? (y/n): ").strip().lower()
    if user_input == 'y':
        return ApprovalResponse(approved=True, reason="User approved")
    return ApprovalResponse(approved=False, reason="User denied")

async def example_usage_with_approval():
    client = OpenAIChatCompletionClient(model="gpt-4o")
    
    async with DockerCommandLineCodeExecutor() as code_executor:
        m1 = MagenticOne(
            client=client,
            hil_mode=False,  # No human intervention in conversation
            code_executor=code_executor,
            approval_func=approval_func  # But approve code execution
        )
        task = "Write a Python script to fetch data from an API."
        result = await Console(m1.run_stream(task=task))
        print(result)

if __name__ == "__main__":
    asyncio.run(example_usage_with_approval())

Using MagenticOneGroupChat

For more control, use MagenticOneGroupChat directly:
import asyncio
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import MagenticOneGroupChat
from autogen_agentchat.ui import Console

async def main() -> None:
    model_client = OpenAIChatCompletionClient(model="gpt-4o")

    assistant = AssistantAgent(
        "Assistant",
        model_client=model_client,
    )
    team = MagenticOneGroupChat([assistant], model_client=model_client)
    await Console(team.run_stream(task="Provide a proof for Fermat's Last Theorem"))
    await model_client.close()

asyncio.run(main())

Using Individual Magentic-One Agents

Combine specific agents in a custom team:
import asyncio
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_agentchat.teams import MagenticOneGroupChat
from autogen_agentchat.ui import Console
from autogen_ext.agents.web_surfer import MultimodalWebSurfer
from autogen_ext.agents.file_surfer import FileSurfer
from autogen_ext.agents.magentic_one import MagenticOneCoderAgent
from autogen_agentchat.agents import CodeExecutorAgent
from autogen_ext.code_executors.local import LocalCommandLineCodeExecutor

async def main() -> None:
    model_client = OpenAIChatCompletionClient(model="gpt-4o")

    surfer = MultimodalWebSurfer("WebSurfer", model_client=model_client)
    file_surfer = FileSurfer("FileSurfer", model_client=model_client)
    coder = MagenticOneCoderAgent("Coder", model_client=model_client)
    terminal = CodeExecutorAgent(
        "ComputerTerminal",
        code_executor=LocalCommandLineCodeExecutor()
    )

    team = MagenticOneGroupChat(
        [surfer, file_surfer, coder, terminal],
        model_client=model_client
    )
    
    await Console(team.run_stream(task="What is the UV index in Melbourne today?"))

asyncio.run(main())

Safety Precautions

Magentic-One interacts with real web pages, executes code, and accesses files. Always follow these safety guidelines:
Run all tasks in Docker containers to isolate the agents and prevent direct system attacks.
from autogen_ext.code_executors.docker import DockerCommandLineCodeExecutor

async with DockerCommandLineCodeExecutor() as code_executor:
    m1 = MagenticOne(client=client, code_executor=code_executor)
Use a virtual environment to prevent agents from accessing sensitive data or system files.
Closely monitor logs during and after execution to detect and mitigate risky behavior.
Run examples with a human in the loop to supervise agents and prevent unintended consequences.
m1 = MagenticOne(
    client=client,
    hil_mode=True,
    approval_func=approval_func
)
Restrict agents’ access to the internet and other resources to prevent unauthorized actions.
Ensure agents do not have access to sensitive data or resources. Never share sensitive information with the agents.
Be aware that agents may occasionally attempt risky actions, such as:
  • Recruiting humans for help
  • Accepting cookie agreements without human involvement
  • Following instructions from compromised web pages (prompt injection)
Always ensure agents are monitored and operate within a controlled environment.

Model Recommendations

Magentic-One is model-agnostic and can work with various LLMs:

GPT-4o (Recommended)

Default multimodal LLM for all agents. Strong reasoning and vision capabilities.

GPT-4o for Orchestrator

Use a strong reasoning model for the Orchestrator agent.

OpenAI o1-preview

For advanced reasoning in Orchestrator outer loop and Coder agent.

Heterogeneous Models

Mix different models for different agents to balance cost and capabilities.

Azure OpenAI Example

from autogen_ext.models.openai import AzureOpenAIChatCompletionClient

client = AzureOpenAIChatCompletionClient(
    azure_endpoint="https://your-endpoint.openai.azure.com/",
    api_version="2024-02-15-preview",
    model="gpt-4o",
    api_key="your-api-key"
)

m1 = MagenticOne(client=client)

Performance

Magentic-One achieves competitive results on multiple benchmarks:
  • GAIA: Strong performance on general AI assistant tasks
  • HumanEval: Effective code generation capabilities
  • AssistantBench: Competitive across diverse assistant scenarios
See the technical report for detailed benchmark results.

Orchestrator Workflow

The Orchestrator uses a two-loop architecture:

Outer Loop (Task Ledger)

  1. Create initial plan for the task
  2. Gather facts and educated guesses
  3. Update plan if progress stalls

Inner Loop (Progress Ledger)

  1. Self-reflect on current progress
  2. Check if task is completed
  3. Assign subtask to appropriate agent
  4. Update progress after agent completes subtask
  5. Repeat until task is complete or replanning is needed
This architecture allows Magentic-One to:
  • Dynamically adapt to obstacles
  • Recover from failures
  • Optimize agent selection based on subtask requirements

API Reference

MagenticOne

client
ChatCompletionClient
required
The client used for model interactions (e.g., OpenAIChatCompletionClient)
hil_mode
bool
default:"false"
If True, adds UserProxyAgent to enable human-in-the-loop interactions
input_func
InputFuncType
default:"None"
Function to use for user input in human-in-the-loop mode
code_executor
CodeExecutor
default:"None"
Code executor to use. If None, will use Docker if available, otherwise local executor.
approval_func
ApprovalFuncType
default:"None"
Function to approve code execution before running. If None, code executes without approval.

Resources

Blog Post

Read the official Magentic-One announcement

Technical Report

Full academic paper with detailed methodology

GitHub Repository

View source code and contribute

API Reference

Complete API documentation

Citation

If you use Magentic-One in your research, please cite:
@misc{fourney2024magenticonegeneralistmultiagentsolving,
    title={Magentic-One: A Generalist Multi-Agent System for Solving Complex Tasks},
    author={Adam Fourney and Gagan Bansal and Hussein Mozannar and Cheng Tan and Eduardo Salinas and Erkang Zhu and Friederike Niedtner and Grace Proebsting and Griffin Bassman and Jack Gerrits and Jacob Alber and Peter Chang and Ricky Loynd and Robert West and Victor Dibia and Ahmed Awadallah and Ece Kamar and Rafah Hosn and Saleema Amershi},
    year={2024},
    eprint={2411.04468},
    archivePrefix={arXiv},
    primaryClass={cs.AI},
    url={https://arxiv.org/abs/2411.04468}
}

Build docs developers (and LLMs) love