LangGraph Server Overview

What is LangGraph Server?

DeerFlow uses LangGraph as its core agent runtime. The LangGraph Server provides HTTP/SSE endpoints for creating threads, sending messages, and streaming agent responses.

Default Port: LangGraph Server runs on port 2024 in development.When using make dev from the project root, all API requests are proxied through Nginx on port 2026 with the /api/langgraph prefix.

Base URL

The LangGraph API is accessible at:

Production/Docker: http://localhost:2026/api/langgraph
Direct access (dev): http://localhost:2024

Architecture

DeerFlow’s LangGraph Server is configured via backend/langgraph.json:

{
  "$schema": "https://langgra.ph/schema.json",
  "dependencies": ["."],
  "env": ".env",
  "graphs": {
    "lead_agent": "src.agents:make_lead_agent"
  }
}

The lead_agent graph factory (make_lead_agent) creates the main agent with:

Dynamic model selection based on runtime configuration
Middleware chain for thread isolation, memory, uploads, and more
Tools from sandbox, built-ins, MCP servers, and community integrations
System prompt with skills, memory context, and date injection

Using the LangGraph SDK

The official LangGraph SDK provides a Python client for interacting with LangGraph Server.

Installation

pip install langgraph-sdk

Basic Usage

from langgraph_sdk import get_client

# Connect to LangGraph Server
client = get_client(url="http://localhost:2024")

# Create a new thread
thread = await client.threads.create()
print(f"Thread ID: {thread['thread_id']}")

# Send a message and get response
response = await client.runs.create(
    thread_id=thread["thread_id"],
    assistant_id="lead_agent",
    input={"messages": [{"role": "user", "content": "Hello!"}]}
)

Use assistant_id="lead_agent" when creating runs. This references the graph defined in langgraph.json.

Runtime Configuration

You can customize agent behavior by passing config.configurable parameters:

response = await client.runs.create(
    thread_id=thread["thread_id"],
    assistant_id="lead_agent",
    input={"messages": [{"role": "user", "content": "Analyze this code"}]},
    config={
        "configurable": {
            "thinking_enabled": True,
            "model_name": "gpt-4",
            "is_plan_mode": True,
            "subagent_enabled": True,
            "max_concurrent_subagents": 3
        }
    }
)

Available Configuration Options

Parameter	Type	Default	Description
`thinking_enabled`	`bool`	`True`	Enable extended thinking mode for supported models
`reasoning_effort`	`str`	`None`	Reasoning effort level (e.g., “low”, “medium”, “high”)
`model_name`	`str`	Config default	Override the model for this conversation
`is_plan_mode`	`bool`	`False`	Enable TodoList middleware for task tracking
`subagent_enabled`	`bool`	`False`	Enable the `task` tool for delegating to sub-agents
`max_concurrent_subagents`	`int`	`3`	Maximum number of parallel sub-agent tasks
`agent_name`	`str`	`None`	Use a custom agent (requires custom agent setup)

Thread State Schema

DeerFlow extends LangGraph’s AgentState with additional fields in ThreadState:

class ThreadState(AgentState):
    # Standard LangGraph fields
    messages: list[Message]  # Conversation history
    
    # DeerFlow extensions
    sandbox: SandboxState | None  # Sandbox connection info
    thread_data: ThreadDataState | None  # Per-thread file paths
    title: str | None  # Auto-generated thread title
    artifacts: list[str]  # Presented output files
    todos: list | None  # Task list (when plan_mode enabled)
    uploaded_files: list[dict] | None  # User-uploaded files
    viewed_images: dict[str, ViewedImageData]  # Image cache for vision models

Thread Isolation

Each thread gets isolated directories created by ThreadDataMiddleware:

Workspace: backend/.deer-flow/threads/{thread_id}/user-data/workspace/
Uploads: backend/.deer-flow/threads/{thread_id}/user-data/uploads/
Outputs: backend/.deer-flow/threads/{thread_id}/user-data/outputs/

Inside the agent’s sandbox, these paths are mapped to /mnt/user-data/workspace, /mnt/user-data/uploads, and /mnt/user-data/outputs.

Agent Graph Structure

The lead_agent graph is created using LangGraph’s create_agent() API with:

Model: Selected via create_chat_model(name, thinking_enabled)
Tools: Combined from multiple sources via get_available_tools()
Middleware: 11 middleware components processing requests/responses
System Prompt: Generated by apply_prompt_template() with context injection
State Schema: ThreadState with custom reducers

Middleware Chain

Middlewares execute in strict order defined in src/agents/lead_agent/agent.py:207:

ThreadDataMiddleware - Create per-thread directories
UploadsMiddleware - Inject uploaded files into context
SandboxMiddleware - Acquire and manage sandbox lifecycle
DanglingToolCallMiddleware - Patch missing tool responses
SummarizationMiddleware - Context reduction (optional)
TodoListMiddleware - Task tracking (optional, plan_mode)
TitleMiddleware - Auto-generate thread title
MemoryMiddleware - Queue conversations for memory updates
ViewImageMiddleware - Inject images for vision models
SubagentLimitMiddleware - Enforce parallel task limits (optional)
ClarificationMiddleware - Intercept clarification requests (always last)

Middleware order is critical for proper operation. See backend/src/agents/lead_agent/agent.py:198 for detailed comments.

API Endpoints

The LangGraph Server provides standard endpoints:

Thread Management - Create, list, and manage conversation threads
Streaming - Stream agent responses with SSE
State Management - Get and update thread state
Interrupts - Handle clarification requests

Alternative: Embedded Python Client

For Python applications running in the same process, use DeerFlowClient instead of HTTP:

from src.client import DeerFlowClient

client = DeerFlowClient()

# Streaming response
for event in client.stream("Hello!", thread_id="my-thread"):
    if event.type == "messages-tuple":
        print(event.data)

# Synchronous chat
response = client.chat("Analyze this code", thread_id="my-thread")
print(response)

DeerFlowClient shares the same configuration files and data directories as LangGraph Server, but doesn’t require any HTTP services.

See Python Client API for full documentation.

Next Steps

Thread Management

Learn how to create and manage conversation threads

Streaming

Stream agent responses with Server-Sent Events

Python Client

Use the embedded Python client for in-process access

Agent Configuration

Configure models, tools, and runtime behavior

Python Client

Gateway API

LangGraph API

Agent Tools

LangGraph Server Overview

What is LangGraph Server?

Base URL

Architecture

Using the LangGraph SDK

Installation

Basic Usage

Runtime Configuration

Available Configuration Options

Thread State Schema

Thread Isolation

Agent Graph Structure

Middleware Chain

API Endpoints

Alternative: Embedded Python Client

Next Steps

Thread Management

Streaming

Python Client

Agent Configuration

Build docs developers (and LLMs) love

Python Client

Gateway API

LangGraph API

Agent Tools

​What is LangGraph Server?

​Base URL

​Architecture

​Using the LangGraph SDK

​Installation

​Basic Usage

​Runtime Configuration

​Available Configuration Options

​Thread State Schema

​Thread Isolation

​Agent Graph Structure

​Middleware Chain

​API Endpoints

​Alternative: Embedded Python Client

​Next Steps

Thread Management

Streaming

Python Client

Agent Configuration

Build docs developers (and LLMs) love

What is LangGraph Server?

Base URL

Architecture

Using the LangGraph SDK

Installation

Basic Usage

Runtime Configuration

Available Configuration Options

Thread State Schema

Thread Isolation

Agent Graph Structure

Middleware Chain

API Endpoints

Alternative: Embedded Python Client

Next Steps