Configuration

Environment variables

Agentic Patterns reads its configuration from a .env file loaded via python-dotenv. Copy .env.example to .env and set the variables relevant to your setup.

Variable	Description	Example
`LOCAL_MODEL`	Name of the Ollama model to use with `ChatOllama`	`llama3`
`OLLAMA_HOST`	Base URL of the Ollama server	`http://localhost:11434`
`OLLAMA_MODEL`	Alternate Ollama host URL (mirrors `OLLAMA_HOST` in the example)	`http://localhost:11434`
`SUMMARY_HOST`	Base URL for the cloud OpenAI-compatible API	`https://api.routeway.ai/v1`
`SUMMARY_MODEL`	Model identifier for the cloud API	`nemotron-nano-9b-v2:free`
`SUMMARY_AGENT_API_KEY`	API key for the cloud LLM endpoint	`sk-...`
`REPORT_HOST`	Base URL for a secondary cloud API used for report generation	`https://api.llmapi.ai/v1`
`REPORT_MODEL`	Model identifier for the report API	`llama-3-8b-instruct`
`REPORT_AGENT_API_KEY`	API key for the report LLM endpoint	`sk-...`

Never commit .env to version control. It is already listed in .gitignore.

LLM configuration

Local Ollama

Use ChatOllama from langchain-ollama to run models locally:

import os
from dotenv import load_dotenv
from langchain_ollama import ChatOllama

load_dotenv()
llm = ChatOllama(model=os.getenv("LOCAL_MODEL"))

Ollama must be running at the address specified by OLLAMA_HOST (default: http://localhost:11434). Install Ollama from ollama.com and pull a model with ollama pull llama3 before running.

Cloud OpenAI-compatible

Use ChatOpenAI with a custom base_url to connect to any OpenAI-compatible API:

import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI

load_dotenv()
llm_cloud = ChatOpenAI(
    model_name=os.getenv("SUMMARY_MODEL"),
    temperature=0.3,
    openai_api_key=os.getenv("SUMMARY_AGENT_API_KEY"),
    base_url=os.getenv("SUMMARY_HOST"),
)

You can mix backends per agent. For example, pass a local ChatOllama to PlanningAgent and ExecutionAgent, and a cloud ChatOpenAI to MonitoringAgent for higher-quality evaluation.

Agent configuration

All agents inherit from BaseAgent. The constructor parameters available on every agent are:

llm

BaseChatModel

required

The LangChain chat model instance to use. Any BaseChatModel subclass is accepted — ChatOllama, ChatOpenAI, etc.

system_prompt

str

The system prompt injected at the start of every invocation. Each concrete agent class sets a sensible default, but you can override it.Default values:

PlanningAgent: instructs the model to return a JSON {"plan": [...]} object.
ExecutionAgent: instructs the model to complete the given action step with detailed output.
MonitoringAgent: instructs the model to return a JSON {"success": bool, "feedback": str} object.
LocalAgent: instructs the model to aggressively condense text to save tokens.

agent_name

str

A label used in log messages. Defaults to the class name (e.g., "PlanningAgent").

Example — overriding the system prompt on PlanningAgent:

from langchain_ollama import ChatOllama
from agents import PlanningAgent

llm = ChatOllama(model="llama3")
planner = PlanningAgent(
    llm=llm,
    system_prompt=(
        "You are a concise planning agent. Return ONLY a valid JSON object "
        'with a \'plan\' key containing at most 3 steps. '
        'Example: {\"plan\": [\"Step 1\", \"Step 2\"]}'
    ),
)

ExecutionAgent tools

ExecutionAgent accepts an optional list of LangChain-compatible tools. When tools are provided, it wraps a LangGraph ReAct agent internally:

tools

List[BaseTool]

A list of BaseTool instances bound to the executor. When non-empty, the agent uses create_react_agent from LangGraph. Pass [] or omit to use a plain LLM chain.

from agents import ExecutionAgent
from tools.curl_search_tool import CurlSearchTool

executor = ExecutionAgent(llm=llm, tools=[CurlSearchTool()])

Workflow configuration

SequentialWorkflow

agents

Dict[str, Any]

required

A dictionary with keys "planner", "executor", and "monitor" mapping to their respective agent instances.

tools

List[Any]

Optional list of tools passed to the workflow. The first tool in the list is used as the context compressor. Supports both invoke() and _run() interfaces.

SequentialWorkflow.run() accepts:

task

str

required

The high-level task description passed to the planner.

max_retries

int

Maximum number of retry attempts per step before aborting the workflow. Defaults to 2.

from workflows import SequentialWorkflow

workflow = SequentialWorkflow(agents=agent_dict, tools=[compressor, curl_tool])
result = workflow.run(task="Summarize recent AI research", max_retries=3)

ParallelWorkflow

ParallelWorkflow.run() accepts:

tasks

List[str]

required

A list of independent task descriptions to execute concurrently.

max_workers

int

Maximum number of threads in the pool. Defaults to 5.

from workflows import ParallelWorkflow

workflow = ParallelWorkflow(agents={"executor": executor})
result = workflow.run(
    tasks=["Summarize topic A", "Summarize topic B", "Summarize topic C"],
    max_workers=3,
)

LangGraphOrchestrator configuration

planner

PlanningAgent

required

The planning agent instance.

executor

ExecutionAgent

required

The execution agent instance.

monitor

MonitoringAgent

required

The monitoring agent instance.

compressor

Any

An optional compressor used to shrink accumulated context before each execution step. Accepts any object with an invoke() or _run() method — typically a CompressContextTool or a LocalAgent instance.

max_retries

int

Maximum retry attempts per step before the orchestrator transitions to "failed" status. Defaults to 2.

from orchestators import LangGraphOrchestrator
from tools.compress_context_tool import CompressContextTool

orchestrator = LangGraphOrchestrator(
    planner=planner,
    executor=executor,
    monitor=monitor,
    compressor=CompressContextTool(max_length=10000),
    max_retries=2,
)
result = orchestrator.run(task="Research and summarize the history of Andorra")

Memory configuration

SQLiteShortTermMemory stores conversation history and serves as a caching layer.

db_path

str

Path to the SQLite database file. Defaults to 'short_term_memory.db' in the working directory. Use ':memory:' for ephemeral RAM-only storage that is discarded when the process exits.

from memory.short_term_memory import SQLiteShortTermMemory

# Persistent storage
memory = SQLiteShortTermMemory(db_path="my_project_memory.db")

# Ephemeral (in-memory only)
memory = SQLiteShortTermMemory(db_path=":memory:")

add_memory() accepts:

session_id

str

required

Unique identifier grouping related messages. Use a consistent value (e.g., "global_cache_session") across runs to enable cache hits.

role

str

required

The speaker role: "user", "assistant", "system", or "tool".

content

str

required

The text content to persist.

metadata

Dict[str, Any]

Optional dictionary of additional context (e.g., tool names, token counts).

get_context() accepts:

session_id

str

required

The session to retrieve messages for.

limit

int

Maximum number of recent messages to return, in chronological order. Defaults to 10.

CompressContextTool configuration

CompressContextTool reduces prompt size locally by stripping whitespace, removing common filler words, and truncating to a length limit.

max_length

int

Maximum character length of the compressed output. Text beyond this limit is replaced with ... [TRUNCATED]. Defaults to 4000.

from tools.compress_context_tool import CompressContextTool

# Default — truncate at 4000 characters
compressor = CompressContextTool()

# Extended limit for larger context windows
compressor = CompressContextTool(max_length=10000)

# Call directly
compressed = compressor._run(long_text)

For LLM-quality compression (summarization rather than truncation), pass a LocalAgent as the compressor parameter to LangGraphOrchestrator or SequentialWorkflow. LocalAgent is used automatically when LOCAL_MODEL is set and Ollama is reachable.

Get Started

Core Concepts

Guides

Environment variables

LLM configuration

Local Ollama

Cloud OpenAI-compatible

Agent configuration

ExecutionAgent tools

Workflow configuration

SequentialWorkflow

ParallelWorkflow

LangGraphOrchestrator configuration

Memory configuration

CompressContextTool configuration

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

​Environment variables

​LLM configuration

​Local Ollama

​Cloud OpenAI-compatible

​Agent configuration

​ExecutionAgent tools

​Workflow configuration

​SequentialWorkflow

​ParallelWorkflow

​LangGraphOrchestrator configuration

​Memory configuration

​CompressContextTool configuration

Build docs developers (and LLMs) love

Environment variables

LLM configuration

Local Ollama

Cloud OpenAI-compatible

Agent configuration

ExecutionAgent tools

Workflow configuration

SequentialWorkflow

ParallelWorkflow

LangGraphOrchestrator configuration

Memory configuration

CompressContextTool configuration