Skip to main content

Environment variables

Agentic Patterns reads its configuration from a .env file loaded via python-dotenv. Copy .env.example to .env and set the variables relevant to your setup.
VariableDescriptionExample
LOCAL_MODELName of the Ollama model to use with ChatOllamallama3
OLLAMA_HOSTBase URL of the Ollama serverhttp://localhost:11434
OLLAMA_MODELAlternate Ollama host URL (mirrors OLLAMA_HOST in the example)http://localhost:11434
SUMMARY_HOSTBase URL for the cloud OpenAI-compatible APIhttps://api.routeway.ai/v1
SUMMARY_MODELModel identifier for the cloud APInemotron-nano-9b-v2:free
SUMMARY_AGENT_API_KEYAPI key for the cloud LLM endpointsk-...
REPORT_HOSTBase URL for a secondary cloud API used for report generationhttps://api.llmapi.ai/v1
REPORT_MODELModel identifier for the report APIllama-3-8b-instruct
REPORT_AGENT_API_KEYAPI key for the report LLM endpointsk-...
Never commit .env to version control. It is already listed in .gitignore.

LLM configuration

Local Ollama

Use ChatOllama from langchain-ollama to run models locally:
import os
from dotenv import load_dotenv
from langchain_ollama import ChatOllama

load_dotenv()
llm = ChatOllama(model=os.getenv("LOCAL_MODEL"))
Ollama must be running at the address specified by OLLAMA_HOST (default: http://localhost:11434). Install Ollama from ollama.com and pull a model with ollama pull llama3 before running.

Cloud OpenAI-compatible

Use ChatOpenAI with a custom base_url to connect to any OpenAI-compatible API:
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI

load_dotenv()
llm_cloud = ChatOpenAI(
    model_name=os.getenv("SUMMARY_MODEL"),
    temperature=0.3,
    openai_api_key=os.getenv("SUMMARY_AGENT_API_KEY"),
    base_url=os.getenv("SUMMARY_HOST"),
)
You can mix backends per agent. For example, pass a local ChatOllama to PlanningAgent and ExecutionAgent, and a cloud ChatOpenAI to MonitoringAgent for higher-quality evaluation.

Agent configuration

All agents inherit from BaseAgent. The constructor parameters available on every agent are:
llm
BaseChatModel
required
The LangChain chat model instance to use. Any BaseChatModel subclass is accepted — ChatOllama, ChatOpenAI, etc.
system_prompt
str
The system prompt injected at the start of every invocation. Each concrete agent class sets a sensible default, but you can override it.Default values:
  • PlanningAgent: instructs the model to return a JSON {"plan": [...]} object.
  • ExecutionAgent: instructs the model to complete the given action step with detailed output.
  • MonitoringAgent: instructs the model to return a JSON {"success": bool, "feedback": str} object.
  • LocalAgent: instructs the model to aggressively condense text to save tokens.
agent_name
str
A label used in log messages. Defaults to the class name (e.g., "PlanningAgent").
Example — overriding the system prompt on PlanningAgent:
from langchain_ollama import ChatOllama
from agents import PlanningAgent

llm = ChatOllama(model="llama3")
planner = PlanningAgent(
    llm=llm,
    system_prompt=(
        "You are a concise planning agent. Return ONLY a valid JSON object "
        'with a \'plan\' key containing at most 3 steps. '
        'Example: {\"plan\": [\"Step 1\", \"Step 2\"]}'
    ),
)

ExecutionAgent tools

ExecutionAgent accepts an optional list of LangChain-compatible tools. When tools are provided, it wraps a LangGraph ReAct agent internally:
tools
List[BaseTool]
A list of BaseTool instances bound to the executor. When non-empty, the agent uses create_react_agent from LangGraph. Pass [] or omit to use a plain LLM chain.
from agents import ExecutionAgent
from tools.curl_search_tool import CurlSearchTool

executor = ExecutionAgent(llm=llm, tools=[CurlSearchTool()])

Workflow configuration

SequentialWorkflow

agents
Dict[str, Any]
required
A dictionary with keys "planner", "executor", and "monitor" mapping to their respective agent instances.
tools
List[Any]
Optional list of tools passed to the workflow. The first tool in the list is used as the context compressor. Supports both invoke() and _run() interfaces.
SequentialWorkflow.run() accepts:
task
str
required
The high-level task description passed to the planner.
max_retries
int
Maximum number of retry attempts per step before aborting the workflow. Defaults to 2.
from workflows import SequentialWorkflow

workflow = SequentialWorkflow(agents=agent_dict, tools=[compressor, curl_tool])
result = workflow.run(task="Summarize recent AI research", max_retries=3)

ParallelWorkflow

ParallelWorkflow.run() accepts:
tasks
List[str]
required
A list of independent task descriptions to execute concurrently.
max_workers
int
Maximum number of threads in the pool. Defaults to 5.
from workflows import ParallelWorkflow

workflow = ParallelWorkflow(agents={"executor": executor})
result = workflow.run(
    tasks=["Summarize topic A", "Summarize topic B", "Summarize topic C"],
    max_workers=3,
)

LangGraphOrchestrator configuration

planner
PlanningAgent
required
The planning agent instance.
executor
ExecutionAgent
required
The execution agent instance.
monitor
MonitoringAgent
required
The monitoring agent instance.
compressor
Any
An optional compressor used to shrink accumulated context before each execution step. Accepts any object with an invoke() or _run() method — typically a CompressContextTool or a LocalAgent instance.
max_retries
int
Maximum retry attempts per step before the orchestrator transitions to "failed" status. Defaults to 2.
from orchestators import LangGraphOrchestrator
from tools.compress_context_tool import CompressContextTool

orchestrator = LangGraphOrchestrator(
    planner=planner,
    executor=executor,
    monitor=monitor,
    compressor=CompressContextTool(max_length=10000),
    max_retries=2,
)
result = orchestrator.run(task="Research and summarize the history of Andorra")

Memory configuration

SQLiteShortTermMemory stores conversation history and serves as a caching layer.
db_path
str
Path to the SQLite database file. Defaults to 'short_term_memory.db' in the working directory. Use ':memory:' for ephemeral RAM-only storage that is discarded when the process exits.
from memory.short_term_memory import SQLiteShortTermMemory

# Persistent storage
memory = SQLiteShortTermMemory(db_path="my_project_memory.db")

# Ephemeral (in-memory only)
memory = SQLiteShortTermMemory(db_path=":memory:")
add_memory() accepts:
session_id
str
required
Unique identifier grouping related messages. Use a consistent value (e.g., "global_cache_session") across runs to enable cache hits.
role
str
required
The speaker role: "user", "assistant", "system", or "tool".
content
str
required
The text content to persist.
metadata
Dict[str, Any]
Optional dictionary of additional context (e.g., tool names, token counts).
get_context() accepts:
session_id
str
required
The session to retrieve messages for.
limit
int
Maximum number of recent messages to return, in chronological order. Defaults to 10.

CompressContextTool configuration

CompressContextTool reduces prompt size locally by stripping whitespace, removing common filler words, and truncating to a length limit.
max_length
int
Maximum character length of the compressed output. Text beyond this limit is replaced with ... [TRUNCATED]. Defaults to 4000.
from tools.compress_context_tool import CompressContextTool

# Default — truncate at 4000 characters
compressor = CompressContextTool()

# Extended limit for larger context windows
compressor = CompressContextTool(max_length=10000)

# Call directly
compressed = compressor._run(long_text)
For LLM-quality compression (summarization rather than truncation), pass a LocalAgent as the compressor parameter to LangGraphOrchestrator or SequentialWorkflow. LocalAgent is used automatically when LOCAL_MODEL is set and Ollama is reachable.

Build docs developers (and LLMs) love