Environment variables
Agentic Patterns reads its configuration from a.env file loaded via python-dotenv. Copy .env.example to .env and set the variables relevant to your setup.
| Variable | Description | Example |
|---|---|---|
LOCAL_MODEL | Name of the Ollama model to use with ChatOllama | llama3 |
OLLAMA_HOST | Base URL of the Ollama server | http://localhost:11434 |
OLLAMA_MODEL | Alternate Ollama host URL (mirrors OLLAMA_HOST in the example) | http://localhost:11434 |
SUMMARY_HOST | Base URL for the cloud OpenAI-compatible API | https://api.routeway.ai/v1 |
SUMMARY_MODEL | Model identifier for the cloud API | nemotron-nano-9b-v2:free |
SUMMARY_AGENT_API_KEY | API key for the cloud LLM endpoint | sk-... |
REPORT_HOST | Base URL for a secondary cloud API used for report generation | https://api.llmapi.ai/v1 |
REPORT_MODEL | Model identifier for the report API | llama-3-8b-instruct |
REPORT_AGENT_API_KEY | API key for the report LLM endpoint | sk-... |
LLM configuration
Local Ollama
UseChatOllama from langchain-ollama to run models locally:
OLLAMA_HOST (default: http://localhost:11434). Install Ollama from ollama.com and pull a model with ollama pull llama3 before running.
Cloud OpenAI-compatible
UseChatOpenAI with a custom base_url to connect to any OpenAI-compatible API:
Agent configuration
All agents inherit fromBaseAgent. The constructor parameters available on every agent are:
The LangChain chat model instance to use. Any
BaseChatModel subclass is accepted — ChatOllama, ChatOpenAI, etc.The system prompt injected at the start of every invocation. Each concrete agent class sets a sensible default, but you can override it.Default values:
PlanningAgent: instructs the model to return a JSON{"plan": [...]}object.ExecutionAgent: instructs the model to complete the given action step with detailed output.MonitoringAgent: instructs the model to return a JSON{"success": bool, "feedback": str}object.LocalAgent: instructs the model to aggressively condense text to save tokens.
A label used in log messages. Defaults to the class name (e.g.,
"PlanningAgent").PlanningAgent:
ExecutionAgent tools
ExecutionAgent accepts an optional list of LangChain-compatible tools. When tools are provided, it wraps a LangGraph ReAct agent internally:
A list of
BaseTool instances bound to the executor. When non-empty, the agent uses create_react_agent from LangGraph. Pass [] or omit to use a plain LLM chain.Workflow configuration
SequentialWorkflow
A dictionary with keys
"planner", "executor", and "monitor" mapping to their respective agent instances.Optional list of tools passed to the workflow. The first tool in the list is used as the context compressor. Supports both
invoke() and _run() interfaces.SequentialWorkflow.run() accepts:
The high-level task description passed to the planner.
Maximum number of retry attempts per step before aborting the workflow. Defaults to
2.ParallelWorkflow
ParallelWorkflow.run() accepts:
A list of independent task descriptions to execute concurrently.
Maximum number of threads in the pool. Defaults to
5.LangGraphOrchestrator configuration
The planning agent instance.
The execution agent instance.
The monitoring agent instance.
An optional compressor used to shrink accumulated context before each execution step. Accepts any object with an
invoke() or _run() method — typically a CompressContextTool or a LocalAgent instance.Maximum retry attempts per step before the orchestrator transitions to
"failed" status. Defaults to 2.Memory configuration
SQLiteShortTermMemory stores conversation history and serves as a caching layer.
Path to the SQLite database file. Defaults to
'short_term_memory.db' in the working directory. Use ':memory:' for ephemeral RAM-only storage that is discarded when the process exits.add_memory() accepts:
Unique identifier grouping related messages. Use a consistent value (e.g.,
"global_cache_session") across runs to enable cache hits.The speaker role:
"user", "assistant", "system", or "tool".The text content to persist.
Optional dictionary of additional context (e.g., tool names, token counts).
get_context() accepts:
The session to retrieve messages for.
Maximum number of recent messages to return, in chronological order. Defaults to
10.CompressContextTool configuration
CompressContextTool reduces prompt size locally by stripping whitespace, removing common filler words, and truncating to a length limit.
Maximum character length of the compressed output. Text beyond this limit is replaced with
... [TRUNCATED]. Defaults to 4000.For LLM-quality compression (summarization rather than truncation), pass a
LocalAgent as the compressor parameter to LangGraphOrchestrator or SequentialWorkflow. LocalAgent is used automatically when LOCAL_MODEL is set and Ollama is reachable.