Skip to main content

Prerequisites

  • Python 3.12.12 (the project pins this version via .python-version)
  • uv or pip for package management
  • Ollama running locally (for local models), or an OpenAI-compatible API key (for cloud models)
Ollama must be running and have your chosen model pulled before the workflow can execute. Run ollama pull llama3 to download a model.

Setup

1

Clone the repository

git clone https://github.com/giffy/agentic-patterns.git
cd agentic-patterns
2

Install dependencies

Install all dependencies from pyproject.toml using uv:
uv pip install -r pyproject.toml
Or with plain pip:
pip install langchain==1.2.11 langchain-ollama==1.0.1 langchain-openai==1.1.11 "langgraph>=1.1.0" "openai>=2.26.0" python-dotenv
3

Configure environment variables

Copy the example environment file and fill in your values:
cp .env.example .env
Open .env and set the variables for your chosen LLM backend:
# Local models via Ollama
OLLAMA_HOST=http://localhost:11434
OLLAMA_MODEL=http://localhost:11434

# Cloud LLM for summarization / monitoring
SUMMARY_HOST=https://api.routeway.ai/v1
SUMMARY_MODEL=nemotron-nano-9b-v2:free
SUMMARY_AGENT_API_KEY='your-api-key-here'

# Model name used by LocalAgent and ChatOllama
LOCAL_MODEL=llama3
To run fully locally, only LOCAL_MODEL and OLLAMA_HOST are required. The SUMMARY_* variables are only needed when using a cloud LLM.
4

Run your first workflow

Execute the bundled main.py to verify your setup:
python main.py
You should see output similar to:
=== Initializing Agentic Patterns Components ===
[*] Using LocalAgent with model 'llama3' for context compression.
[CACHE MISS] No previous answer found for: 'check again What is the capital of Andorra?'
Proceeding to execute through workflows...

--- Running Sequential Workflow ---
Workflow Status: success

Step: Step 1: ...
Output: ...

--- Running LangGraph Orchestrator ---
Orchestrator Final Status: success
Plan steps generated: 3

Write your own workflow

Here is a complete, working example based on the patterns in main.py:

Using SequentialWorkflow

import os
from dotenv import load_dotenv
from langchain_ollama import ChatOllama

from agents import PlanningAgent, ExecutionAgent, MonitoringAgent
from workflows import SequentialWorkflow
from tools.curl_search_tool import CurlSearchTool

load_dotenv()

# 1. Initialize the LLM
llm = ChatOllama(model=os.getenv("LOCAL_MODEL"))

# 2. Create a search tool for the executor
curl_tool = CurlSearchTool()

# 3. Instantiate agents
planner = PlanningAgent(llm=llm)
executor = ExecutionAgent(llm=llm, tools=[curl_tool])
monitor = MonitoringAgent(llm=llm)

agent_dict = {
    "planner": planner,
    "executor": executor,
    "monitor": monitor,
}

# 4. Run the workflow
sequential_workflow = SequentialWorkflow(agents=agent_dict, tools=[curl_tool])
result = sequential_workflow.run(task="What is the capital of Andorra?")

print(f"Status: {result['status']}")
for step_result in result.get("completed_results", []):
    print(f"\nStep: {step_result['step']}")
    print(f"Output: {step_result['result']}")

Using LangGraphOrchestrator

For a more robust pipeline with a LangGraph state machine and automatic retries:
import os
from dotenv import load_dotenv
from langchain_ollama import ChatOllama

from agents import PlanningAgent, ExecutionAgent, MonitoringAgent
from orchestators import LangGraphOrchestrator
from tools.compress_context_tool import CompressContextTool
from tools.curl_search_tool import CurlSearchTool

load_dotenv()

llm = ChatOllama(model=os.getenv("LOCAL_MODEL"))

planner = PlanningAgent(llm=llm)
executor = ExecutionAgent(llm=llm, tools=[CurlSearchTool()])
monitor = MonitoringAgent(llm=llm)
compressor = CompressContextTool(max_length=10000)

orchestrator = LangGraphOrchestrator(
    planner=planner,
    executor=executor,
    monitor=monitor,
    compressor=compressor,
    max_retries=2,
)

result = orchestrator.run(task="What is the capital of Andorra?")

print(f"Status: {result['status']}")
print(f"Plan steps generated: {len(result['plan'])}")

for step_result in result.get("results", []):
    if step_result.get("status") == "validated":
        print(f"\nStep: {step_result['step']}")
        print(f"Output: {step_result['result']}")

Using a cloud LLM

Swap ChatOllama for ChatOpenAI with a custom base_url to use any OpenAI-compatible endpoint:
import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI

load_dotenv()

llm_cloud = ChatOpenAI(
    model_name=os.getenv("SUMMARY_MODEL"),
    temperature=0.3,
    openai_api_key=os.getenv("SUMMARY_AGENT_API_KEY"),
    base_url=os.getenv("SUMMARY_HOST"),
)
Pass llm_cloud to any agent constructor in place of ChatOllama.

Using short-term memory with caching

from memory.short_term_memory import SQLiteShortTermMemory

memory = SQLiteShortTermMemory()  # defaults to 'short_term_memory.db'
task = "What is the capital of Andorra?"

# Check cache before calling any LLM
cached = memory.get_exact_match_answer(task)
if cached:
    print(f"[CACHE HIT] {cached}")
else:
    # ... run workflow ...
    # Save result to cache
    memory.add_memory(session_id="session_1", role="user", content=task)
    memory.add_memory(session_id="session_1", role="assistant", content="Andorra la Vella")

Expected output structure

SequentialWorkflow.run() returns:
{
    "status": "success",           # or "failed"
    "completed_results": [
        {"step": "Step 1: ...", "result": "..."},
        {"step": "Step 2: ...", "result": "..."},
    ]
}
LangGraphOrchestrator.run() returns the full OrchestratorState:
{
    "status": "success",           # or "failed"
    "plan": ["Step 1: ...", "Step 2: ..."],
    "results": [
        {"step": "Step 1: ...", "result": "...", "status": "validated"},
    ],
    "task": "What is the capital of Andorra?",
    "context": "[Step 1 Result]: ...",
    # ... other state fields
}
The orchestrator writes results to result.txt in the working directory by default (see main.py). Remove or redirect this in your own code.
Next, see Configuration for all available options.

Build docs developers (and LLMs) love