Skip to main content
LocalAgent is a self-contained agent designed for use with local Ollama models. Its sole responsibility is context compression: it takes verbose text and returns a condensed version that retains all critical facts while dramatically reducing token count. Unlike the other agents it does not extend BaseAgent and is not exported from the agents package — it must be imported directly.

Import

from agents.local_agent import LocalAgent
LocalAgent is intentionally excluded from agents/__init__.py. It is a specialised utility; import it from agents.local_agent to make the dependency explicit.

Constructor

LocalAgent(
    llm: BaseChatModel,
    system_prompt: str = "You are a specialized summarization agent. ...",
    agent_name: str = "LocalAgent"
)
llm
BaseChatModel
required
The LangChain chat model to use for summarization. Optimised for local models via ChatOllama, but any BaseChatModel implementation works.
system_prompt
str
System instruction controlling summarization behaviour. The default prompt instructs the model to be aggressive about compression while preserving every important detail. Override this for domain-specific compression styles.
agent_name
str
default:"LocalAgent"
Display name used in log messages.

Methods

invoke

invoke(user_input: str, **kwargs: Any) -> Any
Sends user_input through a ChatPromptTemplate | llm chain and returns the compressed text. The interface is intentionally identical to BaseAgent.invoke() so that LocalAgent can be used as a drop-in compressor wherever that signature is expected.
user_input
str
required
The text to be summarized or compressed.
**kwargs
Any
Additional key-value pairs merged into the prompt template’s format call.
Returns: str — the response.content string containing the compressed text.

Usage example

Standalone summarization

from langchain_ollama import ChatOllama
from agents.local_agent import LocalAgent

llm = ChatOllama(model="llama3")
compressor = LocalAgent(llm=llm)

long_text = """
The history of computing spans several decades and includes many pivotal
inventions: the transistor, the integrated circuit, the personal computer,
the internet, and now artificial intelligence. Each era built on the last,
accelerating the pace of change until today's models can reason, plan, and
generate human-quality text at scale.
"""

summary = compressor.invoke(long_text)
print(summary)
# → "Computing evolved from transistors → ICs → PCs → internet → AI,
#    each era accelerating innovation."

As a compressor inside an orchestrator

from langchain_ollama import ChatOllama
from agents import PlanningAgent, ExecutionAgent, MonitoringAgent
from agents.local_agent import LocalAgent
from orchestators import LangGraphOrchestrator

llm = ChatOllama(model="llama3")
compressor = LocalAgent(llm=llm)

# Pass LocalAgent as the context compressor instead of CompressContextTool
orchestrator = LangGraphOrchestrator(
    planner=PlanningAgent(llm=llm),
    executor=ExecutionAgent(llm=llm),
    monitor=MonitoringAgent(llm=llm),
    compressor=compressor,
    max_retries=2,
)

orchestrator.run("Summarise the Q3 financial report and highlight key risks.")
LocalAgent is particularly effective in long-running multi-step pipelines where intermediate results accumulate. Compress each step’s output before passing it as context to the next ExecutionAgent.execute_step() call to stay within the model’s context window.

Build docs developers (and LLMs) love