MonitoringAgent

MonitoringAgent extends BaseAgent and acts as a judge in a plan-execute-monitor loop. It compares the original objective of a step against the actual output produced by an ExecutionAgent and returns a structured verdict.

Import

from agents import MonitoringAgent

Constructor

MonitoringAgent(
    llm: BaseChatModel,
    system_prompt: str = "<strict monitoring prompt>",  # see default below
    agent_name: str = "MonitoringAgent"
)

llm

BaseChatModel

required

The LangChain chat model used to judge execution results.

system_prompt

str

Override the default system instruction. The built-in prompt reads:

You are a strict monitoring and evaluation agent. Your task is to compare
the Original Objective of a step with the Actual Output produced by an
execution agent. Determine if the objective was successfully met.
Return ONLY a valid JSON object with two keys:
  1. 'success' (boolean: true or false)
  2. 'feedback' (string: explanation of why it succeeded or failed, and
     how to fix if failed).
Example: {"success": true, "feedback": "Objective met completely."}

agent_name

str

default:"MonitoringAgent"

Display name used in log messages. Inherited from BaseAgent.

Methods

`evaluate`

evaluate(objective: str, result: str) -> Dict[str, Any]

Builds a structured evaluation prompt from objective and result, calls invoke(), parses the JSON response, and returns a normalised verdict dictionary.

objective

str

required

What the step was originally supposed to accomplish (typically the step description from PlanningAgent.generate_plan()).

result

str

required

The actual output produced by ExecutionAgent.execute_step() for this step.

Returns: Dict[str, Any] with the following fields:

Show properties

success

bool

True if the model determined the objective was met; False otherwise.

feedback

str

Human-readable explanation of the verdict. On success this describes what was done correctly; on failure it explains what is missing and how to fix it.

Expected LLM response format:

{"success": true, "feedback": "Objective met completely."}

If the model returns malformed JSON the failure is logged at WARNING level and the method returns {"success": False, "feedback": "Failed to parse monitoring response: <raw_response>"}. This ensures callers always receive a well-typed dictionary rather than an exception.

`invoke` (inherited)

See BaseAgent.invoke().

Usage example

from langchain_ollama import ChatOllama
from agents import PlanningAgent, ExecutionAgent, MonitoringAgent

llm = ChatOllama(model="llama3")

planner = PlanningAgent(llm=llm)
executor = ExecutionAgent(llm=llm)
monitor = MonitoringAgent(llm=llm)

# Generate a plan
steps = planner.generate_plan("Write a haiku about the ocean")

# Execute and monitor each step
for step in steps:
    output = executor.execute_step(step_description=step)
    verdict = monitor.evaluate(objective=step, result=output)

    print(f"Step : {step}")
    print(f"Output : {output}")
    print(f"Success : {verdict['success']}")
    print(f"Feedback: {verdict['feedback']}")
    print()

    if not verdict["success"]:
        # Optionally retry the step or log for human review
        print("[WARN] Step did not meet objective — consider retrying.")

Use verdict["success"] as a gate: only advance to the next step (or mark the overall task complete) when the monitor returns True.

Agents

Workflows & Orchestration

Memory & Tools

Import

Constructor

Methods

`evaluate`

`invoke` (inherited)

Usage example

Build docs developers (and LLMs) love

Agents

Workflows & Orchestration

Memory & Tools

​Import

​Constructor

​Methods

​evaluate

​invoke (inherited)

​Usage example

Build docs developers (and LLMs) love

Import

Constructor

Methods

`evaluate`

`invoke` (inherited)

Usage example