Understand how Hive models agent workflows as directed graphs with nodes, edges, and shared memory
Real business processes aren’t linear. A sales outreach might go: research a prospect, draft a message, realize the research is thin, go back and dig deeper, draft again, get human approval, send. There are loops, branches, fallbacks, and decision points.Hive models this as a directed graph. Nodes do work, edges connect them, and shared memory lets them pass data. The framework walks this structure — running nodes, following edges, managing retries — until the agent reaches its goal or exhausts its step budget.
Edges can loop back, creating feedback cycles where an agent retries a step or takes a different path. That’s intentional. A graph that only moves forward can’t self-correct.
intake → research → draft → [human review] → send → done ↑ | └──────── on failure ───────────────┘
This structure lets agents adapt based on results, not just execute predefined steps.
The only node type in Hive is event_loop. It’s a multi-turn LLM loop where the model reasons about the current state, calls tools, observes results, and keeps going until it has produced the required outputs.
from framework.graph import NodeSpecresearch_node = NodeSpec( id="research", name="Research", description="Search the web, fetch source content, and compile findings", node_type="event_loop", input_keys=["research_brief", "feedback"], output_keys=["findings", "sources", "gaps"], nullable_output_keys=["feedback"], success_criteria=( "Findings reference at least 3 distinct sources with URLs. " "Key claims are substantiated by fetched content, not generated." ), system_prompt="""You are a research agent. Given a research brief, find and analyze sources.Work in phases:1. **Search**: Use web_search with 3-5 diverse queries.2. **Fetch**: Use web_scrape on the most promising URLs.3. **Analyze**: Review what you've collected and identify key findings.When done, use set_output:- set_output("findings", "Structured summary with source URLs")- set_output("sources", [{"url": "...", "title": "..."}])- set_output("gaps", "What aspects are not well-covered yet")""", tools=["web_search", "web_scrape", "save_data"],)
All agent behavior happens in these nodes. They handle long-running tasks, manage their own context window, and can recover from crashes mid-conversation.
The most important behavior in an event_loop node is the ability to self-correct. After each iteration, the node evaluates its own output: did it produce what was needed? If yes, it’s done. If not, it tries again — but this time it sees what went wrong and adjusts.This is the reflexion pattern: try, evaluate, learn from the result, try again. It’s cheaper and more effective than starting over.
core/framework/graph/event_loop_node.py
class JudgeVerdict: """Result of judge evaluation for the event loop.""" action: Literal["ACCEPT", "RETRY", "ESCALATE"] feedback: str = ""
Within a single node, the outcomes are:
Accept — Output meets the bar. Move on.
Retry — Not good enough, but recoverable. Try again with feedback.
Escalate — Something is fundamentally broken. Hand off to error handling.
This is self-correction within a session — the agent adapting in real time. It’s different from evolution, which improves the agent across sessions by rewriting its code between generations.
Edges control flow between nodes. Each edge has a condition:
core/framework/graph/edge.py
class EdgeCondition(StrEnum): """When an edge should be traversed.""" ALWAYS = "always" # Always after source completes ON_SUCCESS = "on_success" # Only if source succeeds ON_FAILURE = "on_failure" # Only if source fails CONDITIONAL = "conditional" # Based on expression LLM_DECIDE = "llm_decide" # Let LLM decide based on goal
When a node has multiple outgoing edges, the framework can run those branches in parallel and reconverge when they’re all done:
edges = [ EdgeSpec(id="search-to-linkedin", source="search", target="linkedin_research"), EdgeSpec(id="search-to-twitter", source="search", target="twitter_research"), EdgeSpec(id="search-to-news", source="search", target="news_research"), # All three branches run in parallel, results merge back to shared memory]
This is useful for tasks like researching a prospect from multiple sources simultaneously.
Shared memory is how nodes communicate. It’s a key-value store scoped to a single session.
core/framework/graph/node.py
class SharedMemory: """Shared memory for passing data between nodes.""" def get(self, key: str) -> Any: ... def set(self, key: str, value: Any) -> None: ... def has(self, key: str) -> bool: ...
Every node declares which keys it reads and which it writes, and the framework enforces those boundaries — a node can’t quietly access data it hasn’t declared.
NodeSpec( id="research", input_keys=["research_brief", "feedback"], # What it reads output_keys=["findings", "sources"], # What it writes nullable_output_keys=["feedback"], # Optional outputs)
Data flows through the graph in a natural way:
Input arrives at the start
Each node reads what it needs and writes what it produces
Edges map outputs to inputs as data moves between nodes
At the end, the full memory state is the execution result
A typical agent graph looks something like this:Key patterns:
Entry node where work begins
Chain of nodes that do the real work
HITL nodes at approval gates
Failure edges that loop back for another attempt
Terminal nodes where execution ends
The framework tracks which nodes ran, how many retries each needed, how much the LLM calls cost, and how long each step took. This data is critical for both monitoring and evolution.