What is a harness?
A harness is an async generator function that yields events during LLM invocations. It’s the core abstraction in LLM Gateway — everything from a single API call to a multi-agent orchestration is built by composing harnesses.invoke()takes messages, tools, and configuration, returns an async iterable of eventssupportedModels()returns the list of model IDs the harness can handle
Two types of harnesses
Provider harnesses
Provider harnesses make single LLM API calls and stream the results. They handle one request-response cycle and yield events liketext, reasoning, tool_call, and usage.
Zen
OpenAI-compatible API with support for reasoning content. Default provider.
Anthropic
Claude models via the Anthropic Messages API.
OpenAI
GPT models via OpenAI Chat Completions API.
OpenRouter
Access 100+ models through OpenRouter aggregator.
Agent harness
The agent harness wraps a provider harness to add agentic behavior: tool execution, permission checking, and an iterative loop that continues until the model has no more tool calls.Agent harness wrapping Zen provider
- Agentic loop: Continues calling the LLM until no tool calls remain or
maxIterationsis reached - Permission handling: Checks allowlists, yields relay events, waits for approval
- Tool execution: Executes approved tools with proper context and error handling
- Message history: Builds up the conversation with assistant responses and tool results
Event flow
A typical agentic conversation produces this event sequence:Lifecycle events
Agent harnesses emit lifecycle events to mark the boundaries of their execution:Marks the beginning of an agent run. Contains
runId and optional depth and maxIterations for nested invocations.Marks the end of an agent run. Contains
runId, optional reason (“final” or “max_iterations”), and usage totals for RLM harness.Lifecycle tracking
Harness parameters
Theinvoke() method accepts these parameters:
Model ID to use for LLM calls (e.g., “glm-4.7”, “claude-3.5-sonnet”).
Conversation history. Each message has
role (“system”, “user”, “assistant”, “tool”) and content.Tools available to the agent. Each tool has a name, description, Zod schema, and optional
execute() function.Permission rules. Contains
allowlist, allowOnce, and deny arrays for controlling tool access.Environment context:
parentId: Links this run to a parent (for subagents)spawn: Function to spawn subagentsfileTime: File timestamp tracking utility
Creating custom harnesses
You can create custom harnesses for specialized behavior:See Composition to learn how to layer harness behavior, and Custom Harnesses for detailed implementation patterns.
Key characteristics
Composable
Composable
Harnesses wrap other harnesses. The agent harness wraps a provider harness. You can add retries, logging, rate limiting, or caching by wrapping harnesses around each other.
Streaming by default
Streaming by default
Events arrive as they’re produced. Text streams token-by-token. Tool calls stream as the model emits them. No buffering unless you choose to collect events.
Type-safe
Type-safe
Events are a discriminated union type. Tools use Zod schemas for runtime validation. TypeScript provides autocomplete for event fields.
Provider-agnostic
Provider-agnostic
The same agent code works with any provider. Swap Zen for Anthropic or OpenAI by changing one line. Model-specific features (like reasoning) surface as events when available.
Next steps
Events
Learn about the event types that flow through harnesses
Composition
Understand how to layer harness behavior
Agent API
Full API reference for the agent harness
Custom Provider
Build a custom provider harness
