Overview
The LLM Gateway server is highly configurable through the createApp factory function. This guide covers advanced configuration options for harnesses, tools, models, and skills.
Application Configuration
The createApp function accepts an optional AppConfig object:
interface AppConfig {
harness ?: GeneratorHarnessModule ;
providerHarness ?: GeneratorHarnessModule ;
tools ?: ToolDefinition [];
defaultModel ?: string ;
skillDirs ?: string [];
}
Harness Configuration
Provider Harnesses
Provider harnesses handle communication with AI providers. The gateway includes built-in support for multiple providers:
Zen (Default)
Anthropic
OpenAI
OpenRouter
import { createGeneratorHarness } from "./packages/ai/harness/providers/zen" ;
const app = await createApp ({
harness: createAgentHarness ({
harness: createGeneratorHarness ()
})
});
Requires ZEN_API_KEY in environment. import { createAnthropicHarness } from "./packages/ai/harness/providers/anthropic" ;
const app = await createApp ({
harness: createAgentHarness ({
harness: createAnthropicHarness ()
})
});
Requires ANTHROPIC_API_KEY in environment. import { createOpenAIHarness } from "./packages/ai/harness/providers/openai" ;
const app = await createApp ({
harness: createAgentHarness ({
harness: createOpenAIHarness ()
})
});
Requires OPENAI_API_KEY in environment. import { createOpenRouterHarness } from "./packages/ai/harness/providers/openrouter" ;
const app = await createApp ({
harness: createAgentHarness ({
harness: createOpenRouterHarness ()
})
});
Requires OPENROUTER_API_KEY in environment.
Agent Harness
The agent harness wraps a provider harness with tool-calling capabilities:
import { createAgentHarness } from "./packages/ai/harness/agent" ;
import { createGeneratorHarness } from "./packages/ai/harness/providers/zen" ;
const agent = createAgentHarness ({
harness: createGeneratorHarness (),
// Optional: additional agent configuration
});
const app = await createApp ({ harness: agent });
The agent harness handles tool execution, permissions, and the agentic loop automatically.
Recursive Language Model (RLM)
For processing long inputs, use the RLM harness:
import { createRlmHarness } from "./packages/ai/rlm/harness" ;
import { createGeneratorHarness } from "./packages/ai/harness/providers/zen" ;
const rlm = createRlmHarness ({
rootHarness: createGeneratorHarness (),
subHarness: createGeneratorHarness (), // Can use cheaper model
config: {
maxIterations: 10 ,
maxStdoutLength: 4000 ,
metadataPrefixLength: 200 ,
maxDepth: 2 ,
},
});
const app = await createApp ({ harness: rlm });
Clients must specify mode: "rlm" in chat requests to use RLM mode.
The server includes four default tools:
bash Execute shell commands with permission controls
agent Spawn subagents for delegated tasks
read Read file contents from the filesystem
patch Apply code patches to files
Define custom tools by implementing the ToolDefinition interface:
import type { ToolDefinition } from "./packages/ai/types" ;
const myCustomTool : ToolDefinition = {
name: "my_tool" ,
description: "Does something useful" ,
schema: {
type: "object" ,
properties: {
input: { type: "string" , description: "Input parameter" },
},
required: [ "input" ],
},
execute : async ( params ) => {
// Tool implementation
return { success: true , output: "result" };
},
derivePermission : ( params ) => {
// Return permission pattern for "always allow"
return { tool: "my_tool" , args: { input: params . input } };
},
};
const app = await createApp ({
tools: [ myCustomTool , bashTool , agentTool , readTool , patchTool ],
});
Always implement derivePermission to enable “always allow” functionality for your tool.
Clients can control tool execution through permission configurations:
// Client request example
await fetch ( '/chat' , {
method: 'POST' ,
headers: { 'Content-Type' : 'application/json' },
body: JSON . stringify ({
model: 'glm-4.7' ,
messages: [ ... ],
permissions: {
allowlist: [
{ tool: 'bash' , args: { command: 'ls *' } },
{ tool: 'read' },
],
deny: [
{ tool: 'bash' , args: { command: 'rm *' } },
],
},
}),
});
Permission patterns use glob matching (via picomatch):
allowlist: Auto-approve matching tools
allowOnce: One-time approval, consumed on use
deny: Immediately reject matching tools
Model Configuration
Default Model
Specify a default model for requests that don’t include a model:
const app = await createApp ({
defaultModel: process . env . DEFAULT_MODEL || "glm-4.7" ,
});
The /models endpoint returns the default model if it’s supported by the configured provider:
{
"models" : [ "glm-4.7" , "kimi-k2.5" , "..." ]
"defaultModel" : "glm-4.7"
}
Model Validation
The server validates models against the provider’s supported models:
const models = await harness . supportedModels ();
const validDefault = defaultModel && models . includes ( defaultModel )
? defaultModel
: undefined ;
Skills Configuration
Skills extend agent capabilities with specialized instructions and workflows:
const app = await createApp ({
skillDirs: [
"./skills" ,
"/etc/llm-gateway/skills" ,
process . env . CUSTOM_SKILLS_DIR ,
]. filter ( Boolean ),
});
Skills Discovery
The server automatically discovers skills in configured directories:
Searches each skillDir for valid skill definitions
Formats skills into a system prompt
Prepends skills prompt to agent messages
Skills are discovered at server startup. Restart the server to load new skills.
Request Configuration
Clients can configure individual chat requests:
Standard Mode
interface ChatRequest {
model : string ; // Required: model identifier
messages : Message []; // Required: conversation history
context ?: string ; // Optional: additional context
permissions ?: Permissions ; // Optional: tool permissions
mode ?: "agent" ; // Optional: defaults to agent
}
RLM Mode
interface ChatRequest {
model : string ;
messages : Message [];
context ?: string ; // Long-form content to process
mode : "rlm" ; // Required for RLM
maxIterations ?: number ; // Default: 10
maxDepth ?: number ; // Default: 2
}
{
"model" : "glm-4.7" ,
"messages" : [
{ "role" : "user" , "content" : "List files in current directory" }
],
"permissions" : {
"allowlist" : [{ "tool" : "bash" }]
}
}
{
"model" : "kimi-k2.5" ,
"messages" : [
{ "role" : "user" , "content" : "Summarize this document" }
],
"context" : "<very long document content>" ,
"mode" : "rlm" ,
"maxIterations" : 15 ,
"maxDepth" : 3
}
Server Options
Bun server options can be configured in the export:
export default {
port: Number ( process . env . PORT ) || 4000 ,
fetch: app . fetch ,
idleTimeout: 255 , // Seconds before idle connections close
// Additional Bun server options:
// maxRequestBodySize: 1024 * 1024 * 10, // 10MB
// development: process.env.NODE_ENV !== 'production',
} ;
Multi-Provider Setup
Support multiple providers by creating separate harness configurations:
import { createGeneratorHarness as createZen } from "./packages/ai/harness/providers/zen" ;
import { createAnthropicHarness } from "./packages/ai/harness/providers/anthropic" ;
// Route requests based on model prefix or header
const getHarness = ( model : string ) => {
if ( model . startsWith ( 'claude-' )) {
return createAgentHarness ({ harness: createAnthropicHarness () });
}
return createAgentHarness ({ harness: createZen () });
};
// This requires modifying the server to support dynamic harness selection
The default server uses a single harness. Multi-provider support requires custom server modifications.
Connection Limits
Manage concurrent connections based on your infrastructure:
// Implement connection limiting middleware
const connectionLimit = 100 ;
let activeConnections = 0 ;
app . use ( async ( c , next ) => {
if ( activeConnections >= connectionLimit ) {
return c . json ({ error: 'Server at capacity' }, 503 );
}
activeConnections ++ ;
try {
await next ();
} finally {
activeConnections -- ;
}
});
Memory Management
Orchestrators are automatically cleaned up, but monitor memory for long-running sessions:
# Monitor memory usage
bun --inspect server/index.ts
# Set Node.js memory limits if needed
NODE_OPTIONS = "--max-old-space-size=4096" bun run server/index.ts
Configuration Best Practices
Environment-based Configuration
Use environment variables for deployment-specific settings
Sensible Defaults
Provide reasonable defaults for optional configuration
Validation
Validate configuration at startup to fail fast
Documentation
Document custom tools and skills for your team
Next Steps
Environment Variables Complete reference for all environment variables
HTTP API Reference Detailed endpoint documentation