Custom Provider Setup

Overview

ZeroClaw supports custom API endpoints for both OpenAI-compatible and Anthropic-compatible providers. This enables integration with:

Local LLM servers (llama.cpp, SGLang, vLLM, Ollama)
Corporate AI gateways
Third-party OpenAI/Anthropic proxies
Self-hosted models

Provider Types

OpenAI-Compatible (`custom:`)

For services implementing the OpenAI API format: Provider prefix: custom:https://your-api.com Endpoints:

POST /chat/completions - Chat completion
GET /models - Model discovery (optional)

Anthropic-Compatible (`anthropic-custom:`)

For services implementing the Anthropic API format: Provider prefix: anthropic-custom:https://your-api.com Endpoints:

POST /v1/messages - Chat completion

Configuration Methods

Config File

Edit ~/.zeroclaw/config.toml:

# OpenAI-compatible
default_provider = "custom:https://api.example.com"
api_key = "your-api-key"
default_model = "model-name"
default_temperature = 0.7

# Anthropic-compatible
default_provider = "anthropic-custom:https://api.example.com"
api_key = "your-api-key"
default_model = "claude-sonnet-4-6"
default_temperature = 0.7

Environment Variables

For custom providers, use the generic API key variables:

export API_KEY="your-api-key"
# or
export ZEROCLAW_API_KEY="your-api-key"

zeroclaw agent --provider "custom:https://api.example.com" \
  --model model-name \
  -m "Hello!"

API Mode (OpenAI-compatible only)

Control which endpoint is called first:

default_provider = "custom:https://api.example.com"
provider_api = "openai-chat-completions"  # Default
# or
provider_api = "openai-responses"  # Responses-first mode

Note: provider_api is only valid when using custom:<url>.

First-Class Local Providers

ZeroClaw includes dedicated providers for common local servers with optimized defaults.

llama.cpp Server

Provider ID: llamacpp (alias: llama.cpp) Default endpoint: http://localhost:8080/v1 Setup:

# Start llama-server
llama-server -hf ggml-org/gpt-oss-20b-GGUF \
  --jinja -c 133000 \
  --host 127.0.0.1 --port 8033

# Configure ZeroClaw
cat >> ~/.zeroclaw/config.toml <<EOF
default_provider = "llamacpp"
api_url = "http://127.0.0.1:8033/v1"
default_model = "ggml-org/gpt-oss-20b-GGUF"
default_temperature = 0.7
EOF

# Validate
zeroclaw models refresh --provider llamacpp
zeroclaw agent -m "hello"

Authentication: API key is optional. Set LLAMACPP_API_KEY only if server is started with --api-key.

SGLang Server

Provider ID: sglang Default endpoint: http://localhost:30000/v1 Setup:

# Start SGLang server
python -m sglang.launch_server \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --port 30000 \
  --tool-call-parser hermes  # Required for tool calling

# Configure ZeroClaw
cat >> ~/.zeroclaw/config.toml <<EOF
default_provider = "sglang"
default_model = "meta-llama/Llama-3.1-8B-Instruct"
default_temperature = 0.7
EOF

# Validate
zeroclaw models refresh --provider sglang
zeroclaw agent -m "hello"

Tool calling: Requires --tool-call-parser flag when launching SGLang.

vLLM Server

Provider ID: vllm Default endpoint: http://localhost:8000/v1 Setup:

# Start vLLM server
vllm serve meta-llama/Llama-3.1-8B-Instruct

# Configure ZeroClaw
cat >> ~/.zeroclaw/config.toml <<EOF
default_provider = "vllm"
default_model = "meta-llama/Llama-3.1-8B-Instruct"
default_temperature = 0.7
EOF

# Validate
zeroclaw models refresh --provider vllm
zeroclaw agent -m "hello"

Hunyuan (Tencent)

Provider ID: hunyuan (alias: tencent) Base URL: https://api.hunyuan.cloud.tencent.com/v1 Setup:

export HUNYUAN_API_KEY="your-api-key"

cat >> ~/.zeroclaw/config.toml <<EOF
default_provider = "hunyuan"
default_model = "hunyuan-t1-latest"
default_temperature = 0.7
EOF

zeroclaw agent -m "hello"

Models: hunyuan-t1-latest, hunyuan-turbo-latest, hunyuan-pro

OpenAI Responses API (WebSocket)

For OpenAI-compatible endpoints: Auto mode: When custom: endpoint resolves to api.openai.com, ZeroClaw tries WebSocket first (wss://.../responses) and falls back to HTTP. Manual override:

# Force WebSocket mode
export ZEROCLAW_RESPONSES_WEBSOCKET=1

# Disable WebSocket (HTTP only)
export ZEROCLAW_RESPONSES_WEBSOCKET=0

Credential Resolution

For custom: and anthropic-custom: providers:

Explicit api_key from config
ZEROCLAW_API_KEY environment variable
API_KEY environment variable

Note: Provider-specific env vars (e.g., OPENAI_API_KEY) are not used for custom endpoints.

Testing Configuration

Verify Endpoint

# Test connection
zeroclaw agent --provider "custom:https://api.example.com" \
  --model model-name \
  -m "test message"

Model Discovery

zeroclaw models refresh --provider "custom:https://api.example.com"

Health Check

zeroclaw status

Troubleshooting

Authentication Errors

Symptom: 401 Unauthorized Solution:

Verify API key is correct
Check endpoint URL format (must include http:// or https://)
Ensure endpoint is accessible from your network

curl -I https://api.example.com

Model Not Found

Symptom: 404 Model not found Solution:

Verify model name matches provider’s available models
List available models:

curl -sS https://api.example.com/models \
  -H "Authorization: Bearer $API_KEY"

For gateways that don’t implement /models, send a test request and check the error message:

curl -sS https://api.example.com/chat/completions \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "test-model",
    "messages": [{"role": "user", "content": "test"}]
  }'

Connection Issues

Symptom: Connection timeout or refused Solution:

Test endpoint accessibility:

curl -I https://api.example.com

Check firewall/proxy settings
Verify provider status page
Try with verbose logging:

RUST_LOG=debug zeroclaw agent --provider "custom:..." -m "test"

Examples

Local LLM Server

default_provider = "custom:http://localhost:8080/v1"
api_key = "optional-if-auth-enabled"
default_model = "local-model"

Corporate Proxy

default_provider = "anthropic-custom:https://llm-proxy.corp.example.com"
api_key = "internal-token"
default_model = "claude-sonnet-4-6"

Cloud Provider Gateway

default_provider = "custom:https://gateway.cloud-provider.com/v1"
api_key = "gateway-api-key"
default_model = "gpt-4o"

Multi-Provider Fallback

Combine custom endpoints with fallback:

default_provider = "custom:https://primary.example.com"

[reliability]
fallback_providers = [
  "custom:https://backup.example.com",
  "openai"
]

API Format Requirements

OpenAI-Compatible

Request:

{
  "model": "model-name",
  "messages": [
    {"role": "system", "content": "System prompt"},
    {"role": "user", "content": "User message"}
  ],
  "temperature": 0.7,
  "max_tokens": 4096
}

Response:

{
  "choices": [{
    "message": {
      "role": "assistant",
      "content": "Response text"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 5
  }
}

Anthropic-Compatible

Request:

{
  "model": "model-name",
  "max_tokens": 4096,
  "system": "System prompt",
  "messages": [
    {"role": "user", "content": "User message"}
  ],
  "temperature": 0.7
}

Response:

{
  "content": [{
    "type": "text",
    "text": "Response text"
  }],
  "stop_reason": "end_turn",
  "usage": {
    "input_tokens": 10,
    "output_tokens": 5
  }
}

Limitations

Custom endpoints must match OpenAI or Anthropic API formats
Provider-specific features may not be supported
Model discovery depends on endpoint implementing /models
Tool calling support depends on endpoint compatibility
Vision support depends on endpoint capabilities

Core Traits

Built-in Providers

Built-in Channels

Built-in Tools

CLI Commands

Custom Provider Setup

Overview

Provider Types

OpenAI-Compatible (`custom:`)

Anthropic-Compatible (`anthropic-custom:`)

Configuration Methods

Config File

Environment Variables

API Mode (OpenAI-compatible only)

First-Class Local Providers

llama.cpp Server

SGLang Server

vLLM Server

Hunyuan (Tencent)

OpenAI Responses API (WebSocket)

Credential Resolution

Testing Configuration

Verify Endpoint

Model Discovery

Health Check

Troubleshooting

Authentication Errors

Model Not Found

Connection Issues

Examples

Local LLM Server

Corporate Proxy

Cloud Provider Gateway

Multi-Provider Fallback

API Format Requirements

OpenAI-Compatible

Anthropic-Compatible

Limitations

Build docs developers (and LLMs) love

Core Traits

Built-in Providers

Built-in Channels

Built-in Tools

CLI Commands

​Overview

​Provider Types

​OpenAI-Compatible (custom:)

​Anthropic-Compatible (anthropic-custom:)

​Configuration Methods

​Config File

​Environment Variables

​API Mode (OpenAI-compatible only)

​First-Class Local Providers

​llama.cpp Server

​SGLang Server

​vLLM Server

​Hunyuan (Tencent)

​OpenAI Responses API (WebSocket)

​Credential Resolution

​Testing Configuration

​Verify Endpoint

​Model Discovery

​Health Check

​Troubleshooting

​Authentication Errors

​Model Not Found

​Connection Issues

​Examples

​Local LLM Server

​Corporate Proxy

​Cloud Provider Gateway

​Multi-Provider Fallback

​API Format Requirements

​OpenAI-Compatible

​Anthropic-Compatible

​Limitations

​Related

Build docs developers (and LLMs) love

Overview

Provider Types

OpenAI-Compatible (`custom:`)

Anthropic-Compatible (`anthropic-custom:`)

Configuration Methods

Config File

Environment Variables

API Mode (OpenAI-compatible only)

First-Class Local Providers

llama.cpp Server

SGLang Server

vLLM Server

Hunyuan (Tencent)

OpenAI Responses API (WebSocket)

Credential Resolution

Testing Configuration

Verify Endpoint

Model Discovery

Health Check

Troubleshooting

Authentication Errors

Model Not Found

Connection Issues

Examples

Local LLM Server

Corporate Proxy

Cloud Provider Gateway

Multi-Provider Fallback

API Format Requirements

OpenAI-Compatible

Anthropic-Compatible

Limitations

Related