Skip to main content

Overview

IronClaw defaults to NEAR AI for model access, but supports any OpenAI-compatible endpoint as well as direct Anthropic, OpenAI, and Ollama integrations.

Supported Providers

ProviderBackend ValueAPI Key RequiredNotes
NEAR AInearaiOAuth (browser) or API keyDefault; multi-model access
AnthropicanthropicANTHROPIC_API_KEYClaude models
OpenAIopenaiOPENAI_API_KEYGPT models
OllamaollamaNoLocal inference
OpenRouteropenai_compatibleLLM_API_KEY300+ models via one API
Together AIopenai_compatibleLLM_API_KEYFast open-source inference
Fireworks AIopenai_compatibleLLM_API_KEYFast inference with compound AI
vLLM / LiteLLMopenai_compatibleOptionalSelf-hosted
LM Studioopenai_compatibleNoLocal GUI

NEAR AI (Default)

No additional configuration required. On first run, ironclaw onboard opens a browser for OAuth authentication.

Session Token Auth (Default)

Best for: Local development, personal use
ironclaw onboard  # Opens browser for GitHub/Google login
Credentials are saved to ~/.ironclaw/session.json. Environment variables:
NEARAI_MODEL=zai-org/GLM-latest
NEARAI_BASE_URL=https://private.near.ai  # default
NEARAI_AUTH_URL=https://private.near.ai  # default

API Key Auth

Best for: CI/CD, hosting providers, VPS without browser access
LLM_BACKEND=nearai
NEARAI_API_KEY=your-api-key-from-cloud.near.ai
NEARAI_MODEL=zai-org/GLM-latest
Get your API key from cloud.near.ai. Automatic mode selection: When NEARAI_API_KEY is set, IronClaw automatically uses the Chat Completions API at cloud-api.near.ai instead of session-based auth.
ModelID
GLM Latest (default)zai-org/GLM-latest
Claude Sonnet 4anthropic::claude-sonnet-4-20250514
GPT-5.3 Codexopenai::gpt-5.3-codex
GPT-5.2openai::gpt-5.2
GPT-4oopenai::gpt-4o

Anthropic (Claude)

Direct access to Claude models via Anthropic API.
LLM_BACKEND=anthropic
ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_MODEL=claude-sonnet-4-20250514  # optional, see below
Get your API key: console.anthropic.com/settings/keys
ModelID
Claude Sonnet 4claude-sonnet-4-20250514
Claude 3.5 Sonnetclaude-3-5-sonnet-20241022
Claude 3.5 Haikuclaude-3-5-haiku-20241022

Optional Base URL Override

ANTHROPIC_BASE_URL=https://api.anthropic.com  # custom proxy

OpenAI (GPT)

Direct access to OpenAI models.
LLM_BACKEND=openai
OPENAI_API_KEY=sk-...
OPENAI_MODEL=gpt-4o  # optional, see below
Get your API key: platform.openai.com/api-keys
ModelID
GPT-4ogpt-4o
GPT-4o Minigpt-4o-mini
o3-minio3-mini

Optional Base URL Override

OPENAI_BASE_URL=https://api.openai.com/v1  # custom proxy

Ollama (Local)

Run models locally with Ollama.

Installation

  1. Install Ollama from ollama.com
  2. Pull a model:
    ollama pull llama3.2
    
  3. Configure IronClaw:
    LLM_BACKEND=ollama
    OLLAMA_MODEL=llama3.2
    OLLAMA_BASE_URL=http://localhost:11434  # default
    
ModelCommandNotes
Llama 3.2ollama pull llama3.23B, fast
Mistralollama pull mistral7B, good quality
Qwen 2.5ollama pull qwen2.5Multilingual
See all models at ollama.com/library.

OpenAI-Compatible Providers

All providers below use LLM_BACKEND=openai_compatible. Set LLM_BASE_URL to the provider’s endpoint and LLM_API_KEY if required.

OpenRouter

OpenRouter routes to 300+ models from a single API key.
LLM_BACKEND=openai_compatible
LLM_BASE_URL=https://openrouter.ai/api/v1
LLM_API_KEY=sk-or-...
LLM_MODEL=anthropic/claude-sonnet-4
Get your API key: openrouter.ai/settings/keys
ModelID
Claude Sonnet 4anthropic/claude-sonnet-4
GPT-4oopenai/gpt-4o
Llama 4 Maverickmeta-llama/llama-4-maverick
Gemini 2.0 Flashgoogle/gemini-2.0-flash-001
Mistral Smallmistralai/mistral-small-3.1-24b-instruct
Browse all models at openrouter.ai/models.

Optional HTTP Headers

OpenRouter supports custom headers for attribution:
LLM_EXTRA_HEADERS=HTTP-Referer:https://myapp.com,X-Title:MyApp

Together AI

Together AI provides fast inference for open-source models.
LLM_BACKEND=openai_compatible
LLM_BASE_URL=https://api.together.xyz/v1
LLM_API_KEY=your-together-api-key
LLM_MODEL=meta-llama/Llama-3.3-70B-Instruct-Turbo
Get your API key: api.together.xyz/settings/api-keys
ModelID
Llama 3.3 70Bmeta-llama/Llama-3.3-70B-Instruct-Turbo
DeepSeek R1deepseek-ai/DeepSeek-R1
Qwen 2.5 72BQwen/Qwen2.5-72B-Instruct-Turbo

Fireworks AI

Fireworks AI offers fast inference with compound AI system support.
LLM_BACKEND=openai_compatible
LLM_BASE_URL=https://api.fireworks.ai/inference/v1
LLM_API_KEY=fw_...
LLM_MODEL=accounts/fireworks/models/llama4-maverick-instruct-basic
Get your API key: fireworks.ai/account/api-keys

vLLM / LiteLLM (Self-Hosted)

For self-hosted inference servers:
LLM_BACKEND=openai_compatible
LLM_BASE_URL=http://localhost:8000/v1
LLM_API_KEY=token-abc123  # set to any string if auth is disabled
LLM_MODEL=meta-llama/Llama-3.1-8B-Instruct
LiteLLM proxy (forwards to any backend, including Bedrock, Vertex, Azure):
LLM_BACKEND=openai_compatible
LLM_BASE_URL=http://localhost:4000/v1
LLM_API_KEY=sk-...
LLM_MODEL=gpt-4o  # as configured in litellm config.yaml

LM Studio (Local GUI)

Start LM Studio’s local server, then:
LLM_BACKEND=openai_compatible
LLM_BASE_URL=http://localhost:1234/v1
LLM_MODEL=llama-3.2-3b-instruct-q4_K_M
# LLM_API_KEY not required for LM Studio

Extra HTTP Headers

For OpenAI-compatible providers that require custom headers:
LLM_EXTRA_HEADERS=HTTP-Referer:https://github.com/nearai/ironclaw,X-Title:ironclaw
Format: Comma-separated Key:Value pairs. Values can contain colons (e.g., URLs).

Using the Setup Wizard

Instead of editing .env manually, run:
ironclaw onboard
Select “OpenAI-compatible” for OpenRouter, Together AI, Fireworks, vLLM, LiteLLM, or LM Studio. The wizard will prompt for the base URL and API key.

Advanced Configuration

Fallback Models

For NEAR AI, configure automatic failover:
NEARAI_FALLBACK_MODEL=openai::gpt-4o-mini
If the primary model fails, requests automatically fall through to the fallback.

Circuit Breaker

Prevent cascading failures:
CIRCUIT_BREAKER_THRESHOLD=5  # Open after 5 consecutive failures
CIRCUIT_BREAKER_RECOVERY_SECS=30  # Try again after 30 seconds

Response Caching

Cache LLM responses in memory (saves tokens on repeated prompts):
RESPONSE_CACHE_ENABLED=true
RESPONSE_CACHE_TTL_SECS=3600  # 1 hour
RESPONSE_CACHE_MAX_ENTRIES=1000

Retries

NEARAI_MAX_RETRIES=3  # 1 initial + 3 retries = 4 total attempts

Embeddings Providers

For semantic search in workspace memory:
# Use NEAR AI for embeddings (default if NEAR AI is configured)
EMBEDDINGS_PROVIDER=nearai
EMBEDDINGS_MODEL=text-embedding-3-small

# Or use OpenAI
EMBEDDINGS_PROVIDER=openai
OPENAI_API_KEY=sk-...
EMBEDDINGS_MODEL=text-embedding-3-small
Both NEAR AI and OpenAI use the same model: text-embedding-3-small.

Build docs developers (and LLMs) love