Overview
OneClaw supports 6 LLM providers across 4 different API formats:| Provider | Format | Default Endpoint | Auth Header |
|---|---|---|---|
| Anthropic | Anthropic Messages | https://api.anthropic.com | x-api-key |
| OpenAI | OpenAI Chat Completions | https://api.openai.com | Bearer |
| DeepSeek | OpenAI Chat Completions | https://api.deepseek.com | Bearer |
| Groq | OpenAI Chat Completions | https://api.groq.com/openai | Bearer |
| Gemini | Gemini GenerateContent | https://generativelanguage.googleapis.com | Query param ?key= |
| Ollama | Ollama Chat | http://localhost:11434 | None (local) |
AnthropicProvider
API Format: POST/v1/messages with x-api-key header
Supported Models
claude-sonnet-4-20250514(default) - Best balance of quality/speed/costclaude-haiku-4-5-20251001- Fast, cheap, good for classificationclaude-opus-4-5-20250918- Maximum quality, expensive
Configuration
API Key Resolution
Priority order:config.api_key(explicit in code/TOML)ONECLAW_API_KEYenvironment variableANTHROPIC_API_KEYenvironment variable- Error if none found
API Format Differences
System Prompt: Separate top-levelsystem parameter (not in messages array)
Request:
OpenAICompatibleProvider
API Format: POST/v1/chat/completions with Bearer token
This single implementation serves 3 providers:
OpenAI (GPT-4o family)
Default Model:gpt-4o
Other Models:
gpt-4o-mini- Smaller, faster, cheapero1- Reasoning model
config.api_key → ONECLAW_API_KEY → OPENAI_API_KEY
DeepSeek
Default Model:deepseek-chat
Other Models:
deepseek-reasoner- Extended thinking/reasoning model
config.api_key → ONECLAW_API_KEY → DEEPSEEK_API_KEY
Groq (Fast Inference)
Default Model:llama-3.3-70b-versatile
Other Models:
mixtral-8x7b-32768- Mixtral model- Various Llama variants
config.api_key → ONECLAW_API_KEY → GROQ_API_KEY
API Format (All OpenAI-Compatible)
System Prompt: Included as a message withrole: "system" in the messages array
Request:
GeminiProvider
API Format: POST/v1beta/models/{model}:generateContent?key={api_key}
Supported Models
gemini-2.0-flash(default) - Fast, cheap, good qualitygemini-2.0-flash-lite- Fastest, cheapestgemini-2.5-pro- Best quality, expensivegemini-2.5-flash- Balanced with extended thinking
Configuration
API Key Resolution
Priority order:config.api_keyONECLAW_API_KEYGOOGLE_API_KEYGEMINI_API_KEY
API Format Differences
Key Differences from Other Providers:- Model name in URL path, not request body
- API key in query parameter
?key=, not header - System prompt as separate
systemInstructionfield - Role
"model"instead of"assistant" - Content uses
partsarray, not flatcontentstring - camelCase JSON fields (not snake_case)
OllamaProvider
API Format: POST/api/chat (no authentication)
Supported Models
Edge devices (RPi 4, 4GB RAM):llama3.2:3b(default) - Good balancephi3:mini- Smaller, fasterqwen2.5:3b- Multilingual, Vietnamese OK
llama3.2:7b- Higher qualitymistral:7b- Alternativedeepseek-r1:7b- Reasoning model
Configuration
Key Differences
No Authentication: No API key required (local service) Health Check:is_available() performs actual health check via GET /api/tags with 5s timeout
Longer Timeout: 120s default (vs 60s for cloud providers) for slow edge hardware
Different Endpoint: /api/chat (not /v1/chat/completions)
Parameter Names: Uses num_predict (not max_tokens)
API Format
Request:Model Management
Check Available Models:Choosing a Provider
For Production (Cloud)
Primary: Anthropic Claude (claude-sonnet-4-20250514)
- Best overall quality
- Good context handling
- Reliable API
gpt-4o)
- Fast inference
- Wide model selection
- Excellent for reasoning
deepseek-chat)
- Very cost-effective
- Good quality
- Fast reasoning model available
For Edge/IoT
Ollama (llama3.2:3b on RPi, llama3.2:7b on desktop)
- Fully offline
- No API costs
- Privacy-preserving
- Runs on edge hardware
For Speed
Groq (llama-3.3-70b-versatile)
- Fastest inference
- Good quality
- OpenAI-compatible
For Multimodal
Gemini (gemini-2.0-flash)
- Native multimodal support
- Fast and cheap
- Good multilingual support