Working with LLM Providers
OneClaw supports multiple LLM providers through a unifiedProvider trait. All providers share the same interface for maximum flexibility and easy switching between models.
Supported Providers
OneClaw supports 6 LLM providers across three tiers: Tier 1 (Cloud - Primary)- Anthropic Claude - Primary provider, best balance of quality and speed
- OpenAI GPT - GPT-4o family, industry standard
- Google Gemini - Multimodal, fast and cost-effective
- DeepSeek - Reasoning-focused models
- Groq - Ultra-fast inference (Llama, Mixtral)
- Ollama - Self-hosted, offline capable, perfect for edge deployment
Provider Configuration
Configure providers inoneclaw.toml:
API Key Resolution
API keys are resolved in this order:- Per-provider key in
[provider.keys] - Global key in
provider.api_key - Environment variable
ONECLAW_API_KEY - Provider-specific env var (
ANTHROPIC_API_KEY,OPENAI_API_KEY, etc.)
Configuring Each Provider
Anthropic Claude
claude-sonnet-4-20250514- Best balance (default)claude-haiku-4-5-20251001- Fast, cheap, good for classificationclaude-opus-4-5-20250918- Max quality, expensive
OpenAI GPT
gpt-4o- Flagship model (default)gpt-4o-mini- Faster, cheaper
Google Gemini
gemini-2.0-flash- Fast, cheap, good quality (default)gemini-2.0-flash-lite- Fastest, cheapestgemini-2.5-pro- Best quality, expensive
Ollama (Local)
llama3.2:3b- Best balance for edge (Raspberry Pi 4)phi3:mini- Smaller, fasterqwen2.5:3b- Multilingual, Vietnamese support
Fallback Chains with ReliableProvider
Build resilient systems with automatic failover:FallbackChain
Tries providers in order until one succeeds:ReliableProvider
Wraps a provider with retry logic:Provider Selection Logic
TheFallbackChain uses this logic:
- Skips unavailable providers (e.g., Ollama not running)
- Logs each attempt for debugging
- Returns first successful response
- Fails only if all providers fail
Error Handling
Provider-Specific Errors
Each provider handles errors differently: Anthropic:Handling Errors in Your Code
Provider Trait Interface
All providers implement this trait:Response Format
Best Practices
- Use fallback chains for production:
anthropic → openai → ollama - Configure retries per-provider:
max_retries = 2in config - Monitor token usage: Check
response.usageto track costs - Test locally first: Use Ollama for development before cloud deployment
- Set reasonable timeouts: Cloud = 60s, Ollama = 120s (edge hardware is slow)
- Handle provider-specific errors: Different providers return different error formats
See Also
- Configuration Reference - Full config options
- Architecture - Provider layer design
- API Reference - Detailed trait documentation