Overview
The Hive framework uses LiteLLM to provide unified access to multiple LLM providers through a single interface. This allows you to switch between providers seamlessly without changing your agent code.Supported Providers
Anthropic
Claude Opus, Sonnet, Haiku models with extended context
OpenAI
GPT-4o, GPT-4 Turbo, GPT-3.5, o1 reasoning models
Gemini Pro, Gemini Flash with multimodal support
DeepSeek
DeepSeek Chat, Coder, Reasoner models
Groq
Ultra-fast inference with Llama, Mixtral models
Cerebras
Fast inference with GLM and Qwen models
Quick Setup via Quickstart
The interactive quickstart script guides you through provider configuration:Subscription Modes (No API Key Purchase)
Claude Code Subscription
Use your Claude Max/Pro plan for API access.Setup: Run
claude CLI to authenticate, then select option 1 in quickstart.Models: claude-opus-4-6, claude-sonnet-4-5-20250929ZAI Code Subscription
Use your ZAI Code plan for API access.Setup: Provide ZAI API key when prompted.Models: glm-5 (32K context)
API Key Providers
Anthropic (Recommended)
Get API key: https://console.anthropic.com/settings/keysModels: claude-opus-4-6, claude-sonnet-4-5, claude-haiku-4-5
OpenAI
Get API key: https://platform.openai.com/api-keysModels: gpt-5.2, gpt-5-mini, gpt-4o, gpt-4-turbo
Google Gemini (Free Tier)
Get API key: https://aistudio.google.com/apikeyModels: gemini-3-flash-preview, gemini-3.1-pro-preview
Groq (Fast, Free Tier)
Get API key: https://console.groq.com/keysModels: moonshotai/kimi-k2-instruct-0905, openai/gpt-oss-120b
Cerebras (Fast, Free Tier)
Get API key: https://cloud.cerebras.ai/Models: zai-glm-4.7, qwen3-235b-a22b-instruct-2507
Manual Configuration
Set Environment Variables
Add your API key to your shell configuration:~/.bashrc or ~/.zshrc for persistence:
Create Configuration File
Create~/.hive/configuration.json:
Provider-Specific Setup
Anthropic (Claude)
- API Key
- Claude Code Subscription
claude-opus-4-6- Most capable (recommended)claude-sonnet-4-5-20250929- Best balanceclaude-sonnet-4-20250514- Fast + capableclaude-haiku-4-5-20251001- Fast + cheap
OpenAI
- API Key
- Codex Subscription
gpt-5.2- Most capable (recommended)gpt-5-mini- Fast + cheapgpt-4o- Multimodal flagshipgpt-4-turbo- Fast GPT-4o1- Reasoning model
Google Gemini
gemini-3-flash-preview- Fast (recommended)gemini-3.1-pro-preview- Best qualitygemini-1.5-pro- Extended context (2M tokens)
DeepSeek
deepseek-chat- General purposedeepseek-coder- Code generationdeepseek-reasoner- Chain-of-thought reasoning
Groq
moonshotai/kimi-k2-instruct-0905- Best quality (recommended)openai/gpt-oss-120b- Fast reasoningllama3-70b- Llama 3 70Bmixtral-8x7b- Mixtral MoE
Cerebras
zai-glm-4.7- Best quality (recommended)qwen3-235b-a22b-instruct-2507- Frontier reasoning
ZAI Code
Using in Code
Basic Usage
With Custom API Key
With Custom API Base
Async Completion
Streaming
With Tools
Model Selection Guide
By Use Case
Complex Reasoning
Complex Reasoning
Best:
claude-opus-4-6(Anthropic)gpt-5.2(OpenAI)o1(OpenAI - specialized reasoning)
Fast Iteration
Fast Iteration
Best:
claude-haiku-4-5(Anthropic)gpt-5-mini(OpenAI)gemini-3-flash(Google)llama3-70bon Groq (ultra-fast)
Code Generation
Code Generation
Best:
deepseek-coder(DeepSeek)claude-sonnet-4-5(Anthropic)gpt-4o(OpenAI)
Cost Optimization
Cost Optimization
Best:
gemini-3-flash(Free tier)llama3-70bon Groq (Free tier)gpt-5-mini(Cheap)
Long Context
Long Context
Best:
claude-opus-4-6(200K tokens)gemini-1.5-pro(2M tokens)gpt-4-turbo(128K tokens)
By Budget
| Budget | Model | Provider | Notes |
|---|---|---|---|
| Free | gemini-3-flash | Free tier available | |
| Free | llama3-70b | Groq | Fast, free tier |
| Low | gpt-5-mini | OpenAI | $0.10/1M tokens |
| Low | claude-haiku-4-5 | Anthropic | $0.25/1M tokens |
| Medium | claude-sonnet-4-5 | Anthropic | $3/1M tokens |
| Medium | gpt-4o | OpenAI | $5/1M tokens |
| High | claude-opus-4-6 | Anthropic | $15/1M tokens |
| High | gpt-5.2 | OpenAI | $20/1M tokens |
Advanced Features
Rate Limit Handling
Automatic retry with exponential backoff:Token Estimation
Failed Request Debugging
Failed requests are automatically dumped to:- Full request payload
- Error type and attempt number
- Token count estimate
- Timestamp
Troubleshooting
API Key Not Found
API Key Not Found
Error:
AuthenticationError: API key not foundSolution:Rate Limit Exceeded
Rate Limit Exceeded
Error:
RateLimitError: 429 Rate limit exceededSolution:- Framework retries automatically with backoff
- Check server-provided
retry-afterheader - Reduce concurrency
- Upgrade to higher tier plan
Empty Response
Empty Response
Error: Empty content returnedCauses:
- Rate limit (stealth 200 instead of 429)
- Context window exceeded
- finish_reason=length (max_tokens too low)
- Check
~/.hive/failed_requests/for dumps - Increase
max_tokens - Reduce context length
Context Window Exceeded
Context Window Exceeded
Error:
BadRequestError: maximum context length exceededSolution:- Use model with larger context (e.g., claude-opus-4-6)
- Implement message compaction
- Summarize earlier conversation turns
Next Steps
Credential Management
Securely manage API keys
Self-Hosting
Deploy your own Hive instance