Model Tiers
- Frontier
- Smart
- Balanced
- Fast
- Local
Most capable models for complex reasoning, research, and advanced tasks
| Model | Provider | Context | Price (per 1M tokens) | Features |
|---|---|---|---|---|
claude-opus-4-6 | Anthropic | 200K | 75 | Tools, Vision, 32K output |
o3 | OpenAI | 200K | 40 | Reasoning, 100K output |
gemini-2.5-pro | 1M | 10 | Vision, Code exec, 65K output | |
grok-3 | xAI | 131K | 15 | Tools, Vision |
samba-llama-3.1-405b | SambaNova | 4K | 10 | 405B parameters |
Model Aliases
AgentOS provides convenient aliases for quick model selection:Using Aliases
Model Capabilities
Tool Use (Function Calling)
Models that support structured tool invocation:43 models with tool support
43 models with tool support
- All Claude models (Opus, Sonnet, Haiku)
- All OpenAI models (GPT-4o, o3, o4-mini, GPT-4.1)
- All Gemini models (Pro, Flash)
- DeepSeek (Chat, Reasoner)
- Llama 3.3 70B (all providers)
- Cohere Command series
- xAI Grok series
- Mistral Large
- AI21 Jamba series
- All Chinese models (Qwen, GLM, Moonshot, ERNIE)
- AWS Bedrock models
Vision Support
Models that can process images:14 models with vision
14 models with vision
- Claude Opus 4.6, Sonnet 4.6, Haiku 4.5
- GPT-4o, GPT-4.1, o3, o4-mini, GPT-4o mini
- Gemini 2.5 Pro, Gemini 2.5 Flash
- Grok-2, Grok-3
- AWS Bedrock Claude, Nova Pro
- GitHub Copilot GPT-4o
Long Context (>100K tokens)
Models with extended context windows:30+ models with 100K+ context
30+ models with 100K+ context
- 1M tokens: Gemini 2.5 (Flash, Pro), GPT-4.1, Qwen Turbo
- 300K tokens: AWS Bedrock Nova Pro
- 256K tokens: Cohere Command series, AI21 Jamba, MiniMax ABAB 7
- 200K tokens: All Claude models, OpenAI o3/o4-mini, Perplexity Sonar, Bedrock Claude
- 128K+: Most other modern models
Pricing Comparison
Prices shown as input / output per 1 million tokens (USD).Best Value Models
| Model | Input | Output | Use Case |
|---|---|---|---|
qwen-turbo | $0.05 | $0.15 | High-volume, 1M context |
deepseek-chat | $0.14 | $0.28 | General purpose |
gpt-4o-mini | $0.15 | $0.6 | Fast OpenAI option |
gemini-2.5-flash | $0.15 | $0.6 | Google ecosystem |
jamba-1.5-mini | $0.2 | $0.4 | 256K context |
hf-llama-3.3-70b | $0.36 | $0.36 | Open source |
hf-mistral-7b | Free | Free | Development/testing |
Most Expensive (Frontier)
| Model | Input | Output | Justification |
|---|---|---|---|
claude-opus-4-6 | $15 | $75 | Most capable reasoning |
o3 | $10 | $40 | Advanced reasoning |
gemini-2.5-pro | $1.25 | $10 | 1M context + code exec |
samba-llama-3.1-405b | $5 | $10 | 405B parameters |
Model Selection Guide
By Use Case
By Budget
- Under $0.50/1M tokens: Qwen Turbo, DeepSeek Chat, HF models, local (free)
- 2/1M: Most balanced tier models
- 5/1M: Smart tier models
- Over $5/1M: Frontier models (use sparingly)
CLI Usage
Programmatic Access
HTTP API
Model Metadata
Each model entry includes:Next Steps
Provider Details
Learn about each provider’s setup
Routing Logic
Understand automatic model selection