Overview
Fast Agent supports multiple LLM providers with native integrations for Anthropic, OpenAI, Google, and compatibility with dozens of others through OpenAI-compatible APIs. Each provider has specific configuration options for API keys, endpoints, and model-specific features.
Model Selection
Models are specified using a consistent format:
<provider>.<model_name>.<reasoning_effort>
Or with query parameter syntax:
<provider>.<model_name>?reasoning=<value>
Examples
openai.o3-mini.low
anthropic.claude-3-5-sonnet-20241022
google.gemini-2.0-flash-exp
gpt-5-mini.high
sonnet (using built-in alias)
Model Aliases
Fast Agent provides built-in aliases for common models:
Anthropic: haiku, haiku3, sonnet, sonnet35, opus, opus3
OpenAI: gpt-4.1, gpt-4.1-mini, o1, o1-mini, o3-mini, gpt-5-mini
Provider Configuration
Anthropic
anthropic:
api_key: "${ANTHROPIC_API_KEY}"
base_url: "https://api.anthropic.com" # optional override
default_model: "claude-3-5-sonnet-20241022"
cache_mode: "auto" # off, prompt, auto
cache_ttl: "5m" # 5m or 1h
reasoning: "medium" # or token budget (int)
structured_output_mode: "auto" # auto, json, tool_use
Anthropic API key. Can also be set via ANTHROPIC_API_KEY environment variable.
Override API endpoint for custom deployments.
Default model when Anthropic provider is selected without an explicit model.
Custom headers to pass with every request.
Caching mode:
off: Disabled
prompt: Cache tools and system prompt
auto: Same as prompt (default)
Cache TTL:
5m: Standard (5 minutes)
1h: Extended (1 hour, additional cost)
anthropic.reasoning
string | integer | boolean
Reasoning setting. Supports:
- Effort strings:
minimal, low, medium, high
- Budget tokens: integer value
- Toggle:
true/false (0 to disable)
anthropic.structured_output_mode
Structured output mode: auto, json, or tool_use
Anthropic Web Search
anthropic:
web_search:
enabled: true
max_uses: 10
allowed_domains: ["wikipedia.org", "*.gov"]
user_location:
type: "approximate"
city: "San Francisco"
country: "US"
timezone: "America/Los_Angeles"
anthropic.web_search.enabled
Enable Anthropic built-in web_search tool.
anthropic.web_search.max_uses
Maximum number of web searches per session.
anthropic.web_search.allowed_domains
Whitelist of allowed domains. Mutually exclusive with blocked_domains.
anthropic.web_search.blocked_domains
Blacklist of blocked domains. Mutually exclusive with allowed_domains.
Anthropic Web Fetch
anthropic:
web_fetch:
enabled: true
max_uses: 20
allowed_domains: ["*.com", "*.org"]
citations_enabled: true
max_content_tokens: 10000
anthropic.web_fetch.enabled
Enable Anthropic built-in web_fetch tool.
anthropic.web_fetch.citations_enabled
Enable citation tracking for fetched content.
anthropic.web_fetch.max_content_tokens
Maximum tokens per fetched page.
OpenAI
openai:
api_key: "${OPENAI_API_KEY}"
base_url: "https://api.openai.com/v1"
default_model: "gpt-5-mini"
reasoning: "medium"
reasoning_effort: "medium" # minimal, low, medium, high
text_verbosity: "medium" # low, medium, high
transport: "auto" # sse, websocket, auto
service_tier: "fast" # fast, flex
OpenAI API key. Can also be set via OPENAI_API_KEY environment variable.
Default model when OpenAI provider is selected without an explicit model.
openai.reasoning
string | integer | boolean
Unified reasoning setting (effort level or budget).
Default reasoning effort: minimal, low, medium, high
Text verbosity level: low, medium, high
Responses transport mode:
sse: Server-Sent Events
websocket: WebSocket connection
auto: Automatic with fallback
Service tier: fast (priority) or flex
OpenAI Web Search
openai:
web_search:
enabled: true
tool_type: "web_search" # web_search, web_search_preview
search_context_size: "medium" # low, medium, high
allowed_domains: ["wikipedia.org"]
external_web_access: true
Google (Gemini)
google:
api_key: "${GOOGLE_API_KEY}"
base_url: "https://generativelanguage.googleapis.com"
default_model: "gemini-2.0-flash-exp"
Google API key. Can also be set via GOOGLE_API_KEY environment variable.
DeepSeek
deepseek:
api_key: "${DEEPSEEK_API_KEY}"
base_url: "https://api.deepseek.com"
default_model: "deepseek-chat"
xAI (Grok)
xai:
api_key: "${XAI_API_KEY}"
base_url: "https://api.x.ai/v1"
default_model: "grok-2-latest"
Azure OpenAI
Azure OpenAI supports three authentication modes:
Option 1: Resource Name + API Key
default_model: "azure.my-deployment"
azure:
api_key: "${AZURE_OPENAI_API_KEY}"
resource_name: "your-resource-name"
azure_deployment: "my-deployment"
api_version: "2023-05-15"
Option 2: Base URL + API Key
azure:
api_key: "${AZURE_OPENAI_API_KEY}"
base_url: "https://your-resource-name.openai.azure.com/"
azure_deployment: "my-deployment"
api_version: "2023-05-15"
Option 3: DefaultAzureCredential (Managed Identity)
azure:
use_default_azure_credential: true
base_url: "https://your-resource-name.openai.azure.com/"
azure_deployment: "my-deployment"
api_version: "2023-05-15"
Do not use both resource_name and base_url together. Choose one authentication mode.
AWS Bedrock
bedrock:
region: "us-east-1"
profile: "default"
default_model: "anthropic.claude-3-5-sonnet-20241022-v2:0"
reasoning: "medium"
reasoning_effort: "minimal" # minimal, low, medium, high
AWS region for Bedrock (e.g., us-east-1).
AWS profile for authentication.
Groq
groq:
api_key: "${GROQ_API_KEY}"
base_url: "https://api.groq.com/openai/v1"
default_model: "llama-3.3-70b-versatile"
HuggingFace
huggingface:
api_key: "${HF_TOKEN}"
base_url: "https://router.huggingface.co/v1"
default_model: "meta-llama/Llama-3.3-70B-Instruct"
default_provider: "groq" # groq, fireworks-ai, cerebras
OpenRouter
openrouter:
api_key: "${OPENROUTER_API_KEY}"
base_url: "https://openrouter.ai/api/v1"
default_model: "anthropic/claude-3.5-sonnet"
TensorZero
tensorzero:
base_url: "http://localhost:3000"
api_key: "${TENSORZERO_API_KEY}" # if required
default_model: "my-function-name"
Generic (Ollama, etc.)
For Ollama or other OpenAI-compatible APIs:
generic:
api_key: "ollama" # or actual key for other providers
base_url: "http://localhost:11434/v1"
default_model: "qwen2.5"
Usage:
fast-agent go --model=generic.qwen2.5
Codex Responses
codexresponses:
api_key: "${CODEX_API_KEY}"
base_url: "https://api.codex.com"
default_model: "codex-5-mini"
service_tier: "fast" # fast or unset
text_verbosity: "medium"
transport: "auto"
Open Responses
openresponses:
api_key: "${OPEN_RESPONSES_API_KEY}"
base_url: "https://api.openresponses.com"
default_model: "open-5-mini"
reasoning: "medium"
reasoning_effort: "medium"
transport: "sse" # sse, websocket, auto
service_tier: "fast"
Model Override Priority
Model selection follows this priority order:
- Agent definition: Model specified in
@fast.agent(model=...)
- CLI flag:
--model command-line argument
- Environment variable:
FAST_AGENT_MODEL
- Config file:
default_model in fastagent.config.yaml
- Default:
gpt-5-mini.low
Using Models in Agents
# Use default model
@fast.agent(
instruction="You are a helpful assistant"
)
# Specify model in agent
@fast.agent(
instruction="Complex reasoning task",
model="openai.o3-mini.high"
)
# Use model alias
@fast.agent(
instruction="Fast responses",
model="haiku"
)
# Use custom alias
@fast.agent(
instruction="Planning agent",
model="$system.plan"
)
Complete Multi-Provider Example
default_model: gpt-5-mini.low
model_aliases:
system:
default: "responses.gpt-5-mini.low"
fast: "anthropic.claude-3-haiku-20241022"
reasoning: "openai.o3-mini.high"
local: "generic.qwen2.5"
anthropic:
api_key: "${ANTHROPIC_API_KEY}"
default_model: "claude-3-5-sonnet-20241022"
cache_mode: "auto"
web_search:
enabled: true
max_uses: 10
openai:
api_key: "${OPENAI_API_KEY}"
default_model: "gpt-5-mini"
reasoning_effort: "medium"
service_tier: "fast"
web_search:
enabled: true
google:
api_key: "${GOOGLE_API_KEY}"
default_model: "gemini-2.0-flash-exp"
generic:
api_key: "ollama"
base_url: "http://localhost:11434/v1"
default_model: "qwen2.5"
See Also