Skip to main content

Overview

Fast Agent supports multiple LLM providers with native integrations for Anthropic, OpenAI, Google, and compatibility with dozens of others through OpenAI-compatible APIs. Each provider has specific configuration options for API keys, endpoints, and model-specific features.

Model Selection

Models are specified using a consistent format:
<provider>.<model_name>.<reasoning_effort>
Or with query parameter syntax:
<provider>.<model_name>?reasoning=<value>

Examples

  • openai.o3-mini.low
  • anthropic.claude-3-5-sonnet-20241022
  • google.gemini-2.0-flash-exp
  • gpt-5-mini.high
  • sonnet (using built-in alias)

Model Aliases

Fast Agent provides built-in aliases for common models: Anthropic: haiku, haiku3, sonnet, sonnet35, opus, opus3 OpenAI: gpt-4.1, gpt-4.1-mini, o1, o1-mini, o3-mini, gpt-5-mini

Provider Configuration

Anthropic

anthropic:
  api_key: "${ANTHROPIC_API_KEY}"
  base_url: "https://api.anthropic.com"  # optional override
  default_model: "claude-3-5-sonnet-20241022"
  cache_mode: "auto"  # off, prompt, auto
  cache_ttl: "5m"  # 5m or 1h
  reasoning: "medium"  # or token budget (int)
  structured_output_mode: "auto"  # auto, json, tool_use
anthropic.api_key
string
Anthropic API key. Can also be set via ANTHROPIC_API_KEY environment variable.
anthropic.base_url
string
Override API endpoint for custom deployments.
anthropic.default_model
string
Default model when Anthropic provider is selected without an explicit model.
anthropic.default_headers
object
Custom headers to pass with every request.
anthropic.cache_mode
string
default:"auto"
Caching mode:
  • off: Disabled
  • prompt: Cache tools and system prompt
  • auto: Same as prompt (default)
anthropic.cache_ttl
string
default:"5m"
Cache TTL:
  • 5m: Standard (5 minutes)
  • 1h: Extended (1 hour, additional cost)
anthropic.reasoning
string | integer | boolean
Reasoning setting. Supports:
  • Effort strings: minimal, low, medium, high
  • Budget tokens: integer value
  • Toggle: true/false (0 to disable)
anthropic.structured_output_mode
string
default:"auto"
Structured output mode: auto, json, or tool_use
anthropic:
  web_search:
    enabled: true
    max_uses: 10
    allowed_domains: ["wikipedia.org", "*.gov"]
    user_location:
      type: "approximate"
      city: "San Francisco"
      country: "US"
      timezone: "America/Los_Angeles"
anthropic.web_search.enabled
boolean
default:"false"
Enable Anthropic built-in web_search tool.
anthropic.web_search.max_uses
integer
Maximum number of web searches per session.
anthropic.web_search.allowed_domains
array
Whitelist of allowed domains. Mutually exclusive with blocked_domains.
anthropic.web_search.blocked_domains
array
Blacklist of blocked domains. Mutually exclusive with allowed_domains.

Anthropic Web Fetch

anthropic:
  web_fetch:
    enabled: true
    max_uses: 20
    allowed_domains: ["*.com", "*.org"]
    citations_enabled: true
    max_content_tokens: 10000
anthropic.web_fetch.enabled
boolean
default:"false"
Enable Anthropic built-in web_fetch tool.
anthropic.web_fetch.citations_enabled
boolean
default:"false"
Enable citation tracking for fetched content.
anthropic.web_fetch.max_content_tokens
integer
Maximum tokens per fetched page.

OpenAI

openai:
  api_key: "${OPENAI_API_KEY}"
  base_url: "https://api.openai.com/v1"
  default_model: "gpt-5-mini"
  reasoning: "medium"
  reasoning_effort: "medium"  # minimal, low, medium, high
  text_verbosity: "medium"  # low, medium, high
  transport: "auto"  # sse, websocket, auto
  service_tier: "fast"  # fast, flex
openai.api_key
string
OpenAI API key. Can also be set via OPENAI_API_KEY environment variable.
openai.default_model
string
Default model when OpenAI provider is selected without an explicit model.
openai.reasoning
string | integer | boolean
Unified reasoning setting (effort level or budget).
openai.reasoning_effort
string
default:"medium"
Default reasoning effort: minimal, low, medium, high
openai.text_verbosity
string
default:"medium"
Text verbosity level: low, medium, high
openai.transport
string
Responses transport mode:
  • sse: Server-Sent Events
  • websocket: WebSocket connection
  • auto: Automatic with fallback
openai.service_tier
string
Service tier: fast (priority) or flex
openai:
  web_search:
    enabled: true
    tool_type: "web_search"  # web_search, web_search_preview
    search_context_size: "medium"  # low, medium, high
    allowed_domains: ["wikipedia.org"]
    external_web_access: true

Google (Gemini)

google:
  api_key: "${GOOGLE_API_KEY}"
  base_url: "https://generativelanguage.googleapis.com"
  default_model: "gemini-2.0-flash-exp"
google.api_key
string
Google API key. Can also be set via GOOGLE_API_KEY environment variable.

DeepSeek

deepseek:
  api_key: "${DEEPSEEK_API_KEY}"
  base_url: "https://api.deepseek.com"
  default_model: "deepseek-chat"

xAI (Grok)

xai:
  api_key: "${XAI_API_KEY}"
  base_url: "https://api.x.ai/v1"
  default_model: "grok-2-latest"

Azure OpenAI

Azure OpenAI supports three authentication modes:

Option 1: Resource Name + API Key

default_model: "azure.my-deployment"

azure:
  api_key: "${AZURE_OPENAI_API_KEY}"
  resource_name: "your-resource-name"
  azure_deployment: "my-deployment"
  api_version: "2023-05-15"

Option 2: Base URL + API Key

azure:
  api_key: "${AZURE_OPENAI_API_KEY}"
  base_url: "https://your-resource-name.openai.azure.com/"
  azure_deployment: "my-deployment"
  api_version: "2023-05-15"

Option 3: DefaultAzureCredential (Managed Identity)

azure:
  use_default_azure_credential: true
  base_url: "https://your-resource-name.openai.azure.com/"
  azure_deployment: "my-deployment"
  api_version: "2023-05-15"
Do not use both resource_name and base_url together. Choose one authentication mode.

AWS Bedrock

bedrock:
  region: "us-east-1"
  profile: "default"
  default_model: "anthropic.claude-3-5-sonnet-20241022-v2:0"
  reasoning: "medium"
  reasoning_effort: "minimal"  # minimal, low, medium, high
bedrock.region
string
AWS region for Bedrock (e.g., us-east-1).
bedrock.profile
string
default:"default"
AWS profile for authentication.

Groq

groq:
  api_key: "${GROQ_API_KEY}"
  base_url: "https://api.groq.com/openai/v1"
  default_model: "llama-3.3-70b-versatile"

HuggingFace

huggingface:
  api_key: "${HF_TOKEN}"
  base_url: "https://router.huggingface.co/v1"
  default_model: "meta-llama/Llama-3.3-70B-Instruct"
  default_provider: "groq"  # groq, fireworks-ai, cerebras

OpenRouter

openrouter:
  api_key: "${OPENROUTER_API_KEY}"
  base_url: "https://openrouter.ai/api/v1"
  default_model: "anthropic/claude-3.5-sonnet"

TensorZero

tensorzero:
  base_url: "http://localhost:3000"
  api_key: "${TENSORZERO_API_KEY}"  # if required
  default_model: "my-function-name"

Generic (Ollama, etc.)

For Ollama or other OpenAI-compatible APIs:
generic:
  api_key: "ollama"  # or actual key for other providers
  base_url: "http://localhost:11434/v1"
  default_model: "qwen2.5"
Usage:
fast-agent go --model=generic.qwen2.5

Codex Responses

codexresponses:
  api_key: "${CODEX_API_KEY}"
  base_url: "https://api.codex.com"
  default_model: "codex-5-mini"
  service_tier: "fast"  # fast or unset
  text_verbosity: "medium"
  transport: "auto"

Open Responses

openresponses:
  api_key: "${OPEN_RESPONSES_API_KEY}"
  base_url: "https://api.openresponses.com"
  default_model: "open-5-mini"
  reasoning: "medium"
  reasoning_effort: "medium"
  transport: "sse"  # sse, websocket, auto
  service_tier: "fast"

Model Override Priority

Model selection follows this priority order:
  1. Agent definition: Model specified in @fast.agent(model=...)
  2. CLI flag: --model command-line argument
  3. Environment variable: FAST_AGENT_MODEL
  4. Config file: default_model in fastagent.config.yaml
  5. Default: gpt-5-mini.low

Using Models in Agents

# Use default model
@fast.agent(
    instruction="You are a helpful assistant"
)

# Specify model in agent
@fast.agent(
    instruction="Complex reasoning task",
    model="openai.o3-mini.high"
)

# Use model alias
@fast.agent(
    instruction="Fast responses",
    model="haiku"
)

# Use custom alias
@fast.agent(
    instruction="Planning agent",
    model="$system.plan"
)

Complete Multi-Provider Example

default_model: gpt-5-mini.low

model_aliases:
  system:
    default: "responses.gpt-5-mini.low"
    fast: "anthropic.claude-3-haiku-20241022"
    reasoning: "openai.o3-mini.high"
    local: "generic.qwen2.5"

anthropic:
  api_key: "${ANTHROPIC_API_KEY}"
  default_model: "claude-3-5-sonnet-20241022"
  cache_mode: "auto"
  web_search:
    enabled: true
    max_uses: 10

openai:
  api_key: "${OPENAI_API_KEY}"
  default_model: "gpt-5-mini"
  reasoning_effort: "medium"
  service_tier: "fast"
  web_search:
    enabled: true

google:
  api_key: "${GOOGLE_API_KEY}"
  default_model: "gemini-2.0-flash-exp"

generic:
  api_key: "ollama"
  base_url: "http://localhost:11434/v1"
  default_model: "qwen2.5"

See Also

Build docs developers (and LLMs) love