Skip to main content
Clients configure how BAML calls LLM providers, including authentication, model selection, retry policies, and provider-specific options.

Syntax

client<llm> ClientName {
  provider "provider-name"
  retry_policy PolicyName  // Optional
  options {
    // Provider-specific configuration
  }
}

Basic Examples

OpenAI

client<llm> GPT4 {
  provider openai
  options {
    model gpt-4o
    api_key env.OPENAI_API_KEY
  }
}

Anthropic

client<llm> Claude {
  provider anthropic
  options {
    model claude-sonnet-4-5-20250929
    api_key env.ANTHROPIC_API_KEY
    max_tokens 2048
  }
}

Google AI

client<llm> Gemini {
  provider google-ai
  options {
    model gemini-2.5-flash
    api_key env.GOOGLE_API_KEY
  }
}

Client Components

client<llm>
required
Declares an LLM client configuration.
ClientName
identifier
required
Unique name for this client. Use PascalCase convention.
client<llm> GPT4Turbo { }
client<llm> ClaudeHaiku { }
provider
string
required
The LLM provider name. See Providers for supported values.
provider openai
provider anthropic
provider "vertex-ai"
retry_policy
identifier
Optional reference to a retry policy configuration.
retry_policy ExponentialBackoff
options
required
Provider-specific configuration block. Contents vary by provider.

Providers

BAML supports multiple LLM providers:
ProviderValueDocumentation
OpenAIopenaiOpenAI →
AnthropicanthropicAnthropic →
Google AIgoogle-aiGoogle AI →
Vertex AIvertex-aiVertex AI →
AWS Bedrockaws-bedrockAWS Bedrock →
Azure OpenAIazure-openaiAzure OpenAI →
OpenRouteropenrouterOpenRouter →
OllamaollamaOllama →
OpenAI Genericopenai-genericFor OpenAI-compatible APIs
Fallbackbaml-fallbackFallback Strategy →
Round Robinbaml-round-robinRound Robin →

Common Options

While options vary by provider, these are commonly supported:
model
string
required
The model identifier.
model gpt-4o
model "claude-3-opus-20240229"
api_key
string
API authentication key. Typically references an environment variable.
api_key env.OPENAI_API_KEY
api_key env.ANTHROPIC_API_KEY
base_url
string
Override the default API endpoint.
base_url "https://api.openai.com/v1"
base_url env.CUSTOM_ENDPOINT
max_tokens
int
Maximum tokens in the response.
max_tokens 2048
max_tokens 4096
Some models (like O1, O3) don’t support max_tokens. Use max_completion_tokens instead or set to null.
temperature
float
Sampling temperature (0.0 to 2.0).
temperature 0.7
temperature 0.0  // Deterministic
headers
map
Custom HTTP headers.
headers {
  "X-Custom-Header" "value"
  "Authorization" env.CUSTOM_AUTH
}

Shorthand Syntax

For simple cases, use inline provider/model syntax:
function MyFunction(input: string) -> string {
  client "openai/gpt-4o"
  prompt #"..."#
}
This is equivalent to:
client<llm> AutoGeneratedClient {
  provider openai
  options {
    model gpt-4o
    api_key env.OPENAI_API_KEY  // Uses default
  }
}

function MyFunction(input: string) -> string {
  client AutoGeneratedClient
  prompt #"..."#
}

Retry Policies

Define retry behavior for failed requests:
retry_policy ExponentialBackoff {
  max_retries 3
  strategy {
    type exponential_backoff
  }
}

retry_policy ConstantDelay {
  max_retries 5
  strategy {
    type constant_delay
    delay_ms 1000
  }
}

client<llm> ResilientClient {
  provider openai
  retry_policy ExponentialBackoff
  options {
    model gpt-4o
    api_key env.OPENAI_API_KEY
  }
}
See Retry Policies for details.

Advanced Examples

Azure OpenAI

client<llm> AzureGPT {
  provider azure-openai
  options {
    resource_name "my-resource"
    deployment_id "gpt-4o-deployment"
    api_version "2024-02-01"
    api_key env.AZURE_OPENAI_API_KEY
  }
}

AWS Bedrock

client<llm> BedrockClaude {
  provider aws-bedrock
  options {
    model "anthropic.claude-3-5-sonnet-20240620-v1:0"
    region "us-east-1"
    inference_configuration {
      max_tokens 2048
    }
  }
}

Vertex AI

client<llm> VertexGemini {
  provider vertex-ai
  options {
    model gemini-2.5-flash
    location us-central1
    credentials env.GOOGLE_APPLICATION_CREDENTIALS_CONTENT
  }
}

OpenAI with Custom Settings

client<llm> CustomGPT {
  provider openai
  options {
    model gpt-4o
    api_key env.OPENAI_API_KEY
    temperature 0.7
    max_tokens 2048
    top_p 0.9
    frequency_penalty 0.5
    presence_penalty 0.5
  }
}

Anthropic with Prompt Caching

client<llm> ClaudeWithCaching {
  provider anthropic
  options {
    model claude-3-haiku-20240307
    api_key env.ANTHROPIC_API_KEY
    max_tokens 1000
    allowed_role_metadata ["cache_control"]
    headers {
      "anthropic-beta" "prompt-caching-2024-07-31"
    }
  }
}

OpenRouter

client<llm> OpenRouterClient {
  provider openrouter
  options {
    model "anthropic/claude-3-haiku"
    api_key env.OPENROUTER_API_KEY
    headers {
      "X-Title" "My App"
      "HTTP-Referer" "https://myapp.com"
    }
  }
}

Ollama (Local)

client<llm> LocalLlama {
  provider ollama
  options {
    model llama3.1
    base_url "http://localhost:11434"
  }
}

Strategy Clients

Fallback Strategy

Try clients in sequence until one succeeds:
client<llm> ResilientClient {
  provider baml-fallback
  options {
    strategy [
      GPT4Turbo
      GPT35
      Claude
    ]
  }
}

Round Robin Strategy

Distribute requests across clients:
client<llm> LoadBalanced {
  provider baml-round-robin
  options {
    start 0
    strategy [
      GPT4
      Claude
      Gemini
    ]
  }
}

Media Handling

Configure how media (images, audio, etc.) are sent to models:
client<llm> CustomMediaHandling {
  provider openai
  options {
    model gpt-4o
    api_key env.OPENAI_API_KEY
    media_url_handler {
      image "send_base64"       // Convert URLs to base64
      audio "send_url"          // Send URL directly
      pdf "send_base64"         // Convert to base64
      video "send_url"          // Send URL directly
    }
  }
}
Options:
  • send_url: Pass URL directly to model
  • send_base64: Download and convert to base64
  • send_url_add_mime_type: Send URL with MIME type header

Reasoning Models

OpenAI O-series

// O1/O3/O4 models don't support max_tokens
client<llm> OpenAIO1 {
  provider openai
  options {
    model "o4-mini"
    api_key env.OPENAI_API_KEY
    max_tokens null  // Must be null or omitted
    max_completion_tokens 2048  // Use this instead
  }
}

Anthropic with Extended Thinking

client<llm> ClaudeThinking {
  provider anthropic
  options {
    model "claude-3-7-sonnet-20250219"
    api_key env.ANTHROPIC_API_KEY
    max_tokens 2048
    thinking {
      type "enabled"
      budget_tokens 1024
    }
  }
}

Gemini with Thinking

client<llm> GeminiThinking {
  provider google-ai
  options {
    model "gemini-2.5-pro"
    api_key env.GOOGLE_API_KEY
    generationConfig {
      thinkingConfig {
        thinkingBudget 1024
        includeThoughts true
      }
    }
  }
}

Runtime Client Selection

Override the client at runtime:
from baml_client import b

result = await b.MyFunction(
    input="data",
    baml_options={"client": "GPT4Turbo"}
)
Useful for:
  • A/B testing
  • Gradual rollouts
  • User-specific routing
  • Cost optimization
See Client Registry for details.

Environment Variables

Reference environment variables with env. prefix:
options {
  api_key env.OPENAI_API_KEY
  base_url env.CUSTOM_BASE_URL
  organization env.OPENAI_ORG_ID
}
Default environment variable names by provider:
ProviderDefault Variable
OpenAIOPENAI_API_KEY
AnthropicANTHROPIC_API_KEY
Google AIGOOGLE_API_KEY
AWS BedrockAWS credentials chain
Azure OpenAIAZURE_OPENAI_API_KEY
OpenRouterOPENROUTER_API_KEY

Best Practices

  1. Naming: Use descriptive PascalCase names (e.g., GPT4Turbo, not client1)
  2. Credentials: Always use environment variables for API keys
  3. Retry Policies: Add retry policies for production clients
  4. Fallbacks: Use fallback strategies for critical operations
  5. Testing: Create separate clients for development/testing
  6. Documentation: Add comments explaining client purpose
  7. Defaults: Leverage provider defaults when appropriate
  8. Monitoring: Use different clients to track usage by model/provider

Build docs developers (and LLMs) love