client<llm>

Clients configure how BAML calls LLM providers, including authentication, model selection, retry policies, and provider-specific options.

Syntax

client<llm> ClientName {
  provider "provider-name"
  retry_policy PolicyName  // Optional
  options {
    // Provider-specific configuration
  }
}

Basic Examples

OpenAI

client<llm> GPT4 {
  provider openai
  options {
    model gpt-4o
    api_key env.OPENAI_API_KEY
  }
}

Anthropic

client<llm> Claude {
  provider anthropic
  options {
    model claude-sonnet-4-5-20250929
    api_key env.ANTHROPIC_API_KEY
    max_tokens 2048
  }
}

Google AI

client<llm> Gemini {
  provider google-ai
  options {
    model gemini-2.5-flash
    api_key env.GOOGLE_API_KEY
  }
}

Client Components

client<llm>

required

Declares an LLM client configuration.

ClientName

identifier

required

Unique name for this client. Use PascalCase convention.

client<llm> GPT4Turbo { }
client<llm> ClaudeHaiku { }

provider

string

required

The LLM provider name. See Providers for supported values.

provider openai
provider anthropic
provider "vertex-ai"

retry_policy

identifier

Optional reference to a retry policy configuration.

retry_policy ExponentialBackoff

options

required

Provider-specific configuration block. Contents vary by provider.

Providers

BAML supports multiple LLM providers:

Provider	Value	Documentation
OpenAI	`openai`	OpenAI →
Anthropic	`anthropic`	Anthropic →
Google AI	`google-ai`	Google AI →
Vertex AI	`vertex-ai`	Vertex AI →
AWS Bedrock	`aws-bedrock`	AWS Bedrock →
Azure OpenAI	`azure-openai`	Azure OpenAI →
OpenRouter	`openrouter`	OpenRouter →
Ollama	`ollama`	Ollama →
OpenAI Generic	`openai-generic`	For OpenAI-compatible APIs
Fallback	`baml-fallback`	Fallback Strategy →
Round Robin	`baml-round-robin`	Round Robin →

Common Options

While options vary by provider, these are commonly supported:

model

string

required

The model identifier.

model gpt-4o
model "claude-3-opus-20240229"

api_key

string

API authentication key. Typically references an environment variable.

api_key env.OPENAI_API_KEY
api_key env.ANTHROPIC_API_KEY

base_url

string

Override the default API endpoint.

base_url "https://api.openai.com/v1"
base_url env.CUSTOM_ENDPOINT

max_tokens

int

Maximum tokens in the response.

max_tokens 2048
max_tokens 4096

Some models (like O1, O3) don’t support max_tokens. Use max_completion_tokens instead or set to null.

temperature

float

Sampling temperature (0.0 to 2.0).

temperature 0.7
temperature 0.0  // Deterministic

headers

map

Custom HTTP headers.

headers {
  "X-Custom-Header" "value"
  "Authorization" env.CUSTOM_AUTH
}

Shorthand Syntax

For simple cases, use inline provider/model syntax:

function MyFunction(input: string) -> string {
  client "openai/gpt-4o"
  prompt #"..."#
}

This is equivalent to:

client<llm> AutoGeneratedClient {
  provider openai
  options {
    model gpt-4o
    api_key env.OPENAI_API_KEY  // Uses default
  }
}

function MyFunction(input: string) -> string {
  client AutoGeneratedClient
  prompt #"..."#
}

Retry Policies

Define retry behavior for failed requests:

retry_policy ExponentialBackoff {
  max_retries 3
  strategy {
    type exponential_backoff
  }
}

retry_policy ConstantDelay {
  max_retries 5
  strategy {
    type constant_delay
    delay_ms 1000
  }
}

client<llm> ResilientClient {
  provider openai
  retry_policy ExponentialBackoff
  options {
    model gpt-4o
    api_key env.OPENAI_API_KEY
  }
}

See Retry Policies for details.

Advanced Examples

Azure OpenAI

client<llm> AzureGPT {
  provider azure-openai
  options {
    resource_name "my-resource"
    deployment_id "gpt-4o-deployment"
    api_version "2024-02-01"
    api_key env.AZURE_OPENAI_API_KEY
  }
}

AWS Bedrock

client<llm> BedrockClaude {
  provider aws-bedrock
  options {
    model "anthropic.claude-3-5-sonnet-20240620-v1:0"
    region "us-east-1"
    inference_configuration {
      max_tokens 2048
    }
  }
}

Vertex AI

client<llm> VertexGemini {
  provider vertex-ai
  options {
    model gemini-2.5-flash
    location us-central1
    credentials env.GOOGLE_APPLICATION_CREDENTIALS_CONTENT
  }
}

OpenAI with Custom Settings

client<llm> CustomGPT {
  provider openai
  options {
    model gpt-4o
    api_key env.OPENAI_API_KEY
    temperature 0.7
    max_tokens 2048
    top_p 0.9
    frequency_penalty 0.5
    presence_penalty 0.5
  }
}

Anthropic with Prompt Caching

client<llm> ClaudeWithCaching {
  provider anthropic
  options {
    model claude-3-haiku-20240307
    api_key env.ANTHROPIC_API_KEY
    max_tokens 1000
    allowed_role_metadata ["cache_control"]
    headers {
      "anthropic-beta" "prompt-caching-2024-07-31"
    }
  }
}

OpenRouter

client<llm> OpenRouterClient {
  provider openrouter
  options {
    model "anthropic/claude-3-haiku"
    api_key env.OPENROUTER_API_KEY
    headers {
      "X-Title" "My App"
      "HTTP-Referer" "https://myapp.com"
    }
  }
}

Ollama (Local)

client<llm> LocalLlama {
  provider ollama
  options {
    model llama3.1
    base_url "http://localhost:11434"
  }
}

Strategy Clients

Fallback Strategy

Try clients in sequence until one succeeds:

client<llm> ResilientClient {
  provider baml-fallback
  options {
    strategy [
      GPT4Turbo
      GPT35
      Claude
    ]
  }
}

Round Robin Strategy

Distribute requests across clients:

client<llm> LoadBalanced {
  provider baml-round-robin
  options {
    start 0
    strategy [
      GPT4
      Claude
      Gemini
    ]
  }
}

Media Handling

Configure how media (images, audio, etc.) are sent to models:

client<llm> CustomMediaHandling {
  provider openai
  options {
    model gpt-4o
    api_key env.OPENAI_API_KEY
    media_url_handler {
      image "send_base64"       // Convert URLs to base64
      audio "send_url"          // Send URL directly
      pdf "send_base64"         // Convert to base64
      video "send_url"          // Send URL directly
    }
  }
}

Options:

send_url: Pass URL directly to model
send_base64: Download and convert to base64
send_url_add_mime_type: Send URL with MIME type header

Reasoning Models

OpenAI O-series

// O1/O3/O4 models don't support max_tokens
client<llm> OpenAIO1 {
  provider openai
  options {
    model "o4-mini"
    api_key env.OPENAI_API_KEY
    max_tokens null  // Must be null or omitted
    max_completion_tokens 2048  // Use this instead
  }
}

Anthropic with Extended Thinking

client<llm> ClaudeThinking {
  provider anthropic
  options {
    model "claude-3-7-sonnet-20250219"
    api_key env.ANTHROPIC_API_KEY
    max_tokens 2048
    thinking {
      type "enabled"
      budget_tokens 1024
    }
  }
}

Gemini with Thinking

client<llm> GeminiThinking {
  provider google-ai
  options {
    model "gemini-2.5-pro"
    api_key env.GOOGLE_API_KEY
    generationConfig {
      thinkingConfig {
        thinkingBudget 1024
        includeThoughts true
      }
    }
  }
}

Runtime Client Selection

Override the client at runtime:

from baml_client import b

result = await b.MyFunction(
    input="data",
    baml_options={"client": "GPT4Turbo"}
)

Useful for:

A/B testing
Gradual rollouts
User-specific routing
Cost optimization

See Client Registry for details.

Environment Variables

Reference environment variables with env. prefix:

options {
  api_key env.OPENAI_API_KEY
  base_url env.CUSTOM_BASE_URL
  organization env.OPENAI_ORG_ID
}

Default environment variable names by provider:

Provider	Default Variable
OpenAI	`OPENAI_API_KEY`
Anthropic	`ANTHROPIC_API_KEY`
Google AI	`GOOGLE_API_KEY`
AWS Bedrock	AWS credentials chain
Azure OpenAI	`AZURE_OPENAI_API_KEY`
OpenRouter	`OPENROUTER_API_KEY`

Best Practices

Naming: Use descriptive PascalCase names (e.g., GPT4Turbo, not client1)
Credentials: Always use environment variables for API keys
Retry Policies: Add retry policies for production clients
Fallbacks: Use fallback strategies for critical operations
Testing: Create separate clients for development/testing
Documentation: Add comments explaining client purpose
Defaults: Leverage provider defaults when appropriate
Monitoring: Use different clients to track usage by model/provider

Function → - Use clients in functions
Retry Policies → - Configure retry behavior
Client Registry → - Runtime client selection
Provider Guides → - Provider-specific documentation

BAML Language

Type System

CLI

Client API

LLM Providers

client<llm>

Syntax

Basic Examples

OpenAI

Anthropic

Google AI

Client Components

Providers

Common Options

Shorthand Syntax

Retry Policies

Advanced Examples

Azure OpenAI

AWS Bedrock

Vertex AI

OpenAI with Custom Settings

Anthropic with Prompt Caching

OpenRouter

Ollama (Local)

Strategy Clients

Fallback Strategy

Round Robin Strategy

Media Handling

Reasoning Models

OpenAI O-series

Anthropic with Extended Thinking

Gemini with Thinking

Runtime Client Selection

Environment Variables

Best Practices

Build docs developers (and LLMs) love

BAML Language

Type System

CLI

Client API

LLM Providers

​Syntax

​Basic Examples

​OpenAI

​Anthropic

​Google AI

​Client Components

​Providers

​Common Options

​Shorthand Syntax

​Retry Policies

​Advanced Examples

​Azure OpenAI

​AWS Bedrock

​Vertex AI

​OpenAI with Custom Settings

​Anthropic with Prompt Caching

​OpenRouter

​Ollama (Local)

​Strategy Clients

​Fallback Strategy

​Round Robin Strategy

​Media Handling

​Reasoning Models

​OpenAI O-series

​Anthropic with Extended Thinking

​Gemini with Thinking

​Runtime Client Selection

​Environment Variables

​Best Practices

​Related

Build docs developers (and LLMs) love

Syntax

Basic Examples

OpenAI

Anthropic

Google AI

Client Components

Providers

Common Options

Shorthand Syntax

Retry Policies

Advanced Examples

Azure OpenAI

AWS Bedrock

Vertex AI

OpenAI with Custom Settings

Anthropic with Prompt Caching

OpenRouter

Ollama (Local)

Strategy Clients

Fallback Strategy

Round Robin Strategy

Media Handling

Reasoning Models

OpenAI O-series

Anthropic with Extended Thinking

Gemini with Thinking

Runtime Client Selection

Environment Variables

Best Practices

Related