LLM Clients
Clients in BAML define which LLM provider and model to use for your functions. BAML supports all major LLM providers and can work with any OpenAI-compatible API.Quick Start: Shorthand Syntax
The fastest way to use a client is with the shorthand syntax:"<provider>/<model>"
This assumes you have the appropriate API key in your environment:
OPENAI_API_KEYfor OpenAIANTHROPIC_API_KEYfor AnthropicGOOGLE_API_KEYfor Google AI- etc.
Common Shorthand Examples
Named Client Configuration
For more control, define named clients:Client Anatomy
- Declaration:
client<llm> ClientName - Provider: Which API provider to use
- Options: Model, credentials, and parameters
Supported Providers
BAML supports all major LLM providers:OpenAI
Anthropic
Google AI (Gemini)
AWS Bedrock
Azure OpenAI
OpenAI-Compatible (Ollama, OpenRouter, etc.)
Common Options
These options work across most providers:Environment Variables
Access environment variables withenv.VARIABLE_NAME:
Custom Headers
Add custom headers for beta features or authentication:Retry Policies
Add automatic retries for transient failures:Fallback Clients
Automatically fall back to another model if the primary fails:Round Robin
Distribute requests across multiple models:- Load distribution
- Cost optimization
- A/B testing different models
Runtime Client Selection
Choose the client dynamically at runtime using the Client Registry:- Feature flags (send 10% to a new model)
- User-based routing (premium users get better models)
- Dynamic cost optimization
Switching Models
Switching models is as simple as changing one line:- API formats
- Authentication
- Response parsing
- Structured output support
Schema-Aligned Parsing (SAP)
BAML’s SAP algorithm works with any model, even those without native structured output support:- Works on day one of new model releases
- Handles models without tool calling (like O1, DeepSeek R1)
- Parses markdown-wrapped JSON
- Accepts chain-of-thought before JSON
- Tolerates minor formatting issues
- Brand new models before official API support
- Open-source models
- Fine-tuned models
- Models without structured output APIs
Provider-Specific Features
Some providers have unique capabilities:Anthropic Prompt Caching
OpenAI Response Format
Testing with Different Clients
Test the same function with multiple models:- Accuracy
- Latency
- Cost
- Output quality
Best Practices
- Use named clients for configuration: Easier to maintain than inline options
- Store API keys in environment variables: Never hardcode credentials
- Add retry policies: Handle transient failures gracefully
- Use fallbacks for critical paths: Ensure high availability
- Test with multiple models: Find the best model for your use case
- Monitor costs: Different models have different pricing
- Use round robin for load balancing: Distribute load across providers
Example: Production-Ready Configuration
Here’s a complete example with retries and fallbacks:Next Steps
Functions
Use clients in BAML functions
Testing
Test with different clients
Provider Reference
Complete provider documentation
Client Registry
Runtime client selection