OpenAI-Compatible Providers

The openai-generic provider supports all APIs that use OpenAI’s request and response formats, including Ollama, OpenRouter, Groq, HuggingFace, Together AI, and many others.

Quick Start

client<llm> MyClient {
  provider "openai-generic"
  options {
    base_url "https://api.provider.com"
    model "<provider-specific-model-name>"
  }
}

Supported Providers

A non-exhaustive list of providers compatible with openai-generic:

Provider	Base URL	Documentation
Ollama	`http://localhost:11434/v1`	Ollama
OpenRouter	`https://openrouter.ai/api/v1`	OpenRouter
Groq	`https://api.groq.com/openai/v1`	Groq
Together AI	`https://api.together.xyz/v1`	Together
Cerebras	`https://api.cerebras.ai/v1`	Cerebras
Hugging Face	`https://api-inference.huggingface.co/models/<model>`	HuggingFace
LM Studio	`http://localhost:1234/v1`	LM Studio
vLLM	Custom	vLLM

Configuration Options

BAML-Specific Options

base_url

string

default:"https://api.openai.com/v1"

The base URL for the API endpoint.

api_key

string

default:"<none>"

Used to build the Authorization header: Authorization: Bearer $api_keyIf not set or set to an empty string, the Authorization header will not be sent. This is useful for local providers like Ollama that don’t require authentication.

headers

object

Additional headers to send with requests.

client<llm> MyClient {
  provider "openai-generic"
  options {
    base_url "https://api.provider.com"
    model "model-name"
    headers {
      "X-My-Header" "my-value"
    }
  }
}

model

string

The model name in the provider’s expected format. The exact syntax depends on your provider’s documentation.Examples:

OpenAI: "gpt-4o"
Ollama: "llama3"
OpenRouter: "openai/gpt-4o-mini"

Model Parameters

These parameters are passed directly to the provider API.

For reasoning models like o1 or o1-mini, use max_completion_tokens instead of max_tokens. Set max_tokens to null.

client<llm> ReasoningModel {
  provider "openai-generic"
  options {
    base_url "https://api.provider.com"
    model "o4-mini"
    max_tokens null
    max_completion_tokens 4096
  }
}

Common parameters (support varies by provider):

temperature - Controls randomness
max_tokens - Maximum tokens to generate
top_p - Nucleus sampling parameter
frequency_penalty - Reduces repetition
presence_penalty - Encourages new topics

Consult your specific provider’s documentation for supported parameters.

Provider Examples

Ollama

Ollama provides local LLM inference with OpenAI-compatible endpoints.

Use Ollama’s OpenAI-compatible /v1 endpoint. See Ollama’s OpenAI compatibility documentation.

client<llm> OllamaClient {
  provider "openai-generic"
  options {
    base_url "http://localhost:11434/v1"
    model "llama3"
  }
}

Popular Ollama Models:

Model	Description
`llama4`	Latest Meta Llama with enhanced reasoning
`llama3.3`	Enhanced Llama 3 with improved performance
`qwen2`	Alibaba’s large language model series
`phi3`	Microsoft’s lightweight 3B/14B models
`mistral`	Mistral AI’s 7B model
`gemma`	Google DeepMind’s lightweight models

See the Ollama Model Library for all available models. CORS for Web Testing:

OLLAMA_ORIGINS='*' ollama serve

OpenRouter

OpenRouter provides unified access to 300+ models from multiple providers.

export OPENROUTER_API_KEY="your-api-key-here"

client<llm> OpenRouterClient {
  provider "openai-generic"
  options {
    base_url "https://openrouter.ai/api/v1"
    api_key env.OPENROUTER_API_KEY
    model "openai/gpt-4o-mini"
  }
}

Model Naming Convention: OpenRouter uses provider/model-name format:

openai/gpt-4o-mini
anthropic/claude-3.5-sonnet
google/gemini-2.0-flash-001
meta-llama/llama-3.1-70b-instruct

Model Variants: OpenRouter supports routing preferences (e.g., :nitro for high-throughput):

client<llm> NitroClient {
  provider "openai-generic"
  options {
    base_url "https://openrouter.ai/api/v1"
    api_key env.OPENROUTER_API_KEY
    model "meta-llama/llama-3.1-70b-instruct:nitro"
  }
}

App Attribution: OpenRouter supports optional headers for app attribution:

client<llm> OpenRouterWithAttribution {
  provider "openai-generic"
  options {
    base_url "https://openrouter.ai/api/v1"
    api_key env.OPENROUTER_API_KEY
    model "anthropic/claude-3-haiku"
    headers {
      "X-Title" "My App"
      "HTTP-Referer" "https://myapp.com"
    }
  }
}

See OpenRouter Models for the complete list.

Groq

Groq provides fast inference for open-source models.

export GROQ_API_KEY="your-api-key-here"

client<llm> GroqClient {
  provider "openai-generic"
  options {
    base_url "https://api.groq.com/openai/v1"
    api_key env.GROQ_API_KEY
    model "llama-3.1-70b-versatile"
  }
}

Together AI

Together AI provides access to open-source models with fast inference.

export TOGETHER_API_KEY="your-api-key-here"

client<llm> TogetherClient {
  provider "openai-generic"
  options {
    base_url "https://api.together.xyz/v1"
    api_key env.TOGETHER_API_KEY
    model "meta-llama/Llama-3-70b-chat-hf"
  }
}

LM Studio

LM Studio provides local model inference with a desktop application.

client<llm> LMStudioClient {
  provider "openai-generic"
  options {
    base_url "http://localhost:1234/v1"
    model "local-model"
  }
}

Features

Streaming: Supported (depends on provider)
Multimodal: Support varies by provider and model
Local Inference: Works with Ollama, LM Studio, and vLLM
Cloud Providers: Works with OpenRouter, Groq, Together AI, and more

Do Not Set

messages

DO NOT USE

BAML automatically constructs this from your prompt.

stream

DO NOT USE

BAML automatically sets this based on how you call the client in your code.

Additional Resources

For detailed configuration of specific providers, see:

BAML Language

Type System

CLI

Client API

LLM Providers

OpenAI-Compatible Providers

Quick Start

Supported Providers

Configuration Options

BAML-Specific Options

Model Parameters

Provider Examples

Ollama

OpenRouter

Groq

Together AI

LM Studio

Features

Do Not Set

Additional Resources

Build docs developers (and LLMs) love

BAML Language

Type System

CLI

Client API

LLM Providers

​Quick Start

​Supported Providers

​Configuration Options

​BAML-Specific Options

​Model Parameters

​Provider Examples

​Ollama

​OpenRouter

​Groq

​Together AI

​LM Studio

​Features

​Do Not Set

​Additional Resources

Build docs developers (and LLMs) love

Quick Start

Supported Providers

Configuration Options

BAML-Specific Options

Model Parameters

Provider Examples

Ollama

OpenRouter

Groq

Together AI

LM Studio

Features

Do Not Set

Additional Resources