Google AI Provider

The google-ai provider supports Google’s generateContent and streamGenerateContent endpoints for Gemini models.

BAML uses the v1beta endpoint to access both existing v1 models and additional v1beta-exclusive models, following Google’s SDK conventions.

Quick Start

client<llm> MyClient {
  provider "google-ai"
  options {
    model "gemini-2.5-flash"
  }
}

Authentication

Set your Google API key as an environment variable:

export GOOGLE_API_KEY="your-api-key-here"

Or specify it explicitly in your BAML configuration:

client<llm> MyClient {
  provider "google-ai"
  options {
    api_key env.MY_GOOGLE_KEY
    model "gemini-2.5-flash"
  }
}

Configuration Options

BAML-Specific Options

These options modify the API request sent to Google AI.

api_key

string

default:"env.GOOGLE_API_KEY"

Passed as the x-goog-api-key header.

base_url

string

The base URL for the Google AI API.

headers

object

Additional headers to send with requests.

client<llm> MyClient {
  provider "google-ai"
  options {
    model "gemini-2.5-flash"
    headers {
      "X-My-Header" "my-value"
    }
  }
}

Supported Models

model

string

default:"gemini-2.5-flash"

The Gemini model to use.

Model	Use Case	Context	Key Features
gemini-2.5-pro	Complex tasks, coding, STEM	1M	Adaptive thinking, multimodal
gemini-2.5-flash	Production apps, balanced performance	1M	Best price/performance
gemini-2.5-flash-lite	High-volume, cost-sensitive	1M	Lowest cost, fastest

You can specify any model name - BAML won’t validate whether it exists.See the Google Model Documentation for the latest models.

Model Parameters

Parameters like temperature are specified in the generationConfig object. See the Google AI documentation for details.

client<llm> MyClient {
  provider "google-ai"
  options {
    model "gemini-2.5-flash"
    generationConfig {
      temperature 0.5
      maxOutputTokens 1024
      topP 0.8
      topK 40
    }
  }
}

Common generationConfig parameters:

temperature - Controls randomness (0-2)
maxOutputTokens - Maximum tokens to generate
topP - Nucleus sampling parameter
topK - Top-k sampling parameter
stopSequences - Array of sequences that stop generation

For all options, see the Google Gemini API documentation.

Media Handling

Google AI uses send_base64_unless_google_url by default for images, which preserves Google Cloud Storage URLs (gs://) while converting other URLs to base64.

Features

Streaming: Automatically uses streamGenerateContent when you call the streaming interface
Multimodal: Supports text, image, audio, and video inputs
Large Context: Up to 1M tokens context window
Function Calling: Native support for tool use

Do Not Set

contents

DO NOT USE

BAML automatically constructs this from your prompt.

BAML Language

Type System

CLI

Client API

LLM Providers

Google AI Provider

Quick Start

Authentication

Configuration Options

BAML-Specific Options

Supported Models

Model Parameters

Media Handling

Features

Do Not Set

Build docs developers (and LLMs) love

BAML Language

Type System

CLI

Client API

LLM Providers

​Quick Start

​Authentication

​Configuration Options

​BAML-Specific Options

​Supported Models

​Model Parameters

​Media Handling

​Features

​Do Not Set

Build docs developers (and LLMs) love

Quick Start

Authentication

Configuration Options

BAML-Specific Options

Supported Models

Model Parameters

Media Handling

Features

Do Not Set