Databricks Provider

Overview

The Databricks provider connects to models hosted on Databricks AI Gateway, including Claude models and Meta Llama models. It supports both token-based and OAuth authentication. Source: crates/goose/src/providers/databricks.rs

Configuration

Environment Variables

DATABRICKS_HOST

string

required

Your Databricks workspace URL (e.g., https://your-workspace.cloud.databricks.com)

DATABRICKS_TOKEN

string

Personal access token (optional if using OAuth)

DATABRICKS_MAX_RETRIES

number

default:"3"

Maximum number of retry attempts

DATABRICKS_INITIAL_RETRY_INTERVAL_MS

number

default:"1000"

Initial retry interval in milliseconds

DATABRICKS_BACKOFF_MULTIPLIER

number

default:"2.0"

Multiplier for exponential backoff

DATABRICKS_MAX_RETRY_INTERVAL_MS

number

default:"60000"

Maximum retry interval in milliseconds

Setup

# Configure using the CLI
goose configure

# Or set environment variables
export DATABRICKS_HOST="https://your-workspace.cloud.databricks.com"
export DATABRICKS_TOKEN="dapi..."

Authentication

The provider supports two authentication methods:

1. Token Authentication

Use a personal access token:

export DATABRICKS_TOKEN="dapi1234567890abcdef"

2. OAuth Authentication

If no token is provided, the provider automatically uses OAuth device code flow:

# Only set the host
export DATABRICKS_HOST="https://your-workspace.cloud.databricks.com"

# When you run goose, you'll be prompted to authenticate via browser
goose session start

The OAuth flow:

Displays a device code and URL
Opens your browser to authenticate
Caches the OAuth token for future use
Automatically refreshes expired tokens

Default OAuth configuration:

Client ID: databricks-cli
Redirect URL: http://localhost
Scopes: all-apis, offline_access

Supported Models

Claude Models

databricks-claude-sonnet-4 (default) - Claude Sonnet on Databricks
databricks-claude-sonnet-4-5 - Latest Sonnet
databricks-claude-haiku-4-5 (fast model) - Fast Claude model

Meta Llama Models

databricks-meta-llama-3-3-70b-instruct - Llama 3.3 70B
databricks-meta-llama-3-1-405b-instruct - Llama 3.1 405B

Documentation: https://docs.databricks.com/en/generative-ai/external-models/

Usage

Basic Usage

use goose::providers::create;
use goose::model::ModelConfig;

// Create with default model
let model_config = ModelConfig::new("databricks-claude-sonnet-4")?;
let provider = create("databricks", model_config, vec![]).await?;

// Stream a response
let messages = vec![Message::user().with_text("Hello!")];
let stream = provider.stream(
    &provider.get_model_config(),
    "session-123",
    "You are a helpful assistant.",
    &messages,
    &[],
).await?;

Custom Configuration

let model_config = ModelConfig::new("databricks-meta-llama-3-3-70b-instruct")?
    .with_temperature(0.7)
    .with_max_tokens(2048);

let provider = create("databricks", model_config, vec![]).await?;

Using Fast Models

// Automatically tries databricks-claude-haiku-4-5 with 0 retries
let (response, usage) = provider.complete_fast(
    "session-123",
    "You are a helpful assistant.",
    &messages,
    &[],
).await?;

Advanced Features

Embeddings

The Databricks provider supports text embeddings:

if provider.supports_embeddings() {
    let texts = vec![
        "Hello world".to_string(),
        "Databricks embeddings".to_string(),
    ];
    
    let embeddings = provider.create_embeddings(
        "session-123",
        texts,
    ).await?;
    
    println!("Generated {} embeddings", embeddings.len());
}

Embedding endpoint: serving-endpoints/text-embedding-3-small/invocations

Retry Configuration

Configure retry behavior:

export DATABRICKS_MAX_RETRIES="5"
export DATABRICKS_INITIAL_RETRY_INTERVAL_MS="2000"
export DATABRICKS_BACKOFF_MULTIPLIER="2.5"
export DATABRICKS_MAX_RETRY_INTERVAL_MS="120000"

Fast models use a different retry strategy:

Max retries: 0 (fail fast)
No exponential backoff

Model Endpoints

The provider automatically routes to the correct endpoint:

// Regular models
GET /serving-endpoints/{model_name}/invocations

// Codex models (if model name contains "codex")
GET /serving-endpoints/responses

// Embeddings
GET /serving-endpoints/text-embedding-3-small/invocations

Implementation Details

Provider Metadata

impl ProviderDef for DatabricksProvider {
    fn metadata() -> ProviderMetadata {
        ProviderMetadata::new(
            "databricks",
            "Databricks",
            "Models on Databricks AI Gateway",
            "databricks-claude-sonnet-4",
            DATABRICKS_KNOWN_MODELS.to_vec(),
            "https://docs.databricks.com/en/generative-ai/external-models/",
            vec![
                ConfigKey::new("DATABRICKS_HOST", true, false, None, true),
                ConfigKey::new("DATABRICKS_TOKEN", false, true, None, true),
            ],
        )
    }
}

Authentication Flow

pub enum DatabricksAuth {
    Token(String),
    OAuth {
        host: String,
        client_id: String,
        redirect_url: String,
        scopes: Vec<String>,
    },
}

The provider dynamically gets the auth token:

impl AuthProvider for DatabricksAuthProvider {
    async fn get_auth_header(&self) -> Result<(String, String)> {
        let token = match &self.auth {
            DatabricksAuth::Token(token) => token.clone(),
            DatabricksAuth::OAuth { host, client_id, redirect_url, scopes } => {
                oauth::get_oauth_token_async(host, client_id, redirect_url, scopes).await?
            }
        };
        Ok(("Authorization".to_string(), format!("Bearer {}", token)))
    }
}

API Format

Requests use OpenAI-compatible format but without the model field:

{
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful assistant."
    },
    {
      "role": "user",
      "content": "Hello!"
    }
  ],
  "temperature": 0.7,
  "max_tokens": 2048,
  "stream": true
}

Note: The model is specified in the endpoint URL, not the request body.

Retry Strategy

pub struct RetryConfig {
    pub max_retries: usize,
    pub initial_interval_ms: u64,
    pub backoff_multiplier: f64,
    pub max_interval_ms: u64,
}

// Default config
RetryConfig {
    max_retries: 3,
    initial_interval_ms: 1000,
    backoff_multiplier: 2.0,
    max_interval_ms: 60000,
}

// Fast model config (fail fast)
RetryConfig {
    max_retries: 0,
    initial_interval_ms: 0,
    backoff_multiplier: 1.0,
    max_interval_ms: 0,
}

Fetching Available Models

// Get all serving endpoints
let models = provider.fetch_supported_models().await?;

// Queries: GET /api/2.0/serving-endpoints
// Returns endpoint names that can be used as model names

Example response:

{
  "endpoints": [
    {
      "name": "databricks-claude-sonnet-4-5",
      "creator": "[email protected]",
      "creation_timestamp": 1234567890,
      "config": { ... }
    }
  ]
}

Error Handling

match provider.stream(...).await {
    Ok(stream) => { /* handle stream */ },
    Err(ProviderError::Authentication(msg)) => {
        eprintln!("Auth failed: {}", msg);
        eprintln!("Try running: goose configure");
    },
    Err(ProviderError::RateLimited { retry_after }) => {
        eprintln!("Rate limited");
    },
    Err(e) => eprintln!("Error: {}", e),
}

Programmatic Configuration

use goose::providers::databricks::DatabricksProvider;

let provider = DatabricksProvider::from_params(
    "https://your-workspace.cloud.databricks.com".to_string(),
    "dapi1234567890abcdef".to_string(),
    model_config,
)?;

OAuth Token Management

Tokens are cached in the system keyring:

Service: databricks_oauth
Account: {host}_access_token and {host}_refresh_token

Tokens are automatically refreshed when expired.

Core Library

Providers

Extensions

CLI Commands

Server API

Databricks Provider

Overview

Configuration

Environment Variables

Setup

Authentication

1. Token Authentication

2. OAuth Authentication

Supported Models

Claude Models

Meta Llama Models

Usage

Basic Usage

Custom Configuration

Using Fast Models

Advanced Features

Embeddings

Retry Configuration

Model Endpoints

Implementation Details

Provider Metadata

Authentication Flow

API Format

Retry Strategy

Fetching Available Models

Error Handling

Programmatic Configuration

OAuth Token Management

See Also

Build docs developers (and LLMs) love

Core Library

Providers

Extensions

CLI Commands

Server API

​Overview

​Configuration

​Environment Variables

​Setup

​Authentication

​1. Token Authentication

​2. OAuth Authentication

​Supported Models

​Claude Models

​Meta Llama Models

​Usage

​Basic Usage

​Custom Configuration

​Using Fast Models

​Advanced Features

​Embeddings

​Retry Configuration

​Model Endpoints

​Implementation Details

​Provider Metadata

​Authentication Flow

​API Format

​Retry Strategy

​Fetching Available Models

​Error Handling

​Programmatic Configuration

​OAuth Token Management

​See Also

Build docs developers (and LLMs) love

Overview

Configuration

Environment Variables

Setup

Authentication

1. Token Authentication

2. OAuth Authentication

Supported Models

Claude Models

Meta Llama Models

Usage

Basic Usage

Custom Configuration

Using Fast Models

Advanced Features

Embeddings

Retry Configuration

Model Endpoints

Implementation Details

Provider Metadata

Authentication Flow

API Format

Retry Strategy

Fetching Available Models

Error Handling

Programmatic Configuration

OAuth Token Management

See Also