Overview
The Databricks provider connects to models hosted on Databricks AI Gateway, including Claude models and Meta Llama models. It supports both token-based and OAuth authentication.
Source: crates/goose/src/providers/databricks.rs
Configuration
Environment Variables
Your Databricks workspace URL (e.g., https://your-workspace.cloud.databricks.com)
Personal access token (optional if using OAuth)
Maximum number of retry attempts
DATABRICKS_INITIAL_RETRY_INTERVAL_MS
Initial retry interval in milliseconds
DATABRICKS_BACKOFF_MULTIPLIER
Multiplier for exponential backoff
DATABRICKS_MAX_RETRY_INTERVAL_MS
Maximum retry interval in milliseconds
Setup
# Configure using the CLI
goose configure
# Or set environment variables
export DATABRICKS_HOST="https://your-workspace.cloud.databricks.com"
export DATABRICKS_TOKEN="dapi..."
Authentication
The provider supports two authentication methods:
1. Token Authentication
Use a personal access token:
export DATABRICKS_TOKEN="dapi1234567890abcdef"
2. OAuth Authentication
If no token is provided, the provider automatically uses OAuth device code flow:
# Only set the host
export DATABRICKS_HOST="https://your-workspace.cloud.databricks.com"
# When you run goose, you'll be prompted to authenticate via browser
goose session start
The OAuth flow:
- Displays a device code and URL
- Opens your browser to authenticate
- Caches the OAuth token for future use
- Automatically refreshes expired tokens
Default OAuth configuration:
- Client ID:
databricks-cli
- Redirect URL:
http://localhost
- Scopes:
all-apis, offline_access
Supported Models
Claude Models
databricks-claude-sonnet-4 (default) - Claude Sonnet on Databricks
databricks-claude-sonnet-4-5 - Latest Sonnet
databricks-claude-haiku-4-5 (fast model) - Fast Claude model
databricks-meta-llama-3-3-70b-instruct - Llama 3.3 70B
databricks-meta-llama-3-1-405b-instruct - Llama 3.1 405B
Documentation: https://docs.databricks.com/en/generative-ai/external-models/
Usage
Basic Usage
use goose::providers::create;
use goose::model::ModelConfig;
// Create with default model
let model_config = ModelConfig::new("databricks-claude-sonnet-4")?;
let provider = create("databricks", model_config, vec![]).await?;
// Stream a response
let messages = vec![Message::user().with_text("Hello!")];
let stream = provider.stream(
&provider.get_model_config(),
"session-123",
"You are a helpful assistant.",
&messages,
&[],
).await?;
Custom Configuration
let model_config = ModelConfig::new("databricks-meta-llama-3-3-70b-instruct")?
.with_temperature(0.7)
.with_max_tokens(2048);
let provider = create("databricks", model_config, vec![]).await?;
Using Fast Models
// Automatically tries databricks-claude-haiku-4-5 with 0 retries
let (response, usage) = provider.complete_fast(
"session-123",
"You are a helpful assistant.",
&messages,
&[],
).await?;
Advanced Features
Embeddings
The Databricks provider supports text embeddings:
if provider.supports_embeddings() {
let texts = vec![
"Hello world".to_string(),
"Databricks embeddings".to_string(),
];
let embeddings = provider.create_embeddings(
"session-123",
texts,
).await?;
println!("Generated {} embeddings", embeddings.len());
}
Embedding endpoint: serving-endpoints/text-embedding-3-small/invocations
Retry Configuration
Configure retry behavior:
export DATABRICKS_MAX_RETRIES="5"
export DATABRICKS_INITIAL_RETRY_INTERVAL_MS="2000"
export DATABRICKS_BACKOFF_MULTIPLIER="2.5"
export DATABRICKS_MAX_RETRY_INTERVAL_MS="120000"
Fast models use a different retry strategy:
- Max retries: 0 (fail fast)
- No exponential backoff
Model Endpoints
The provider automatically routes to the correct endpoint:
// Regular models
GET /serving-endpoints/{model_name}/invocations
// Codex models (if model name contains "codex")
GET /serving-endpoints/responses
// Embeddings
GET /serving-endpoints/text-embedding-3-small/invocations
Implementation Details
impl ProviderDef for DatabricksProvider {
fn metadata() -> ProviderMetadata {
ProviderMetadata::new(
"databricks",
"Databricks",
"Models on Databricks AI Gateway",
"databricks-claude-sonnet-4",
DATABRICKS_KNOWN_MODELS.to_vec(),
"https://docs.databricks.com/en/generative-ai/external-models/",
vec![
ConfigKey::new("DATABRICKS_HOST", true, false, None, true),
ConfigKey::new("DATABRICKS_TOKEN", false, true, None, true),
],
)
}
}
Authentication Flow
pub enum DatabricksAuth {
Token(String),
OAuth {
host: String,
client_id: String,
redirect_url: String,
scopes: Vec<String>,
},
}
The provider dynamically gets the auth token:
impl AuthProvider for DatabricksAuthProvider {
async fn get_auth_header(&self) -> Result<(String, String)> {
let token = match &self.auth {
DatabricksAuth::Token(token) => token.clone(),
DatabricksAuth::OAuth { host, client_id, redirect_url, scopes } => {
oauth::get_oauth_token_async(host, client_id, redirect_url, scopes).await?
}
};
Ok(("Authorization".to_string(), format!("Bearer {}", token)))
}
}
Requests use OpenAI-compatible format but without the model field:
{
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
],
"temperature": 0.7,
"max_tokens": 2048,
"stream": true
}
Note: The model is specified in the endpoint URL, not the request body.
Retry Strategy
pub struct RetryConfig {
pub max_retries: usize,
pub initial_interval_ms: u64,
pub backoff_multiplier: f64,
pub max_interval_ms: u64,
}
// Default config
RetryConfig {
max_retries: 3,
initial_interval_ms: 1000,
backoff_multiplier: 2.0,
max_interval_ms: 60000,
}
// Fast model config (fail fast)
RetryConfig {
max_retries: 0,
initial_interval_ms: 0,
backoff_multiplier: 1.0,
max_interval_ms: 0,
}
Fetching Available Models
// Get all serving endpoints
let models = provider.fetch_supported_models().await?;
// Queries: GET /api/2.0/serving-endpoints
// Returns endpoint names that can be used as model names
Example response:
{
"endpoints": [
{
"name": "databricks-claude-sonnet-4-5",
"creator": "[email protected]",
"creation_timestamp": 1234567890,
"config": { ... }
}
]
}
Error Handling
match provider.stream(...).await {
Ok(stream) => { /* handle stream */ },
Err(ProviderError::Authentication(msg)) => {
eprintln!("Auth failed: {}", msg);
eprintln!("Try running: goose configure");
},
Err(ProviderError::RateLimited { retry_after }) => {
eprintln!("Rate limited");
},
Err(e) => eprintln!("Error: {}", e),
}
Programmatic Configuration
use goose::providers::databricks::DatabricksProvider;
let provider = DatabricksProvider::from_params(
"https://your-workspace.cloud.databricks.com".to_string(),
"dapi1234567890abcdef".to_string(),
model_config,
)?;
OAuth Token Management
Tokens are cached in the system keyring:
- Service:
databricks_oauth
- Account:
{host}_access_token and {host}_refresh_token
Tokens are automatically refreshed when expired.
See Also