Vertex AI

Crush supports running AI models through Google Cloud’s Vertex AI platform, which provides access to various foundation models including Anthropic Claude via Google Cloud.

Prerequisites

Before using Vertex AI with Crush, you need:

A Google Cloud account with Vertex AI enabled
The gcloud CLI installed and configured
A Google Cloud project with billing enabled
Vertex AI API enabled in your project

Required Environment Variables

Vertex AI requires two environment variables to be set:

VERTEXAI_PROJECT

Your Google Cloud project ID:

export VERTEXAI_PROJECT="my-project-id"

To find your project ID:

gcloud config get-value project

VERTEXAI_LOCATION

The Google Cloud region where you want to run models:

export VERTEXAI_LOCATION="us-central1"

Common locations:

us-central1 (United States - Iowa)
us-east4 (United States - N. Virginia)
europe-west1 (Belgium)
europe-west4 (Netherlands)
asia-southeast1 (Singapore)

Model availability varies by region. Check the Vertex AI documentation for model availability in your preferred location.

Setting Both Variables

export VERTEXAI_PROJECT="my-project-id"
export VERTEXAI_LOCATION="us-central1"

Authentication

Crush uses Google Cloud’s Application Default Credentials (ADC) for authentication. You need to authenticate using the gcloud CLI:

gcloud auth application-default login

This command will:

Open your web browser
Ask you to sign in to your Google account
Request permission to access Google Cloud resources
Store credentials locally for use by Crush

You only need to run gcloud auth application-default login once. The credentials will persist until you log out or they expire.

Enabling Vertex AI in Crush

Once you have set the required environment variables and authenticated, Crush will automatically detect your Vertex AI configuration and show it as an available provider. To verify:

Set VERTEXAI_PROJECT and VERTEXAI_LOCATION
Run gcloud auth application-default login
Start Crush: crush
Check the list of available models for Vertex AI

Model Configuration

While Crush will automatically detect Vertex AI, you can customize the available models in your crush.json configuration:

{
  "$schema": "https://charm.land/crush.json",
  "providers": {
    "vertexai": {
      "models": [
        {
          "id": "claude-sonnet-4@20250514",
          "name": "VertexAI Sonnet 4",
          "cost_per_1m_in": 3,
          "cost_per_1m_out": 15,
          "cost_per_1m_in_cached": 3.75,
          "cost_per_1m_out_cached": 0.3,
          "context_window": 200000,
          "default_max_tokens": 50000,
          "can_reason": true,
          "supports_attachments": true
        }
      ]
    }
  }
}

Model Configuration Fields

id: The Vertex AI model identifier (e.g., claude-sonnet-4@20250514)
name: Display name for the model in Crush
cost_per_1m_in: Cost per 1 million input tokens (USD)
cost_per_1m_out: Cost per 1 million output tokens (USD)
cost_per_1m_in_cached: Cost per 1 million cached input tokens (USD)
cost_per_1m_out_cached: Cost per 1 million cached output tokens (USD)
context_window: Maximum number of tokens (input + output)
default_max_tokens: Default maximum tokens for responses
can_reason: Whether the model supports extended thinking
supports_attachments: Whether the model can process file attachments

Update the pricing information in your configuration to match Google Cloud’s current rates, as they may change over time.

Available Models

Through Vertex AI, you can access various models including:

Anthropic Claude models (Claude 3.5, Claude 3 family)
Google’s Gemini models
Other supported foundation models

Model availability depends on your Google Cloud region and project configuration.

Pricing and Billing

Vertex AI pricing differs from direct provider APIs:

Pricing Structure

Per-token pricing: Charged based on input and output tokens
Caching discount: Lower rates for cached prompt tokens (when supported)
Region-specific: Prices vary by Google Cloud region
No minimum charges: Pay only for what you use

Cost Tracking

Crush tracks your token usage and estimated costs. To view your usage:

crush stats

Google Cloud billing is separate from Crush. Check your Google Cloud Console for detailed billing information and set up budget alerts.

Pricing Resources

Troubleshooting

Vertex AI Not Appearing

If Vertex AI doesn’t show up as a provider:

Verify environment variables are set:

echo $VERTEXAI_PROJECT
echo $VERTEXAI_LOCATION

Check authentication status:

gcloud auth application-default print-access-token

Ensure Vertex AI API is enabled in your project

Authentication Errors

If you see authentication errors:

Re-authenticate:
```
gcloud auth application-default login
```
Verify your account has necessary permissions
Check that your credentials haven’t expired

API Not Enabled

If you see “API not enabled” errors:

Enable the Vertex AI API:

gcloud services enable aiplatform.googleapis.com

Wait a few minutes for the API to be fully enabled
Restart Crush

Permission Denied

If you see permission errors:

Verify you have the required IAM roles:
- Vertex AI User or Vertex AI Administrator
Check project-level permissions in Google Cloud Console
Ensure billing is enabled on your project

Required IAM Roles

Your Google Cloud account needs the following IAM roles:

roles/aiplatform.user: To use Vertex AI services

Or the more permissive:

roles/aiplatform.admin: For full Vertex AI access

To grant the role:

gcloud projects add-iam-policy-binding PROJECT_ID \
  --member="user:[email protected]" \
  --role="roles/aiplatform.user"

Replace PROJECT_ID with your actual project ID and [email protected] with your Google account email.

Best Practices

Region Selection

Choose a region based on:

Latency: Select a region close to your location
Model availability: Not all models are available in all regions
Pricing: Prices may vary slightly by region
Data residency: Choose regions that comply with your data regulations

Cost Optimization

Monitor usage: Use crush stats to track token consumption
Set budget alerts: Configure alerts in Google Cloud Console
Use appropriate models: Smaller models cost less but may be sufficient for some tasks
Leverage caching: Use models that support prompt caching to reduce costs

Get Started

Configuration

Guides

Advanced

Prerequisites

Required Environment Variables

VERTEXAI_PROJECT

VERTEXAI_LOCATION

Setting Both Variables

Authentication

Enabling Vertex AI in Crush

Model Configuration

Model Configuration Fields

Available Models

Pricing and Billing

Pricing Structure

Cost Tracking

Pricing Resources

Troubleshooting

Vertex AI Not Appearing

Authentication Errors

API Not Enabled

Permission Denied

Required IAM Roles

Best Practices

Region Selection

Cost Optimization

Next Steps

Amazon Bedrock

Custom Providers

Build docs developers (and LLMs) love

Get Started

Configuration

Guides

Advanced

​Prerequisites

​Required Environment Variables

​VERTEXAI_PROJECT

​VERTEXAI_LOCATION

​Setting Both Variables

​Authentication

​Enabling Vertex AI in Crush

​Model Configuration

​Model Configuration Fields

​Available Models

​Pricing and Billing

​Pricing Structure

​Cost Tracking

​Pricing Resources

​Troubleshooting

​Vertex AI Not Appearing

​Authentication Errors

​API Not Enabled

​Permission Denied

​Required IAM Roles

​Best Practices

​Region Selection

​Cost Optimization

​Next Steps

Amazon Bedrock

Custom Providers

Build docs developers (and LLMs) love

Prerequisites

Required Environment Variables

VERTEXAI_PROJECT

VERTEXAI_LOCATION

Setting Both Variables

Authentication

Enabling Vertex AI in Crush

Model Configuration

Model Configuration Fields

Available Models

Pricing and Billing

Pricing Structure

Cost Tracking

Pricing Resources

Troubleshooting

Vertex AI Not Appearing

Authentication Errors

API Not Enabled

Permission Denied

Required IAM Roles

Best Practices

Region Selection

Cost Optimization

Next Steps