Overview
Google Vertex AI provides access to Gemini models, PaLM, and other Google AI models through Google Cloud Platform with enterprise features and SLAs.Quick Start
Supported Models
- Gemini 2.0
- Gemini 1.5
- Gemini 1.0
- Other Models
Latest Gemini models with multimodal capabilities:
Authentication
- Service Account (Recommended)
- Application Default Credentials
- Direct Parameters
Available Locations
Vertex AI is available in multiple regions:| Location | Code | Description |
|---|---|---|
| US Multi-Region | us-central1 | US multi-region (recommended) |
| Europe | europe-west1 | Belgium |
| Europe | europe-west4 | Netherlands |
| Asia | asia-southeast1 | Singapore |
| Asia | asia-northeast1 | Tokyo |
Multimodal (Vision)
Gemini models support images, videos, and audio:Function Calling
Gemini supports function calling:Streaming
Context Caching
Cache large contexts to reduce costs:JSON Mode
Force JSON output:Grounding (Search)
Ground responses in Google Search or Vertex AI Search:Safety Settings
Configure content safety filters:Embeddings
Generate embeddings:Advanced Parameters
Temperature and Sampling
System Instructions
Stop Sequences
Batch Prediction
Process large batches asynchronously:Error Handling
Cost Tracking
Model Garden
Use models from Vertex AI Model Garden:Best Practices
Use Service Accounts
Use service accounts with minimal required permissions for production.
Enable Caching
Use context caching for large prompts to reduce costs.
Choose Right Model
Use Flash for speed, Pro for quality, Flash-8B for high throughput.
Set Safety Filters
Configure appropriate safety settings for your use case.
Related Documentation
Vision
Work with images, videos, and PDFs
Function Calling
Implement tool use with Gemini
Embeddings
Generate embeddings on Vertex AI
Streaming
Stream responses in real-time