Available Models
Gemini 3 Series (Latest)
- gemini-3.1-pro-preview - Most capable with advanced reasoning (1M context)
- gemini-3-pro-preview - Advanced reasoning and thinking
- gemini-3-flash-preview - Fast with thinking support
Gemini 2 Series
- gemini-2.5-pro - Most capable Gemini 2.5 (1M context)
- gemini-2.5-flash - Fast and efficient (1M context)
- gemini-2.0-flash - Fast and versatile (1M context)
Gemini 1.5 Series
- gemini-1.5-pro - Capable and reliable (1M context)
- gemini-1.5-flash - Fast and efficient (1M context)
- gemini-1.5-flash-8b - Compact and efficient (1M context)
- Massive 1M token context windows
- Multimodal input (text + images)
- Tool calling and parallel execution
- Thinking/reasoning (Gemini 3 series)
Prerequisites
Before configuring Vertex AI in Forge:- Google Cloud Account: Active GCP account with billing enabled
- GCP Project: Project with Vertex AI API enabled
- Authentication: Google Cloud CLI installed and configured
Setup Steps
Install Google Cloud CLI
Configure Application Default Credentials
For Forge to access Vertex AI, set up ADC:This creates credentials that Forge can use automatically.
Configure Forge
Run the interactive login command:Select Vertex AI and provide:
- Project ID: Your GCP project ID
- Location: GCP region (e.g.,
us-central1orglobal) - Auth Method: Choose “Google ADC” (recommended)
Configuration
Required Parameters
- PROJECT_ID: Your Google Cloud project ID
- LOCATION: GCP region (e.g.,
us-central1,europe-west1, orglobal)
API Endpoints
The endpoint format varies by location: Global location:Authentication Methods
Google Application Default Credentials (Recommended)
Forge automatically uses ADC when configured with “Google ADC” method:- Tokens are refreshed automatically
- No manual token management needed
- Works seamlessly with GCP services
Manual API Token
You can also provide a token manually:Model Selection
For Maximum Context
All Gemini models support 1M context:gemini-3.1-pro-preview- Best overallgemini-2.5-pro- Excellent capabilitygemini-1.5-pro- Reliable choice
For Speed
gemini-3-flash-preview- Fast with thinkinggemini-2.5-flash- Fast and capablegemini-1.5-flash- Quick responsesgemini-1.5-flash-8b- Ultra-fast
For Reasoning
Gemini 3 models support extended thinking:gemini-3.1-pro-preview- Advanced reasoninggemini-3-pro-preview- Strong reasoninggemini-3-flash-preview- Fast reasoning
Switching Models
Change models during a session:Regions and Availability
Recommended Regions
- us-central1 - US Central (Iowa)
- us-east4 - US East (Virginia)
- europe-west1 - Europe (Belgium)
- asia-northeast1 - Asia (Tokyo)
- global - Global endpoint (auto-routing)
Choosing a Region
Useglobal if:
- You want automatic routing
- Latency is not critical
- You don’t need regional data residency
- You need low latency
- Compliance requires data residency
- You’re using other regional GCP services
Features
Massive Context Windows
Gemini models support 1M tokens:- Process entire codebases
- Analyze large documents
- Long conversation history
- Complex multi-file operations
Multimodal Capabilities
- Image understanding
- Diagram analysis
- Screenshot interpretation
- Combined text and visual reasoning
Thinking Mode
Gemini 3 models show reasoning:- Explicit thought process
- Step-by-step logic
- Problem decomposition
- Self-verification
Enterprise Features
- Audit Logging: Full request/response logging
- VPC Service Controls: Network security
- Customer-Managed Keys: Data encryption
- SLA: 99.9% uptime guarantee
Best Practices
Authentication
For Development:- Use
gcloud auth application-default login - Let Forge automatically refresh tokens
- Use service accounts with minimal permissions
- Rotate credentials regularly
- Enable audit logging
Cost Management
Model Selection:- Use Flash models for simple tasks (lower cost)
- Use Pro models for complex reasoning (higher cost)
- Monitor usage in GCP Console
- Use smaller context when possible
- Cache common prompts
- Batch similar requests
Rate Limits
Vertex AI enforces quotas:- Requests per minute: Varies by model and region
- Tokens per minute: Varies by model
Troubleshooting
Authentication Errors
If authentication fails:API Not Enabled
If you see “API not enabled”:Permission Denied
If you lack permissions:- Check IAM roles in GCP Console
- Ensure you have
Vertex AI Userrole - Contact your GCP admin for access
Region Not Available
If a model isn’t available in your region:- Try the
globallocation - Check model availability
- Switch to an available region
Token Expiration
If using manual tokens and they expire:Deprecated: Environment Variable Setup
For backward compatibility:Next Steps
- Learn about Gemini capabilities
- Explore prompt design
- Set up billing alerts to monitor costs
- Configure VPC Service Controls for security