Overview
Google Gemini is Google’s most capable AI model family, offering multimodal capabilities including text, vision, audio, and code. Access Gemini through Portkey for advanced reasoning, long context understanding, and function calling. Base URL:https://generativelanguage.googleapis.com
Supported Features
- ✅ Chat Completions (including streaming)
- ✅ Embeddings
- ✅ Function Calling
- ✅ Vision (Image and Video inputs)
- ✅ Audio Understanding
- ✅ Long Context (up to 2M tokens)
- ✅ JSON Mode
- ✅ System Instructions
- ❌ Image Generation (use Vertex AI)
- ❌ Fine-tuning (use Vertex AI)
Quick Start
Chat Completions
Streaming
Available Models
Gemini 2.0 (Latest)
| Model | Context Window | Description | Best For |
|---|---|---|---|
gemini-2.0-flash-exp | 1M tokens | Latest experimental Gemini 2.0 | General purpose, fast |
gemini-2.0-flash-thinking-exp | 32K tokens | Reasoning model (experimental) | Complex problem solving |
Gemini 1.5
| Model | Context Window | Description | Best For |
|---|---|---|---|
gemini-1.5-pro | 2M tokens | Most capable Gemini 1.5 | Complex tasks, long context |
gemini-1.5-flash | 1M tokens | Fast, efficient model | High-throughput applications |
gemini-1.5-flash-8b | 1M tokens | Smallest, fastest | Cost-effective tasks |
Embeddings
| Model | Dimensions | Description |
|---|---|---|
text-embedding-004 | 768 | Latest embedding model |
text-multilingual-embedding-002 | 768 | Multilingual support |
Gemini models excel at:
- Long context understanding (up to 2M tokens)
- Multimodal reasoning (text, images, video, audio)
- Code generation and analysis
- Multilingual capabilities
Configuration Options
Getting Your API Key
- Go to Google AI Studio
- Click Get API Key
- Create or select a project
- Copy your API key
Advanced Features
Vision (Image Understanding)
Function Calling
System Instructions
Long Context Processing
Gemini excels at processing very long documents:JSON Mode
Embeddings
Fallback Configuration
Fallback to GPT-4 if Gemini fails:Load Balancing
Balance between different Gemini models:Error Handling
Key Features
Context Windows
| Model | Context Window | Notes |
|---|---|---|
| gemini-1.5-pro | 2,097,152 tokens | Largest available |
| gemini-1.5-flash | 1,048,576 tokens | Fast processing |
| gemini-2.0-flash-exp | 1,048,576 tokens | Latest generation |
| gemini-2.0-flash-thinking-exp | 32,768 tokens | Reasoning focused |
Safety Settings
Gemini includes built-in safety filters. Responses may be blocked if content violates safety thresholds.Rate Limits
- Free tier: 15 requests per minute
- Pay-as-you-go: Higher limits based on usage
Best Practices
- Use Flash for speed - Gemini Flash is significantly faster
- Leverage long context - Process entire documents in one request
- Multimodal inputs - Combine text, images, and more
- System instructions - Guide behavior with clear instructions
- Handle safety blocks - Implement fallbacks for blocked responses
- Use embeddings - text-embedding-004 for semantic search
- Stream responses - Better UX for long generations
Gemini vs Vertex AI
| Feature | Google AI (Gemini) | Vertex AI |
|---|---|---|
| Access | Google AI Studio API key | GCP Service Account |
| Pricing | Pay-per-request | Enterprise pricing |
| Features | Core features | Additional enterprise features |
| Authentication | API key | OAuth 2.0, Service Accounts |
| Use Case | Development, small apps | Production, enterprise |
For enterprise deployments, consider using Google Vertex AI which offers additional features like fine-tuning, private endpoints, and SLA.
Pricing
Gemini offers competitive pricing with a free tier:Gemini Pricing
View detailed pricing for all Gemini models
Related Resources
Google Vertex AI
Enterprise Gemini through GCP
Function Calling
Advanced function calling
Vision Guide
Working with images
Fallbacks
Fallback configurations