Gemini Models Overview
Gemini is a family of multimodal generative AI models developed by Google DeepMind, designed for state-of-the-art performance across text, code, images, audio, and video understanding.Model Family
The Gemini family includes several model variants optimized for different use cases:Gemini 3.1 Pro
Gemini 3.1 Pro is Google’s latest flagship model with enhanced capabilities:- Enhanced stability & grounding: Improved factuality and reduced repetitive response patterns
- Advanced coding & agentic capabilities: Significant improvements in software engineering and agentic performance
- Efficiency & reasoning modes: Improved token efficiency with Medium thinking level to balance depth with latency
- Core quality improvements: Advanced reasoning, instruction following, and creative writing
Gemini 3 Flash
Optimized for speed and efficiency while maintaining high-quality output:- Fast response times for real-time applications
- Cost-effective for high-volume workloads
- Supports multimodal inputs (text, images, video, audio)
- Excellent for chat, code generation, and content analysis
Gemini 2.5 Models
Previous generation models with strong performance:- Gemini 2.5 Pro: Balanced performance for complex reasoning
- Gemini 2.5 Flash: Speed-optimized variant
- Gemini 2.5 Flash Lite: Ultra-lightweight for edge deployments
Key Capabilities
Multimodal Understanding
Process and understand text, images, video, audio, and PDFs in a single unified model
Advanced Reasoning
Dynamic thinking with configurable reasoning depth for complex problem-solving
Code Generation
Generate, execute, and debug Python code with built-in code execution
Function Calling
Integrate with external APIs and tools through structured function declarations
Grounding
Connect to real-time data via Google Search or custom data sources
Long Context
Process up to millions of tokens with context caching for cost optimization
Available Models
| Model ID | Context Window | Strengths | Best For |
|---|---|---|---|
gemini-3.1-pro-preview | 2M tokens | Advanced reasoning, coding | Complex tasks, agents |
gemini-3-flash-preview | 1M tokens | Speed, efficiency | Real-time apps, high volume |
gemini-2.5-pro | 2M tokens | Balanced performance | General purpose |
gemini-2.5-flash | 1M tokens | Fast responses | Production workloads |
gemini-2.5-flash-lite | 1M tokens | Lightweight | Edge deployment |
Getting Started
Model Configuration
Control model behavior with generation parameters:For Gemini 3 models, we recommend keeping
temperature at the default value of 1.0 as the reasoning capabilities are optimized for this setting.Thinking Levels
Gemini 3.1 Pro introduces granular control over reasoning depth:- Low (1-1K tokens): Minimizes latency for simple tasks
- Medium (1K-16K tokens): Balances reasoning and latency
- High (16K-32K tokens): Maximizes reasoning depth (default)
Media Resolution Control
Gemini 3 allows fine-grained control over image and video processing:Safety Settings
Configure content filtering for responsible AI:Next Steps
Getting Started
Learn basic text generation and model configuration
Multimodal Inputs
Process images, video, audio, and documents
Function Calling
Connect Gemini to external tools and APIs
Grounding
Ground responses in Google Search or custom data