Skip to main content
This quickstart guide will get you up and running with Gemini on Vertex AI. You’ll learn to generate text and build a multimodal application that processes both text and images.
Before starting, make sure you’ve completed the Environment Setup guide.

Your First Gemini Request

Let’s start with the simplest possible example - generating text from a prompt.
1

Import and Initialize

Create a new Python file or notebook and set up the client:
from google import genai
from google.genai import types

# Initialize the client
PROJECT_ID = "your-project-id"  # Replace with your project ID
LOCATION = "us-central1"

client = genai.Client(
    vertexai=True,
    project=PROJECT_ID,
    location=LOCATION
)
If you’re running on Google Colab, add authentication first:
from google.colab import auth
auth.authenticate_user()
2

Generate Your First Response

Send a prompt to Gemini and get a response:
response = client.models.generate_content(
    model="gemini-2.0-flash-exp",
    contents="Explain how AI works in simple terms."
)

print(response.text)
That’s it! You should see a clear, friendly explanation of AI concepts.

Text Generation Examples

Here are practical examples of what you can do with text generation:

Simple Q&A

response = client.models.generate_content(
    model="gemini-2.0-flash-exp",
    contents="What are the key differences between Python and JavaScript?"
)

print(response.text)

Creative Writing

prompt = """
Write a short story about a robot learning to paint.
Make it heartwarming and under 200 words.
"""

response = client.models.generate_content(
    model="gemini-2.0-flash-exp",
    contents=prompt
)

print(response.text)

Streaming Responses

For longer outputs, stream the response to see results as they’re generated:
prompt = "Explain quantum computing in detail."

for chunk in client.models.generate_content_stream(
    model="gemini-2.0-flash-exp",
    contents=prompt
):
    print(chunk.text, end="")

Multimodal Example: Vision

Gemini’s real power comes from its multimodal capabilities. Let’s analyze an image:
response = client.models.generate_content(
    model="gemini-2.0-flash-exp",
    contents=[
        types.Part.from_uri(
            file_uri="gs://cloud-samples-data/generative-ai/image/meal.png",
            mime_type="image/png"
        ),
        "Describe this image and suggest a recipe."
    ]
)

print(response.text)

Example: Product Photo Analysis

Here’s a complete example that analyzes a product photo:
from google import genai
from google.genai import types

# Initialize client
client = genai.Client(
    vertexai=True,
    project="your-project-id",
    location="us-central1"
)

# Analyze a product image
response = client.models.generate_content(
    model="gemini-2.0-flash-exp",
    contents=[
        types.Part.from_uri(
            file_uri="gs://cloud-samples-data/generative-ai/image/a-man-and-a-dog.png",
            mime_type="image/png"
        ),
        """
        Analyze this image and provide:
        1. A detailed description
        2. The mood or atmosphere
        3. Potential use cases for marketing
        """
    ]
)

print(response.text)

Multi-Turn Conversations

Build conversational applications with chat history:
# Start a chat session
chat = client.chats.create(
    model="gemini-2.0-flash-exp"
)

# First message
response = chat.send_message(
    "I'm building a web app. Should I use React or Vue?"
)
print("Gemini:", response.text)

# Follow-up question
response = chat.send_message(
    "What about performance differences?"
)
print("Gemini:", response.text)

# The model remembers the context
response = chat.send_message(
    "Can you show me a simple component example?"
)
print("Gemini:", response.text)

Advanced Configuration

Control Output Quality

Customize the model’s behavior with configuration parameters:
response = client.models.generate_content(
    model="gemini-2.0-flash-exp",
    contents="Write a product description for a smart watch.",
    config=types.GenerateContentConfig(
        temperature=0.9,          # Higher = more creative (0.0-2.0)
        top_p=0.95,              # Nucleus sampling
        max_output_tokens=1024,   # Maximum length
    )
)

print(response.text)

System Instructions

Set consistent behavior across all prompts:
system_instruction = """
You are a helpful coding assistant specializing in Python.
Always provide working code examples with comments.
Use best practices and explain your reasoning.
"""

response = client.models.generate_content(
    model="gemini-2.0-flash-exp",
    contents="How do I read a CSV file?",
    config=types.GenerateContentConfig(
        system_instruction=system_instruction
    )
)

print(response.text)

Safety Settings

Control content filtering for your application:
safety_settings = [
    types.SafetySetting(
        category=types.HarmCategory.HARM_CATEGORY_HATE_SPEECH,
        threshold=types.HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE
    ),
    types.SafetySetting(
        category=types.HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
        threshold=types.HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE
    )
]

response = client.models.generate_content(
    model="gemini-2.0-flash-exp",
    contents="Your prompt here",
    config=types.GenerateContentConfig(
        safety_settings=safety_settings
    )
)

Complete Multimodal Example

Here’s a production-ready example combining multiple features:
from google import genai
from google.genai import types
import os

class GeminiAnalyzer:
    def __init__(self, project_id: str, location: str = "us-central1"):
        self.client = genai.Client(
            vertexai=True,
            project=project_id,
            location=location
        )
        self.model = "gemini-2.0-flash-exp"
    
    def analyze_image(self, image_uri: str, question: str) -> str:
        """Analyze an image with a custom question."""
        response = self.client.models.generate_content(
            model=self.model,
            contents=[
                types.Part.from_uri(
                    file_uri=image_uri,
                    mime_type="image/jpeg"
                ),
                question
            ],
            config=types.GenerateContentConfig(
                temperature=0.4,
                max_output_tokens=2048
            )
        )
        return response.text
    
    def chat(self, messages: list[str]) -> list[str]:
        """Have a multi-turn conversation."""
        chat = self.client.chats.create(model=self.model)
        responses = []
        
        for message in messages:
            response = chat.send_message(message)
            responses.append(response.text)
        
        return responses

# Usage
analyzer = GeminiAnalyzer(project_id="your-project-id")

# Analyze an image
result = analyzer.analyze_image(
    image_uri="gs://cloud-samples-data/generative-ai/image/meal.png",
    question="What ingredients would I need to recreate this dish?"
)
print(result)

# Have a conversation
responses = analyzer.chat([
    "What's the best way to learn Python?",
    "How long would that take?",
    "What resources do you recommend?"
])
for i, response in enumerate(responses, 1):
    print(f"Response {i}: {response}\n")

Error Handling

Always implement proper error handling in production:
from google.api_core import exceptions

try:
    response = client.models.generate_content(
        model="gemini-2.0-flash-exp",
        contents="Your prompt here"
    )
    print(response.text)
    
except exceptions.ResourceExhausted:
    print("Quota exceeded. Please try again later.")
    
except exceptions.InvalidArgument as e:
    print(f"Invalid request: {e}")
    
except exceptions.PermissionDenied:
    print("Permission denied. Check your authentication.")
    
except Exception as e:
    print(f"Unexpected error: {e}")

What You’ve Learned

You now know how to:
  • Generate text with Gemini models
  • Stream responses for better UX
  • Process images and build multimodal applications
  • Maintain conversation context
  • Configure model parameters for quality and safety
  • Handle errors gracefully

Next Steps

Function Calling

Connect Gemini to external tools and APIs

Multimodal Guide

Work with video, audio, and PDFs

Prompt Engineering

Write better prompts for better results

Sample Applications

Explore production-ready examples

Additional Resources

This repository contains sample code for demonstrative purposes. For production applications, refer to the official Vertex AI documentation.

Build docs developers (and LLMs) love