Quickstart

This quickstart guide will get you up and running with Gemini on Vertex AI. You’ll learn to generate text and build a multimodal application that processes both text and images.

Before starting, make sure you’ve completed the Environment Setup guide.

Your First Gemini Request

Let’s start with the simplest possible example - generating text from a prompt.

Import and Initialize

Create a new Python file or notebook and set up the client:

from google import genai
from google.genai import types

# Initialize the client
PROJECT_ID = "your-project-id"  # Replace with your project ID
LOCATION = "us-central1"

client = genai.Client(
    vertexai=True,
    project=PROJECT_ID,
    location=LOCATION
)

If you’re running on Google Colab, add authentication first:

from google.colab import auth
auth.authenticate_user()

Generate Your First Response

Send a prompt to Gemini and get a response:

response = client.models.generate_content(
    model="gemini-2.0-flash-exp",
    contents="Explain how AI works in simple terms."
)

print(response.text)

That’s it! You should see a clear, friendly explanation of AI concepts.

Text Generation Examples

Here are practical examples of what you can do with text generation:

Simple Q&A

response = client.models.generate_content(
    model="gemini-2.0-flash-exp",
    contents="What are the key differences between Python and JavaScript?"
)

print(response.text)

Creative Writing

prompt = """
Write a short story about a robot learning to paint.
Make it heartwarming and under 200 words.
"""

response = client.models.generate_content(
    model="gemini-2.0-flash-exp",
    contents=prompt
)

print(response.text)

Streaming Responses

For longer outputs, stream the response to see results as they’re generated:

prompt = "Explain quantum computing in detail."

for chunk in client.models.generate_content_stream(
    model="gemini-2.0-flash-exp",
    contents=prompt
):
    print(chunk.text, end="")

Multimodal Example: Vision

Gemini’s real power comes from its multimodal capabilities. Let’s analyze an image:

response = client.models.generate_content(
    model="gemini-2.0-flash-exp",
    contents=[
        types.Part.from_uri(
            file_uri="gs://cloud-samples-data/generative-ai/image/meal.png",
            mime_type="image/png"
        ),
        "Describe this image and suggest a recipe."
    ]
)

print(response.text)

Example: Product Photo Analysis

Here’s a complete example that analyzes a product photo:

from google import genai
from google.genai import types

# Initialize client
client = genai.Client(
    vertexai=True,
    project="your-project-id",
    location="us-central1"
)

# Analyze a product image
response = client.models.generate_content(
    model="gemini-2.0-flash-exp",
    contents=[
        types.Part.from_uri(
            file_uri="gs://cloud-samples-data/generative-ai/image/a-man-and-a-dog.png",
            mime_type="image/png"
        ),
        """
        Analyze this image and provide:
        1. A detailed description
        2. The mood or atmosphere
        3. Potential use cases for marketing
        """
    ]
)

print(response.text)

Multi-Turn Conversations

Build conversational applications with chat history:

# Start a chat session
chat = client.chats.create(
    model="gemini-2.0-flash-exp"
)

# First message
response = chat.send_message(
    "I'm building a web app. Should I use React or Vue?"
)
print("Gemini:", response.text)

# Follow-up question
response = chat.send_message(
    "What about performance differences?"
)
print("Gemini:", response.text)

# The model remembers the context
response = chat.send_message(
    "Can you show me a simple component example?"
)
print("Gemini:", response.text)

Advanced Configuration

Control Output Quality

Customize the model’s behavior with configuration parameters:

response = client.models.generate_content(
    model="gemini-2.0-flash-exp",
    contents="Write a product description for a smart watch.",
    config=types.GenerateContentConfig(
        temperature=0.9,          # Higher = more creative (0.0-2.0)
        top_p=0.95,              # Nucleus sampling
        max_output_tokens=1024,   # Maximum length
    )
)

print(response.text)

System Instructions

Set consistent behavior across all prompts:

system_instruction = """
You are a helpful coding assistant specializing in Python.
Always provide working code examples with comments.
Use best practices and explain your reasoning.
"""

response = client.models.generate_content(
    model="gemini-2.0-flash-exp",
    contents="How do I read a CSV file?",
    config=types.GenerateContentConfig(
        system_instruction=system_instruction
    )
)

print(response.text)

Safety Settings

Control content filtering for your application:

safety_settings = [
    types.SafetySetting(
        category=types.HarmCategory.HARM_CATEGORY_HATE_SPEECH,
        threshold=types.HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE
    ),
    types.SafetySetting(
        category=types.HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
        threshold=types.HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE
    )
]

response = client.models.generate_content(
    model="gemini-2.0-flash-exp",
    contents="Your prompt here",
    config=types.GenerateContentConfig(
        safety_settings=safety_settings
    )
)

Complete Multimodal Example

Here’s a production-ready example combining multiple features:

from google import genai
from google.genai import types
import os

class GeminiAnalyzer:
    def __init__(self, project_id: str, location: str = "us-central1"):
        self.client = genai.Client(
            vertexai=True,
            project=project_id,
            location=location
        )
        self.model = "gemini-2.0-flash-exp"
    
    def analyze_image(self, image_uri: str, question: str) -> str:
        """Analyze an image with a custom question."""
        response = self.client.models.generate_content(
            model=self.model,
            contents=[
                types.Part.from_uri(
                    file_uri=image_uri,
                    mime_type="image/jpeg"
                ),
                question
            ],
            config=types.GenerateContentConfig(
                temperature=0.4,
                max_output_tokens=2048
            )
        )
        return response.text
    
    def chat(self, messages: list[str]) -> list[str]:
        """Have a multi-turn conversation."""
        chat = self.client.chats.create(model=self.model)
        responses = []
        
        for message in messages:
            response = chat.send_message(message)
            responses.append(response.text)
        
        return responses

# Usage
analyzer = GeminiAnalyzer(project_id="your-project-id")

# Analyze an image
result = analyzer.analyze_image(
    image_uri="gs://cloud-samples-data/generative-ai/image/meal.png",
    question="What ingredients would I need to recreate this dish?"
)
print(result)

# Have a conversation
responses = analyzer.chat([
    "What's the best way to learn Python?",
    "How long would that take?",
    "What resources do you recommend?"
])
for i, response in enumerate(responses, 1):
    print(f"Response {i}: {response}\n")

Error Handling

Always implement proper error handling in production:

from google.api_core import exceptions

try:
    response = client.models.generate_content(
        model="gemini-2.0-flash-exp",
        contents="Your prompt here"
    )
    print(response.text)
    
except exceptions.ResourceExhausted:
    print("Quota exceeded. Please try again later.")
    
except exceptions.InvalidArgument as e:
    print(f"Invalid request: {e}")
    
except exceptions.PermissionDenied:
    print("Permission denied. Check your authentication.")
    
except Exception as e:
    print(f"Unexpected error: {e}")

What You’ve Learned

You now know how to:

Generate text with Gemini models
Stream responses for better UX
Process images and build multimodal applications
Maintain conversation context
Configure model parameters for quality and safety
Handle errors gracefully

Next Steps

Function Calling

Connect Gemini to external tools and APIs

Multimodal Guide

Work with video, audio, and PDFs

Prompt Engineering

Write better prompts for better results

Sample Applications

Explore production-ready examples

Additional Resources

This repository contains sample code for demonstrative purposes. For production applications, refer to the official Vertex AI documentation.

Getting Started

Gemini Models

Agents

RAG & Search

Embeddings & Vector Search

Vision

Audio

Your First Gemini Request

Text Generation Examples

Simple Q&A

Creative Writing

Streaming Responses

Multimodal Example: Vision

Example: Product Photo Analysis

Multi-Turn Conversations

Advanced Configuration

Control Output Quality

System Instructions

Safety Settings

Complete Multimodal Example

Error Handling

What You’ve Learned

Next Steps

Function Calling

Multimodal Guide

Prompt Engineering

Sample Applications

Additional Resources

Build docs developers (and LLMs) love

Getting Started

Gemini Models

Agents

RAG & Search

Embeddings & Vector Search

Vision

Audio

​Your First Gemini Request

​Text Generation Examples

​Simple Q&A

​Creative Writing

​Streaming Responses

​Multimodal Example: Vision

​Example: Product Photo Analysis

​Multi-Turn Conversations

​Advanced Configuration

​Control Output Quality

​System Instructions

​Safety Settings

​Complete Multimodal Example

​Error Handling

​What You’ve Learned

​Next Steps

Function Calling

Multimodal Guide

Prompt Engineering

Sample Applications

​Additional Resources

Build docs developers (and LLMs) love

Your First Gemini Request

Text Generation Examples

Simple Q&A

Creative Writing

Streaming Responses

Multimodal Example: Vision

Example: Product Photo Analysis

Multi-Turn Conversations

Advanced Configuration

Control Output Quality

System Instructions

Safety Settings

Complete Multimodal Example

Error Handling

What You’ve Learned

Next Steps

Additional Resources