Skip to main content

Google Gemini Integration

The Google Gemini integration brings Google’s most advanced AI models to n8n, offering powerful multimodal capabilities that can understand and generate text, analyze images, process audio, and even work with video content.

Available Nodes

Google Gemini Node

Direct access to Gemini for text, image, audio, video, and document operations

Google Gemini Chat Model

Use Gemini with AI Agent for advanced workflows with tools and memory

Prerequisites

Before you begin, you’ll need:
  • A Google Cloud account or Google AI Studio account
  • A Google Gemini API key
  • (Optional) Google Cloud project with Vertex AI enabled for production use

Setup

1

Get Your API Key

Option 1: Google AI Studio (Recommended for getting started)
  1. Go to Google AI Studio
  2. Click Get API Key
  3. Create a new key or use an existing one
  4. Copy the API key
Option 2: Google Cloud Console (For production)
  1. Go to Google Cloud Console
  2. Enable the Vertex AI API
  3. Create credentials for the API
  4. Copy the API key
2

Configure in n8n

  1. Add a Google Gemini node to your workflow
  2. Click Credential to connect with
  3. Select Create New Credential
  4. Enter your API key
  5. (Optional) Set custom host URL for Vertex AI
  6. Click Save
3

Test the Connection

Send a simple text message to verify your credentials are working correctly.

Google Gemini Node

The Google Gemini node provides comprehensive access to Gemini’s multimodal capabilities across multiple resources.

Available Resources

Send messages to Gemini and receive intelligent responses.Operations:
  • Message: Send prompts and get responses from Gemini
Features:
  • Multi-turn conversations
  • System instructions support
  • Tool/function calling
  • JSON mode for structured output
  • Safety settings configuration
Example Configuration:
{
  "resource": "text",
  "operation": "message",
  "modelId": "gemini-2.0-flash",
  "messages": [
    {
      "content": "Explain how neural networks work in simple terms",
      "role": "user"
    }
  ]
}
Available Roles:
  • User: Send messages as the user
  • Model: Set Gemini’s response style

Gemini Models

Google offers several Gemini models with different capabilities:
ModelBest ForContext WindowKey Features
gemini-2.0-flashLatest, fastest1M tokensMultimodal, fast responses, cost-effective
gemini-1.5-proAdvanced reasoning2M tokensBest quality, longest context, video understanding
gemini-1.5-flashBalanced performance1M tokensFast, multimodal, good quality
gemini-1.0-proLegacy tasks32K tokensText-only, baseline model
Gemini 1.5 Pro has the longest context window of any large language model at 2 million tokens, allowing it to process entire codebases, long videos, and large document collections.

Advanced Features

Tool Use (Function Calling)

Connect tools to Gemini for dynamic interactions:
  1. Connect tool nodes to the Tools input
  2. Gemini will automatically decide when to use tools
  3. Tools execute and return results
  4. Gemini incorporates results in its response

Built-in Tools

Gemini supports built-in tools for specific capabilities: Code Execution: Allow Gemini to write and run Python code:
{
  "builtInTools": {
    "codeExecution": true
  }
}
Google Search: Enable Gemini to search the web for current information:
{
  "builtInTools": {
    "googleSearch": true
  }
}

JSON Mode

Request structured JSON output:
{
  "responseFormat": "json",
  "prompt": "Extract the name, email, and phone from this text as JSON"
}

Safety Settings

Control content safety thresholds:
{
  "safetySettings": [
    {
      "category": "HARM_CATEGORY_HARASSMENT",
      "threshold": "BLOCK_MEDIUM_AND_ABOVE"
    }
  ]
}

System Instructions

Set Gemini’s behavior and context:
{
  "systemInstruction": "You are a helpful data analyst specializing in Python and SQL. Always provide code examples with explanations."
}

Google Gemini Chat Model

The Google Gemini Chat Model node is designed for use with LangChain components, particularly the AI Agent.

Setup with AI Agent

1

Add Chat Model

Add the Google Gemini Chat Model node to your workflow.
2

Select Model

Choose the appropriate Gemini model:
  • gemini-2.0-flash: Latest, fastest, great for most tasks
  • gemini-1.5-pro: Maximum capability, longest context
  • gemini-1.5-flash: Balanced speed and quality
3

Configure Parameters

Set temperature, max tokens, and other options:
{
  "temperature": 0.7,
  "maxTokens": 8192,
  "topP": 0.95,
  "topK": 40
}
4

Connect to AI Agent

Wire the chat model to an AI Agent:

Model Parameters

{
  "model": "gemini-2.0-flash",
  "temperature": 0.7,
  "maxTokens": 8192
}

Common Use Cases

1. Video Content Analysis

Analyze video content automatically:

2. Multimodal Customer Support

Handle text, image, and document queries:

3. Document Processing Pipeline

Extract and process document data:

4. Audio Transcription Workflow

Transcribe and analyze audio: Build a retrieval-augmented generation system:

Best Practices

1

Choose the Right Model

  • gemini-2.0-flash: Fast responses, most tasks
  • gemini-1.5-pro: Complex reasoning, long context
  • gemini-1.5-flash: Balanced performance
2

Leverage Multimodal Capabilities

  • Combine text, images, audio, and video in single prompts
  • Use video understanding for long-form content
  • Process documents with visual elements effectively
3

Optimize Context Usage

  • Gemini supports massive context (up to 2M tokens)
  • Use for long documents and entire codebases
  • Consider chunking only for processing speed
4

Use Built-in Tools

  • Enable code execution for math and data analysis
  • Use Google Search for current information
  • Combine with custom tools for powerful agents
5

Configure Safety Settings

  • Set appropriate thresholds for your use case
  • Monitor filtered responses
  • Adjust as needed for your application

Troubleshooting

Rate Limits

If you encounter rate limits:
  1. Implement exponential backoff
  2. Reduce request frequency
  3. Upgrade to higher quota tier
  4. Use batch processing

Context Length Errors

If inputs are too long:
  1. Check total token count
  2. Use Gemini 1.5 Pro for longer context (2M tokens)
  3. Chunk inputs if necessary
  4. Remove redundant information

Media Processing Errors

If media files fail to process:
  1. Verify file format is supported
  2. Check file size limits
  3. Upload large files using the File API first
  4. Ensure proper encoding

Tool Calling Issues

If tools aren’t working:
  1. Verify tool connections
  2. Check tool descriptions are clear
  3. Test tools independently
  4. Review tool output format

Safety Filter Blocks

If responses are filtered:
  1. Review safety settings
  2. Adjust thresholds if appropriate
  3. Rephrase prompts
  4. Check content guidelines

Resources