Models

Genkit provides a unified API for working with AI models from different providers. Whether you’re using Gemini, Claude, GPT, Llama, or any other model, the interface is the same.

Model Abstraction

Genkit abstracts away provider-specific APIs into a single, consistent interface:

// Same API works for any model
const response = await ai.generate({
  model: 'googleai/gemini-2.0-flash',  // or anthropic/claude-3-5-sonnet
  prompt: 'Explain quantum computing',
});

This abstraction means:

Switch providers easily: Change one line to try different models
Multi-model workflows: Use different models for different tasks
Consistent error handling: Same error types across providers
Unified tracing: All model calls appear the same in traces

Model References

Models are referenced by a namespace/name format:

[plugin-namespace]/[model-name]

Examples:

googleai/gemini-2.0-flash
anthropic/claude-3-5-sonnet
ollama/llama2
vertexai/gemini-1.5-pro

import { genkit } from 'genkit';
import { googleAI } from '@genkit-ai/google-genai';
import { anthropic } from '@genkit-ai/anthropic';

const ai = genkit({
  plugins: [googleAI(), anthropic()],
});

// Use Gemini
const geminiResponse = await ai.generate({
  model: googleAI.model('gemini-2.0-flash'),
  prompt: 'Write a haiku',
});

// Use Claude
const claudeResponse = await ai.generate({
  model: anthropic.model('claude-3-5-sonnet'),
  prompt: 'Write a haiku',
});

Generating Content

Basic Text Generation

const { text } = await ai.generate({
  model: googleAI.model('gemini-2.0-flash'),
  prompt: 'Explain REST APIs in simple terms',
});

console.log(text);

Structured Output

Request JSON output that matches a schema:

import { z } from 'genkit';

const RecipeSchema = z.object({
  name: z.string(),
  ingredients: z.array(z.string()),
  steps: z.array(z.string()),
  prepTime: z.string(),
});

const { output } = await ai.generate({
  model: googleAI.model('gemini-2.0-flash'),
  prompt: 'Create a recipe for chocolate chip cookies',
  output: { schema: RecipeSchema },
});

console.log(output.name);        // Typed!
console.log(output.ingredients); // Typed!

Multimodal Input

Send images, audio, and video to multimodal models:

import { Media } from 'genkit';

const { text } = await ai.generate({
  model: googleAI.model('gemini-2.0-flash'),
  prompt: [
    { text: 'What is in this image?' },
    { media: { url: 'https://example.com/image.jpg' } },
  ],
});

Model Configuration

Configure model behavior with parameters:

const { text } = await ai.generate({
  model: googleAI.model('gemini-2.0-flash'),
  prompt: 'Write a creative story',
  config: {
    temperature: 1.2,      // Higher = more creative
    topK: 40,              // Consider top 40 tokens
    topP: 0.95,            // Nucleus sampling threshold
    maxOutputTokens: 1000, // Limit response length
  },
});

Default Configuration

Set defaults at the Genkit level:

const ai = genkit({
  plugins: [googleAI()],
  model: googleAI.model('gemini-2.0-flash', {
    temperature: 0.7,
    topK: 40,
  }),
});

// Uses default config
const response = await ai.generate({
  prompt: 'Hello!',
});

Tool Calling

Models can call functions (tools) to extend their capabilities:

const getWeatherTool = ai.defineTool(
  {
    name: 'getWeather',
    description: 'Get current weather for a city',
    inputSchema: z.object({ city: z.string() }),
    outputSchema: z.string(),
  },
  async ({ city }) => {
    // Call weather API...
    return `Weather in ${city}: Sunny, 72°F`;
  }
);

const { text } = await ai.generate({
  model: googleAI.model('gemini-2.0-flash'),
  prompt: 'What is the weather in Paris?',
  tools: [getWeatherTool],
});

// Model decides to call getWeather, gets result, and responds:
// "The weather in Paris is currently sunny with a temperature of 72°F."

Streaming Responses

Stream responses as they’re generated:

const { stream, response } = ai.generateStream({
  model: googleAI.model('gemini-2.0-flash'),
  prompt: 'Write a long story about space exploration',
});

// Stream chunks as they arrive
for await (const chunk of stream) {
  console.log(chunk.text);
}

// Or wait for the complete response
const final = await response;
console.log(final.text);

Available Model Providers

Official Providers

Provider	Plugin	Models
Google AI	`@genkit-ai/google-genai` (JS) `genkit.plugins.google_genai` (Python)	Gemini 2.0 Flash, Gemini 1.5 Pro, Imagen, Veo
Anthropic	`@genkit-ai/anthropic` `genkit.plugins.anthropic`	Claude 3.5 Sonnet, Claude 3 Opus
Vertex AI	`@genkit-ai/vertexai` `genkit.plugins.vertex_ai`	Model Garden (1000+ models)
Ollama	`@genkit-ai/ollama` `genkit.plugins.ollama`	Llama, Mistral, CodeLlama (local)
OpenAI-compatible	`@genkit-ai/compat-oai` `genkit.plugins.compat_oai`	Any OpenAI-compatible API

Community Providers

Amazon Bedrock: Claude, Llama, Titan models
Mistral AI: Mistral, Mixtral models
Cohere: Command models + reranking
DeepSeek: DeepSeek models
xAI: Grok models
HuggingFace: Inference API models
Cloudflare Workers AI: Edge AI models
Azure AI Foundry: 11,000+ models

Model Middleware

Add behavior to model calls with middleware:

import { retry } from 'genkit/model/middleware';

const { text } = await ai.generate({
  model: googleAI.model('gemini-2.0-flash'),
  prompt: 'Hello!',
  use: [
    retry({
      maxRetries: 3,
      initialDelayMs: 1000,
      backoffFactor: 2,
    }),
  ],
});

Common middleware:

Retry: Automatic retry with exponential backoff
Caching: Cache responses for identical requests
Safety: Filter harmful content
Logging: Log all requests/responses
Custom: Build your own

Response Metadata

All responses include metadata:

const response = await ai.generate({
  model: googleAI.model('gemini-2.0-flash'),
  prompt: 'Hello!',
});

console.log(response.text);          // Generated text
console.log(response.usage);         // Token usage stats
console.log(response.finishReason);  // Why generation stopped
console.log(response.latencyMs);     // Request duration
console.log(response.custom);        // Provider-specific metadata

Next Steps

Learn about Prompts - managing prompt templates
Explore Tools - extending models with functions
See Flows - building multi-step AI workflows

Overview

Getting Started

Core Concepts

Guides

Model Providers

Deployment

Developer Tools

Model Abstraction

Model References

Generating Content

Basic Text Generation

Structured Output

Multimodal Input

Model Configuration

Default Configuration

Tool Calling

Streaming Responses

Available Model Providers

Official Providers

Community Providers

Model Middleware

Response Metadata

Next Steps

Build docs developers (and LLMs) love

Overview

Getting Started

Core Concepts

Guides

Model Providers

Deployment

Developer Tools

​Model Abstraction

​Model References

​Generating Content

​Basic Text Generation

​Structured Output

​Multimodal Input

​Model Configuration

​Default Configuration

​Tool Calling

​Streaming Responses

​Available Model Providers

​Official Providers

​Community Providers

​Model Middleware

​Response Metadata

​Next Steps

Build docs developers (and LLMs) love

Model Abstraction

Model References

Generating Content

Basic Text Generation

Structured Output

Multimodal Input

Model Configuration

Default Configuration

Tool Calling

Streaming Responses

Available Model Providers

Official Providers

Community Providers

Model Middleware

Response Metadata

Next Steps