Skip to main content
Genkit provides a unified API for working with AI models from different providers. Whether you’re using Gemini, Claude, GPT, Llama, or any other model, the interface is the same.

Model Abstraction

Genkit abstracts away provider-specific APIs into a single, consistent interface:
// Same API works for any model
const response = await ai.generate({
  model: 'googleai/gemini-2.0-flash',  // or anthropic/claude-3-5-sonnet
  prompt: 'Explain quantum computing',
});
This abstraction means:
  • Switch providers easily: Change one line to try different models
  • Multi-model workflows: Use different models for different tasks
  • Consistent error handling: Same error types across providers
  • Unified tracing: All model calls appear the same in traces

Model References

Models are referenced by a namespace/name format:
[plugin-namespace]/[model-name]
Examples:
  • googleai/gemini-2.0-flash
  • anthropic/claude-3-5-sonnet
  • ollama/llama2
  • vertexai/gemini-1.5-pro
import { genkit } from 'genkit';
import { googleAI } from '@genkit-ai/google-genai';
import { anthropic } from '@genkit-ai/anthropic';

const ai = genkit({
  plugins: [googleAI(), anthropic()],
});

// Use Gemini
const geminiResponse = await ai.generate({
  model: googleAI.model('gemini-2.0-flash'),
  prompt: 'Write a haiku',
});

// Use Claude
const claudeResponse = await ai.generate({
  model: anthropic.model('claude-3-5-sonnet'),
  prompt: 'Write a haiku',
});

Generating Content

Basic Text Generation

const { text } = await ai.generate({
  model: googleAI.model('gemini-2.0-flash'),
  prompt: 'Explain REST APIs in simple terms',
});

console.log(text);

Structured Output

Request JSON output that matches a schema:
import { z } from 'genkit';

const RecipeSchema = z.object({
  name: z.string(),
  ingredients: z.array(z.string()),
  steps: z.array(z.string()),
  prepTime: z.string(),
});

const { output } = await ai.generate({
  model: googleAI.model('gemini-2.0-flash'),
  prompt: 'Create a recipe for chocolate chip cookies',
  output: { schema: RecipeSchema },
});

console.log(output.name);        // Typed!
console.log(output.ingredients); // Typed!

Multimodal Input

Send images, audio, and video to multimodal models:
import { Media } from 'genkit';

const { text } = await ai.generate({
  model: googleAI.model('gemini-2.0-flash'),
  prompt: [
    { text: 'What is in this image?' },
    { media: { url: 'https://example.com/image.jpg' } },
  ],
});

Model Configuration

Configure model behavior with parameters:
const { text } = await ai.generate({
  model: googleAI.model('gemini-2.0-flash'),
  prompt: 'Write a creative story',
  config: {
    temperature: 1.2,      // Higher = more creative
    topK: 40,              // Consider top 40 tokens
    topP: 0.95,            // Nucleus sampling threshold
    maxOutputTokens: 1000, // Limit response length
  },
});

Default Configuration

Set defaults at the Genkit level:
const ai = genkit({
  plugins: [googleAI()],
  model: googleAI.model('gemini-2.0-flash', {
    temperature: 0.7,
    topK: 40,
  }),
});

// Uses default config
const response = await ai.generate({
  prompt: 'Hello!',
});

Tool Calling

Models can call functions (tools) to extend their capabilities:
const getWeatherTool = ai.defineTool(
  {
    name: 'getWeather',
    description: 'Get current weather for a city',
    inputSchema: z.object({ city: z.string() }),
    outputSchema: z.string(),
  },
  async ({ city }) => {
    // Call weather API...
    return `Weather in ${city}: Sunny, 72°F`;
  }
);

const { text } = await ai.generate({
  model: googleAI.model('gemini-2.0-flash'),
  prompt: 'What is the weather in Paris?',
  tools: [getWeatherTool],
});

// Model decides to call getWeather, gets result, and responds:
// "The weather in Paris is currently sunny with a temperature of 72°F."

Streaming Responses

Stream responses as they’re generated:
const { stream, response } = ai.generateStream({
  model: googleAI.model('gemini-2.0-flash'),
  prompt: 'Write a long story about space exploration',
});

// Stream chunks as they arrive
for await (const chunk of stream) {
  console.log(chunk.text);
}

// Or wait for the complete response
const final = await response;
console.log(final.text);

Available Model Providers

Official Providers

ProviderPluginModels
Google AI@genkit-ai/google-genai (JS)
genkit.plugins.google_genai (Python)
Gemini 2.0 Flash, Gemini 1.5 Pro, Imagen, Veo
Anthropic@genkit-ai/anthropic
genkit.plugins.anthropic
Claude 3.5 Sonnet, Claude 3 Opus
Vertex AI@genkit-ai/vertexai
genkit.plugins.vertex_ai
Model Garden (1000+ models)
Ollama@genkit-ai/ollama
genkit.plugins.ollama
Llama, Mistral, CodeLlama (local)
OpenAI-compatible@genkit-ai/compat-oai
genkit.plugins.compat_oai
Any OpenAI-compatible API

Community Providers

  • Amazon Bedrock: Claude, Llama, Titan models
  • Mistral AI: Mistral, Mixtral models
  • Cohere: Command models + reranking
  • DeepSeek: DeepSeek models
  • xAI: Grok models
  • HuggingFace: Inference API models
  • Cloudflare Workers AI: Edge AI models
  • Azure AI Foundry: 11,000+ models

Model Middleware

Add behavior to model calls with middleware:
import { retry } from 'genkit/model/middleware';

const { text } = await ai.generate({
  model: googleAI.model('gemini-2.0-flash'),
  prompt: 'Hello!',
  use: [
    retry({
      maxRetries: 3,
      initialDelayMs: 1000,
      backoffFactor: 2,
    }),
  ],
});
Common middleware:
  • Retry: Automatic retry with exponential backoff
  • Caching: Cache responses for identical requests
  • Safety: Filter harmful content
  • Logging: Log all requests/responses
  • Custom: Build your own

Response Metadata

All responses include metadata:
const response = await ai.generate({
  model: googleAI.model('gemini-2.0-flash'),
  prompt: 'Hello!',
});

console.log(response.text);          // Generated text
console.log(response.usage);         // Token usage stats
console.log(response.finishReason);  // Why generation stopped
console.log(response.latencyMs);     // Request duration
console.log(response.custom);        // Provider-specific metadata

Next Steps

  • Learn about Prompts - managing prompt templates
  • Explore Tools - extending models with functions
  • See Flows - building multi-step AI workflows

Build docs developers (and LLMs) love