Chroma Plugin

The genkitx-chroma plugin provides integration with ChromaDB, an open-source vector database for building AI applications with embeddings. Use it for retrieval-augmented generation (RAG) and semantic search.

Installation

npm install genkitx-chroma chromadb

Prerequisites

Install and run ChromaDB:

# Using Docker (recommended)
docker pull chromadb/chroma
docker run -p 8000:8000 chromadb/chroma

# Or install with pip
pip install chromadb
chroma run --path /chroma-data

Default server: http://localhost:8000

Basic Setup

import { genkit } from 'genkit';
import { chroma } from 'genkitx-chroma';
import { googleAI } from '@genkit-ai/google-genai';

const ai = genkit({
  plugins: [
    googleAI(),
    chroma([
      {
        collectionName: 'my-collection',
        embedder: googleAI.embedder('gemini-embedding-001'),
        createCollectionIfMissing: true,
      },
    ]),
  ],
});

Configuration

Plugin Configuration

import { chroma } from 'genkitx-chroma';
import { googleAI } from '@genkit-ai/google-genai';

chroma([
  {
    collectionName: 'documents',
    embedder: googleAI.embedder('gemini-embedding-001'),
    embedderOptions: {                           // Optional embedder config
      taskType: 'RETRIEVAL_DOCUMENT',
    },
    createCollectionIfMissing: true,             // Auto-create collection
    clientParams: {                              // Optional Chroma client config
      path: 'http://localhost:8000',
    },
  },
  {
    collectionName: 'code',                      // Multiple collections
    embedder: googleAI.embedder('text-embedding-005'),
    createCollectionIfMissing: true,
  },
])

Custom Client Configuration

import type { ChromaClientParams } from 'chromadb';

// Static configuration
const clientParams: ChromaClientParams = {
  path: 'http://chroma-server:8000',
  auth: {
    provider: 'token',
    credentials: process.env.CHROMA_TOKEN,
  },
};

chroma([{
  collectionName: 'my-docs',
  embedder: googleAI.embedder('gemini-embedding-001'),
  clientParams: clientParams,
}])

// Dynamic configuration (async)
chroma([{
  collectionName: 'my-docs',
  embedder: googleAI.embedder('gemini-embedding-001'),
  clientParams: async () => {
    const token = await getAuthToken();
    return {
      path: 'http://chroma-server:8000',
      auth: { provider: 'token', credentials: token },
    };
  },
}])

Usage

Indexing Documents

import { chromaIndexerRef } from 'genkitx-chroma';
import { Document } from 'genkit';

// Define indexer
const myIndexer = chromaIndexerRef({
  collectionName: 'my-collection',
});

// Create documents
const documents = [
  Document.fromText('Genkit is a framework for building AI apps.', {
    source: 'docs',
  }),
  Document.fromText('ChromaDB is a vector database.', {
    source: 'docs',
  }),
  Document.fromText('RAG combines retrieval with generation.', {
    source: 'docs',
  }),
];

// Index documents
await ai.index({
  indexer: myIndexer,
  documents: documents,
});

Retrieving Documents

import { chromaRetrieverRef } from 'genkitx-chroma';

// Define retriever
const myRetriever = chromaRetrieverRef({
  collectionName: 'my-collection',
});

// Retrieve relevant documents
const results = await ai.retrieve({
  retriever: myRetriever,
  query: 'What is Genkit?',
  options: {
    k: 5,  // Return top 5 results
  },
});

console.log(results.documents);

RAG with Retrieved Context

import { chromaRetrieverRef } from 'genkitx-chroma';
import { googleAI } from '@genkit-ai/google-genai';

const retriever = chromaRetrieverRef({
  collectionName: 'knowledge-base',
});

// Retrieve relevant documents
const docs = await ai.retrieve({
  retriever: retriever,
  query: 'How does RAG work?',
  options: { k: 3 },
});

// Use context in generation
const context = docs.documents
  .map(d => d.text)
  .join('\n\n');

const response = await ai.generate({
  model: googleAI.model('gemini-2.5-flash'),
  prompt: `Answer based on this context:\n\n${context}\n\nQuestion: How does RAG work?`,
});

console.log(response.text());

Advanced Usage

Filtering with Metadata

import { chromaRetrieverRef, IncludeEnum } from 'genkitx-chroma';

// Index with metadata
const docs = [
  Document.fromText('Python tutorial', { 
    language: 'python',
    level: 'beginner',
  }),
  Document.fromText('Advanced TypeScript', { 
    language: 'typescript',
    level: 'advanced',
  }),
];

await ai.index({ indexer: myIndexer, documents: docs });

// Retrieve with filters
const results = await ai.retrieve({
  retriever: myRetriever,
  query: 'programming tutorial',
  options: {
    k: 10,
    where: { language: 'python' },              // Metadata filter
    whereDocument: { $contains: 'tutorial' },   // Content filter
    include: [                                  // What to include in results
      'documents',
      'metadatas',
      'distances',
      'embeddings',
    ] as IncludeEnum[],
  },
});

Creating Collections Manually

import { createChromaCollection } from 'genkitx-chroma';

// Create collection with custom settings
await createChromaCollection(ai, {
  name: 'my-collection',
  embedder: googleAI.embedder('gemini-embedding-001'),
  metadata: {
    description: 'My document collection',
    'hnsw:space': 'cosine',  // Similarity metric: cosine, l2, ip
  },
  clientParams: {
    path: 'http://localhost:8000',
  },
});

Deleting Collections

import { deleteChromaCollection } from 'genkitx-chroma';

await deleteChromaCollection({
  name: 'old-collection',
  clientParams: {
    path: 'http://localhost:8000',
  },
});

Complete RAG Example

import { genkit, z } from 'genkit';
import { chroma, chromaRetrieverRef, chromaIndexerRef } from 'genkitx-chroma';
import { googleAI } from '@genkit-ai/google-genai';
import { Document } from 'genkit';

const ai = genkit({
  plugins: [
    googleAI(),
    chroma([{
      collectionName: 'knowledge-base',
      embedder: googleAI.embedder('gemini-embedding-001'),
      createCollectionIfMissing: true,
    }]),
  ],
});

const indexer = chromaIndexerRef({ collectionName: 'knowledge-base' });
const retriever = chromaRetrieverRef({ collectionName: 'knowledge-base' });

// Index documents
const knowledgeDocs = [
  Document.fromText('Genkit is a framework for building AI applications.'),
  Document.fromText('ChromaDB is an open-source vector database.'),
  Document.fromText('RAG improves LLM responses with relevant context.'),
];

await ai.index({ indexer, documents: knowledgeDocs });

// RAG flow
const ragFlow = ai.defineFlow(
  {
    name: 'ragFlow',
    inputSchema: z.string(),
    outputSchema: z.string(),
  },
  async (query) => {
    // Retrieve relevant documents
    const docs = await ai.retrieve({
      retriever: retriever,
      query: query,
      options: { k: 3 },
    });

    // Build context from retrieved documents
    const context = docs.documents
      .map(d => d.text)
      .join('\n');

    // Generate with context
    const response = await ai.generate({
      model: googleAI.model('gemini-2.5-flash'),
      prompt: `Context:\n${context}\n\nQuestion: ${query}\n\nAnswer:`,
    });

    return response.text();
  }
);

// Use the flow
const answer = await ragFlow('What is Genkit?');
console.log(answer);

Best Practices

Choose the Right Embedder

// For general text (Google AI)
embedder: googleAI.embedder('gemini-embedding-001')

// For high-quality embeddings (Vertex AI)
embedder: vertexAI.embedder('text-embedding-005')

// For local embeddings (Ollama)
embedder: ollama.embedder('nomic-embed-text')

Chunk Large Documents

import { Document } from 'genkit';

function chunkDocument(text: string, chunkSize: number = 500): Document[] {
  const chunks = [];
  for (let i = 0; i < text.length; i += chunkSize) {
    chunks.push(
      Document.fromText(text.slice(i, i + chunkSize), {
        chunkIndex: i / chunkSize,
        originalLength: text.length,
      })
    );
  }
  return chunks;
}

const longText = /* ... */;
const chunks = chunkDocument(longText);
await ai.index({ indexer: myIndexer, documents: chunks });

Error Handling

try {
  const results = await ai.retrieve({
    retriever: myRetriever,
    query: 'test query',
  });
} catch (error) {
  if (error.message.includes('Collection not found')) {
    console.error('Collection does not exist. Create it first.');
  } else if (error.message.includes('ECONNREFUSED')) {
    console.error('ChromaDB server is not running.');
  } else {
    console.error('Retrieval error:', error);
  }
}

Collection Naming

// Use descriptive names
chroma([{
  collectionName: 'product-documentation',  // Good
  // collectionName: 'docs',                 // Too generic
}])

// Organize by domain
chroma([
  { collectionName: 'legal-documents', embedder },
  { collectionName: 'technical-docs', embedder },
  { collectionName: 'customer-support', embedder },
])

Configuration Options

Retriever Options

await ai.retrieve({
  retriever: myRetriever,
  query: 'search query',
  options: {
    k: 5,                                    // Number of results (default: 10)
    where: { language: 'en' },               // Metadata filter
    whereDocument: { $contains: 'keyword' }, // Content filter
    include: ['documents', 'distances'],     // What to include
  },
});

Where Filters

// Exact match
where: { language: 'python' }

// Multiple conditions
where: { 
  $and: [
    { language: 'python' },
    { level: 'beginner' },
  ]
}

// Or conditions
where: {
  $or: [
    { language: 'python' },
    { language: 'javascript' },
  ]
}

// Not equal
where: { language: { $ne: 'java' } }

Troubleshooting

ChromaDB Not Running

Error: ECONNREFUSED Solution: Start ChromaDB server:

docker run -p 8000:8000 chromadb/chroma

Collection Not Found

Error: Collection 'name' not found Solution: Set createCollectionIfMissing: true or create manually.

Slow Retrieval

Solutions:

Reduce k value (return fewer results)
Use more specific metadata filters
Optimize collection size
Use smaller embeddings

Official Plugins

Plugin Development

Chroma Plugin

Chroma Plugin

Installation

Prerequisites

Basic Setup

Configuration

Plugin Configuration

Custom Client Configuration

Usage

Indexing Documents

Retrieving Documents

RAG with Retrieved Context

Advanced Usage

Filtering with Metadata

Creating Collections Manually

Deleting Collections

Complete RAG Example

Best Practices

Choose the Right Embedder

Chunk Large Documents

Error Handling

Collection Naming

Configuration Options

Retriever Options

Where Filters

Troubleshooting

ChromaDB Not Running

Collection Not Found

Slow Retrieval

Links

Build docs developers (and LLMs) love

Official Plugins

Plugin Development

​Chroma Plugin

​Installation

​Prerequisites

​Basic Setup

​Configuration

​Plugin Configuration

​Custom Client Configuration

​Usage

​Indexing Documents

​Retrieving Documents

​RAG with Retrieved Context

​Advanced Usage

​Filtering with Metadata

​Creating Collections Manually

​Deleting Collections

​Complete RAG Example

​Best Practices

​Choose the Right Embedder

​Chunk Large Documents

​Error Handling

​Collection Naming

​Configuration Options

​Retriever Options

​Where Filters

​Troubleshooting

​ChromaDB Not Running

​Collection Not Found

​Slow Retrieval

​Links

Build docs developers (and LLMs) love

Chroma Plugin

Installation

Prerequisites

Basic Setup

Configuration

Plugin Configuration

Custom Client Configuration

Usage

Indexing Documents

Retrieving Documents

RAG with Retrieved Context

Advanced Usage

Filtering with Metadata

Creating Collections Manually

Deleting Collections

Complete RAG Example

Best Practices

Choose the Right Embedder

Chunk Large Documents

Error Handling

Collection Naming

Configuration Options

Retriever Options

Where Filters

Troubleshooting

ChromaDB Not Running

Collection Not Found

Slow Retrieval

Links