Chroma

Overview

Chroma is an open-source embedding database designed for AI applications. Perfect for local development with easy deployment to cloud or self-hosted environments.

Setup

Local (In-Memory)
Local (Persistent)
Cloud (Chroma Cloud)

No setup required! Chroma runs in-memory by default.

// Simply leave Chroma URL empty
Collection Name: my-collection
Embeddings: OpenAI Embeddings

Run Chroma server locally:

# Using Docker
docker run -p 8000:8000 chromadb/chroma

# Or install with pip
pip install chromadb
chroma run --path ./chroma_data

Then in Flowise:

Chroma URL: http://localhost:8000
Collection Name: my-collection

// Add credential in Flowise
Chroma API Key: your-cloud-api-key
Chroma Tenant: your-tenant
Chroma Database: your-database

// In node configuration
Chroma URL: https://api.trychroma.com
Collection Name: my-collection

Configuration

Required Parameters

collectionName

string

required

Name of the collection to store/retrieve embeddings

embeddings

Embeddings

required

Embedding model to use (e.g., OpenAI Embeddings)

Optional Parameters

document

Document[]

Documents to upsert into the collection

chromaURL

string

URL of Chroma server. Leave empty for in-memory mode:

Empty = In-memory
http://localhost:8000 = Local server
https://api.trychroma.com = Chroma Cloud

credential

Chroma API credential (only needed for cloud-hosted instances)

recordManager

RecordManager

Track indexed documents to prevent duplication

chromaMetadataFilter

json

Filter search results by metadata:

{
  "source": "docs",
  "category": "tutorial"
}

topK

number

default:4

Number of results to return

Usage Examples

In-Memory (Development)

// Fastest setup - no persistence
Collection Name: test-collection
Embeddings: OpenAI Embeddings
Chroma URL: [leave empty]
Top K: 4

// Data stored in memory, lost on restart

Local Persistent

# Start Chroma server
docker run -p 8000:8000 -v ./chroma_data:/chroma/chroma chromadb/chroma

// Connect to local server
Collection Name: my-docs
Chroma URL: http://localhost:8000
Embeddings: OpenAI Embeddings

Chroma Cloud

// Cloud configuration
Chroma URL: https://api.trychroma.com
Collection Name: production-docs
Credential: Chroma API (with key, tenant, database)
Embeddings: OpenAI Embeddings

With Metadata Filtering

// Search only specific documents
{
  "chromaMetadataFilter": {
    "type": "api-docs",
    "version": "v2"
  }
}

With Record Manager

// Prevent duplicate indexing
Document: Text Loader
Collection Name: knowledge-base
Record Manager: Postgres Record Manager
Embeddings: OpenAI Embeddings

// Only new/changed docs are processed

Metadata Filter Syntax

Chroma supports WHERE clause filtering:

// Simple equality
{ "category": "tutorial" }

// Operators: $eq, $ne, $gt, $gte, $lt, $lte
{
  "year": { "$gte": 2023 },
  "rating": { "$gt": 4.5 }
}

// $in operator
{
  "status": { "$in": ["published", "reviewed"] }
}

// Logical operators: $and, $or
{
  "$and": [
    { "category": "docs" },
    { "language": "en" }
  ]
}

{
  "$or": [
    { "priority": "high" },
    { "urgent": true }
  ]
}

Best Practices

Development

Use in-memory for quick testing
Use local server for development
Small datasets work great in-memory
Easy to reset and iterate

Production

Use Chroma Cloud or self-hosted server
Enable authentication
Set up backups
Monitor collection sizes

Performance

Create indexes on frequently queried metadata
Use appropriate collection sizes
Consider sharding for very large datasets
Batch upserts when possible

Data Management

Use descriptive collection names
Tag documents with metadata
Use record manager to avoid duplicates
Implement collection lifecycle management

Collection Management

Creating Collections

Collections are created automatically when you first upsert documents.

Deleting Collections

# Via Chroma client (if needed)
import chromadb
client = chromadb.HttpClient(host="localhost", port=8000)
client.delete_collection(name="old-collection")

Listing Collections

# Check existing collections
client = chromadb.HttpClient(host="localhost", port=8000)
collections = client.list_collections()
for collection in collections:
    print(f"{collection.name}: {collection.count()} documents")

Deployment Options

Docker Compose

version: '3'
services:
  chroma:
    image: chromadb/chroma
    ports:
      - "8000:8000"
    volumes:
      - ./chroma_data:/chroma/chroma
    environment:
      - CHROMA_SERVER_AUTH_PROVIDER=token
      - CHROMA_SERVER_AUTH_CREDENTIALS=your-secret-token

Kubernetes

apiVersion: apps/v1
kind: Deployment
metadata:
  name: chroma
spec:
  replicas: 1
  selector:
    matchLabels:
      app: chroma
  template:
    metadata:
      labels:
        app: chroma
    spec:
      containers:
      - name: chroma
        image: chromadb/chroma
        ports:
        - containerPort: 8000
        volumeMounts:
        - name: chroma-data
          mountPath: /chroma/chroma
      volumes:
      - name: chroma-data
        persistentVolumeClaim:
          claimName: chroma-pvc

Common Issues

Connection Refused

Can’t connect to Chroma serverSolution:

Verify Chroma server is running
Check URL format: http://localhost:8000
Ensure port 8000 is not blocked
For Docker: Check container is running

Collection Not Found

Error accessing collectionSolution:

Collections are auto-created on first upsert
Check collection name spelling
Ensure documents were successfully indexed
Verify you’re connecting to correct server

In-Memory Data Lost

Data disappears after restartSolution:

In-memory mode doesn’t persist
Use Chroma server for persistence
Configure persistent volume
Consider Chroma Cloud for managed hosting

Authentication Failed

Chroma Cloud connection issuesSolution:

Verify API key is correct
Check tenant and database names
Ensure credential is properly configured
Test with Chroma Cloud console

Chroma vs Other Vector DBs

Feature	Chroma	Pinecone	Qdrant
Open Source	Yes	No	Yes
In-Memory	Yes	No	Yes
Managed Cloud	Yes	Yes	Yes
Self-Hosted	Yes	No	Yes
Best For	Development	Production	Production
Ease of Use	Excellent	Very Good	Good

Outputs

retriever

VectorStoreRetriever

Retriever interface for use in chains and agents

vectorStore

ChromaVectorStore

Direct vector store access for custom operations

Overview

Language Models

Vector Stores

Document Loaders

Agents & Tools

Overview

Setup

Configuration

Required Parameters

Optional Parameters

Usage Examples

In-Memory (Development)

Local Persistent

Chroma Cloud

With Metadata Filtering

With Record Manager

Metadata Filter Syntax

Best Practices

Development

Production

Performance

Data Management

Collection Management

Creating Collections

Deleting Collections

Listing Collections

Deployment Options

Docker Compose

Kubernetes

Common Issues

Chroma vs Other Vector DBs

Outputs

Build docs developers (and LLMs) love

Overview

Language Models

Vector Stores

Document Loaders

Agents & Tools

​Overview

​Setup

​Configuration

​Required Parameters

​Optional Parameters

​Usage Examples

​In-Memory (Development)

​Local Persistent

​Chroma Cloud

​With Metadata Filtering

​With Record Manager

​Metadata Filter Syntax

​Best Practices

Development

Production

Performance

Data Management

​Collection Management

​Creating Collections

​Deleting Collections

​Listing Collections

​Deployment Options

​Docker Compose

​Kubernetes

​Common Issues

​Chroma vs Other Vector DBs

​Outputs

​Related Resources

Build docs developers (and LLMs) love

Overview

Setup

Configuration

Required Parameters

Optional Parameters

Usage Examples

In-Memory (Development)

Local Persistent

Chroma Cloud

With Metadata Filtering

With Record Manager

Metadata Filter Syntax

Best Practices

Collection Management

Creating Collections

Deleting Collections

Listing Collections

Deployment Options

Docker Compose

Kubernetes

Common Issues

Chroma vs Other Vector DBs

Outputs

Related Resources