Skip to main content
This guide walks you through creating a complete semantic search pipeline using VectorDB with Chroma as the vector database.

What you’ll build

By the end of this quickstart, you’ll have:
  • A working semantic search pipeline using Chroma
  • Documents indexed with dense embeddings
  • The ability to search using natural language queries
  • Optional RAG answer generation
This example uses Chroma for local development. You can swap to any other supported vector database (Pinecone, Weaviate, Qdrant, Milvus) by changing the configuration.

Before you begin

Make sure you’ve completed the installation steps and have:
  • Python 3.11+ installed
  • VectorDB dependencies installed via uv sync
  • Virtual environment activated

Step 1: Create a configuration file

Create a configuration file config.yaml that defines your search pipeline:
config.yaml
dataloader:
  type: "arc"
  split: "test"
  limit: 100
  use_text_splitter: false

embeddings:
  model: "sentence-transformers/all-MiniLM-L6-v2"
  device: "cpu"
  batch_size: 32

chroma:
  collection_name: "quickstart-semantic-search"
  path: "./chroma_data"
  recreate: true

search:
  top_k: 5

rag:
  enabled: false

logging:
  name: "vectordb_quickstart"
  level: "INFO"
This configuration uses the ARC dataset (science reasoning questions) and loads 100 test documents. The embedding model runs on CPU for compatibility.

Step 2: Index your documents

Create a Python script index.py to index documents into Chroma:
index.py
from vectordb.langchain.semantic_search import ChromaSemanticIndexingPipeline

# Initialize the indexing pipeline
pipeline = ChromaSemanticIndexingPipeline("config.yaml")

# Run the indexing process
result = pipeline.run()

print(f"Successfully indexed {result['documents_indexed']} documents")
print(f"Collection: {result['collection_name']}")
Run the indexing script:
python index.py
You should see output like:
Successfully indexed 100 documents
Collection: quickstart-semantic-search
Indexing creates dense vector embeddings for each document using the specified model. These embeddings capture semantic meaning for similarity search.

Step 3: Search your documents

Create a search script search.py to query your indexed documents:
search.py
from vectordb.langchain.semantic_search import ChromaSemanticSearchPipeline

# Initialize the search pipeline
pipeline = ChromaSemanticSearchPipeline("config.yaml")

# Run a semantic search query
results = pipeline.search(
    query="What is photosynthesis?",
    top_k=5
)

print(f"Query: {results['query']}")
print(f"\nTop {len(results['documents'])} results:\n")

for i, doc in enumerate(results['documents'], 1):
    score = doc.metadata.get('score', 'N/A')
    content = doc.page_content[:200]  # First 200 characters
    print(f"{i}. [Score: {score}]")
    print(f"   {content}...\n")
Run the search script:
python search.py
You’ll see the top 5 most semantically similar documents to your query.

Step 4: Add RAG answer generation (optional)

To generate answers from retrieved documents, enable RAG in your configuration:
config.yaml
rag:
  enabled: true
  model: "llama-3.3-70b-versatile"
  api_key: "${GROQ_API_KEY}"
  temperature: 0.7
  max_tokens: 2048
Make sure you have set your GROQ_API_KEY environment variable before enabling RAG.
Update your search script to display the generated answer:
search.py
from vectordb.langchain.semantic_search import ChromaSemanticSearchPipeline

pipeline = ChromaSemanticSearchPipeline("config.yaml")

results = pipeline.search(
    query="What is photosynthesis?",
    top_k=5
)

print(f"Query: {results['query']}\n")

# Display RAG-generated answer
if 'answer' in results:
    print(f"Answer: {results['answer']}\n")

print(f"Retrieved {len(results['documents'])} documents")
Now when you run the search, you’ll get an AI-generated answer based on the retrieved documents.

Understanding the pipeline

Here’s what happens under the hood:
1

Query embedding

Your query text is converted to a dense vector using the same embedding model used during indexing
2

Similarity search

The vector database finds documents with embeddings closest to your query embedding using cosine similarity
3

Result ranking

Results are ranked by similarity score, with the most relevant documents returned first
4

Optional RAG generation

If enabled, the LLM generates an answer using the retrieved documents as context

Try different vector databases

VectorDB supports multiple vector databases with the same interface. Here’s how to switch from Chroma to Pinecone:
from vectordb.langchain.semantic_search import ChromaSemanticSearchPipeline

pipeline = ChromaSemanticSearchPipeline("config.yaml")
results = pipeline.search("What is machine learning?", top_k=5)
Just update your config file with the appropriate database settings and API keys.

Using Haystack instead of LangChain

VectorDB provides identical functionality for both frameworks:
from vectordb.langchain.semantic_search import ChromaSemanticSearchPipeline

pipeline = ChromaSemanticSearchPipeline("config.yaml")
results = pipeline.search("What is photosynthesis?", top_k=5)

for doc in results["documents"]:
    print(doc.page_content)

Next steps

Now that you have a working semantic search pipeline, explore advanced features:

Hybrid search

Combine dense and sparse retrieval for better results

Reranking

Use cross-encoders for higher precision

Query enhancement

Improve recall with multi-query and HyDE

Metadata filtering

Filter results by document attributes

Build docs developers (and LLMs) love