This guide walks you through creating a complete semantic search pipeline using VectorDB with Chroma as the vector database.
What you’ll build
By the end of this quickstart, you’ll have:
A working semantic search pipeline using Chroma
Documents indexed with dense embeddings
The ability to search using natural language queries
Optional RAG answer generation
This example uses Chroma for local development. You can swap to any other supported vector database (Pinecone, Weaviate, Qdrant, Milvus) by changing the configuration.
Before you begin
Make sure you’ve completed the installation steps and have:
Python 3.11+ installed
VectorDB dependencies installed via uv sync
Virtual environment activated
Step 1: Create a configuration file
Create a configuration file config.yaml that defines your search pipeline:
dataloader :
type : "arc"
split : "test"
limit : 100
use_text_splitter : false
embeddings :
model : "sentence-transformers/all-MiniLM-L6-v2"
device : "cpu"
batch_size : 32
chroma :
collection_name : "quickstart-semantic-search"
path : "./chroma_data"
recreate : true
search :
top_k : 5
rag :
enabled : false
logging :
name : "vectordb_quickstart"
level : "INFO"
This configuration uses the ARC dataset (science reasoning questions) and loads 100 test documents. The embedding model runs on CPU for compatibility.
Step 2: Index your documents
Create a Python script index.py to index documents into Chroma:
from vectordb.langchain.semantic_search import ChromaSemanticIndexingPipeline
# Initialize the indexing pipeline
pipeline = ChromaSemanticIndexingPipeline( "config.yaml" )
# Run the indexing process
result = pipeline.run()
print ( f "Successfully indexed { result[ 'documents_indexed' ] } documents" )
print ( f "Collection: { result[ 'collection_name' ] } " )
Run the indexing script:
You should see output like:
Successfully indexed 100 documents
Collection: quickstart-semantic-search
Indexing creates dense vector embeddings for each document using the specified model. These embeddings capture semantic meaning for similarity search.
Step 3: Search your documents
Create a search script search.py to query your indexed documents:
from vectordb.langchain.semantic_search import ChromaSemanticSearchPipeline
# Initialize the search pipeline
pipeline = ChromaSemanticSearchPipeline( "config.yaml" )
# Run a semantic search query
results = pipeline.search(
query = "What is photosynthesis?" ,
top_k = 5
)
print ( f "Query: { results[ 'query' ] } " )
print ( f " \n Top { len (results[ 'documents' ]) } results: \n " )
for i, doc in enumerate (results[ 'documents' ], 1 ):
score = doc.metadata.get( 'score' , 'N/A' )
content = doc.page_content[: 200 ] # First 200 characters
print ( f " { i } . [Score: { score } ]" )
print ( f " { content } ... \n " )
Run the search script:
You’ll see the top 5 most semantically similar documents to your query.
Step 4: Add RAG answer generation (optional)
To generate answers from retrieved documents, enable RAG in your configuration:
rag :
enabled : true
model : "llama-3.3-70b-versatile"
api_key : "${GROQ_API_KEY}"
temperature : 0.7
max_tokens : 2048
Make sure you have set your GROQ_API_KEY environment variable before enabling RAG.
Update your search script to display the generated answer:
from vectordb.langchain.semantic_search import ChromaSemanticSearchPipeline
pipeline = ChromaSemanticSearchPipeline( "config.yaml" )
results = pipeline.search(
query = "What is photosynthesis?" ,
top_k = 5
)
print ( f "Query: { results[ 'query' ] } \n " )
# Display RAG-generated answer
if 'answer' in results:
print ( f "Answer: { results[ 'answer' ] } \n " )
print ( f "Retrieved { len (results[ 'documents' ]) } documents" )
Now when you run the search, you’ll get an AI-generated answer based on the retrieved documents.
Understanding the pipeline
Here’s what happens under the hood:
Query embedding
Your query text is converted to a dense vector using the same embedding model used during indexing
Similarity search
The vector database finds documents with embeddings closest to your query embedding using cosine similarity
Result ranking
Results are ranked by similarity score, with the most relevant documents returned first
Optional RAG generation
If enabled, the LLM generates an answer using the retrieved documents as context
Try different vector databases
VectorDB supports multiple vector databases with the same interface. Here’s how to switch from Chroma to Pinecone:
Chroma (local)
Pinecone (cloud)
Weaviate
Qdrant
Milvus
from vectordb.langchain.semantic_search import ChromaSemanticSearchPipeline
pipeline = ChromaSemanticSearchPipeline( "config.yaml" )
results = pipeline.search( "What is machine learning?" , top_k = 5 )
Just update your config file with the appropriate database settings and API keys.
Using Haystack instead of LangChain
VectorDB provides identical functionality for both frameworks:
from vectordb.langchain.semantic_search import ChromaSemanticSearchPipeline
pipeline = ChromaSemanticSearchPipeline( "config.yaml" )
results = pipeline.search( "What is photosynthesis?" , top_k = 5 )
for doc in results[ "documents" ]:
print (doc.page_content)
Next steps
Now that you have a working semantic search pipeline, explore advanced features:
Hybrid search Combine dense and sparse retrieval for better results
Reranking Use cross-encoders for higher precision
Query enhancement Improve recall with multi-query and HyDE
Metadata filtering Filter results by document attributes