Vector Store

What is a Vector Store?

A vector store is a specialized database that stores text as high-dimensional numerical vectors (embeddings) and enables fast similarity search. Instead of keyword matching, vector stores find semantically similar content based on meaning.

Vector embeddings capture the semantic meaning of text. Similar concepts have similar vectors, even if they use different words.

Why Vector Embeddings?

Traditional keyword search fails when questions and answers use different terminology:

Keyword Search

Query: “revenue growth” → Only finds exact phrase “revenue growth”

Vector Search

Query: “revenue growth” → Finds “sales increase”, “income expansion”, etc.

Vector embeddings understand that “revenue” and “sales” are semantically related, enabling more intelligent retrieval.

ChromaDB in RAG Chat

RAG Chat uses ChromaDB as its vector store. ChromaDB is lightweight, fast, and perfect for local deployments.

Loading an Existing Vector Store

When the app starts, it checks for previously stored documents:

app.py

persistant_directory = 'db'

def load_existing_vector_store():
    if os.path.exists(persistant_directory):
        vector_store = Chroma(
            persist_directory=persistant_directory,
            embedding_function=OpenAIEmbeddings()
        )
        return vector_store
    return None

The db directory stores all your document embeddings persistently. This means your documents remain available even after restarting the application.

Creating a New Vector Store

When you upload your first document, ChromaDB creates a new vector store:

app.py

def add_to_vector_store(documents, vector_store = None):
    if vector_store:
        vector_store.add_documents(documents)
    else:
        vector_store = Chroma.from_documents(
            documents=documents,
            embedding=OpenAIEmbeddings(),
            persist_directory=persistant_directory
        )
    return vector_store

This function handles both scenarios:

New store: Creates a fresh ChromaDB instance with from_documents()
Existing store: Adds new documents to the existing collection with add_documents()

The Embedding Process

How Text Becomes Vectors

Text chunk: “RAG combines retrieval with generation”
OpenAI Embedding API: Converts text to a 1536-dimensional vector
Vector: [0.023, -0.145, 0.891, ..., 0.234] (1536 numbers)
Storage: ChromaDB stores the vector along with the original text
Retrieval: Query vectors are compared to stored vectors using cosine similarity

OpenAI Embeddings

RAG Chat uses OpenAI’s text-embedding-ada-002 model through OpenAIEmbeddings():

app.py

from langchain_openai import OpenAIEmbeddings

# Used when loading existing store
embedding_function=OpenAIEmbeddings()

# Used when creating new store
embedding=OpenAIEmbeddings()

The same embedding model must be used for both storing and querying to ensure vectors are in the same semantic space.

Similarity Search

When you ask a question, ChromaDB performs similarity search:

app.py

retriever = vector_store.as_retriever()

The retriever:

Embeds your question using OpenAI’s embedding model
Computes similarity between the question vector and all stored chunk vectors
Returns the top-k most similar chunks (default k=4)
These chunks become the context for the LLM

Cosine Similarity

ChromaDB uses cosine similarity to measure how close two vectors are:

1.0: Identical meaning
0.8-0.9: Very similar
0.5-0.7: Somewhat related
< 0.5: Not very similar

Persistence

The db directory contains all your vector store data:

db/
├── chroma.sqlite3          # Metadata and document text
└── [uuid-folders]/         # Vector index files

Your uploaded documents persist across sessions. Delete the db directory to reset the vector store.

Advantages of Persistence

Fast Startup

No need to re-process documents every time

Incremental Updates

Add new documents without losing existing ones

Cost Savings

Avoid redundant embedding API calls

Reliability

Data survives application restarts

Code Flow Example

Here’s how documents flow into the vector store:

# 1. User uploads PDF files
uploaded_files = st.file_uploader(...)

# 2. Process each file into chunks
all_chunks = []
for uploaded_file in uploaded_files:
    chunks = process_file(uploaded_file)  # Creates text chunks
    all_chunks.extend(chunks)

# 3. Add chunks to vector store
if all_chunks:
    vector_store = add_to_vector_store(
        vector_store = vector_store,  # Existing or None
        documents = all_chunks         # New chunks to add
    )

Performance Considerations

Embedding Costs

OpenAI charges per token for embeddings:

Model: text-embedding-ada-002
Cost: ~$0.0001 per 1K tokens
Example: A 100-page document might cost $0.05-$ 0.15 to embed

Embeddings are only computed once per document chunk. Queries use the same model but only embed the question (very cheap).

Retrieval Speed

ChromaDB is optimized for fast retrieval:

Small collections (< 10K chunks): Near-instant retrieval
Medium collections (10K-100K chunks): Milliseconds
Large collections (> 100K chunks): Consider approximate nearest neighbor (ANN) indices

Get Started

Core Concepts

Guides

Reference

Advanced

What is a Vector Store?

Why Vector Embeddings?

Keyword Search

Vector Search

ChromaDB in RAG Chat

Loading an Existing Vector Store

Creating a New Vector Store

The Embedding Process

OpenAI Embeddings

Similarity Search

Cosine Similarity

Persistence

Advantages of Persistence

Fast Startup

Incremental Updates

Cost Savings

Reliability

Code Flow Example

Performance Considerations

Embedding Costs

Retrieval Speed

Next Steps

Document Processing

RAG Overview

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Reference

Advanced

​What is a Vector Store?

​Why Vector Embeddings?

Keyword Search

Vector Search

​ChromaDB in RAG Chat

​Loading an Existing Vector Store

​Creating a New Vector Store

​The Embedding Process

​OpenAI Embeddings

​Similarity Search

​Cosine Similarity

​Persistence

​Advantages of Persistence

Fast Startup

Incremental Updates

Cost Savings

Reliability

​Code Flow Example

​Performance Considerations

​Embedding Costs

​Retrieval Speed

​Next Steps

Document Processing

RAG Overview

Build docs developers (and LLMs) love

What is a Vector Store?

Why Vector Embeddings?

ChromaDB in RAG Chat

Loading an Existing Vector Store

Creating a New Vector Store

The Embedding Process

OpenAI Embeddings

Similarity Search

Cosine Similarity

Persistence

Advantages of Persistence

Code Flow Example

Performance Considerations

Embedding Costs

Retrieval Speed

Next Steps