Skip to main content
ClinicalPilot uses LanceDB as its embedded vector database for Retrieval-Augmented Generation (RAG). This enables agents to cite medical literature and ground their reasoning in evidence.

Why LanceDB?

  • Serverless: No separate database server required
  • Embedded: Runs in-process with your application
  • Fast: Native vector search with disk-based persistence
  • Zero cost: No cloud infrastructure or API charges

Architecture

Query → Embeddings (all-MiniLM-L6-v2)

      LanceDB Vector Search

    Top-K Similar Documents

    Literature Agent (GPT-4o-mini)

Installation

LanceDB is included in requirements.txt. No additional installation needed.
pip install lancedb sentence-transformers

Initialization

1

Auto-Initialization

LanceDB auto-creates the vector store on first run:
python -m uvicorn backend.main:app --reload
Database path: data/lancedb/ (configurable via LANCEDB_PATH in .env)
2

Manual Initialization

To explicitly initialize the database:
python -m backend.rag.lancedb_store --init
Output:
Initializing LanceDB...
✓ LanceDB initialized at /path/to/clinicalpilot/data/lancedb
3

Verify Setup

Check that the database was created:
ls -la data/lancedb/
You should see:
medical_knowledge.lance/

Embedding Model

ClinicalPilot uses all-MiniLM-L6-v2 from sentence-transformers:
  • Dimension: 384
  • Model size: ~80MB
  • Speed: ~1000 sentences/second on CPU
  • Quality: Optimized for semantic similarity
The embedding model downloads automatically on first use (~80MB). Subsequent runs use the cached model.

Ingesting Documents

From Directory

Ingest all .txt, .md, and .pdf files from a directory:
python -m backend.rag.lancedb_store --ingest /path/to/documents
Example:
# Create a directory for medical literature
mkdir -p data/rag_documents

# Add your medical PDFs, guidelines, etc.
cp ~/Downloads/clinical_guidelines.pdf data/rag_documents/

# Ingest
python -m backend.rag.lancedb_store --ingest data/rag_documents/
Output:
✓ Ingested 127 chunks from data/rag_documents/

Programmatic Ingestion

Add documents from Python code:
from backend.rag.lancedb_store import add_documents

texts = [
    "Aspirin 81mg daily reduces cardiovascular risk in patients with CAD.",
    "Beta-blockers are contraindicated in severe asthma.",
]

sources = ["AHA Guidelines 2024", "GINA Guidelines 2024"]
categories = ["cardiology", "pulmonology"]

count = add_documents(texts, sources, categories)
print(f"Added {count} documents")

Document Chunking

Documents are automatically split into 500-character chunks with 50-character overlap to preserve context.
Chunking ensures that embedding quality remains high and search results are granular. A 10-page PDF might generate 50-100 chunks.

Chunking Parameters

Edit backend/rag/lancedb_store.py to customize:
def _chunk_text(text: str, chunk_size: int = 500, overlap: int = 50):
    # Adjust chunk_size and overlap as needed
    ...

Searching the Vector Store

From Python

from backend.rag.lancedb_store import search

results = search(
    query="What are the contraindications for beta-blockers?",
    top_k=5
)

for result in results:
    print(f"Score: {result['score']:.3f}")
    print(f"Source: {result['source']}")
    print(f"Text: {result['text'][:100]}...\n")

From API

The Literature Agent automatically searches LanceDB when processing cases. No manual API calls needed.

Configuration

Environment Variables

# .env
LANCEDB_PATH=data/lancedb

Absolute Path

You can use an absolute path:
LANCEDB_PATH=/var/lib/clinicalpilot/lancedb
Relative paths are resolved from the project root.

Schema

LanceDB table schema:
FieldTypeDescription
textstrDocument chunk text
sourcestrSource file or reference
categorystrCategory (e.g., “cardiology”, “pharmacology”)
vectorlist[float]384-dimensional embedding

Performance

Search Latency

  • 10K documents: ~10-20ms
  • 100K documents: ~50-100ms
  • 1M documents: ~200-500ms
LanceDB uses disk-based indices, so performance scales well even with large datasets. RAM usage remains low (~100-200MB).

Embedding Latency

  • Single query: ~5-10ms (CPU)
  • Batch (100 docs): ~500ms (CPU)
  • Batch (1000 docs): ~5s (CPU)

Integration with Agents

The Literature Agent (backend/agents/literature.py) automatically queries LanceDB:
from backend.rag.lancedb_store import search

# Inside Literature Agent
rag_results = search(patient_context.chief_complaint, top_k=5)

# Agent uses RAG results + PubMed citations
prompt = f"""
Evidence from local knowledge base:
{rag_results}

Evidence from PubMed:
{pubmed_results}

Provide clinical recommendations...
"""

Best Practices

Do not commit the data/lancedb/ directory to Git. It can grow to several GB. Add it to .gitignore.
  1. Clinical Practice Guidelines (AHA, ACC, ACCP, etc.)
  2. Pharmacology References (DrugBank extracts, FDA labels)
  3. Differential Diagnosis Tables
  4. Your Organization’s Protocols (internal SOPs, pathways)

Avoiding Hallucinations

  • Cite sources: Always include source metadata so agents can reference guidelines
  • Keep chunks focused: Don’t mix unrelated topics in the same document
  • Update regularly: Medical knowledge changes — refresh your RAG store quarterly

Backup and Restore

Backup

tar -czf lancedb_backup.tar.gz data/lancedb/

Restore

tar -xzf lancedb_backup.tar.gz

Troubleshooting

”Table not found” Error

# Re-initialize the database
python -m backend.rag.lancedb_store --init

Search Returns No Results

Check that documents were ingested:
from backend.rag.lancedb_store import get_table

table = get_table()
print(f"Total documents: {table.count_rows()}")

Embedding Model Download Fails

# Pre-download the model
python -c "from sentence_transformers import SentenceTransformer; SentenceTransformer('all-MiniLM-L6-v2')"

”lancedb not installed” Warning

pip install lancedb --upgrade

Advanced: Custom Embeddings

To use OpenAI embeddings instead of sentence-transformers:
# backend/rag/embeddings.py
import openai

def embed_texts(texts: list[str]) -> list[list[float]]:
    response = openai.embeddings.create(
        model="text-embedding-3-small",
        input=texts
    )
    return [e.embedding for e in response.data]
OpenAI embeddings have 1536 dimensions (vs 384 for all-MiniLM-L6-v2). You’ll need to re-create the LanceDB table and re-ingest all documents.

Next Steps

Observability

Monitor RAG retrieval quality with LangSmith tracing

Testing

Write tests to validate RAG search results

Build docs developers (and LLMs) love