Semantic search is optional and disabled by default. It requires additional dependencies and resources.
How It Works
Configuration
Enable semantic search in your config:Embedding Providers
FastEmbed (Default)
FastEmbed provides local embedding models via ONNX runtime. Advantages:- Runs locally (no API calls)
- Fast inference
- Low memory footprint
- No API costs
| Model | Dimensions | Description |
|---|---|---|
BAAI/bge-small-en-v1.5 | 384 | Default - Balanced speed and quality |
BAAI/bge-base-en-v1.5 | 768 | Higher quality, slower |
sentence-transformers/all-MiniLM-L6-v2 | 384 | Fast, good for short texts |
OpenAI
Use OpenAI’s embedding API for higher quality embeddings. Advantages:- State-of-the-art quality
- No local compute required
- Latest models
- Requires API key and internet
- Per-token costs
- Data sent to OpenAI
| Model | Dimensions | Cost (per 1M tokens) |
|---|---|---|
text-embedding-3-small | 1536 | $0.02 |
text-embedding-3-large | 3072 | $0.13 |
text-embedding-ada-002 | 1536 | $0.10 (legacy) |
Database Setup
SQLite (sqlite-vec)
SQLite uses thesqlite-vec extension for vector storage:
PostgreSQL (pgvector)
PostgreSQL uses thepgvector extension:
Usage
MCP Tool
CLI
Hybrid Search
Combine full-text and semantic search:- Run full-text search (keyword matching)
- Run semantic search (meaning-based)
- Merge and re-rank results using Reciprocal Rank Fusion (RRF)
Performance Considerations
Embedding Generation
- FastEmbed (Local)
- OpenAI API
Speed: ~100-500 docs/sec (depending on hardware)Memory: ~200-500 MB for modelBatch processing: Enabled by default (
batch_size=32)Vector Search Performance
Vector search is slower than full-text search. Use hybrid search to get the best of both.
| Database | Backend | Search Time (10k docs) |
|---|---|---|
| SQLite | sqlite-vec | ~20-50ms |
| PostgreSQL | pgvector (IVFFlat) | ~10-30ms |
| PostgreSQL | pgvector (HNSW) | ~5-15ms |
Optimization Tips
Use smaller embedding models
Use smaller embedding models
Smaller dimensions = faster search:
- 384 dimensions: Faster, good for most use cases
- 768 dimensions: Balanced
- 1536+ dimensions: Higher quality, slower
Batch embedding generation
Batch embedding generation
Use PostgreSQL for large datasets
Use PostgreSQL for large datasets
PostgreSQL with pgvector is much faster for > 10k documents
Enable HNSW index (PostgreSQL)
Enable HNSW index (PostgreSQL)
Reindexing
Regenerate embeddings after changing models:Search Quality
When to Use Semantic Search
✅ Good for:- Finding conceptually similar notes
- Queries with synonyms or paraphrasing
- Discovering related topics
- Cross-lingual search (with multilingual models)
- Exact keyword matching
- Searching for specific names or IDs
- Boolean logic (AND, OR, NOT)
- Very short queries (< 3 words)
Example Comparisons
- Full-Text Search
- Semantic Search
- Hybrid Search
Query:
python loggingFinds: Documents containing “python” AND “logging”Misses: Documents about “debugging in Python” or “error handling”Multilingual Search
Use multilingual models for cross-language search:Privacy Considerations
For maximum privacy, use FastEmbed with local models.Troubleshooting
ModuleNotFoundError: No module named 'fastembed'
ModuleNotFoundError: No module named 'fastembed'
Solution: Install semantic dependencies:
PostgreSQL: extension 'vector' not found
PostgreSQL: extension 'vector' not found
Solution: Install pgvector:
Semantic search returns no results
Semantic search returns no results
Causes:
- Embeddings not generated yet
- Model mismatch (changed model without reindexing)
OpenAI API key not found
OpenAI API key not found
Solution: Set environment variable:
Next Steps
Search Guide
Learn advanced search techniques
Database Backends
Configure SQLite or PostgreSQL