Semantic Search

Semantic search allows you to find notes by meaning, not just keywords. It uses vector embeddings to understand the semantic similarity between your query and your notes.

Semantic search is optional and disabled by default. It requires additional dependencies and resources.

How It Works

Text Embedding

Text is converted to high-dimensional vectors using a language model

Vector Storage

Embeddings are stored in the database with vector indexes

Similarity Search

Queries are embedded and compared using cosine similarity

Result Ranking

Results ranked by semantic similarity score

Configuration

Enable semantic search in your config:

# ~/.config/basic-memory/config.yaml
semantic_search_enabled: true
semantic_embedding_provider: fastembed
semantic_embedding_model: BAAI/bge-small-en-v1.5
semantic_embedding_dimensions: 384
semantic_embedding_batch_size: 32

Or via environment variables:

export BASIC_MEMORY_SEMANTIC_SEARCH_ENABLED=true
export BASIC_MEMORY_SEMANTIC_EMBEDDING_PROVIDER=fastembed
export BASIC_MEMORY_SEMANTIC_EMBEDDING_MODEL=BAAI/bge-small-en-v1.5

Embedding Providers

FastEmbed (Default)

FastEmbed provides local embedding models via ONNX runtime. Advantages:

Runs locally (no API calls)
Fast inference
Low memory footprint
No API costs

Available models:

Model	Dimensions	Description
`BAAI/bge-small-en-v1.5`	384	Default - Balanced speed and quality
`BAAI/bge-base-en-v1.5`	768	Higher quality, slower
`sentence-transformers/all-MiniLM-L6-v2`	384	Fast, good for short texts

semantic_embedding_provider: fastembed
semantic_embedding_model: BAAI/bge-small-en-v1.5
semantic_embedding_dimensions: 384

OpenAI

Use OpenAI’s embedding API for higher quality embeddings. Advantages:

State-of-the-art quality
No local compute required
Latest models

Disadvantages:

Requires API key and internet
Per-token costs
Data sent to OpenAI

semantic_embedding_provider: openai
semantic_embedding_model: text-embedding-3-small
semantic_embedding_dimensions: 1536

Environment variable:

export OPENAI_API_KEY=sk-...

Available models:

Model	Dimensions	Cost (per 1M tokens)
`text-embedding-3-small`	1536	$0.02
`text-embedding-3-large`	3072	$0.13
`text-embedding-ada-002`	1536	$0.10 (legacy)

Database Setup

SQLite (sqlite-vec)

SQLite uses the sqlite-vec extension for vector storage:

# Automatically installed with semantic dependencies
import sqlite_vec

# Vector table created automatically
CREATE VIRTUAL TABLE vec_items USING vec0(
    embedding float[384]
);

Performance: Good for small to medium datasets (< 100k vectors)

PostgreSQL (pgvector)

PostgreSQL uses the pgvector extension:

# Install pgvector extension
psql -d basic_memory -c "CREATE EXTENSION vector;"

-- Vector column and index created automatically
ALTER TABLE entity ADD COLUMN embedding vector(384);
CREATE INDEX ON entity USING ivfflat (embedding vector_cosine_ops);

Performance: Excellent for large datasets with HNSW/IVFFlat indexes

Usage

MCP Tool

# Semantic search via MCP
await search_notes(
    query="knowledge management systems",
    search_type="semantic",
    page_size=10
)

CLI

# Semantic search via CLI
bm tool search-notes \
  --query "knowledge management systems" \
  --search-type semantic \
  --page-size 10

Hybrid Search

Combine full-text and semantic search:

# Hybrid search (best of both)
await search_notes(
    query="knowledge management",
    search_type="hybrid",
    page_size=10
)

How it works:

Run full-text search (keyword matching)
Run semantic search (meaning-based)
Merge and re-rank results using Reciprocal Rank Fusion (RRF)

Performance Considerations

Embedding Generation

FastEmbed (Local)
OpenAI API

Speed: ~100-500 docs/sec (depending on hardware)Memory: ~200-500 MB for modelBatch processing: Enabled by default (batch_size=32)

Vector Search Performance

Vector search is slower than full-text search. Use hybrid search to get the best of both.

Database	Backend	Search Time (10k docs)
SQLite	sqlite-vec	~20-50ms
PostgreSQL	pgvector (IVFFlat)	~10-30ms
PostgreSQL	pgvector (HNSW)	~5-15ms

Optimization Tips

Use smaller embedding models

Smaller dimensions = faster search:

384 dimensions: Faster, good for most use cases
768 dimensions: Balanced
1536+ dimensions: Higher quality, slower

Batch embedding generation

semantic_embedding_batch_size: 64  # Increase for faster processing

Trade-off: Higher memory usage

Use PostgreSQL for large datasets

PostgreSQL with pgvector is much faster for > 10k documents

Enable HNSW index (PostgreSQL)

CREATE INDEX ON entity USING hnsw (embedding vector_cosine_ops);

Faster search, longer index build time

Reindexing

Regenerate embeddings after changing models:

# Full reindex (slow)
bm tool reindex --force

# Reindex specific project
bm tool reindex --project my-project

Reindexing can take several minutes for large knowledge bases. Embeddings are generated for all notes.

Search Quality

When to Use Semantic Search

✅ Good for:

Finding conceptually similar notes
Queries with synonyms or paraphrasing
Discovering related topics
Cross-lingual search (with multilingual models)

❌ Not ideal for:

Exact keyword matching
Searching for specific names or IDs
Boolean logic (AND, OR, NOT)
Very short queries (< 3 words)

Example Comparisons

Full-Text Search
Semantic Search
Hybrid Search

Query: python loggingFinds: Documents containing “python” AND “logging”Misses: Documents about “debugging in Python” or “error handling”

Query: python loggingFinds:

Documents about Python logging
Documents about debugging
Documents about error handling
Documents about observability in Python

May include false positives: Documents about logging in general

Query: python loggingFinds: Best of both approachesRanking: Keyword matches ranked higher, semantic matches for context

Multilingual Search

Use multilingual models for cross-language search:

semantic_embedding_model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
semantic_embedding_dimensions: 384

Supported languages: 50+ including English, Spanish, French, German, Chinese, Japanese Example: Query in English, find results in Spanish

Privacy Considerations

FastEmbed (Local): All processing happens locally. No data leaves your machine.OpenAI API: Text is sent to OpenAI for embedding generation. Review OpenAI’s data usage policy.

For maximum privacy, use FastEmbed with local models.

Troubleshooting

ModuleNotFoundError: No module named 'fastembed'

Solution: Install semantic dependencies:

uv pip install basic-memory[semantic]

PostgreSQL: extension 'vector' not found

Solution: Install pgvector:

# macOS
brew install pgvector

# Ubuntu/Debian
sudo apt-get install postgresql-16-pgvector

# Enable in database
psql -d basic_memory -c "CREATE EXTENSION vector;"

Semantic search returns no results

Causes:

Embeddings not generated yet
Model mismatch (changed model without reindexing)

Solution: Reindex

bm tool reindex --force

OpenAI API key not found

Solution: Set environment variable:

export OPENAI_API_KEY=sk-...

Get Started

Core Concepts

User Guide

Cloud Features

Advanced

How It Works

Configuration

Embedding Providers

FastEmbed (Default)

OpenAI

Database Setup

SQLite (sqlite-vec)

PostgreSQL (pgvector)

Usage

MCP Tool

CLI

Hybrid Search

Performance Considerations

Embedding Generation

Vector Search Performance

Optimization Tips

Reindexing

Search Quality

When to Use Semantic Search

Example Comparisons

Multilingual Search

Privacy Considerations

Troubleshooting

Next Steps

Search Guide

Database Backends

Build docs developers (and LLMs) love

Get Started

Core Concepts

User Guide

Cloud Features

Advanced

​How It Works

​Configuration

​Embedding Providers

​FastEmbed (Default)

​OpenAI

​Database Setup

​SQLite (sqlite-vec)

​PostgreSQL (pgvector)

​Usage

​MCP Tool

​CLI

​Hybrid Search

​Performance Considerations

​Embedding Generation

​Vector Search Performance

​Optimization Tips

​Reindexing

​Search Quality

​When to Use Semantic Search

​Example Comparisons

​Multilingual Search

​Privacy Considerations

​Troubleshooting

​Next Steps

Search Guide

Database Backends

Build docs developers (and LLMs) love

How It Works

Configuration

Embedding Providers

FastEmbed (Default)

OpenAI

Database Setup

SQLite (sqlite-vec)

PostgreSQL (pgvector)

Usage

MCP Tool

CLI

Hybrid Search

Performance Considerations

Embedding Generation

Vector Search Performance

Optimization Tips

Reindexing

Search Quality

When to Use Semantic Search

Example Comparisons

Multilingual Search

Privacy Considerations

Troubleshooting

Next Steps