EmbeddingManager class handles the generation of vector embeddings for text using pre-trained Sentence Transformer models. It provides a simple interface for converting text into dense vector representations suitable for semantic search.
Class definition
Constructor parameters
Name of the Sentence Transformer model to use for generating embeddings. The default
all-MiniLM-L6-v2 is a lightweight, efficient model that produces 384-dimensional embeddings.The model is automatically downloaded from Hugging Face on first use and cached locally for subsequent runs. The
all-MiniLM-L6-v2 model offers a good balance of speed and quality for code embeddings.Methods
generate_embeddings()
Generates vector embeddings for a list of text strings.List of text strings to generate embeddings for. Each string is encoded independently.
NumPy array of embeddings with shape
(len(texts), embedding_dimension). For the default model, the embedding dimension is 384.A progress bar is displayed during embedding generation to track progress for large batches.
_load_model()
Internal method that loads the Sentence Transformer model.- Downloads the model if not cached
- Loads the model into memory
- Prints the embedding dimension for verification
- Raises an exception if the model fails to load
Usage example
Alternative models
Integration example
Frommain.py showing embedding generation in the RAG pipeline:
Batch processing
Model information
- Dimensions: 384
- Max sequence length: 256 word pieces (~200 words)
- Performance: ~14,000 sentences/second on CPU
- Size: ~80 MB
- Training: Trained on 1 billion sentence pairs
Error handling
Implementation notes
- Uses the
sentence-transformerslibrary for embedding generation - Models are automatically cached in
~/.cache/torch/sentence_transformers/ - Embeddings are generated on CPU by default (GPU acceleration available if PyTorch with CUDA is installed)
- Progress bars are displayed for long-running embedding generation
- Thread-safe for concurrent embedding generation