Skip to main content

Overview

The RAG Recruitment Assistant uses HuggingFaceEmbeddings from LangChain to convert resume text into vector representations for semantic search and retrieval.

Installation

pip install langchain-huggingface

Basic Setup

from langchain_huggingface import HuggingFaceEmbeddings

# Initialize embeddings model
embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-MiniLM-L6-v2"
)

Configuration Parameters

model_name
string
default:"sentence-transformers/all-MiniLM-L6-v2"
The HuggingFace model identifier for generating embeddings.Common Models:
  • sentence-transformers/all-MiniLM-L6-v2 - Fast, lightweight (90MB)
  • sentence-transformers/all-mpnet-base-v2 - Higher quality, larger
  • sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 - Multilingual support
model_kwargs
dict
default:"{}"
Additional keyword arguments passed to the model.
embeddings = HuggingFaceEmbeddings(
    model_kwargs={'device': 'cuda'}  # Use GPU
)
encode_kwargs
dict
default:"{}"
Arguments for the encode method.
embeddings = HuggingFaceEmbeddings(
    encode_kwargs={'normalize_embeddings': True}
)
cache_folder
string
Directory to cache downloaded models.
embeddings = HuggingFaceEmbeddings(
    cache_folder="./model_cache"
)

Methods

embed_documents()

Generate embeddings for multiple documents.
texts
list[str]
required
List of text documents to embed.
embeddings
list[list[float]]
List of embedding vectors, one per input document.
documents = [
    "Estudiante de 9no ciclo con Python y React",
    "Data Analyst con experiencia en PowerBI"
]

vectors = embeddings.embed_documents(documents)
print(f"Generated {len(vectors)} embeddings")
print(f"Vector dimension: {len(vectors[0])}")
# Output: Generated 2 embeddings
# Output: Vector dimension: 384

embed_query()

Generate embedding for a single search query.
text
str
required
Query text to embed.
embedding
list[float]
Single embedding vector for the query.
query = "Busco desarrollador Python con FastAPI"
query_vector = embeddings.embed_query(query)
print(f"Query vector dimension: {len(query_vector)}")
# Output: Query vector dimension: 384

Real-World Usage

From the Talent Scout System

This example shows how embeddings are used in the actual recruitment assistant:
reference/notebook/Talent_Scout_3000x.ipynb
from langchain_community.document_loaders import PyPDFLoader
from langchain_community.vectorstores import FAISS
from langchain_huggingface import HuggingFaceEmbeddings

# Initialize embeddings
embeddings = HuggingFaceEmbeddings()

# Load resume PDF
loader = PyPDFLoader("CV_Estudiante_4_Fernanda_Paredes.pdf")
docs = loader.load()

# Create vector store with embeddings
vectorstore = FAISS.from_documents(docs, embeddings)

Batch Processing Multiple Resumes

import glob
from langchain_community.document_loaders import PyPDFLoader

# Process multiple student resumes
archivos = glob.glob("cvs_estudiantes_final/*.pdf")

for pdf_path in archivos:
    loader = PyPDFLoader(pdf_path)
    pages = loader.load()
    
    # Generate embeddings for all pages
    texts = [p.page_content for p in pages]
    vectors = embeddings.embed_documents(texts)
    
    print(f"Processed {pdf_path}: {len(vectors)} vectors")

Performance Considerations

Model Download

On first run, the model (~90MB) is downloaded from HuggingFace Hub. Subsequent runs use the cached version.

GPU Acceleration

# Enable GPU for faster embedding generation
embeddings = HuggingFaceEmbeddings(
    model_kwargs={'device': 'cuda'}
)

Batch Size

For large documents, process in batches to avoid memory issues:
def batch_embed(texts, batch_size=32):
    all_embeddings = []
    for i in range(0, len(texts), batch_size):
        batch = texts[i:i + batch_size]
        embeddings_batch = embeddings.embed_documents(batch)
        all_embeddings.extend(embeddings_batch)
    return all_embeddings

Error Handling

import warnings
from langchain_huggingface import HuggingFaceEmbeddings

try:
    embeddings = HuggingFaceEmbeddings(
        model_name="sentence-transformers/all-MiniLM-L6-v2"
    )
    
    # Test embedding
    test_vector = embeddings.embed_query("Test query")
    print(f"✓ Embeddings loaded successfully ({len(test_vector)} dimensions)")
    
except Exception as e:
    print(f"Error loading embeddings: {e}")
    # Fallback to default model
    embeddings = HuggingFaceEmbeddings()

Vector Dimensions

ModelDimensionSizeSpeed
all-MiniLM-L6-v238490MBFast
all-mpnet-base-v2768420MBMedium
paraphrase-multilingual-MiniLM-L12-v2384420MBMedium

Next Steps

Vector Store

Learn how to store and search embeddings with FAISS

LLM Integration

Connect embeddings to the language model

Build docs developers (and LLMs) love