Overview
The system uses Sentence Transformers for text embeddings and FAISS (Facebook AI Similarity Search) for efficient vector similarity search. This enables semantic retrieval, concept matching, and duplicate detection. Source Files:backend/resume_processor.pyscripts/mistral_faiss.pybackend/rag.py
Embedding Model
all-MiniLM-L6-v2
A lightweight, fast sentence embedding model optimized for semantic similarity tasks.- Dimensions: 384
- Max Sequence Length: 256 tokens
- Performance: 14,200 sentences/sec on V100 GPU
- Size: 80 MB
- Training: Trained on 1B+ sentence pairs
interview_analyzer.py:23, resume_processor.py:38, rag.py:79
Normalized Embeddings
All embeddings are L2-normalized for efficient cosine similarity via inner product.- Cosine similarity = inner product when vectors are normalized
- Faster computation (no division needed)
- FAISS
IndexFlatIPoptimized for inner product search
FAISS Index Structure
IndexFlatIP
Inner Product index for normalized vectors (equivalent to cosine similarity).mistral_faiss.py:43-55, resume_processor.py:59-66
Index Types Comparison
| Index Type | Description | Use Case |
|---|---|---|
IndexFlatIP | Exact inner product search | Normalized vectors, high accuracy |
IndexFlatL2 | Exact L2 distance search | Non-normalized vectors |
IndexIVFFlat | Inverted file index | Large datasets, approximate search |
IndexHNSWFlat | Hierarchical NSW graph | Very large datasets, fast retrieval |
IndexFlatIP (exact search, no approximation)
Knowledge Base Index
Index Building Process
Builds FAISS index from cleaned knowledge base.mistral_faiss.py:43-66
Chunk Creation
Creates searchable chunks from Q&A pairs.mistral_faiss.py:24-40
Metadata Storage (metas.json)
Metadata stored separately for efficient retrieval.- FAISS only stores vectors, not metadata
- Metadata indexed by position (0-based)
- Fast lookup:
meta = metas[idx]
Resume Index
Per-user FAISS index for resume content.Resume Processing
resume_processor.py:29-75
Chunking Strategy
RecursiveCharacterTextSplitter Parameters:chunk_size=500: Maximum chunk length (characters)chunk_overlap=50: Overlap between chunks to preserve contextseparators=["\n\n", "\n", " ", ""]: Split priority (paragraphs > lines > words > chars)
- Semantic coherence within chunks
- Context preservation via overlap
- Handles varied resume formats
Search Operations
Basic Search
scores: Similarity scores (higher = more similar)indices: Positions in index (used to lookup metadata)
Resume Search
resume_processor.py:77-109
Topic-Filtered Search
rag.py:167-193
Vector Dimensions
Embedding Space
Distance Metrics
Inner Product (Normalized Vectors):Similarity Thresholds
| Use Case | Threshold | Interpretation |
|---|---|---|
| Semantic Deduplication | 0.75 | Very similar questions |
| Concept Matching | 0.65 | Concept present in answer |
| Topic Detection | 0.50 | Weak topic signal |
| Retrieval | 0.30 | Potentially relevant |
Job Description Embeddings
Store JD embeddings for interview personalization.Storage
resume_processor.py:118-142
Retrieval
resume_processor.py:145-160
Performance Optimizations
1. Batch Encoding
2. GPU Acceleration
3. Index Caching
rag.py:66-117
4. Float32 Precision
Index Statistics
Knowledge Base Index
Resume Index
File Structure
Key Functions Summary
| Function | Purpose | Location |
|---|---|---|
build_faiss_index() | Build KB index from Q&A pairs | mistral_faiss.py:43 |
process_resume_for_faiss() | Create user resume index | resume_processor.py:29 |
search_resume_faiss() | Search user resume | resume_processor.py:77 |
store_jd_embedding() | Save JD embedding | resume_processor.py:118 |
get_jd_embedding() | Load JD embedding | resume_processor.py:145 |
load_index_and_metas() | Load cached KB index | rag.py:98 |
get_embedder() | Get cached embedder | rag.py:74 |
Best Practices
- Always Normalize: Use
normalize_embeddings=Truefor consistent similarity scores - Cache Models: Load embedder once, reuse across requests
- Batch Operations: Encode multiple texts together for speed
- Float32: Convert embeddings to
float32before adding to FAISS - Metadata Sync: Keep metadata array aligned with FAISS index positions
- Over-fetch & Filter: Search k*3, filter to k for topic-specific retrieval