Overview
The RAG (Retrieval-Augmented Generation) system combines semantic vector search with large language model generation to provide accurate, context-aware responses. The pipeline uses FAISS for efficient similarity search and Mistral API for high-quality text generation. Source File:backend/rag.py
Core Components
1. Topic Detection & Domain Restriction
The system enforces strict domain boundaries to ensure responses stay within allowed topics.Allowed Topics
Topic Aliases
Normalizes variations of topic names to canonical forms:rag.py:35-56
Keyword-Based Topic Detection
rag.py:133-143
RAG Voting for Topic Detection
Uses majority voting from top-k retrieved chunks:rag.py:446-490
2. Caching Architecture
The system implements global caching to avoid repeated model loading and data reads.Cache Variables
rag.py:66-70
Cached Embedder
rag.py:74-80
Cached Index & Metadata
rag.py:98-117
3. FAISS Retrieval
Strict Topic Filtering
rag.py:167-193
Similar Q&A Retrieval
Retrieve few-shot examples for question generation:rag.py:309-376
4. Context Building
The system builds rich context from retrieved chunks before sending to the LLM.rag.py:617-627
5. Mistral API Integration
Technical Explanation Generation
rag.py:197-255
Interview Question Generation
rag.py:280-305
Expected Answer Generation (Adaptive Interviews)
rag.py:642-750
Pipeline Flow
Technical Interview Query
Main Entry Point:rag.py:499-638
Configuration
Paths
Mistral Client
rag.py:16-20
Performance Optimizations
- Global Caching: Models and indexes loaded once per process
- Normalized Embeddings: Enables faster cosine similarity via inner product
- Over-fetching with Filtering: Search k*8, filter down to k for topic relevance
- Deduplication: Track seen IDs to prevent duplicate chunks in context
Key Functions Summary
| Function | Purpose | Location |
|---|---|---|
technical_interview_query() | Main chatbot entry point | rag.py:499 |
detect_topic_via_rag() | RAG voting for topic detection | rag.py:446 |
get_embedder() | Get cached embedder | rag.py:74 |
load_index_and_metas() | Load cached FAISS index | rag.py:98 |
retrieve_similar_qas() | Get few-shot examples | rag.py:309 |
retrieve_relevant_chunks() | Get context chunks | rag.py:379 |
agentic_expected_answer() | Generate expected answers | rag.py:642 |
generate_technical_explanation() | Generate educational responses | rag.py:197 |
generate_interview_question() | Generate interview questions | rag.py:280 |