The RAG system enables the bot to provide accurate, context-aware responses by retrieving relevant information from your indexed documents. It combines vector search with OpenAI’s language models to deliver precise answers based on your knowledge base.
When a user sends a message, it’s converted into a 1536-dimension vector using OpenAI’s text-embedding-3-small model. The system checks the embedding cache first to improve performance.
src/Services/RAGService.php
private function getCachedOrCreateEmbedding($userMessage){ $normalized = trim(mb_strtolower($userMessage)); $queryHash = md5($normalized); // Check cache (24-hour TTL) $cached = $this->db->fetchOne( 'SELECT embedding FROM query_embedding_cache WHERE query_hash = :hash AND created_at > DATE_SUB(NOW(), INTERVAL 24 HOUR)', [':hash' => $queryHash] ); if ($cached && !empty($cached['embedding'])) { return VectorMath::unserializeVector($cached['embedding']); } // Create new embedding $embedding = $this->openai->createEmbedding($userMessage); // Store in cache $this->db->query( 'INSERT INTO query_embedding_cache (query_hash, embedding, created_at, last_used_at, hit_count) VALUES (:hash, :embedding, NOW(), NOW(), 0)', [':hash' => $queryHash, ':embedding' => VectorMath::serializeVector($embedding)] ); return $embedding;}
2
Similarity Search
The system searches for the most similar document chunks using cosine similarity. It retrieves the top K results (default: 3) that exceed the confidence threshold (default: 0.7).
src/Services/VectorSearchService.php
public function searchSimilar(array $queryEmbedding, $topK = 5, $threshold = 0.0, $maxCandidates = 200){ // Fetch candidate vectors from active documents $vectors = $this->db->fetchAll( "SELECT v.id, v.document_id, v.chunk_text, v.chunk_index, v.embedding, d.filename, d.original_name FROM vectors v INNER JOIN documents d ON v.document_id = d.id WHERE d.is_active = 1 ORDER BY RAND() LIMIT {$maxCandidates}" ); $results = []; foreach ($vectors as $vector) { $storedEmbedding = VectorMath::unserializeVector($vector['embedding']); // Calculate cosine similarity $score = VectorMath::cosineSimilarity($queryEmbedding, $storedEmbedding); if ($score >= $threshold) { $results[] = [ 'id' => $vector['id'], 'document_id' => $vector['document_id'], 'chunk_text' => $vector['chunk_text'], 'score' => $score, 'original_name' => $vector['original_name'] ]; } } // Sort by score and return top K usort($results, function($a, $b) { return $b['score'] <=> $a['score']; }); return array_slice($results, 0, $topK);}
3
Context Assembly
Retrieved chunks are combined into a context string, along with their source documents and confidence scores.