RAG System Architecture
Local GPT uses a sophisticated Retrieval-Augmented Generation (RAG) system to enhance AI responses with context from your Obsidian vault. This page explains the technical implementation details.Overview
The RAG system processes linked documents, extracts relevant content, and provides context to the AI model. This is implemented primarily insrc/rag.ts with support for both Markdown and PDF files.
Enhanced Actions automatically extract context from documents linked in your selection to provide richer AI responses.
Document Processing Pipeline
1. Link Discovery
The system identifies linked files through thegetLinkedFiles() function:
- Markdown files (
.md) - PDF documents (
.pdf)
The system sanitizes content by removing code blocks, HTML comments, and inline code before extracting links.
2. Graph Traversal
The RAG system traverses your vault’s knowledge graph with a depth limit:Start from active file
The system begins with the currently active file and extracts all linked documents.
3. Content Extraction
Markdown Files
Markdown content is read using Obsidian’s cached read API for optimal performance:PDF Files
PDF processing usespdfjs-dist with intelligent caching:
src/processors/pdf.ts):
- Uses a Web Worker for background processing
- Extracts text with line break detection
- Preserves page structure with double newlines between pages
- Caches extracted text in IndexedDB for performance
4. Document Structure
Each processed document is stored with metadata:Vector Search & Embedding
Embedding Process
The system uses your configured embedding provider to create vector representations:Chunking Strategy
Documents are automatically split into chunks by the AI Providers SDK:- Chunks are sized based on the embedding model’s context window
- Progress tracking reports
processedChunksandtotalChunks - Each chunk is embedded independently for granular matching
Similarity Search
ThesearchDocuments() function:
- Embeds your query (selected text)
- Embeds all document chunks
- Computes similarity scores (cosine similarity)
- Returns ranked results
Context Formatting
Result Ranking
Results are organized intelligently:Group by File
Results from the same file are grouped together under
[[filename]].Sort by Creation Time
File groups are sorted by creation time (newest first).
Rank by Relevance
Within each group, chunks are sorted by similarity score.
Respect Limits
Context is truncated at the configured limit (see Context Limits).
Output Format
The formatted context follows this structure:The total character count is logged in development mode:
"Total length of context"Performance Optimizations
Caching Strategies
PDF Content Cache (src/indexedDB.ts):
- Stores extracted text by file path
- Includes modification time (
mtime) for invalidation - Prevents re-extraction on every request
- Uses Obsidian’s
MetadataCachefor link resolution - Leverages
resolvedLinksfor graph traversal - Avoids manual parsing of markdown files
Parallel Processing
Progress Tracking
The system provides real-time progress updates:Technical Reference
Key Functions
getLinkedFiles(content, vault, metadataCache, currentFilePath)
getLinkedFiles(content, vault, metadataCache, currentFilePath)
Extracts all wiki-links and markdown links from content.Parameters:
content: Text to parse for linksvault: Obsidian vault instancemetadataCache: Metadata cache for link resolutioncurrentFilePath: Current file path for relative resolution
TFile objects (md and pdf only)processDocumentForRAG(file, context, processedDocs, depth, isBacklink)
processDocumentForRAG(file, context, processedDocs, depth, isBacklink)
Recursively processes a document and its linked files.Parameters:
file: File to processcontext: Processing context (vault, metadataCache, activeFile)processedDocs: Map to store processed documentsdepth: Current traversal depthisBacklink: Whether this is a backlink reference
processedDocs mapsearchDocuments(query, documents, aiProviders, embeddingProvider, abortController, ...)
searchDocuments(query, documents, aiProviders, embeddingProvider, abortController, ...)
Performs vector similarity search across documents.Parameters:
query: Search query (usually selected text)documents: Array of processed documentsaiProviders: AI providers service instanceembeddingProvider: Configured embedding providerabortController: For cancellation supportupdateCompletedSteps: Progress callbackaddTotalProgressSteps: Dynamic total adjustment callbackcontextLimit: Maximum context length in characters
Source Code References
Core RAG Logic
src/rag.ts - Main RAG implementationPDF Processing
src/processors/pdf.ts - PDF text extractionFile Caching
src/indexedDB.ts - IndexedDB cache layerIntegration
src/main.ts:688-747 - enhanceWithContext() methodNext Steps
Context Limits
Learn how to configure context limits for different models
Troubleshooting
Debug common RAG and embedding issues