RAG System Architecture

Local GPT uses a sophisticated Retrieval-Augmented Generation (RAG) system to enhance AI responses with context from your Obsidian vault. This page explains the technical implementation details.

Overview

The RAG system processes linked documents, extracts relevant content, and provides context to the AI model. This is implemented primarily in src/rag.ts with support for both Markdown and PDF files.

Enhanced Actions automatically extract context from documents linked in your selection to provide richer AI responses.

Document Processing Pipeline

1. Link Discovery

The system identifies linked files through the getLinkedFiles() function:

// Supports both wiki-links and markdown links
const WIKI_LINK_REGEX = /\[\[([^\]|#]+)(?:#[^\]|]+)?(?:\|[^\]]+)?\]\]/g;
const MARKDOWN_LINK_REGEX = /\[[^\]]+\]\(([^)]+)\)/g;

Supported file types:

Markdown files (.md)
PDF documents (.pdf)

The system sanitizes content by removing code blocks, HTML comments, and inline code before extracting links.

2. Graph Traversal

The RAG system traverses your vault’s knowledge graph with a depth limit:

const MAX_DEPTH = 10;

Processing flow:

Start from active file

The system begins with the currently active file and extracts all linked documents.

Process forward links

Recursively processes documents linked FROM each file up to MAX_DEPTH levels.

Process backlinks

Includes documents that link TO each file (reverse references).

Prevent duplicates

Each file is processed only once, even if linked multiple times.

3. Content Extraction

Markdown Files

Markdown content is read using Obsidian’s cached read API for optimal performance:

return vault.cachedRead(file);

PDF Files

PDF processing uses pdfjs-dist with intelligent caching:

// Check cache first
const cachedContent = await fileCache.getContent(file.path);
if (cachedContent?.mtime === file.stat.mtime) {
  return cachedContent.content;
}

// Extract if not cached or outdated
const arrayBuffer = await vault.readBinary(file);
const pdfContent = await extractTextFromPDF(arrayBuffer);

PDF extraction maintains layout by detecting line breaks based on text position (transform[5] values). Large PDFs may take longer to process on first use.

PDF processing details (from src/processors/pdf.ts):

Uses a Web Worker for background processing
Extracts text with line break detection
Preserves page structure with double newlines between pages
Caches extracted text in IndexedDB for performance

4. Document Structure

Each processed document is stored with metadata:

interface IAIDocument {
  content: string;
  meta: {
    source: string;      // File path
    basename: string;    // File name without extension
    stat: FileStat;      // Creation/modification times
    depth: number;       // Graph traversal depth
    isBacklink: boolean; // Whether this is a reverse reference
  };
}

Vector Search & Embedding

Embedding Process

The system uses your configured embedding provider to create vector representations:

const results = await aiProviders.retrieve({
  query,
  documents,
  embeddingProvider,
  onProgress: (progress) => {
    // Updates progress bar with chunk processing status
  },
  abortController,
});

Configure your embedding provider in Settings → AI Providers → Embedding AI Provider. Popular choices include nomic-embed-text for Ollama or OpenAI’s text-embedding-3-small.

Chunking Strategy

Documents are automatically split into chunks by the AI Providers SDK:

Chunks are sized based on the embedding model’s context window
Progress tracking reports processedChunks and totalChunks
Each chunk is embedded independently for granular matching

Similarity Search

The searchDocuments() function:

Embeds your query (selected text)
Embeds all document chunks
Computes similarity scores (cosine similarity)
Returns ranked results

Context Formatting

Result Ranking

Results are organized intelligently:

Group by File

Results from the same file are grouped together under [[filename]].

Sort by Creation Time

File groups are sorted by creation time (newest first).

Rank by Relevance

Within each group, chunks are sorted by similarity score.

Respect Limits

Context is truncated at the configured limit (see Context Limits).

Output Format

The formatted context follows this structure:

[[Document 1]]
Relevant chunk from Document 1 with highest score...

Another relevant chunk from Document 1...

[[Document 2]]
Relevant chunk from Document 2...

The total character count is logged in development mode: "Total length of context"

Performance Optimizations

Caching Strategies

PDF Content Cache (src/indexedDB.ts):

Stores extracted text by file path
Includes modification time (mtime) for invalidation
Prevents re-extraction on every request

Metadata Cache:

Uses Obsidian’s MetadataCache for link resolution
Leverages resolvedLinks for graph traversal
Avoids manual parsing of markdown files

Parallel Processing

await Promise.all(
  linkedFiles.map(async (file) => {
    await processDocumentForRAG(file, context, processedDocs, 0, false);
    updateCompletedSteps?.(1);
  }),
);

All linked files at the same depth level are processed concurrently for maximum speed.

Progress Tracking

The system provides real-time progress updates:

Initialize

totalProgressSteps starts at the number of linked files

Dynamic Updates

As chunking begins, addTotalProgressSteps() increases the total

Completion Tracking

updateCompletedSteps() increments as each chunk is processed

Status Bar

Displays percentage in the Obsidian status bar: ”✨ Enhancing 45%“

Technical Reference

Key Functions

getLinkedFiles(content, vault, metadataCache, currentFilePath)

Extracts all wiki-links and markdown links from content.Parameters:

content: Text to parse for links
vault: Obsidian vault instance
metadataCache: Metadata cache for link resolution
currentFilePath: Current file path for relative resolution

Returns: Array of TFile objects (md and pdf only)

processDocumentForRAG(file, context, processedDocs, depth, isBacklink)

Recursively processes a document and its linked files.Parameters:

file: File to process
context: Processing context (vault, metadataCache, activeFile)
processedDocs: Map to store processed documents
depth: Current traversal depth
isBacklink: Whether this is a backlink reference

Returns: Updated processedDocs map

searchDocuments(query, documents, aiProviders, embeddingProvider, abortController, ...)

Performs vector similarity search across documents.Parameters:

query: Search query (usually selected text)
documents: Array of processed documents
aiProviders: AI providers service instance
embeddingProvider: Configured embedding provider
abortController: For cancellation support
updateCompletedSteps: Progress callback
addTotalProgressSteps: Dynamic total adjustment callback
contextLimit: Maximum context length in characters

Returns: Formatted context string

Source Code References

Core RAG Logic

src/rag.ts - Main RAG implementation

PDF Processing

src/processors/pdf.ts - PDF text extraction

File Caching

src/indexedDB.ts - IndexedDB cache layer

Integration

src/main.ts:688-747 - enhanceWithContext() method

Next Steps

Context Limits

Learn how to configure context limits for different models

Troubleshooting

Debug common RAG and embedding issues

Get Started

Features

Guides

Advanced

RAG System Architecture

RAG System Architecture

Overview

Document Processing Pipeline

1. Link Discovery

2. Graph Traversal

3. Content Extraction

Markdown Files

PDF Files

4. Document Structure

Vector Search & Embedding

Embedding Process

Chunking Strategy

Similarity Search

Context Formatting

Result Ranking

Group by File

Sort by Creation Time

Rank by Relevance

Respect Limits

Output Format

Performance Optimizations

Caching Strategies

Parallel Processing

Progress Tracking

Technical Reference

Key Functions

Source Code References

Core RAG Logic

PDF Processing

File Caching

Integration

Next Steps

Context Limits

Troubleshooting

Build docs developers (and LLMs) love

Get Started

Features

Guides

Advanced

​RAG System Architecture

​Overview

​Document Processing Pipeline

​1. Link Discovery

​2. Graph Traversal

​3. Content Extraction

​Markdown Files

​PDF Files

​4. Document Structure

​Vector Search & Embedding

​Embedding Process

​Chunking Strategy

​Similarity Search

​Context Formatting

​Result Ranking

Group by File

Sort by Creation Time

Rank by Relevance

Respect Limits

​Output Format

​Performance Optimizations

​Caching Strategies

​Parallel Processing

​Progress Tracking

​Technical Reference

​Key Functions

​Source Code References

Core RAG Logic

PDF Processing

File Caching

Integration

​Next Steps

Context Limits

Troubleshooting

Build docs developers (and LLMs) love

RAG System Architecture

Overview

Document Processing Pipeline

1. Link Discovery

2. Graph Traversal

3. Content Extraction

Markdown Files

PDF Files

4. Document Structure

Vector Search & Embedding

Embedding Process

Chunking Strategy

Similarity Search

Context Formatting

Result Ranking

Output Format

Performance Optimizations

Caching Strategies

Parallel Processing

Progress Tracking

Technical Reference

Key Functions

Source Code References

Next Steps