Overview
RAG (Retrieval-Augmented Generation) Search provides AI-powered document search that runs completely offline in Vault workspaces. Unlike cloud-based AI features, RAG Search processes your documents locally, ensuring privacy and security while delivering intelligent search results with AI-generated overviews.RAG Search is exclusive to Vault workspaces, introduced in v0.9.5. It combines local AI with vector embeddings for private, offline document intelligence.
What Is RAG?
Retrieval-Augmented Generation (RAG) combines two AI capabilities:- Retrieval - Finding relevant documents using semantic search
- Generation - Creating AI summaries based on retrieved content
How It Works
Document Indexing
Your documents are processed locally and converted into vector embeddings (numerical representations of meaning).
Semantic Search
When you search, the query is converted to a vector and compared against document vectors to find semantically similar content.
All processing happens on your device. No data is sent to the cloud, ensuring complete privacy.
Setting Up RAG Search
Prerequisites
Create a Vault Workspace
RAG Search only works in Vault workspaces. Create a new Vault workspace from Settings → Workspaces.
Install Ollama
Download and install Ollama for local AI processing.
Supported File Types
RAG Search can process:- PDF files - Extracts text and indexes content
- Markdown files (.md) - Indexes structured content
- Text files (.txt) - Indexes plain text
- AppFlowy pages - Native document support
Using RAG Search
Ask a Question
Type a natural language question or search query:
- “What are the key findings in the research papers?”
- “Summarize the meeting notes from this week”
- “Find information about deployment procedures”
Review AI Overview
RAG Search returns:
- AI-generated summary - Comprehensive answer to your question
- Key highlights - Important points extracted from documents
- Source citations - Links to specific files and pages
RAG Search Features
Semantic Search
RAG Search understands meaning, not just keywords:- Conceptual matching - Finds documents about “machine learning” when you search for “AI training”
- Synonym recognition - Matches “purchase” with “buy”, “acquire”, “procurement”
- Context awareness - Understands multi-word concepts and relationships
Chat with Files
Ask questions about uploaded documents:Multi-Document Synthesis
Combine information from multiple files:- Cross-document analysis - Compare findings across multiple PDFs
- Timeline building - Order events from different markdown notes
- Theme identification - Find common themes across text files
- Comprehensive answers - Synthesize information from multiple sources
PDF Search
Extract and search content from PDF documents with page-level citations
Markdown Search
Search through structured markdown notes with section references
Text Search
Index and search plain text files with line number citations
Workspace Search
Search all AppFlowy pages in your Vault workspace
Embedding Models
Choose the right embedding model for your needs:Available Models
| Model | Size | Speed | Quality | Best For |
|---|---|---|---|---|
| nomic-embed-text | ~274MB | Fast | High | General purpose, multilingual |
| all-minilm | ~45MB | Very Fast | Good | Quick indexing, limited resources |
| mxbai-embed-large | ~669MB | Slower | Highest | Maximum quality, powerful hardware |
Switching Embedding Models
RAG Search Best Practices
Ask Complete Questions
Use full sentences like “What were the Q1 revenue figures?” rather than just “revenue”
Be Specific
Include context: “What did the design review document say about mobile UX?”
Verify Sources
Always check source documents for critical information and exact details
Organize Files
Use clear file names and organize content in folders for better context
Optimizing Document Indexing
For better RAG Search results:- Use descriptive file names that indicate content
- Structure markdown files with clear headings
- Keep related documents in the same folders
- Add metadata (dates, authors, topics) to file names
- Break large documents into logical sections
Query Optimization
Write queries that get better results: ✅ Good queries:- “Summarize the key features discussed in the product spec”
- “What budget was allocated for marketing in Q1?”
- “Compare the two proposal documents and highlight differences”
- “features”
- “budget”
- “proposals”
Privacy and Security
What Stays Local
RAG Search in Vault workspaces is 100% private. Everything runs on your device:
- Document content - Never leaves your device
- Embeddings - Stored locally in your Vault database
- AI processing - Runs through local Ollama instance
- Search queries - Processed entirely offline
- Generated summaries - Created by your local AI model
Data Storage
RAG Search stores:- Vector embeddings - In local SQLite database
- Document metadata - File names, locations, timestamps
- No external services - Zero cloud dependencies for search
Vault workspaces are designed for sensitive data. RAG Search maintains this privacy guarantee by keeping all AI operations local.
Performance Considerations
Hardware Requirements
RAG Search performance depends on your hardware: Minimum:- 8GB RAM
- 2GB free disk space
- Multi-core processor
- 16GB+ RAM
- 10GB+ free disk space (for larger models)
- Apple Silicon M1/M2 or modern x86 processor with AVX2
Indexing Speed
Document indexing time varies by:- File size - Larger PDFs take longer
- File count - More files = longer initial indexing
- Model size - Larger embedding models are slower
- Hardware - Faster CPU/GPU speeds up processing
Search Speed
Once indexed, RAG Search is fast:- Query matching - Near-instant with vector search
- AI generation - 2-10 seconds depending on model and context
- No network latency - Offline operation is consistently fast
Troubleshooting
RAG Search Not Available
Problem: RAG Search option doesn’t appear in search. Solutions:- Verify you’re in a Vault workspace (not a regular workspace)
- Check that Ollama is installed and running:
ollama list - Ensure embedding model is downloaded:
ollama pull nomic-embed-text - Restart AppFlowy after installing Ollama
Slow Indexing
Problem: Document indexing takes too long. Solutions:- Use a smaller, faster embedding model (all-minilm)
- Index fewer documents at once
- Close other applications to free up resources
- Check available disk space
- Consider upgrading hardware for better performance
Poor Search Results
Problem: RAG Search returns irrelevant results. Solutions:- Rewrite queries with more specific details
- Check that documents are properly indexed (Settings → AI)
- Try a different embedding model for better quality
- Verify uploaded files contain searchable text (PDFs may have image-only content)
- Re-index documents if search quality has degraded
AI Generation Errors
Problem: Search finds documents but AI fails to generate summary. Solutions:- Verify Ollama is running:
ollama serve - Check generation model is available:
ollama list - Download recommended model:
ollama pull llama2 - Try a different generation model
- Check logs for specific error messages
RAG Search vs. Cloud AI Search
| Feature | RAG Search (Vault) | Cloud AI Search |
|---|---|---|
| Privacy | 100% local | Cloud-processed |
| Internet | Works offline | Requires connection |
| Speed | Fast (local) | Depends on network |
| Models | Ollama models | GPT, Claude, etc. |
| File Types | PDF, MD, TXT | AppFlowy pages only |
| Setup | Requires Ollama | No setup |
| Cost | Free (local resources) | May require subscription |
Advanced RAG Configuration
Embedding Model Parameters
Adjust embedding behavior in Settings → AI → Advanced:- Chunk size - Size of document sections for indexing (default: 512 tokens)
- Chunk overlap - Overlap between chunks for context (default: 128 tokens)
- Top K results - Number of document chunks to retrieve (default: 5)
- Score threshold - Minimum similarity score for results (default: 0.5)
Generation Model Selection
Choose the best generation model for your needs:| Model | Size | Speed | Quality | Use Case |
|---|---|---|---|---|
| llama2 | 3.8GB | Medium | Good | General purpose |
| mistral | 4.1GB | Medium | High | Better reasoning |
| mixtral | 26GB | Slow | Highest | Maximum quality |
| codellama | 3.8GB | Medium | Good | Code-focused tasks |
Custom RAG Workflows
Combine RAG Search with other features:- Research workflow - Upload PDFs → RAG Search → Save summaries to pages
- Meeting notes - Index markdown notes → Search across meetings → Generate weekly summaries
- Documentation - Index text files → Ask implementation questions → Generate guides
- Code analysis - Index code files → Search for patterns → Get explanations
Related Features
Vault Workspaces
Learn about private, offline Vault workspaces
AI Overviews
Cloud-based AI summaries for regular workspaces
AI Search
Natural language search in cloud workspaces
AI Chat
Interactive AI conversations with document sources