Skip to main content
The Knowledge Base feature allows you to upload your own documents and have intelligent conversations about their content. Using RAG (Retrieval Augmented Generation), Page Assist creates searchable embeddings of your documents and retrieves relevant information to answer your questions.
Knowledge bases are stored entirely in your browser’s local storage. Large files may impact browser performance. Consider file size and quantity when building your knowledge base.

Supported File Types

Page Assist supports the following document formats:
FormatExtensionUse Case
PDF.pdfResearch papers, books, reports, manuals
Word Documents.docxEssays, documentation, meeting notes
Plain Text.txtCode files, logs, simple notes
CSV.csvData tables, spreadsheets, lists
Markdown.mdTechnical docs, notes, README files
File processing time depends on document size and complexity. PDFs with images or complex formatting may take longer to process.

Prerequisites

Before using the Knowledge Base feature, you need to configure an embedding model:
1

Choose an Embedding Model

Embedding models convert text into vector representations for semantic search.Recommended Models:
  • nomic-embed-text (Ollama) - Best all-around choice
  • mxbai-embed-large (Ollama) - High quality
  • all-minilm (Ollama) - Fast and lightweight
  • text-embedding-3-small (OpenAI) - Cloud-based option
2

Install Embedding Model (Ollama)

If using local Ollama:
ollama pull nomic-embed-text
3

Configure in Settings

  1. Go to Settings → RAG Settings
  2. Select your embedding model from the dropdown
  3. Configure chunk size (default: 1000)
  4. Configure overlap (default: 200)
  5. Set number of retrieved documents (default: 4)
  6. Click “Save Settings”
Important: Use models specifically designed for text embedding, NOT chat models. Using chat models for embeddings will not work properly.

Creating a Knowledge Base

Upload and process your documents:
1

Access Knowledge Management

  1. Open Page Assist Web UI
  2. Go to Settings
  3. Navigate to “Manage Knowledge”
2

Add New Knowledge

  1. Click “Add New Knowledge” button
  2. Enter a descriptive title for your knowledge base
  3. Optionally add custom system and follow-up prompts
  4. Click “Create”
3

Upload Files

  1. Click “Upload Files” or drag and drop
  2. Select one or more supported files
  3. Wait for processing - you’ll see progress indicators
  4. Files are chunked and embedded automatically
4

Verify Processing

  • Green checkmark indicates successful processing
  • View number of chunks created per file
  • Check total embedding count

Using Knowledge Bases

Once created, use your knowledge bases in conversations:

In the Web UI

  1. Start a new chat or open existing conversation
  2. Look for the database/block icon in the input area
  3. Click the icon to open knowledge base selector
  4. Select one or more knowledge bases to use
  5. Selected knowledge bases highlight with a checkmark
  6. Type your question and send

In the Sidebar

  1. Open the sidebar (Ctrl+Shift+Y)
  2. Click the knowledge base icon in the input area
  3. Select your knowledge base
  4. Ask questions about your documents
You can select multiple knowledge bases simultaneously. The system will search across all selected bases.

Knowledge Base Settings

RAG Configuration

Optimize how documents are processed and retrieved:
Default: 1000 charactersControls how documents are split:
  • Smaller (500-800): Better for precise answers, more chunks
  • Larger (1200-1500): More context per chunk, fewer chunks
Adjust in Settings → RAG Settings → Chunk Size
Default: 200 charactersOverlap between consecutive chunks:
  • Prevents information loss at boundaries
  • Higher overlap = more redundancy but better recall
  • Typical range: 100-300 characters
Adjust in Settings → RAG Settings → Chunk Overlap
Default: 4 documentsNumber of relevant chunks to retrieve:
  • Fewer (2-3): Faster, less context
  • More (6-8): Better coverage, more context
Adjust in Settings → RAG Settings → Number of Retrieved Documents
Default: RecursiveCharacterTextSplitterHow text is divided into chunks:
  • RecursiveCharacterTextSplitter: Smart splitting by paragraphs, sentences
  • CharacterTextSplitter: Simple character-based splitting with custom separator
Adjust in Settings → RAG Settings → Splitting Strategy
Default: 10 filesMaximum number of files per knowledge base:
  • Helps manage browser storage
  • Adjust based on file sizes
  • Larger limit may impact performance
Adjust in Settings → RAG Settings → Total File Per KB

Custom Prompts

Create knowledge base-specific prompts:
  1. When creating/editing a knowledge base
  2. Add custom System Prompt to set context
  3. Add custom Follow-up Prompt for question processing
  4. These override global RAG prompts for this knowledge base
Example System Prompt:
You are an expert assistant analyzing technical documentation. 
Provide accurate, detailed answers citing specific sections.
Context: {context}
Example Follow-up Prompt:
Based on the documentation, {question}
Provide specific examples and page references.

Managing Knowledge Bases

View and Edit

  1. Go to Settings → Manage Knowledge
  2. Click on any knowledge base to view details
  3. See all uploaded files and their status
  4. View total chunks and embeddings
  5. Edit title and prompts

Add More Files

  1. Open an existing knowledge base
  2. Click “Upload Files”
  3. Select additional files
  4. New files are processed and added to existing embeddings

Remove Files

  1. Open knowledge base
  2. Find the file to remove
  3. Click the delete/remove icon
  4. Confirm deletion
  5. Associated embeddings are automatically removed

Delete Knowledge Base

Deleting a knowledge base permanently removes all files and embeddings. This cannot be undone.
  1. Go to Settings → Manage Knowledge
  2. Find the knowledge base to delete
  3. Click the delete icon
  4. Confirm deletion
  5. All data is removed from browser storage

Advanced Usage

Combining Features

Optimization Tips

Organization: Create separate knowledge bases for different topics or projects rather than one massive base.
File Preparation: For best results, use well-formatted documents with clear structure and headings.
Embedding Model: Use nomic-embed-text for the best balance of speed and quality with local processing.
Chunk Size: For technical docs, use smaller chunks (800). For books/articles, use larger chunks (1200).

Use Cases

Research

  • Analyze research papers
  • Cross-reference studies
  • Extract citations
  • Summarize findings

Documentation

  • Query technical docs
  • Find API references
  • Understand codebases
  • Generate documentation

Learning

  • Study textbooks
  • Review course materials
  • Create study guides
  • Generate practice questions

Business

  • Analyze reports
  • Review contracts
  • Extract insights from meeting notes
  • Summarize proposals

Troubleshooting

Possible causes:
  • File too large
  • Unsupported format
  • Browser storage full
Solutions:
  • Check file size (keep under 10MB)
  • Verify file extension is supported
  • Clear browser storage or delete old knowledge bases
  • Try splitting large files
Possible causes:
  • Large file size
  • Complex PDF formatting
  • Slow embedding generation
Solutions:
  • Use local Ollama for faster embedding generation
  • Reduce chunk size to create fewer embeddings
  • Process files in smaller batches
  • Simplify PDF formatting if possible
Solutions:
  1. Go to Settings → RAG Settings
  2. Ensure embedding model is selected
  3. For Ollama: Run ollama pull nomic-embed-text
  4. For OpenAI: Add API key in OpenAI settings
  5. Refresh and try again
Possible causes:
  • Poor chunk retrieval
  • Wrong embedding model
  • Insufficient retrieved documents
Solutions:
  • Increase number of retrieved documents
  • Adjust chunk size and overlap
  • Use better embedding model (nomic-embed-text)
  • Rephrase your question more specifically
  • Try reducing chunk size for more precise matching
Causes:
  • Too many files
  • Large embeddings
  • Browser storage limit reached
Solutions:
  • Delete unused knowledge bases
  • Remove individual large files
  • Reduce files per knowledge base
  • Clear browser cache
  • Use smaller chunk sizes

Performance Considerations

Browser Storage: All embeddings are stored in browser local storage. Each document creates hundreds or thousands of embedding vectors.

Storage Impact

  • Small doc (10 pages): ~1-2 MB of embeddings
  • Medium doc (50 pages): ~5-10 MB of embeddings
  • Large doc (200 pages): ~20-40 MB of embeddings

Best Practices

  1. Limit file sizes: Keep individual files under 10MB
  2. Organize by topic: Create multiple small knowledge bases vs one large one
  3. Regular cleanup: Delete unused knowledge bases
  4. Monitor storage: Check browser storage usage regularly
  5. Use appropriate chunk sizes: Smaller chunks = more embeddings = more storage

Privacy and Security

Local Processing: All file processing and embedding generation happens locally or via your configured provider. Page Assist has no servers and never stores your data.
  • Documents are processed in your browser
  • Embeddings stored in browser local storage
  • Text is only sent to your configured AI provider
  • No third-party storage or processing
  • Delete knowledge base = permanent removal from your browser
When using cloud embedding providers (OpenAI), your document chunks are sent to their servers for embedding generation.

Next Steps

Configuration Settings

Configure embedding models and chunk settings

Prompts

Create custom prompts for knowledge bases

Internet Search

Combine knowledge base with web search

Chat with Webpage

Cross-reference documents with web content

Build docs developers (and LLMs) love