Knowledge Base

The Knowledge Base feature allows you to upload your own documents and have intelligent conversations about their content. Using RAG (Retrieval Augmented Generation), Page Assist creates searchable embeddings of your documents and retrieves relevant information to answer your questions.

Knowledge bases are stored entirely in your browser’s local storage. Large files may impact browser performance. Consider file size and quantity when building your knowledge base.

Supported File Types

Page Assist supports the following document formats:

Format	Extension	Use Case
PDF	`.pdf`	Research papers, books, reports, manuals
Word Documents	`.docx`	Essays, documentation, meeting notes
Plain Text	`.txt`	Code files, logs, simple notes
CSV	`.csv`	Data tables, spreadsheets, lists
Markdown	`.md`	Technical docs, notes, README files

File processing time depends on document size and complexity. PDFs with images or complex formatting may take longer to process.

Prerequisites

Before using the Knowledge Base feature, you need to configure an embedding model:

Choose an Embedding Model

Embedding models convert text into vector representations for semantic search.Recommended Models:

nomic-embed-text (Ollama) - Best all-around choice
mxbai-embed-large (Ollama) - High quality
all-minilm (Ollama) - Fast and lightweight
text-embedding-3-small (OpenAI) - Cloud-based option

Install Embedding Model (Ollama)

If using local Ollama:

ollama pull nomic-embed-text

Configure in Settings

Go to Settings → RAG Settings
Select your embedding model from the dropdown
Configure chunk size (default: 1000)
Configure overlap (default: 200)
Set number of retrieved documents (default: 4)
Click “Save Settings”

Important: Use models specifically designed for text embedding, NOT chat models. Using chat models for embeddings will not work properly.

Creating a Knowledge Base

Upload and process your documents:

Access Knowledge Management

Open Page Assist Web UI
Go to Settings
Navigate to “Manage Knowledge”

Add New Knowledge

Click “Add New Knowledge” button
Enter a descriptive title for your knowledge base
Optionally add custom system and follow-up prompts
Click “Create”

Upload Files

Click “Upload Files” or drag and drop
Select one or more supported files
Wait for processing - you’ll see progress indicators
Files are chunked and embedded automatically

Verify Processing

Green checkmark indicates successful processing
View number of chunks created per file
Check total embedding count

Using Knowledge Bases

Once created, use your knowledge bases in conversations:

In the Web UI

Start a new chat or open existing conversation
Look for the database/block icon in the input area
Click the icon to open knowledge base selector
Select one or more knowledge bases to use
Selected knowledge bases highlight with a checkmark
Type your question and send

Open the sidebar (Ctrl+Shift+Y)
Click the knowledge base icon in the input area
Select your knowledge base
Ask questions about your documents

You can select multiple knowledge bases simultaneously. The system will search across all selected bases.

Knowledge Base Settings

RAG Configuration

Optimize how documents are processed and retrieved:

Chunk Size

Default: 1000 charactersControls how documents are split:

Smaller (500-800): Better for precise answers, more chunks
Larger (1200-1500): More context per chunk, fewer chunks

Adjust in Settings → RAG Settings → Chunk Size

Chunk Overlap

Default: 200 charactersOverlap between consecutive chunks:

Prevents information loss at boundaries
Higher overlap = more redundancy but better recall
Typical range: 100-300 characters

Adjust in Settings → RAG Settings → Chunk Overlap

Retrieved Documents

Default: 4 documentsNumber of relevant chunks to retrieve:

Fewer (2-3): Faster, less context
More (6-8): Better coverage, more context

Adjust in Settings → RAG Settings → Number of Retrieved Documents

Splitting Strategy

Default: RecursiveCharacterTextSplitterHow text is divided into chunks:

RecursiveCharacterTextSplitter: Smart splitting by paragraphs, sentences
CharacterTextSplitter: Simple character-based splitting with custom separator

Adjust in Settings → RAG Settings → Splitting Strategy

Files Per Knowledge Base

Default: 10 filesMaximum number of files per knowledge base:

Helps manage browser storage
Adjust based on file sizes
Larger limit may impact performance

Adjust in Settings → RAG Settings → Total File Per KB

Custom Prompts

Create knowledge base-specific prompts:

When creating/editing a knowledge base
Add custom System Prompt to set context
Add custom Follow-up Prompt for question processing
These override global RAG prompts for this knowledge base

Example System Prompt:

You are an expert assistant analyzing technical documentation. 
Provide accurate, detailed answers citing specific sections.
Context: {context}

Example Follow-up Prompt:

Based on the documentation, {question}
Provide specific examples and page references.

Managing Knowledge Bases

View and Edit

Go to Settings → Manage Knowledge
Click on any knowledge base to view details
See all uploaded files and their status
View total chunks and embeddings
Edit title and prompts

Add More Files

Open an existing knowledge base
Click “Upload Files”
Select additional files
New files are processed and added to existing embeddings

Remove Files

Open knowledge base
Find the file to remove
Click the delete/remove icon
Confirm deletion
Associated embeddings are automatically removed

Delete Knowledge Base

Deleting a knowledge base permanently removes all files and embeddings. This cannot be undone.

Go to Settings → Manage Knowledge
Find the knowledge base to delete
Click the delete icon
Confirm deletion
All data is removed from browser storage

Advanced Usage

Combining Features

Knowledge Base + Internet Search
Knowledge Base + Webpage
Multiple Knowledge Bases

Get the best of both worlds:

Enable knowledge base
Enable internet search (globe icon)
Ask questions that reference both your docs and current info
Example: “How does my research compare to recent developments?”

Optimization Tips

Organization: Create separate knowledge bases for different topics or projects rather than one massive base.

File Preparation: For best results, use well-formatted documents with clear structure and headings.

Embedding Model: Use nomic-embed-text for the best balance of speed and quality with local processing.

Chunk Size: For technical docs, use smaller chunks (800). For books/articles, use larger chunks (1200).

Use Cases

Research

Analyze research papers
Cross-reference studies
Extract citations
Summarize findings

Documentation

Query technical docs
Find API references
Understand codebases
Generate documentation

Learning

Study textbooks
Review course materials
Create study guides
Generate practice questions

Business

Analyze reports
Review contracts
Extract insights from meeting notes
Summarize proposals

Troubleshooting

File upload fails

Possible causes:

File too large
Unsupported format
Browser storage full

Solutions:

Check file size (keep under 10MB)
Verify file extension is supported
Clear browser storage or delete old knowledge bases
Try splitting large files

Processing takes too long

Possible causes:

Large file size
Complex PDF formatting
Slow embedding generation

Solutions:

Use local Ollama for faster embedding generation
Reduce chunk size to create fewer embeddings
Process files in smaller batches
Simplify PDF formatting if possible

No embedding model available

Solutions:

Go to Settings → RAG Settings
Ensure embedding model is selected
For Ollama: Run ollama pull nomic-embed-text
For OpenAI: Add API key in OpenAI settings
Refresh and try again

Irrelevant results

Possible causes:

Poor chunk retrieval
Wrong embedding model
Insufficient retrieved documents

Solutions:

Increase number of retrieved documents
Adjust chunk size and overlap
Use better embedding model (nomic-embed-text)
Rephrase your question more specifically
Try reducing chunk size for more precise matching

Storage quota exceeded

Causes:

Too many files
Large embeddings
Browser storage limit reached

Solutions:

Delete unused knowledge bases
Remove individual large files
Reduce files per knowledge base
Clear browser cache
Use smaller chunk sizes

Performance Considerations

Browser Storage: All embeddings are stored in browser local storage. Each document creates hundreds or thousands of embedding vectors.

Storage Impact

Small doc (10 pages): ~1-2 MB of embeddings
Medium doc (50 pages): ~5-10 MB of embeddings
Large doc (200 pages): ~20-40 MB of embeddings

Best Practices

Limit file sizes: Keep individual files under 10MB
Organize by topic: Create multiple small knowledge bases vs one large one
Regular cleanup: Delete unused knowledge bases
Monitor storage: Check browser storage usage regularly
Use appropriate chunk sizes: Smaller chunks = more embeddings = more storage

Privacy and Security

Local Processing: All file processing and embedding generation happens locally or via your configured provider. Page Assist has no servers and never stores your data.

Documents are processed in your browser
Embeddings stored in browser local storage
Text is only sent to your configured AI provider
No third-party storage or processing
Delete knowledge base = permanent removal from your browser

When using cloud embedding providers (OpenAI), your document chunks are sent to their servers for embedding generation.

Next Steps

Configuration Settings

Configure embedding models and chunk settings

Prompts

Create custom prompts for knowledge bases

Internet Search

Combine knowledge base with web search

Chat with Webpage

Cross-reference documents with web content

Get Started

Core Features

AI Providers

Configuration

Troubleshooting

Resources

Supported File Types

Prerequisites

Creating a Knowledge Base

Using Knowledge Bases

In the Web UI

In the Sidebar

Knowledge Base Settings

RAG Configuration

Custom Prompts

Managing Knowledge Bases

View and Edit

Add More Files

Remove Files

Delete Knowledge Base

Advanced Usage

Combining Features

Optimization Tips

Use Cases

Research

Documentation

Learning

Business

Troubleshooting

Performance Considerations

Storage Impact

Best Practices

Privacy and Security

Next Steps

Configuration Settings

Prompts

Internet Search

Chat with Webpage

Build docs developers (and LLMs) love

Get Started

Core Features

AI Providers

Configuration

Troubleshooting

Resources

​Supported File Types

​Prerequisites

​Creating a Knowledge Base

​Using Knowledge Bases

​In the Web UI

​In the Sidebar

​Knowledge Base Settings

​RAG Configuration

​Custom Prompts

​Managing Knowledge Bases

​View and Edit

​Add More Files

​Remove Files

​Delete Knowledge Base

​Advanced Usage

​Combining Features

​Optimization Tips

​Use Cases

Research

Documentation

Learning

Business

​Troubleshooting

​Performance Considerations

​Storage Impact

​Best Practices

​Privacy and Security

​Next Steps

Configuration Settings

Prompts

Internet Search

Chat with Webpage

Build docs developers (and LLMs) love

Supported File Types

Prerequisites

Creating a Knowledge Base

Using Knowledge Bases

In the Web UI

In the Sidebar

Knowledge Base Settings

RAG Configuration

Custom Prompts

Managing Knowledge Bases

View and Edit

Add More Files

Remove Files

Delete Knowledge Base

Advanced Usage

Combining Features

Optimization Tips

Use Cases

Troubleshooting

Performance Considerations

Storage Impact

Best Practices

Privacy and Security

Next Steps