Supported File Types
Page Assist supports the following document formats:| Format | Extension | Use Case |
|---|---|---|
.pdf | Research papers, books, reports, manuals | |
| Word Documents | .docx | Essays, documentation, meeting notes |
| Plain Text | .txt | Code files, logs, simple notes |
| CSV | .csv | Data tables, spreadsheets, lists |
| Markdown | .md | Technical docs, notes, README files |
File processing time depends on document size and complexity. PDFs with images or complex formatting may take longer to process.
Prerequisites
Before using the Knowledge Base feature, you need to configure an embedding model:Choose an Embedding Model
Embedding models convert text into vector representations for semantic search.Recommended Models:
nomic-embed-text(Ollama) - Best all-around choicemxbai-embed-large(Ollama) - High qualityall-minilm(Ollama) - Fast and lightweighttext-embedding-3-small(OpenAI) - Cloud-based option
Creating a Knowledge Base
Upload and process your documents:Add New Knowledge
- Click “Add New Knowledge” button
- Enter a descriptive title for your knowledge base
- Optionally add custom system and follow-up prompts
- Click “Create”
Upload Files
- Click “Upload Files” or drag and drop
- Select one or more supported files
- Wait for processing - you’ll see progress indicators
- Files are chunked and embedded automatically
Using Knowledge Bases
Once created, use your knowledge bases in conversations:In the Web UI
- Start a new chat or open existing conversation
- Look for the database/block icon in the input area
- Click the icon to open knowledge base selector
- Select one or more knowledge bases to use
- Selected knowledge bases highlight with a checkmark
- Type your question and send
In the Sidebar
- Open the sidebar (
Ctrl+Shift+Y) - Click the knowledge base icon in the input area
- Select your knowledge base
- Ask questions about your documents
You can select multiple knowledge bases simultaneously. The system will search across all selected bases.
Knowledge Base Settings
RAG Configuration
Optimize how documents are processed and retrieved:Chunk Size
Chunk Size
Default: 1000 charactersControls how documents are split:
- Smaller (500-800): Better for precise answers, more chunks
- Larger (1200-1500): More context per chunk, fewer chunks
Chunk Overlap
Chunk Overlap
Default: 200 charactersOverlap between consecutive chunks:
- Prevents information loss at boundaries
- Higher overlap = more redundancy but better recall
- Typical range: 100-300 characters
Retrieved Documents
Retrieved Documents
Default: 4 documentsNumber of relevant chunks to retrieve:
- Fewer (2-3): Faster, less context
- More (6-8): Better coverage, more context
Splitting Strategy
Splitting Strategy
Default: RecursiveCharacterTextSplitterHow text is divided into chunks:
- RecursiveCharacterTextSplitter: Smart splitting by paragraphs, sentences
- CharacterTextSplitter: Simple character-based splitting with custom separator
Files Per Knowledge Base
Files Per Knowledge Base
Default: 10 filesMaximum number of files per knowledge base:
- Helps manage browser storage
- Adjust based on file sizes
- Larger limit may impact performance
Custom Prompts
Create knowledge base-specific prompts:- When creating/editing a knowledge base
- Add custom System Prompt to set context
- Add custom Follow-up Prompt for question processing
- These override global RAG prompts for this knowledge base
Managing Knowledge Bases
View and Edit
- Go to Settings → Manage Knowledge
- Click on any knowledge base to view details
- See all uploaded files and their status
- View total chunks and embeddings
- Edit title and prompts
Add More Files
- Open an existing knowledge base
- Click “Upload Files”
- Select additional files
- New files are processed and added to existing embeddings
Remove Files
- Open knowledge base
- Find the file to remove
- Click the delete/remove icon
- Confirm deletion
- Associated embeddings are automatically removed
Delete Knowledge Base
- Go to Settings → Manage Knowledge
- Find the knowledge base to delete
- Click the delete icon
- Confirm deletion
- All data is removed from browser storage
Advanced Usage
Combining Features
- Knowledge Base + Internet Search
- Knowledge Base + Webpage
- Multiple Knowledge Bases
Get the best of both worlds:
- Enable knowledge base
- Enable internet search (globe icon)
- Ask questions that reference both your docs and current info
- Example: “How does my research compare to recent developments?”
Optimization Tips
Use Cases
Research
- Analyze research papers
- Cross-reference studies
- Extract citations
- Summarize findings
Documentation
- Query technical docs
- Find API references
- Understand codebases
- Generate documentation
Learning
- Study textbooks
- Review course materials
- Create study guides
- Generate practice questions
Business
- Analyze reports
- Review contracts
- Extract insights from meeting notes
- Summarize proposals
Troubleshooting
File upload fails
File upload fails
Possible causes:
- File too large
- Unsupported format
- Browser storage full
- Check file size (keep under 10MB)
- Verify file extension is supported
- Clear browser storage or delete old knowledge bases
- Try splitting large files
Processing takes too long
Processing takes too long
Possible causes:
- Large file size
- Complex PDF formatting
- Slow embedding generation
- Use local Ollama for faster embedding generation
- Reduce chunk size to create fewer embeddings
- Process files in smaller batches
- Simplify PDF formatting if possible
No embedding model available
No embedding model available
Solutions:
- Go to Settings → RAG Settings
- Ensure embedding model is selected
- For Ollama: Run
ollama pull nomic-embed-text - For OpenAI: Add API key in OpenAI settings
- Refresh and try again
Irrelevant results
Irrelevant results
Possible causes:
- Poor chunk retrieval
- Wrong embedding model
- Insufficient retrieved documents
- Increase number of retrieved documents
- Adjust chunk size and overlap
- Use better embedding model (nomic-embed-text)
- Rephrase your question more specifically
- Try reducing chunk size for more precise matching
Storage quota exceeded
Storage quota exceeded
Causes:
- Too many files
- Large embeddings
- Browser storage limit reached
- Delete unused knowledge bases
- Remove individual large files
- Reduce files per knowledge base
- Clear browser cache
- Use smaller chunk sizes
Performance Considerations
Storage Impact
- Small doc (10 pages): ~1-2 MB of embeddings
- Medium doc (50 pages): ~5-10 MB of embeddings
- Large doc (200 pages): ~20-40 MB of embeddings
Best Practices
- Limit file sizes: Keep individual files under 10MB
- Organize by topic: Create multiple small knowledge bases vs one large one
- Regular cleanup: Delete unused knowledge bases
- Monitor storage: Check browser storage usage regularly
- Use appropriate chunk sizes: Smaller chunks = more embeddings = more storage
Privacy and Security
Local Processing: All file processing and embedding generation happens locally or via your configured provider. Page Assist has no servers and never stores your data.
- Documents are processed in your browser
- Embeddings stored in browser local storage
- Text is only sent to your configured AI provider
- No third-party storage or processing
- Delete knowledge base = permanent removal from your browser
When using cloud embedding providers (OpenAI), your document chunks are sent to their servers for embedding generation.
Next Steps
Configuration Settings
Configure embedding models and chunk settings
Prompts
Create custom prompts for knowledge bases
Internet Search
Combine knowledge base with web search
Chat with Webpage
Cross-reference documents with web content