Chat with Webpage

Chat with Webpage allows you to have intelligent conversations about the current page you’re viewing. The AI reads and understands the page content, enabling you to ask questions, get summaries, extract information, or analyze the content.

How It Works

Chat with Webpage extracts text content from the current page and provides it to your AI model as context. You can choose between two processing modes:

RAG Mode (Recommended)
Normal Mode

Uses Retrieval Augmented Generation with vector embeddings:

Page content is extracted and split into chunks
Chunks are converted to vector embeddings using your configured embedding model
When you ask a question, relevant chunks are retrieved
Only relevant context is sent to the AI, enabling longer documents

Pros: Handles large pages, more accurate retrieval, better for long documentsCons: Requires embedding model configuration, slight processing delay

Getting Started

Enable Chat with Webpage

In the Sidebar:

Open the sidebar on any webpage (Ctrl+Shift+Y)
Click the webpage icon in the input area to enable
The icon highlights when active

In the Web UI:

Open Web UI (Ctrl+Shift+L)
Navigate to the webpage you want to analyze
Enable the webpage chat mode

Configure RAG (Optional)

For better performance with long pages:

Go to Settings → RAG Settings
Select an embedding model (recommended: nomic-embed-text)
Configure chunk size (default: 1000)
Configure chunk overlap (default: 200)
Save settings

Use embedding models designed for text embedding, not chat models.

Start Chatting

With webpage mode enabled, start asking questions:

“Summarize this article”
“What are the main points?”
“Extract all email addresses”
“Explain this concept in simple terms”

Customize how the sidebar handles webpage content:

Using RAG Mode

By default, sidebar uses RAG with vector embeddings:

Open sidebar and click the settings icon
Find “Copilot Chat With Website Settings”
Ensure “Chat with website using vector embeddings” is enabled
Configure your embedding model in Settings → RAG Settings

Using Normal Mode

For simpler, faster processing without embeddings:

Disable RAG

Open sidebar settings
Find “Copilot Chat With Website Settings”
Disable “Chat with website using vector embeddings”

Adjust Content Size

Increase how much content is sent:

Find “Normal mode website content size”
Increase the value (default: 10000 characters)
Higher values include more content but may hit token limits

Normal mode is limited by your model’s context window. For GPT-3.5, keep content under 4000 tokens. For GPT-4 or Claude, you can use much larger values.

Enable by Default

Automatically enable chat with webpage when opening sidebar:

Open sidebar settings
Find “Enable Chat with Website by default (Copilot)”
Toggle on
The sidebar will now always start in webpage mode

RAG Configuration

Optimize RAG settings for webpage analysis:

Embedding Model Selection

Choose the right embedding model:

Recommended Models
Installation (Ollama)

For Ollama (Local):

nomic-embed-text - Best all-around, fast and accurate
mxbai-embed-large - High quality embeddings
all-minilm - Lightweight and fast

For OpenAI:

text-embedding-3-small - Cost-effective
text-embedding-3-large - Highest quality
text-embedding-ada-002 - Legacy but reliable

If using local Ollama models:

ollama pull nomic-embed-text

Then select it in Settings → RAG Settings → Embedding Model

Chunk Settings

Optimize how content is split:

Setting	Recommended Value	Description
Chunk Size	1000	Characters per chunk
Chunk Overlap	200	Overlap between chunks
Retrieved Docs	4-6	Number of relevant chunks to use
Splitting Strategy	RecursiveCharacterTextSplitter	Best for web content

Understanding Chunk Size

Chunk Size determines how page content is divided:

Smaller chunks (500-800): More precise retrieval, better for specific questions
Larger chunks (1000-1500): More context per chunk, better for summaries

Overlap ensures important information at chunk boundaries isn’t lost.

Custom RAG Prompts

Customize the system prompt for webpage analysis:

Go to Settings → RAG Settings
Scroll to “Configure RAG Prompt”
Select the RAG tab
Edit the system and question prompts
Available variables:
- {context} - Retrieved webpage chunks (don’t remove)
- {question} - User’s question

Do not remove the {context} variable - it’s required for RAG to work properly.

Use Cases

Research

Summarize research papers
Extract key findings
Compare multiple sources
Generate citations

Learning

Understand complex documentation
Get explanations in simple terms
Generate study notes
Create quiz questions

Shopping

Compare product features
Extract specifications
Summarize reviews
Find best deals

News

Summarize articles
Extract key points
Fact-check claims
Get different perspectives

Advanced Techniques

Combining with Internet Search

Use both webpage chat and internet search together:

Enable chat with webpage
Enable internet search (globe icon)
Ask questions that require both page context and external info
Example: “How does this article’s claims compare to recent research?”

Using with Knowledge Base

Combine webpage content with your documents:

Enable chat with webpage
Select knowledge base (database icon)
Ask questions that cross-reference both sources
Example: “How does this webpage’s approach compare to my notes?”

Multi-Page Analysis

Analyze multiple pages in one conversation:

Enable chat with webpage on first page
Ask questions and get responses
Navigate to another page (keep sidebar open)
New page context automatically replaces old context
Continue asking questions about the new page

Each page replaces the previous context. To compare pages, copy relevant information into your messages.

Performance Optimization

For Large Pages

Use RAG Mode:

Enable vector embeddings
Increase chunk size to 1500
Increase retrieved docs to 6-8
Use a capable embedding model like nomic-embed-text

For Speed

Use Normal Mode:

Disable RAG
Set content size to 8000-10000
Use faster models (GPT-3.5, local Ollama models)
Limit retrieved docs to 3-4

For Accuracy

Optimize RAG:

Use high-quality embedding models
Smaller chunk size (800)
Higher overlap (300)
More retrieved docs (6-8)
Use advanced models (GPT-4, Claude)

Troubleshooting

No content extracted

Causes:

Page uses JavaScript rendering
Content behind authentication
Page blocks content extraction

Solutions:

Wait for page to fully load
Disable RAG and try normal mode
Try refreshing the page
Use vision mode instead

Responses not relevant

Causes:

Poor chunk retrieval
Embedding model not configured
Retrieved docs too few

Solutions:

Check embedding model is set
Increase retrieved docs count
Adjust chunk size and overlap
Try normal mode instead

Processing too slow

Causes:

Large page content
Slow embedding generation
Network latency

Solutions:

Use local Ollama embedding models
Reduce chunk size
Limit retrieved docs
Try normal mode for simpler pages

Context window errors

Causes:

Too much content for model’s limit
Large chunks with long conversation

Solutions:

Reduce chunk size
Reduce retrieved docs
Use a model with larger context (GPT-4, Claude)
Start a new chat

Privacy and Security

Data Processing: All webpage content is processed locally in your browser before being sent to your AI provider.

Sensitive Information: Be cautious when analyzing pages with sensitive information. The content will be sent to your configured AI provider.

Webpage content is extracted client-side
Embeddings are generated locally or via your provider
Only relevant chunks are sent to AI (in RAG mode)
No data is stored on Page Assist servers (we don’t have any)

Next Steps

Vision

Analyze webpage screenshots and images

Knowledge Base

Upload documents for persistent context

Internet Search

Combine with real-time web search

Configuration Settings

Configure embedding and retrieval settings

Get Started

Core Features

AI Providers

Configuration

Troubleshooting

Resources

How It Works

Getting Started

Sidebar Configuration

Using RAG Mode

Using Normal Mode

Enable by Default

RAG Configuration

Embedding Model Selection

Chunk Settings

Custom RAG Prompts

Use Cases

Research

Learning

Shopping

News

Advanced Techniques

Combining with Internet Search

Using with Knowledge Base

Multi-Page Analysis

Performance Optimization

Troubleshooting

Privacy and Security

Next Steps

Vision

Knowledge Base

Internet Search

Configuration Settings

Build docs developers (and LLMs) love

Get Started

Core Features

AI Providers

Configuration

Troubleshooting

Resources

​How It Works

​Getting Started

​Sidebar Configuration

​Using RAG Mode

​Using Normal Mode

​Enable by Default

​RAG Configuration

​Embedding Model Selection

​Chunk Settings

​Custom RAG Prompts

​Use Cases

Research

Learning

Shopping

News

​Advanced Techniques

​Combining with Internet Search

​Using with Knowledge Base

​Multi-Page Analysis

​Performance Optimization

​Troubleshooting

​Privacy and Security

​Next Steps

Vision

Knowledge Base

Internet Search

Configuration Settings

Build docs developers (and LLMs) love

How It Works

Getting Started

Sidebar Configuration

Using RAG Mode

Using Normal Mode

Enable by Default

RAG Configuration

Embedding Model Selection

Chunk Settings

Custom RAG Prompts

Use Cases

Advanced Techniques

Combining with Internet Search

Using with Knowledge Base

Multi-Page Analysis

Performance Optimization

Troubleshooting

Privacy and Security

Next Steps