Skip to main content
Chat with Webpage allows you to have intelligent conversations about the current page you’re viewing. The AI reads and understands the page content, enabling you to ask questions, get summaries, extract information, or analyze the content.

How It Works

Chat with Webpage extracts text content from the current page and provides it to your AI model as context. You can choose between two processing modes:

Getting Started

1

Enable Chat with Webpage

In the Sidebar:
  1. Open the sidebar on any webpage (Ctrl+Shift+Y)
  2. Click the webpage icon in the input area to enable
  3. The icon highlights when active
In the Web UI:
  1. Open Web UI (Ctrl+Shift+L)
  2. Navigate to the webpage you want to analyze
  3. Enable the webpage chat mode
2

Configure RAG (Optional)

For better performance with long pages:
  1. Go to Settings → RAG Settings
  2. Select an embedding model (recommended: nomic-embed-text)
  3. Configure chunk size (default: 1000)
  4. Configure chunk overlap (default: 200)
  5. Save settings
Use embedding models designed for text embedding, not chat models.
3

Start Chatting

With webpage mode enabled, start asking questions:
  • “Summarize this article”
  • “What are the main points?”
  • “Extract all email addresses”
  • “Explain this concept in simple terms”
Customize how the sidebar handles webpage content:

Using RAG Mode

By default, sidebar uses RAG with vector embeddings:
  1. Open sidebar and click the settings icon
  2. Find “Copilot Chat With Website Settings”
  3. Ensure “Chat with website using vector embeddings” is enabled
  4. Configure your embedding model in Settings → RAG Settings

Using Normal Mode

For simpler, faster processing without embeddings:
1

Disable RAG

  1. Open sidebar settings
  2. Find “Copilot Chat With Website Settings”
  3. Disable “Chat with website using vector embeddings”
2

Adjust Content Size

Increase how much content is sent:
  1. Find “Normal mode website content size”
  2. Increase the value (default: 10000 characters)
  3. Higher values include more content but may hit token limits
Normal mode is limited by your model’s context window. For GPT-3.5, keep content under 4000 tokens. For GPT-4 or Claude, you can use much larger values.

Enable by Default

Automatically enable chat with webpage when opening sidebar:
  1. Open sidebar settings
  2. Find “Enable Chat with Website by default (Copilot)”
  3. Toggle on
  4. The sidebar will now always start in webpage mode

RAG Configuration

Optimize RAG settings for webpage analysis:

Embedding Model Selection

Choose the right embedding model:

Chunk Settings

Optimize how content is split:
SettingRecommended ValueDescription
Chunk Size1000Characters per chunk
Chunk Overlap200Overlap between chunks
Retrieved Docs4-6Number of relevant chunks to use
Splitting StrategyRecursiveCharacterTextSplitterBest for web content
Chunk Size determines how page content is divided:
  • Smaller chunks (500-800): More precise retrieval, better for specific questions
  • Larger chunks (1000-1500): More context per chunk, better for summaries
Overlap ensures important information at chunk boundaries isn’t lost.

Custom RAG Prompts

Customize the system prompt for webpage analysis:
  1. Go to Settings → RAG Settings
  2. Scroll to “Configure RAG Prompt”
  3. Select the RAG tab
  4. Edit the system and question prompts
  5. Available variables:
    • {context} - Retrieved webpage chunks (don’t remove)
    • {question} - User’s question
Do not remove the {context} variable - it’s required for RAG to work properly.

Use Cases

Research

  • Summarize research papers
  • Extract key findings
  • Compare multiple sources
  • Generate citations

Learning

  • Understand complex documentation
  • Get explanations in simple terms
  • Generate study notes
  • Create quiz questions

Shopping

  • Compare product features
  • Extract specifications
  • Summarize reviews
  • Find best deals

News

  • Summarize articles
  • Extract key points
  • Fact-check claims
  • Get different perspectives

Advanced Techniques

Use both webpage chat and internet search together:
  1. Enable chat with webpage
  2. Enable internet search (globe icon)
  3. Ask questions that require both page context and external info
  4. Example: “How does this article’s claims compare to recent research?”

Using with Knowledge Base

Combine webpage content with your documents:
  1. Enable chat with webpage
  2. Select knowledge base (database icon)
  3. Ask questions that cross-reference both sources
  4. Example: “How does this webpage’s approach compare to my notes?”

Multi-Page Analysis

Analyze multiple pages in one conversation:
  1. Enable chat with webpage on first page
  2. Ask questions and get responses
  3. Navigate to another page (keep sidebar open)
  4. New page context automatically replaces old context
  5. Continue asking questions about the new page
Each page replaces the previous context. To compare pages, copy relevant information into your messages.

Performance Optimization

Use RAG Mode:
  • Enable vector embeddings
  • Increase chunk size to 1500
  • Increase retrieved docs to 6-8
  • Use a capable embedding model like nomic-embed-text
Use Normal Mode:
  • Disable RAG
  • Set content size to 8000-10000
  • Use faster models (GPT-3.5, local Ollama models)
  • Limit retrieved docs to 3-4
Optimize RAG:
  • Use high-quality embedding models
  • Smaller chunk size (800)
  • Higher overlap (300)
  • More retrieved docs (6-8)
  • Use advanced models (GPT-4, Claude)

Troubleshooting

Causes:
  • Page uses JavaScript rendering
  • Content behind authentication
  • Page blocks content extraction
Solutions:
  • Wait for page to fully load
  • Disable RAG and try normal mode
  • Try refreshing the page
  • Use vision mode instead
Causes:
  • Poor chunk retrieval
  • Embedding model not configured
  • Retrieved docs too few
Solutions:
  • Check embedding model is set
  • Increase retrieved docs count
  • Adjust chunk size and overlap
  • Try normal mode instead
Causes:
  • Large page content
  • Slow embedding generation
  • Network latency
Solutions:
  • Use local Ollama embedding models
  • Reduce chunk size
  • Limit retrieved docs
  • Try normal mode for simpler pages
Causes:
  • Too much content for model’s limit
  • Large chunks with long conversation
Solutions:
  • Reduce chunk size
  • Reduce retrieved docs
  • Use a model with larger context (GPT-4, Claude)
  • Start a new chat

Privacy and Security

Data Processing: All webpage content is processed locally in your browser before being sent to your AI provider.
Sensitive Information: Be cautious when analyzing pages with sensitive information. The content will be sent to your configured AI provider.
  • Webpage content is extracted client-side
  • Embeddings are generated locally or via your provider
  • Only relevant chunks are sent to AI (in RAG mode)
  • No data is stored on Page Assist servers (we don’t have any)

Next Steps

Vision

Analyze webpage screenshots and images

Knowledge Base

Upload documents for persistent context

Internet Search

Combine with real-time web search

Configuration Settings

Configure embedding and retrieval settings

Build docs developers (and LLMs) love