Skip to main content
Khoj’s online search capability brings real-time information into your conversations. When current events, fresh data, or web research is needed, Khoj automatically searches the internet and cites sources.

How It Works

Khoj intelligently decides when to search online:
1

Intent Detection

Khoj analyzes your query to determine if it requires current information:
  • News and current events
  • Recent product releases or updates
  • Real-time data (weather, stock prices, scores)
  • URLs or specific web content
  • Questions beyond its training data cutoff
2

Web Search

Khoj searches the internet using configured search providers, gathering:
  • Search results from multiple sources
  • Webpage content and summaries
  • Answer boxes and knowledge graphs
  • Social media posts (if configured)
3

Content Retrieval

Relevant webpages are fetched and processed:
  • Full text extraction
  • Cleaned and formatted for readability
  • Images and media identified
4

Response Generation

Information is synthesized into a coherent answer with:
  • Clear citations to source URLs
  • Quotes from original content
  • Multiple perspectives when relevant
Khoj automatically searches online for queries like:Current Events:
What's the latest news about the Mars rover?
Recent Information:
What did Apple announce at their last event?
Real-time Data:
What's the weather forecast for New York this weekend?
Web Content:
Summarize this article: https://example.com/article

Example Queries

What's happening with AI regulation in the EU?
Latest developments in quantum computing
What did the Fed announce about interest rates?
Khoj searches recent news and provides multiple sources for balanced coverage
Compare the latest iPhone vs Samsung flagship
Reviews of the Sony WH-1000XM5 headphones
What's the best budget laptop for programming?
Best pizza places in Brooklyn
Hiking trails near Denver
Events happening in Austin this weekend
How to fix a leaky faucet
Best practices for React hooks in 2024
/online Tutorial for setting up Docker on Ubuntu
Summarize this article: https://en.wikipedia.org/wiki/Haitian_Revolution
What are the key points in this blog post? [paste URL]
Khoj reads the full webpage and provides structured summaries
/online Compare MongoDB vs PostgreSQL for my use case
Pros and cons of remote work according to recent studies
What are experts saying about electric vs hybrid cars in 2024?

Citations and Sources

Khoj always shows where information came from:
Online Search Citations
What you’ll see:
  • Inline citations: [1], [2], [3] markers in the response
  • Source list: Full URLs and titles at the end
  • Clickable links: Navigate to original sources
  • Multiple sources: Balanced perspective from different publishers
Always verify critical information by clicking through to the original sources.

Combined Search: Web + Your Documents

Khoj can blend online research with your personal knowledge:
Based on my notes about machine learning and the latest 
online research, what should I focus on learning next?
Compare my investment strategy notes with current financial 
expert recommendations online
1
Khoj searches your documents
2
Simultaneously searches the web
3
Synthesizes both sources into a unified answer
4
Cites both your notes and web sources

Self-Hosting Configuration

Set up online search for your self-hosted Khoj instance:

Search Providers

Included in docker-compose.yml by defaultNo configuration needed! SearXNG runs automatically when you use:
docker-compose up
Benefits:
  • Completely self-hosted and private
  • Aggregates results from multiple search engines
  • No API keys required
  • Free and open source
SearXNG is the default option and works out of the box with Docker

Webpage Reading

Configure how Khoj reads and extracts webpage content:
Built-in, no setup requiredKhoj uses Python’s requests library to fetch webpages.
  • Works immediately
  • No API keys needed
  • Basic content extraction
  • May struggle with JavaScript-heavy sites
You can use different providers for search and webpage reading. For example:
  • Search with Serper.dev
  • Read webpages with Firecrawl

Best Practices

Be Specific

“Latest iPhone reviews” is better than “phone reviews”

Include Time Context

“2024 tax deadlines” instead of just “tax deadlines”

Ask for Structure

“List pros and cons” or “summarize in bullet points”

Verify Sources

Always click through to cited sources for important information

Limitations

Be aware of these constraints:
Khoj cannot access content behind paywalls or login-required sites.Workaround: Manually copy content and paste into chat
Search results may be a few minutes old, not instantaneous.Not suitable for: Live sports scores, stock tickers, breaking news
Results may favor certain regions depending on search provider.Tip: Include location in query: “news in Japan” vs just “news”
API-based search providers have usage limits.Self-hosting: Monitor your API usage and costs

Privacy Considerations

When using online search:
  • Your queries are sent to the search provider
  • Visited URLs may be logged by the search service
  • Use SearXNG (self-hosted) for maximum privacy
  • Khoj does not store or log your search queries

Troubleshooting

Cloud users:
  • Should work automatically - report if broken
Self-hosted:
  • Check Docker logs: docker-compose logs searxng
  • Verify environment variables are set correctly
  • Ensure API keys are valid
  • Restart Khoj after configuration changes
Try:
  • Make query more specific
  • Include time context (“2024”, “recent”, “latest”)
  • Use /online to force web search
  • Try a different search provider
Possible causes:
  • Paywall or login required
  • JavaScript-heavy site
  • Bot protection
Solutions:
  • Try Olostep for complex sites
  • Manually copy content and paste in chat
  • Use reader mode in browser first
Reasons:
  • Fetching multiple webpages
  • Slow search API
  • Complex webpage rendering
Solutions:
  • Use Serper.dev for faster search
  • Be more specific to reduce scope
  • Consider upgrading self-hosted hardware

Advanced: Research Mode

For comprehensive research, use the /research command:
/research What are the environmental impacts of cryptocurrency mining?
Research mode:
  • Performs multiple searches
  • Reads more sources
  • Provides deeper analysis
  • Takes longer but more thorough
  • Better citations and cross-referencing
Use /research for important questions where accuracy and depth matter most.

Next Steps

Automate Research

Schedule recurring online research tasks

Chat Features

Learn other slash commands and capabilities

Code Execution

Combine web data with code analysis

Build docs developers (and LLMs) love