Search Tool

The search tool enables AI assistants to query WebHelp documentation sites efficiently. It automatically chooses between semantic search (when available) and index-based search to deliver the most relevant results.

How It Works

When you invoke the search tool, the server:

Attempts semantic search first (for single-site queries)
Falls back to index-based search if semantic search is unavailable
Returns up to 10 results sorted by relevance score
Provides document IDs that can be used with the fetch tool

For federated search across multiple sites, see Federated Search.

Search Strategies

The WebHelp MCP Server uses two complementary search approaches:

Semantic Search

For single-site queries, the server first attempts semantic search using Oxygen Feedback’s AI-powered search service. This provides natural language understanding and ranks results by semantic relevance.

// From webhelp-search-client.ts:95-99
async semanticSearch(
  query: string,
  baseUrl: string,
  pageSize: number = 10
): Promise<SearchResult>

Semantic search:

Extracts the deployment token from the WebHelp site
Queries the Oxygen Feedback API at feedback.oxygenxml.com
Returns results with relevance scores
Falls back gracefully if unavailable

Index-Based Search

When semantic search isn’t available or for multi-site queries, the server uses the WebHelp search index directly:

// From webhelp-search-client.ts:56-83
for (const url of urls) {
  await this.loadIndex(url);
  this.indexLoader.performSearch(query, function (r: any) {
    result = r;
  });
  const formatted = this.formatSearchResult(result, url, idx);
  mergedResults.push(...formatted.results);
}
mergedResults.sort((a, b) => b.score - a.score);

Index-based search:

Downloads the WebHelp search index files (index-1.js, index-2.js, etc.)
Loads stopwords and file metadata
Executes the WebHelp search engine (nwSearchFnt.js)
Supports boolean operators (AND, OR)

Usage Example

Here’s how the search tool is defined in the MCP server:

// From app/[...site]/route.ts:37-84
server.tool(
  "search",
  searchDescription,
  {
    query: z
      .string()
      .describe("Search query string (supports boolean operators like AND, OR)"),
  },
  async ({ query }) => {
    const result = await searchClient.search(query);
    const maxResultsToUse = 10;

    if (result.error) {
      return {
        content: [{
          type: "text",
          text: `Search error: ${result.error}`
        }],
        isError: true
      };
    }

    const topResults = result.results.slice(0, maxResultsToUse);
    const results = topResults.map((doc: any) => ({
      title: doc.title,
      id: doc.id,
      url: doc.url
    }));
    return {
      content: [{
        type: "text",
        text: JSON.stringify(results)
      }]
    };
  }
);

Parameters

query

string

required

Search query string. Supports boolean operators like AND and OR for index-based search.

Return Value

The search tool returns a JSON array of results:

[
  {
    "title": "Getting Started with WebHelp",
    "id": "0:topics/getting-started.html",
    "url": "https://example.com/docs/topics/getting-started.html"
  },
  {
    "title": "Advanced Configuration",
    "id": "0:topics/configuration.html",
    "url": "https://example.com/docs/topics/configuration.html"
  }
]

Result Fields

title — The document title extracted from the search index
id — Composite identifier in format index:path (used for fetching)
url — Full URL to the document

The id field is crucial — pass it to the fetch tool to retrieve the full document content.

Real-World Examples

Searching DITA Documentation

# Claude Desktop example
Search for "publishing output" in DITA OT docs

MCP server configuration:

{
  "mcpServers": {
    "dita-ot-docs": {
      "url": "https://webhelp-mcp.vercel.app/www.dita-ot.org/dev"
    }
  }
}

Searching Oxygen XML Documentation

# Search for "transformation scenarios"

MCP server configuration:

{
  "mcpServers": {
    "oxygen-docs": {
      "url": "https://webhelp-mcp.vercel.app/www.oxygenxml.com/doc/versions/26.1/ug-editor"
    }
  }
}

Query Tips

Use Specific Terms

“DITA map validation” works better than “checking maps”

Boolean Operators

“publishing AND PDF” to require both terms (index search only)

Natural Language

“How do I publish output?” works well with semantic search

Short Queries

2-5 word queries typically yield better results

Error Handling

The search tool handles various error scenarios:

Index Load Failure

{
  "error": "Failed to load index: HTTP 404: Not Found",
  "results": []
}

This typically means the WebHelp site doesn’t exist or the search index files aren’t accessible.

Search Engine Error

{
  "error": "Search error: Cannot read property 'w' of undefined",
  "results": []
}

Search errors usually indicate malformed or incomplete search indexes. Try fetching the index files directly to diagnose.

Performance Considerations

Index Loading

The first search request for a site loads the entire search index:

// From webhelp-index-loader.ts:328-354
async loadIndex(baseUrl: string): Promise<void> {
  const searchUrl = `${baseUrl.replace(/\/$/, '')}/oxygen-webhelp/app/search`;
  
  // Download all files
  const nwSearchFntJs = await this.downloadSearchEngine(searchUrl);
  const indexParts = await this.downloadIndexParts(searchUrl);
  const metadataFiles = await this.downloadMetadataFiles(searchUrl);
  
  // Process and initialize
  this.processStopwords(metadataFiles.stopwords);
  this.processFileInfoList(metadataFiles.htmlFileInfoList);
  this.processIndexParts(indexParts);
  this.initializeSearchEngine(nwSearchFntJs);
}

Index files loaded:

nwSearchFnt.js — Search engine code
index-1.js through index-N.js — Word indexes
stopwords.js — Stop words list
htmlFileInfoList.js — File metadata

Index loading is cached per deployment. Subsequent searches are much faster.

Result Limits

The server returns a maximum of 10 results to keep responses fast and manageable:

const maxResultsToUse = 10;
const topResults = result.results.slice(0, maxResultsToUse);

Next Steps

Fetch Documents

Retrieve full content after searching

Federated Search

Search multiple sites simultaneously

Semantic Search

Deep dive into AI-powered search

Integration Guide

Connect to AI tools

Get Started

Core Features

Integration

Deployment

Search Tool

Search Tool

How It Works

Search Strategies

Semantic Search

Index-Based Search

Usage Example

Parameters

Return Value

Result Fields

Real-World Examples

Searching DITA Documentation

Searching Oxygen XML Documentation

Query Tips

Use Specific Terms

Boolean Operators

Natural Language

Short Queries

Error Handling

Index Load Failure

Search Engine Error

Performance Considerations

Index Loading

Result Limits

Next Steps

Fetch Documents

Federated Search

Semantic Search

Integration Guide

Build docs developers (and LLMs) love

Get Started

Core Features

Integration

Deployment

​Search Tool

​How It Works

​Search Strategies

​Semantic Search

​Index-Based Search

​Usage Example

​Parameters

​Return Value

​Result Fields

​Real-World Examples

​Searching DITA Documentation

​Searching Oxygen XML Documentation

​Query Tips

Use Specific Terms

Boolean Operators

Natural Language

Short Queries

​Error Handling

​Index Load Failure

​Search Engine Error

​Performance Considerations

​Index Loading

​Result Limits

​Next Steps

Fetch Documents

Federated Search

Semantic Search

Integration Guide

Build docs developers (and LLMs) love

Search Tool

How It Works

Search Strategies

Semantic Search

Index-Based Search

Usage Example

Parameters

Return Value

Result Fields

Real-World Examples

Searching DITA Documentation

Searching Oxygen XML Documentation

Query Tips

Error Handling

Index Load Failure

Search Engine Error

Performance Considerations

Index Loading

Result Limits

Next Steps