Federated Search
Federated search enables querying multiple WebHelp documentation sites in a single request. Results from all sites are merged, scored, and ranked together, providing comprehensive answers across your entire documentation ecosystem.
How It Works
Instead of pointing to a single documentation site, federated search uses a special URL format that encodes multiple site URLs:
https://webhelp-mcp.vercel.app/federated/{encoded-urls}
The server:
Decodes the URL list using the decodeUrls function
Searches each site independently
Merges all results into a single array
Sorts by relevance score
Returns the top 10 results across all sites
Federated search always uses index-based search, not semantic search, to ensure consistent scoring across sites.
URL Encoding
Federated search uses a compressed encoding scheme to pack multiple URLs into a single path segment:
// From url-pack.ts:11-34
export function encodeUrls ( urls : string []) : string {
if ( ! urls || urls . length === 0 ) return '' ;
const sorted = [ ... urls ]. sort ();
const diffs : string [] = [];
let last = '' ;
for ( const url of sorted ) {
if ( ! last ) {
diffs . push ( url );
} else {
let i = 0 ;
const minLen = Math . min ( url . length , last . length );
while ( i < minLen && url [ i ] === last [ i ]) i ++ ;
const suffix = url . slice ( i );
diffs . push ( ` ${ i } | ${ suffix } ` );
}
last = url ;
}
const joined = diffs . join ( ' \n ' );
const compressed = deflateSync ( joined );
return base64urlEncode ( compressed );
}
Encoding steps:
Sort URLs alphabetically to maximize prefix similarity
Compute prefix differences between consecutive URLs
Join differences with newlines
Compress with zlib deflate
Encode as base64url (URL-safe)
The encoding is highly efficient — 10 similar URLs might compress to just 50-100 characters.
URL Decoding
The server decodes the compressed URL list on each request:
// From url-pack.ts:36-61
export function decodeUrls ( encoded : string ) : string [] {
if ( ! encoded ) return [];
const decoded = base64urlDecode ( encoded );
const joined = inflateSync ( decoded ). toString ();
const diffs = joined . split ( ' \n ' );
const urls : string [] = [];
let last = '' ;
for ( const diff of diffs ) {
const sepIndex = diff . indexOf ( '|' );
if ( sepIndex > - 1 ) {
const prefixLen = parseInt ( diff . slice ( 0 , sepIndex ), 10 );
if ( ! isNaN ( prefixLen )) {
const prefix = last . slice ( 0 , prefixLen );
const url = prefix + diff . slice ( sepIndex + 1 );
urls . push ( url );
last = url ;
continue ;
}
}
urls . push ( diff );
last = diff ;
}
return urls ;
}
Route Handler
The Next.js route handler detects federated mode and decodes URLs:
// From app/[...site]/route.ts:8-14
function resolveBaseUrls ( site : Array < string >) : string [] {
if ( site [ 0 ] === 'federated' && site [ 1 ]) {
return decodeUrls ( site [ 1 ]);
}
const endpoint = site . join ( '/' );
return [ `https:// ${ endpoint } /` ];
}
URL structure:
Single site: /www.example.com/docs
Federated: /federated/{encoded-string}
Search Implementation
Federated search queries all sites and merges results:
// From webhelp-search-client.ts:40-88
async search ( query : string ): Promise < SearchResult > {
const urls = this . baseUrls ;
// Semantic search only for single sites
if (urls.length === 1) {
try {
const semantic = await this . semanticSearch ( query , urls [ 0 ]);
if ( ! semantic . error && semantic . results . length > 0 ) {
return semantic ;
}
} catch ( e ) {
// Fall back to index search
}
}
// Index-based search for federated or fallback
const mergedResults: SearchResult [ 'results' ] = [];
for ( const url of urls ) {
await this.loadIndex(url);
let result: any = null ;
this . indexLoader . performSearch ( query , function ( r : any ) {
result = r ;
});
const idx = urls . indexOf ( url );
const formatted = this . formatSearchResult ( result , url , idx );
mergedResults . push ( ... formatted . results );
}
// Sort all results by score
mergedResults . sort (( a , b ) => b . score - a . score );
return { results: mergedResults };
}
Federated search skips semantic search even if all sites support it. This ensures consistent scoring across sites.
Federated search results include the site index in the document ID:
[
{
"title" : "Getting Started" ,
"id" : "0:topics/getting-started.html" ,
"url" : "https://site1.example.com/docs/topics/getting-started.html" ,
"score" : 8.5
},
{
"title" : "API Reference" ,
"id" : "1:reference/api.html" ,
"url" : "https://site2.example.com/docs/reference/api.html" ,
"score" : 7.2
},
{
"title" : "Configuration" ,
"id" : "0:topics/configuration.html" ,
"url" : "https://site1.example.com/docs/topics/configuration.html" ,
"score" : 6.8
}
]
ID format: {index}:{path}
0: — First URL in the decoded list
1: — Second URL in the decoded list
And so on…
The fetch tool uses the index to resolve the correct base URL when retrieving documents.
Usage Examples
Searching Multiple Oxygen Products
Search across Oxygen XML Editor, Author, and Developer documentation:
{
"mcpServers" : {
"oxygen-all" : {
"url" : "https://webhelp-mcp.vercel.app/federated/{encoded}"
}
}
}
Where {encoded} represents:
https://www.oxygenxml.com/doc/versions/26.1/ug-editor/
https://www.oxygenxml.com/doc/versions/26.1/ug-author/
https://www.oxygenxml.com/doc/versions/26.1/ug-developer/
Searching DITA-OT and Oxygen Together
Combine DITA-OT and Oxygen documentation for comprehensive DITA authoring help:
{
"mcpServers" : {
"dita-ecosystem" : {
"url" : "https://webhelp-mcp.vercel.app/federated/{encoded}"
}
}
}
Where {encoded} represents:
https://www.dita-ot.org/dev/
https://www.oxygenxml.com/doc/versions/26.1/ug-editor/
Creating Federated URLs
You can create federated URLs programmatically:
import { encodeUrls } from './lib/url-pack' ;
const urls = [
'https://www.dita-ot.org/dev/' ,
'https://www.oxygenxml.com/doc/versions/26.1/ug-editor/' ,
'https://docs.oasis-open.org/dita/v1.3/'
];
const encoded = encodeUrls ( urls );
const federatedUrl = `https://webhelp-mcp.vercel.app/federated/ ${ encoded } ` ;
console . log ( federatedUrl );
Index Loading
Each site’s search index must be loaded independently:
for ( const url of urls ) {
await this . loadIndex ( url );
// ... search ...
}
Loading time:
1 site: ~500ms
3 sites: ~1.5s
5 sites: ~2.5s
10 sites: ~5s
Loading many sites sequentially can cause timeouts. Consider limiting to 3-5 sites per federated search.
Search Execution
Searching happens sequentially, not in parallel:
for ( const url of urls ) {
await this . loadIndex ( url );
this . indexLoader . performSearch ( query , callback );
// ...
}
This is a current limitation that could be improved with parallel execution.
Result Merging
Merging and sorting results is fast even with many results:
mergedResults . sort (( a , b ) => b . score - a . score );
100 results: < 1ms
1000 results: < 10ms
Score Normalization
Different sites may use different scoring scales. The server does not normalize scores, which can affect ranking:
Site A: scores 0-10
Site B: scores 0-100
Results from Site B will dominate the merged list.
This is a known limitation. Future versions may add score normalization.
Best Practices
Limit Sites Keep federated searches to 3-5 sites for acceptable performance
Group Related Docs Federate related documentation sets, not random sites
Test Scoring Verify that results from all sites appear in merged output
Cache Configs Save encoded URLs in MCP configs rather than generating them each time
Limitations
No Semantic Search
Federated search always uses index-based search, even if all sites support semantic search.
Reason: Semantic search scores from different sites aren’t comparable.
Sequential Loading
Sites are loaded and searched sequentially, not in parallel.
Impact: Response time scales linearly with the number of sites.
No Score Normalization
Scores from different sites are merged without normalization.
Impact: Results from high-scoring sites may dominate.
No Site Labels
Search results don’t indicate which site each result came from (except via the index in the ID).
Workaround: Parse the URL or ID to determine the source site.
Error Handling
Partial Failures
If one site fails to load, the entire federated search fails:
for ( const url of urls ) {
try {
await this . loadIndex ( url );
} catch ( error : any ) {
return {
error: `Failed to load index: ${ error . message } ` ,
results: []
};
}
}
A single unavailable site breaks the entire federated search. Consider adding fallback logic for production use.
Invalid Encoded URLs
If the encoded URL parameter is malformed:
export function decodeUrls ( encoded : string ) : string [] {
if ( ! encoded ) return [];
const decoded = base64urlDecode ( encoded );
const joined = inflateSync ( decoded ). toString ();
// ...
}
Malformed encoding will throw an error during decompression.
Future Improvements
Potential enhancements to federated search:
Parallel loading — Load and search sites concurrently
Score normalization — Normalize scores to a 0-1 range per site
Partial success — Return results even if some sites fail
Site labels — Include site name or index in results
Semantic federation — Support semantic search across multiple sites
Result diversity — Ensure results from all sites appear in top results
Next Steps
Search Tool Learn about single-site search
Fetch Tool Retrieve documents from federated results
Integration Guide Set up federated search in Claude Desktop
URL Encoding Deep dive into the encoding scheme