The Grok Search MCP Server is optimized for both speed and accuracy. Understanding performance characteristics helps you configure optimal search parameters.
Response Times
Basic Mode
Average: 3-8 seconds
Range: 2-15 seconds
Factors affecting speed:
- Query complexity
- Number of results requested
- Network latency
- API load
max_tokens: analysisMode === "comprehensive" ? 4000 : 2000
Comprehensive Mode
Average: 20-40 seconds
Range: 15-60 seconds
Factors affecting speed:
- All basic mode factors
- Analysis depth requirements
- Timeline generation
- Quote extraction
- Multiple perspective analysis
Use basic mode for quick results and comprehensive mode only when detailed analysis is needed.
Caching Strategy
The server implements intelligent caching for expensive operations.
Cache Configuration
class SearchCache {
constructor(maxSize = 100, ttlMinutes = 30) {
this.cache = new Map();
this.maxSize = maxSize;
this.ttl = ttlMinutes * 60 * 1000;
}
get(key) {
const item = this.cache.get(key);
if (!item) return null;
if (Date.now() - item.timestamp > this.ttl) {
this.cache.delete(key);
return null;
}
return item.data;
}
set(key, data) {
if (this.cache.size >= this.maxSize) {
const firstKey = this.cache.keys().next().value;
this.cache.delete(firstKey);
}
this.cache.set(key, {
data,
timestamp: Date.now()
});
}
}
Cache Settings:
- Size: 100 items maximum (LRU eviction)
- TTL: 30 minutes
- Eviction: Least Recently Used (LRU)
What Gets Cached
Cached:
- Comprehensive analysis results only
- Complete query + parameters as cache key
Not Cached:
- Basic search results (always fresh)
- Health check responses
- Error responses
Cache Key Composition
const cacheKey = `${sanitizedQuery}:${searchType}:${maxResults}:${JSON.stringify(handles)}:${fromDate}:${toDate}:comprehensive`;
Cache key includes:
- Sanitized query text
- Search type (web/news/twitter)
- Max results parameter
- Twitter handles (if applicable)
- Date range (from_date and to_date)
- Analysis mode (only “comprehensive”)
Changing any parameter creates a new cache entry. Queries with different dates are cached separately.
Cache Hit:
const cached = this.cache.get(cacheKey);
if (cached) {
Logger.debug("Cache hit for comprehensive analysis", { query: sanitizedQuery });
return cached;
}
Performance gain: ~100x faster (instant vs 20-40 seconds)
Monitor cache:
{
"tool": "health_check",
"parameters": {}
}
Response includes:
{
"api_details": {
"cacheSize": 12 // Number of cached items
}
}
Request Optimization
Query Optimization
Optimal Query Length
Recommended: 2-10 words
Maximum: 1000 characters
if (sanitizedQuery.length > 1000) {
throw new Error("Search query too long (max 1000 characters)");
}
Good queries:
- “AI developments 2025”
- “climate change policy updates”
- “SpaceX Starship launch”
Poor queries:
- “a” (too short, vague)
- “tell me everything about…” (unnecessary words)
- 500-word detailed description (too long)
Query Sanitization
The server automatically sanitizes queries:
const sanitizedQuery = query.trim().replace(/[\x00-\x08\x0B\x0C\x0E-\x1F\x7F]/g, '');
Removes:
- Leading/trailing whitespace
- Control characters
- Null bytes
Parameter Optimization
max_results
Default: 10
Range: 1-20
Performance impact:
- 5 results: Fastest
- 10 results: Balanced (recommended)
- 20 results: Slower, more comprehensive
// Fast, focused search
{
"query": "specific topic",
"max_results": 5
}
// Comprehensive coverage
{
"query": "broad topic",
"max_results": 15
}
For most use cases, 5-10 results provide optimal balance between speed and coverage.
analysis_mode
Basic: Fast, simple results
Comprehensive: Slow, detailed analysis
Token allocation:
max_tokens: analysisMode === "comprehensive" ? 4000 : 2000
Performance comparison:
- Basic: 2000 tokens, 3-8 seconds
- Comprehensive: 4000 tokens, 20-40 seconds
Choose basic when:
- You need quick results
- Simple information is sufficient
- Making multiple searches
- Testing or exploration
Choose comprehensive when:
- You need deep analysis
- Timeline is important
- Multiple perspectives matter
- Direct quotes needed
- Historical context required
Date Filtering
Performance impact: Minimal to moderate
Faster:
{
"query": "recent news",
"from_date": "2025-06-01", // Narrow range
"to_date": "2025-06-07"
}
Slower:
{
"query": "historical data",
"from_date": "2020-01-01", // Broad range
"to_date": "2025-06-04"
}
Narrower date ranges typically yield faster results since less content needs to be searched and analyzed.
Timeout Configuration
Configure timeouts based on your usage patterns:
this.requestTimeout = parseInt(process.env.GROK_TIMEOUT || '30000');
const controller = new AbortController();
const timeoutId = setTimeout(() => controller.abort(), this.requestTimeout);
Recommended Timeouts
Basic mode only:
{"GROK_TIMEOUT": "30000"} // 30 seconds
Comprehensive mode only:
{"GROK_TIMEOUT": "60000"} // 60 seconds
Mixed usage (recommended):
{"GROK_TIMEOUT": "45000"} // 45 seconds
Development/testing:
{"GROK_TIMEOUT": "20000"} // 20 seconds (fail fast)
Setting timeout too low causes legitimate comprehensive requests to fail. Setting too high delays error detection.
The server implements exponential backoff for retries:
const backoffDelay = Math.min(1000 * Math.pow(2, retryCount), 10000);
Retry delays:
- Retry 1: 1 second (2^0 * 1000)
- Retry 2: 2 seconds (2^1 * 1000)
- Retry 3: 4 seconds (2^2 * 1000)
- Retry 4: 8 seconds (2^3 * 1000)
- Retry 5+: 10 seconds (capped)
Total time with 3 retries:
- Original request: 0-30 seconds
- Retry 1: +1 second + 0-30 seconds
- Retry 2: +2 seconds + 0-30 seconds
- Retry 3: +4 seconds + 0-30 seconds
- Maximum: ~97 seconds
Retry Configuration
Fast fail (development):
{
"GROK_MAX_RETRIES": "1",
"GROK_TIMEOUT": "20000"
}
Balanced (recommended):
{
"GROK_MAX_RETRIES": "3",
"GROK_TIMEOUT": "45000"
}
Aggressive (production, high reliability):
{
"GROK_MAX_RETRIES": "5",
"GROK_TIMEOUT": "60000"
}
Request Monitoring
The server logs request duration:
const startTime = Date.now();
// ... make request ...
const duration = Date.now() - startTime;
Logger.debug(`API request successful`, { duration, endpoint });
Parallel Requests
The server handles one request at a time per instance. To make parallel searches, use multiple MCP server instances in your configuration.
Multiple server configuration:
{
"mcpServers": {
"grok-search-1": {
"command": "npx",
"args": ["grok-search-mcp"],
"env": {"XAI_API_KEY": "your-key"}
},
"grok-search-2": {
"command": "npx",
"args": ["grok-search-mcp"],
"env": {"XAI_API_KEY": "your-key"}
}
}
}
Memory Usage
Cache Memory
Per cached item: ~5-50 KB (depending on comprehensiveness)
Maximum cache: 100 items
Total cache memory: ~0.5-5 MB
Server Memory
Baseline: ~50-100 MB
With cache: ~55-105 MB
Per request: +10-20 MB (temporary)
Memory usage is minimal and suitable for long-running server instances.
Best Practices
1. Choose Appropriate Analysis Mode
// Fast exploration
basicMode = {
analysis_mode: "basic",
max_results: 5
}
// Deep research
comprehensiveMode = {
analysis_mode: "comprehensive",
max_results: 10
}
2. Use Cache Effectively
Leverage cache for repeated queries:
// First request: 30 seconds
{
"query": "AI developments",
"analysis_mode": "comprehensive"
}
// Second request (within 30 min): instant
{
"query": "AI developments",
"analysis_mode": "comprehensive"
}
Cache invalidation: Wait 30 minutes for fresh results
3. Optimize Query Construction
Good:
{
"query": "SpaceX Starship test flight",
"search_type": "news",
"from_date": "2025-06-01"
}
Bad:
{
"query": "Can you tell me everything about all SpaceX activities including Starship, Falcon 9, Dragon, and any other programs they might have been working on recently or in the past?",
"search_type": "general"
}
4. Implement Progressive Enhancement
Start with fast basic search, then request comprehensive analysis if needed:
// Step 1: Fast initial search
const basicResults = await search({
query: "topic",
analysis_mode: "basic",
max_results: 5
});
// Step 2: User wants more detail
if (userRequestsDetail) {
const comprehensiveResults = await search({
query: "topic",
analysis_mode: "comprehensive",
max_results: 10
});
}
5. Use Appropriate Search Types
Performance ranking (fastest to slowest):
web - Single source type
news - Two source types (news + web)
twitter - Single source, but may need filtering
general - All sources (web + news + twitter)
getSearchSources(searchType, handles = null) {
switch (searchType) {
case "web":
return [{"type": "web"}];
case "news":
return [{"type": "news"}, {"type": "web"}];
case "twitter":
case "x":
const xSource = {"type": "x"};
if (handles && handles.length > 0) {
xSource.x_handles = handles;
}
return [xSource];
case "general":
default:
return [{"type": "web"}, {"type": "news"}, {"type": "x"}];
}
}
Regularly check health metrics:
{
"tool": "health_check",
"parameters": {}
}
Key metrics:
success_rate: Should be >90%
error_count: Monitor for spikes
cacheSize: Indicates cache utilization
uptime_ms: Server stability
7. Handle Timeouts Gracefully
try {
const results = await search(params);
return results;
} catch (error) {
if (error.includes('timeout')) {
// Retry with basic mode
return await search({
...params,
analysis_mode: 'basic',
max_results: 5
});
}
throw error;
}
Basic Web Search:
- Min: 2 seconds
- Typical: 5 seconds
- Max: 15 seconds
- Timeout: 30 seconds
Comprehensive Analysis:
- Min: 15 seconds
- Typical: 30 seconds
- Max: 50 seconds
- Timeout: 60 seconds (recommended)
Cache Hit:
- Response: Under 100ms (instant)
handleHealthCheck() {
const apiHealth = this.grokAPI.checkHealth();
const uptime = Date.now() - this.startTime;
const successRate = this.requestCount > 0 ?
(((this.requestCount - this.errorCount) / this.requestCount) * 100).toFixed(2) + "%" :
"N/A";
return {
server_healthy: true,
api_healthy: apiHealth.healthy,
uptime_ms: uptime,
total_requests: this.requestCount,
error_count: this.errorCount,
success_rate: successRate,
api_details: {
hasApiKey: apiHealth.hasApiKey,
cacheSize: apiHealth.cacheSize,
lastError: apiHealth.lastError
}
};
}
Interpret metrics:
- Success rate 95-100%: Excellent
- Success rate 85-95%: Good (some retries working)
- Success rate 70-85%: Fair (check configuration)
- Success rate under 70%: Poor (investigate errors)
Slow Responses
Diagnosis:
- Check analysis mode (comprehensive is slower)
- Review max_results (higher is slower)
- Check network latency
- Review date range (broader is slower)
Solution:
- Use basic mode when possible
- Reduce max_results to 5-10
- Use narrower date ranges
- Check network connectivity
High Error Rate
Diagnosis:
{
"error_count": 25,
"total_requests": 100,
"success_rate": "75.00%"
}
Solution:
- Increase timeout:
GROK_TIMEOUT: 60000
- Increase retries:
GROK_MAX_RETRIES: 5
- Check API key validity
- Review error patterns in logs
Cache Not Helping
Diagnosis:
cacheSize: 0 or very low
- Frequently changing parameters
Solution:
- Reuse same queries for comprehensive analysis
- Avoid randomizing parameters
- Use consistent date ranges
- Cache only works for comprehensive mode
Advanced Optimization
JSON Parsing Strategies
The server uses multiple parsing strategies for resilience:
const jsonParsingStrategies = [
// Strategy 1: Find complete JSON object
() => {
const jsonMatch = content.match(/\{[\s\S]*\}/);
return jsonMatch ? JSON.parse(jsonMatch[0]) : null;
},
// Strategy 2: Find JSON between code blocks
() => {
const codeBlockMatch = content.match(/```json\s*([\s\S]*?)\s*```/);
return codeBlockMatch ? JSON.parse(codeBlockMatch[1]) : null;
},
// Strategy 3: Find JSON after specific markers
() => {
const markerMatch = content.match(/(?:json|JSON|response):\s*(\{[\s\S]*\})/);
return markerMatch ? JSON.parse(markerMatch[1]) : null;
},
// Strategy 4: Clean and try parsing the entire content
() => {
const cleaned = content.replace(/^[^{]*/, '').replace(/[^}]*$/, '');
return cleaned.startsWith('{') ? JSON.parse(cleaned) : null;
}
];
for (const strategy of jsonParsingStrategies) {
try {
parsedResults = strategy();
if (parsedResults) break;
} catch (error) {
continue;
}
}
Performance impact: Negligible (under 10ms total)
Reliability gain: Handles malformed responses gracefully
const sanitizedQuery = query.trim().replace(/[\x00-\x08\x0B\x0C\x0E-\x1F\x7F]/g, '');
Performance impact: Under 1ms
Benefit: Prevents API errors from invalid characters
Summary
For optimal performance:
- Use basic mode for most searches
- Set max_results to 5-10
- Use narrow date ranges when possible
- Configure timeouts appropriately: 45-60 seconds
- Enable retries: 3-5 attempts
- Leverage caching for repeated comprehensive queries
- Monitor health metrics regularly
- Use specific search types instead of “general”
- Optimize queries: 2-10 words, clear and specific
- Implement progressive enhancement: basic → comprehensive