Overview
Grok Search MCP Server implements an intelligent in-memory caching system specifically designed for comprehensive analysis queries. This cache improves performance, reduces API costs, and provides faster responses for repeated complex analyses.Cache Implementation
The caching system is implemented in theSearchCache class:
Cache Configuration
Default Max Size
100 itemsMaximum number of cached comprehensive analyses
Default TTL
30 minutesTime-to-live before cache entries expire
How Caching Works
1. Cache Key Generation
Cache keys include all parameters to ensure accuracy:- Sanitized query text
- Search type (web/news/twitter/general)
- Max results parameter
- Handle array (for Twitter searches)
- From date (if specified)
- To date (if specified)
- Analysis mode marker (“comprehensive”)
2. Cache Check (Read)
Before making an API request, the system checks the cache:3. Cache Storage (Write)
After successful comprehensive analysis, results are cached:4. LRU Eviction
When cache reaches capacity, the Least Recently Used item is removed:What Gets Cached
- Cached
- Not Cached
Comprehensive Analyses Only
The cache only stores comprehensive analysis results:✅ Cached Requests:- Comprehensive analysis text
- Key findings array
- Timeline events
- Direct quotes
- Multiple perspectives
- Implications
- Verification status
- Raw results
- Citations and metadata
Why Only Comprehensive?
Comprehensive analyses are expensive operations (4000 tokens vs 2000) and results remain relevant longer than basic search results which prioritize real-time data.
Cache Lifecycle
Monitoring Cache Performance
The health check tool provides cache metrics:Cache Key Examples
Example 1: Simple Query
Example 2: With Date Range
Example 3: Twitter with Handles
Performance Impact
Cache Hit
< 10msInstant response from memory
Cache Miss
10-20 secondsFull API request required
Token Savings
4,000 tokensSaved per cache hit
Best Practices
For Developers
Use Consistent Parameters
Use Consistent Parameters
Keep parameters consistent for the same query to maximize cache hits:✅ Good: Same query, same parameters❌ Bad: Varying parameters prevents cache hits
Understand TTL Limitations
Understand TTL Limitations
Cache entries expire after 30 minutes:
- For topics changing rapidly, consider basic mode instead
- For research queries, cache provides excellent reusability
- Breaking news may be stale after 30 minutes
Monitor Cache Size
Monitor Cache Size
Maximum 100 entries with LRU eviction:
- Popular queries remain cached longer
- Unique queries may be evicted quickly
- Use health_check to monitor cache size
Query Normalization
Query Normalization
Be aware that query text must match exactly:The server sanitizes control characters but preserves case and spacing.
Cache Limitations
Memory Only: Cache is stored in-memory and does not persist across server restarts. Restarting the MCP server clears all cached entries.
Advanced: Manual Cache Management
While not exposed through the MCP interface, the cache can be cleared programmatically:- Development and testing
- Forcing fresh data retrieval
- Memory management in constrained environments
Summary
The intelligent caching system:- Targets expensive operations - Only caches comprehensive analyses
- Uses precise keys - Includes all parameters for accuracy
- Implements TTL - 30-minute expiration ensures reasonable freshness
- Employs LRU eviction - Maintains 100-item capacity efficiently
- Transparent operation - Works automatically without configuration
- Performance boost - Reduces latency from 10-20s to under 10ms on cache hits
- Cost optimization - Saves 4,000 tokens per cached comprehensive analysis