Intelligent Caching

Overview

Grok Search MCP Server implements an intelligent in-memory caching system specifically designed for comprehensive analysis queries. This cache improves performance, reduces API costs, and provides faster responses for repeated complex analyses.

Cache Implementation

The caching system is implemented in the SearchCache class:

// index.js:36-70
class SearchCache {
  constructor(maxSize = 100, ttlMinutes = 30) {
    this.cache = new Map();
    this.maxSize = maxSize;
    this.ttl = ttlMinutes * 60 * 1000;
  }

  get(key) {
    const item = this.cache.get(key);
    if (!item) return null;
    
    if (Date.now() - item.timestamp > this.ttl) {
      this.cache.delete(key);
      return null;
    }
    
    return item.data;
  }

  set(key, data) {
    if (this.cache.size >= this.maxSize) {
      const firstKey = this.cache.keys().next().value;
      this.cache.delete(firstKey);
    }
    
    this.cache.set(key, {
      data,
      timestamp: Date.now()
    });
  }

  clear() {
    this.cache.clear();
  }
}

Cache Configuration

Default Max Size

100 itemsMaximum number of cached comprehensive analyses

Default TTL

30 minutesTime-to-live before cache entries expire

How Caching Works

1. Cache Key Generation

Cache keys include all parameters to ensure accuracy:

// index.js:210-217
if (analysisMode === "comprehensive") {
  const cacheKey = `${sanitizedQuery}:${searchType}:${maxResults}:${JSON.stringify(handles)}:${fromDate}:${toDate}:comprehensive`;
  const cached = this.cache.get(cacheKey);
  if (cached) {
    Logger.debug("Cache hit for comprehensive analysis", { query: sanitizedQuery });
    return cached;
  }
}

Cache Key Components:

Sanitized query text
Search type (web/news/twitter/general)
Max results parameter
Handle array (for Twitter searches)
From date (if specified)
To date (if specified)
Analysis mode marker (“comprehensive”)

2. Cache Check (Read)

Before making an API request, the system checks the cache:

Key Lookup

System generates cache key from request parameters

Existence Check

Looks up key in the Map data structure

TTL Validation

Checks if cached item is still within the 30-minute TTL

Return or Continue

Returns cached data if valid, or proceeds to API call if not

3. Cache Storage (Write)

After successful comprehensive analysis, results are cached:

// index.js:272-276
if (analysisMode === "comprehensive" && results) {
  const cacheKey = `${sanitizedQuery}:${searchType}:${maxResults}:${JSON.stringify(handles)}:${fromDate}:${toDate}:comprehensive`;
  this.cache.set(cacheKey, results);
}

4. LRU Eviction

When cache reaches capacity, the Least Recently Used item is removed:

set(key, data) {
  if (this.cache.size >= this.maxSize) {
    const firstKey = this.cache.keys().next().value;
    this.cache.delete(firstKey);
  }
  
  this.cache.set(key, {
    data,
    timestamp: Date.now()
  });
}

What Gets Cached

Cached
Not Cached

Comprehensive Analyses Only

The cache only stores comprehensive analysis results:✅ Cached Requests:

{
  "tool": "grok_search",
  "parameters": {
    "query": "AI developments 2025",
    "analysis_mode": "comprehensive"
  }
}

✅ Full Response Cached:

Comprehensive analysis text
Key findings array
Timeline events
Direct quotes
Multiple perspectives
Implications
Verification status
Raw results
Citations and metadata

Why Only Comprehensive?

Comprehensive analyses are expensive operations (4000 tokens vs 2000) and results remain relevant longer than basic search results which prioritize real-time data.

Basic Mode Searches

Basic searches are never cached:❌ Not Cached:

{
  "tool": "grok_search",
  "parameters": {
    "query": "latest news",
    "analysis_mode": "basic"
  }
}

Why Not Cache Basic?

Basic searches prioritize real-time, up-to-date information. Caching would defeat the purpose of quick, current results.

Design Rationale:

Basic mode emphasizes freshness
Lower token cost makes repeated queries acceptable
Use cases often require real-time data
Simpler response structure processes quickly

Cache Lifecycle

Monitoring Cache Performance

The health check tool provides cache metrics:

{
  "tool": "health_check",
  "parameters": {}
}

Response includes cache size:

{
  "server_healthy": true,
  "api_healthy": true,
  "api_details": {
    "hasApiKey": true,
    "cacheSize": 42,
    "lastError": null
  }
}

Cache Key Examples

Example 1: Simple Query

// Request
{
  query: "climate change",
  searchType: "news",
  maxResults: 10,
  analysisMode: "comprehensive"
}

// Cache Key
"climate change:news:10:undefined:undefined:undefined:comprehensive"

Example 2: With Date Range

// Request
{
  query: "tech earnings",
  searchType: "news",
  maxResults: 15,
  fromDate: "2025-03-01",
  toDate: "2025-03-04",
  analysisMode: "comprehensive"
}

// Cache Key
"tech earnings:news:15:undefined:2025-03-01:2025-03-04:comprehensive"

Example 3: Twitter with Handles

// Request
{
  query: "AI announcements",
  searchType: "twitter",
  handles: ["OpenAI", "AnthropicAI"],
  maxResults: 20,
  analysisMode: "comprehensive"
}

// Cache Key
"AI announcements:twitter:20:[\"OpenAI\",\"AnthropicAI\"]:undefined:undefined:comprehensive"

Performance Impact

Cache Hit

< 10msInstant response from memory

Cache Miss

10-20 secondsFull API request required

Token Savings

4,000 tokensSaved per cache hit

Best Practices

For Developers

Use Consistent Parameters

Keep parameters consistent for the same query to maximize cache hits:✅ Good: Same query, same parameters

// First request
{"query": "AI trends", "max_results": 10, "analysis_mode": "comprehensive"}

// Second request (cache hit)
{"query": "AI trends", "max_results": 10, "analysis_mode": "comprehensive"}

❌ Bad: Varying parameters prevents cache hits

// First request
{"query": "AI trends", "max_results": 10, "analysis_mode": "comprehensive"}

// Second request (cache miss)
{"query": "AI trends", "max_results": 15, "analysis_mode": "comprehensive"}

Understand TTL Limitations

Cache entries expire after 30 minutes:

For topics changing rapidly, consider basic mode instead
For research queries, cache provides excellent reusability
Breaking news may be stale after 30 minutes

Monitor Cache Size

Maximum 100 entries with LRU eviction:

Popular queries remain cached longer
Unique queries may be evicted quickly
Use health_check to monitor cache size

Query Normalization

Be aware that query text must match exactly:

// These are different cache keys
"climate change"  !== "Climate Change"
"AI trends "      !== "AI trends"  // trailing space

The server sanitizes control characters but preserves case and spacing.

Cache Limitations

Cache Invalidation: There is no automatic cache invalidation when source data changes. Entries remain cached for the full 30-minute TTL regardless of updates to source content.

Memory Only: Cache is stored in-memory and does not persist across server restarts. Restarting the MCP server clears all cached entries.

Advanced: Manual Cache Management

While not exposed through the MCP interface, the cache can be cleared programmatically:

// In server code
this.grokAPI.cache.clear();

This might be useful for:

Development and testing
Forcing fresh data retrieval
Memory management in constrained environments

Summary

The intelligent caching system:

Targets expensive operations - Only caches comprehensive analyses
Uses precise keys - Includes all parameters for accuracy
Implements TTL - 30-minute expiration ensures reasonable freshness
Employs LRU eviction - Maintains 100-item capacity efficiently
Transparent operation - Works automatically without configuration
Performance boost - Reduces latency from 10-20s to under 10ms on cache hits
Cost optimization - Saves 4,000 tokens per cached comprehensive analysis

For repeated comprehensive analyses of the same topic within 30 minutes, the cache provides dramatic performance improvements and cost savings.

Get Started

Core Concepts

Guides

Overview

Cache Implementation

Cache Configuration

Default Max Size

Default TTL

How Caching Works

1. Cache Key Generation

2. Cache Check (Read)

3. Cache Storage (Write)

4. LRU Eviction

What Gets Cached

Comprehensive Analyses Only

Why Only Comprehensive?

Basic Mode Searches

Why Not Cache Basic?

Cache Lifecycle

Monitoring Cache Performance

Cache Key Examples

Example 1: Simple Query

Example 2: With Date Range

Example 3: Twitter with Handles

Performance Impact

Cache Hit

Cache Miss

Token Savings

Best Practices

For Developers

Cache Limitations

Advanced: Manual Cache Management

Summary

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

​Overview

​Cache Implementation

​Cache Configuration

Default Max Size

Default TTL

​How Caching Works

​1. Cache Key Generation

​2. Cache Check (Read)

​3. Cache Storage (Write)

​4. LRU Eviction

​What Gets Cached

​Comprehensive Analyses Only

​Why Only Comprehensive?

​Basic Mode Searches

​Why Not Cache Basic?

​Cache Lifecycle

​Monitoring Cache Performance

​Cache Key Examples

​Example 1: Simple Query

​Example 2: With Date Range

​Example 3: Twitter with Handles

​Performance Impact

Cache Hit

Cache Miss

Token Savings

​Best Practices

​For Developers

​Cache Limitations

​Advanced: Manual Cache Management

​Summary

Build docs developers (and LLMs) love

Overview

Cache Implementation

Cache Configuration

How Caching Works

1. Cache Key Generation

2. Cache Check (Read)

3. Cache Storage (Write)

4. LRU Eviction

What Gets Cached

Comprehensive Analyses Only

Why Only Comprehensive?

Basic Mode Searches

Why Not Cache Basic?

Cache Lifecycle

Monitoring Cache Performance

Cache Key Examples

Example 1: Simple Query

Example 2: With Date Range

Example 3: Twitter with Handles

Performance Impact

Best Practices

For Developers

Cache Limitations

Advanced: Manual Cache Management

Summary