Skip to main content

watercooler_smart_query

Execute intelligent multi-tier memory query with automatic escalation. Queries across three tiers with automatic escalation when lower tiers don’t provide sufficient results.
Safety: Read-only tool - does not modify any state

Overview

Smart Query orchestrates three memory tiers:
  • T1 (Baseline): JSONL graph with keyword/semantic search (cheapest, no LLM)
  • T2 (Graphiti): FalkorDB temporal graph with hybrid search (medium cost)
  • T3 (LeanRAG): Hierarchical clustering with multi-hop reasoning (expensive)
The orchestrator follows: “Always choose the cheapest tier that can satisfy the query intent.” Escalation happens automatically when results are insufficient.

Memory Tiers

T1: Baseline Graph

  • Storage: JSONL files (.graph/nodes.jsonl)
  • Search: Keyword regex + optional embeddings
  • Cost: Free (no LLM)
  • Latency: Less than 100ms
  • Best for: File lookups, keyword search, entity search

T2: Graphiti Temporal Graph

  • Storage: FalkorDB graph database
  • Search: Hybrid keyword + semantic + temporal
  • Cost: LLM calls for entity extraction (during indexing)
  • Latency: ~1-2s
  • Best for: Temporal queries, relationship queries, “who did what when?”

T3: LeanRAG Clustering

  • Storage: FalkorDB + hierarchical clusters
  • Search: Multi-hop reasoning, cluster navigation
  • Cost: LLM calls for summarization + query
  • Latency: ~5-10s
  • Best for: Complex multi-hop queries, “why did we decide X given Y?”

Parameters

query
string
required
Search query (e.g., “What authentication method was implemented?”)
code_path
string
default:""
Path to code repository (for T2/T3 database resolution)
threads_dir
string
default:""
Path to threads directory (for T1 baseline graph)If empty, attempts to resolve from code_path
max_tiers
integer
default:"2"
Maximum number of tiers to query (1-3)Default: 2 (T1 + T2, no T3 unless explicitly enabled)
force_tier
string
default:"null"
Force query to specific tier (“T1”, “T2”, or “T3”)Disables escalation when set
group_ids
list[string]
default:"null"
Optional list of project group_ids to filter results

Return Value

Returns JSON with:
query
string
Original query text
result_count
integer
Total evidence items found
tiers_queried
array
List of tiers that were queried (e.g., [“T1”, “T2”])
primary_tier
string
The tier that provided best results
escalation_reason
string
Why escalation occurred (if applicable)
sufficient
boolean
Whether results met sufficiency criteria
evidence
array
List of evidence items from all tiers:
  • tier: Source tier (T1/T2/T3)
  • id: Evidence UUID
  • content: Evidence text
  • score: Relevance score (0-1)
  • name: Evidence title
  • provenance: Source metadata (topic, entry_id, timestamp)
  • metadata: Additional properties

Usage Examples

Basic Query (Auto-Escalation)

await use_mcp_tool(
    "watercooler_smart_query",
    query="What error handling patterns did we use?",
    code_path="."
)

Force Specific Tier

# Force T2 (Graphiti)
await use_mcp_tool(
    "watercooler_smart_query",
    query="What changed after the auth refactor?",
    force_tier="T2",
    code_path="."
)

Enable T3 (LeanRAG)

# Allow escalation to T3
await use_mcp_tool(
    "watercooler_smart_query",
    query="Why did we choose RS256 given our deployment model?",
    max_tiers=3,
    code_path="."
)

Limit to T1 Only

await use_mcp_tool(
    "watercooler_smart_query",
    query="src/auth/jwt.py",
    max_tiers=1,
    code_path="."
)

Example Output

{
  "query": "What error handling patterns did we use?",
  "result_count": 5,
  "tiers_queried": ["T1", "T2"],
  "primary_tier": "T2",
  "escalation_reason": "Only 2 results from T1 (need 3)",
  "sufficient": true,
  "evidence": [
    {
      "tier": "T1",
      "id": "01ABC...",
      "content": "Implemented try-catch patterns in src/auth/jwt.py",
      "score": 0.85,
      "name": "Error Handling Discussion",
      "provenance": {
        "topic": "feature-auth",
        "entry_id": "01ABC123...",
        "timestamp": "2025-01-15T10:30:00Z"
      },
      "metadata": {"agent": "Claude", "type": "Note"}
    },
    {
      "tier": "T2",
      "id": "01DEF...",
      "content": "Added custom error classes for auth failures",
      "score": 0.92,
      "name": "Custom Error Classes",
      "provenance": {
        "topic": "feature-auth",
        "entry_id": "01DEF456...",
        "timestamp": "2025-01-15T11:00:00Z"
      },
      "metadata": {"agent": "Cursor", "type": "Note"}
    }
  ],
  "message": "Found 5 results from T2"
}

Escalation Logic

The orchestrator escalates when:
  1. Too few results: Fewer than min_results (default: 3)
  2. Low confidence: All scores below threshold (default: 0.6)
  3. Query complexity: Natural language questions escalate to semantic tiers
Escalation stops when:
  • Sufficient results found
  • max_tiers reached
  • Higher tier unavailable

Configuration

Environment variables control tier availability:
# Enable/disable tiers (default: T1=1, T2=0, T3=0)
WATERCOOLER_TIER_T1_ENABLED=1
WATERCOOLER_TIER_T2_ENABLED=1
WATERCOOLER_TIER_T3_ENABLED=0  # Opt-in (expensive)

# Max tiers to query (default: 2)
WATERCOOLER_TIER_MAX_TIERS=2

# Min results for sufficiency (default: 3)
WATERCOOLER_TIER_MIN_RESULTS=3
Or via config.toml:
[memory.tiers]
t1_enabled = true
t2_enabled = true
t3_enabled = false
max_tiers = 2
min_results = 3

When to Use Each Tier

Use T1 (Baseline)

# Simple keyword lookups
await use_mcp_tool(
    "watercooler_smart_query",
    query="src/auth/jwt.py",
    max_tiers=1
)

Use T2 (Graphiti)

# Temporal/relationship queries
await use_mcp_tool(
    "watercooler_smart_query",
    query="Who worked on authentication after Jan 15?",
    force_tier="T2"
)

Use T3 (LeanRAG)

# Multi-hop reasoning
await use_mcp_tool(
    "watercooler_smart_query",
    query="Why did we choose RS256 instead of HS256?",
    force_tier="T3"
)

Prerequisites

T1 (Always Available)

No setup required - uses .graph/nodes.jsonl in threads directory.

T2 (Graphiti)

  1. Enable: WATERCOOLER_GRAPHITI_ENABLED=1
  2. Configure API keys: LLM_API_KEY, EMBEDDING_API_KEY
  3. Start FalkorDB: docker run -d -p 6379:6379 falkordb/falkordb
  4. Index threads: watercooler memory index

T3 (LeanRAG)

  1. Enable: WATERCOOLER_LEANRAG_ENABLED=1
  2. Set path: LEANRAG_PATH=/path/to/leanrag
  3. Same FalkorDB/API keys as T2
  4. Run pipeline: watercooler_leanrag_run_pipeline

Build docs developers (and LLMs) love