watercooler_smart_query

Execute intelligent multi-tier memory query with automatic escalation. Queries across three tiers with automatic escalation when lower tiers don’t provide sufficient results.

Safety: Read-only tool - does not modify any state

Overview

Smart Query orchestrates three memory tiers:

T1 (Baseline): JSONL graph with keyword/semantic search (cheapest, no LLM)
T2 (Graphiti): FalkorDB temporal graph with hybrid search (medium cost)
T3 (LeanRAG): Hierarchical clustering with multi-hop reasoning (expensive)

The orchestrator follows: “Always choose the cheapest tier that can satisfy the query intent.” Escalation happens automatically when results are insufficient.

Memory Tiers

T1: Baseline Graph

Storage: JSONL files (.graph/nodes.jsonl)
Search: Keyword regex + optional embeddings
Cost: Free (no LLM)
Latency: Less than 100ms
Best for: File lookups, keyword search, entity search

T2: Graphiti Temporal Graph

Storage: FalkorDB graph database
Search: Hybrid keyword + semantic + temporal
Cost: LLM calls for entity extraction (during indexing)
Latency: ~1-2s
Best for: Temporal queries, relationship queries, “who did what when?”

T3: LeanRAG Clustering

Storage: FalkorDB + hierarchical clusters
Search: Multi-hop reasoning, cluster navigation
Cost: LLM calls for summarization + query
Latency: ~5-10s
Best for: Complex multi-hop queries, “why did we decide X given Y?”

Parameters

query

string

required

Search query (e.g., “What authentication method was implemented?”)

code_path

string

default:""

Path to code repository (for T2/T3 database resolution)

threads_dir

string

default:""

Path to threads directory (for T1 baseline graph)If empty, attempts to resolve from code_path

max_tiers

integer

default:"2"

Maximum number of tiers to query (1-3)Default: 2 (T1 + T2, no T3 unless explicitly enabled)

force_tier

string

default:"null"

Force query to specific tier (“T1”, “T2”, or “T3”)Disables escalation when set

group_ids

list[string]

default:"null"

Optional list of project group_ids to filter results

Return Value

Returns JSON with:

query

string

Original query text

result_count

integer

Total evidence items found

tiers_queried

array

List of tiers that were queried (e.g., [“T1”, “T2”])

primary_tier

string

The tier that provided best results

escalation_reason

string

Why escalation occurred (if applicable)

sufficient

boolean

Whether results met sufficiency criteria

evidence

array

List of evidence items from all tiers:

tier: Source tier (T1/T2/T3)
id: Evidence UUID
content: Evidence text
score: Relevance score (0-1)
name: Evidence title
provenance: Source metadata (topic, entry_id, timestamp)
metadata: Additional properties

Usage Examples

Basic Query (Auto-Escalation)

await use_mcp_tool(
    "watercooler_smart_query",
    query="What error handling patterns did we use?",
    code_path="."
)

Force Specific Tier

# Force T2 (Graphiti)
await use_mcp_tool(
    "watercooler_smart_query",
    query="What changed after the auth refactor?",
    force_tier="T2",
    code_path="."
)

Enable T3 (LeanRAG)

# Allow escalation to T3
await use_mcp_tool(
    "watercooler_smart_query",
    query="Why did we choose RS256 given our deployment model?",
    max_tiers=3,
    code_path="."
)

Limit to T1 Only

await use_mcp_tool(
    "watercooler_smart_query",
    query="src/auth/jwt.py",
    max_tiers=1,
    code_path="."
)

Example Output

{
  "query": "What error handling patterns did we use?",
  "result_count": 5,
  "tiers_queried": ["T1", "T2"],
  "primary_tier": "T2",
  "escalation_reason": "Only 2 results from T1 (need 3)",
  "sufficient": true,
  "evidence": [
    {
      "tier": "T1",
      "id": "01ABC...",
      "content": "Implemented try-catch patterns in src/auth/jwt.py",
      "score": 0.85,
      "name": "Error Handling Discussion",
      "provenance": {
        "topic": "feature-auth",
        "entry_id": "01ABC123...",
        "timestamp": "2025-01-15T10:30:00Z"
      },
      "metadata": {"agent": "Claude", "type": "Note"}
    },
    {
      "tier": "T2",
      "id": "01DEF...",
      "content": "Added custom error classes for auth failures",
      "score": 0.92,
      "name": "Custom Error Classes",
      "provenance": {
        "topic": "feature-auth",
        "entry_id": "01DEF456...",
        "timestamp": "2025-01-15T11:00:00Z"
      },
      "metadata": {"agent": "Cursor", "type": "Note"}
    }
  ],
  "message": "Found 5 results from T2"
}

Escalation Logic

The orchestrator escalates when:

Too few results: Fewer than min_results (default: 3)
Low confidence: All scores below threshold (default: 0.6)
Query complexity: Natural language questions escalate to semantic tiers

Escalation stops when:

Sufficient results found
max_tiers reached
Higher tier unavailable

Configuration

Environment variables control tier availability:

# Enable/disable tiers (default: T1=1, T2=0, T3=0)
WATERCOOLER_TIER_T1_ENABLED=1
WATERCOOLER_TIER_T2_ENABLED=1
WATERCOOLER_TIER_T3_ENABLED=0  # Opt-in (expensive)

# Max tiers to query (default: 2)
WATERCOOLER_TIER_MAX_TIERS=2

# Min results for sufficiency (default: 3)
WATERCOOLER_TIER_MIN_RESULTS=3

Or via config.toml:

[memory.tiers]
t1_enabled = true
t2_enabled = true
t3_enabled = false
max_tiers = 2
min_results = 3

When to Use Each Tier

Use T1 (Baseline)

# Simple keyword lookups
await use_mcp_tool(
    "watercooler_smart_query",
    query="src/auth/jwt.py",
    max_tiers=1
)

Use T2 (Graphiti)

# Temporal/relationship queries
await use_mcp_tool(
    "watercooler_smart_query",
    query="Who worked on authentication after Jan 15?",
    force_tier="T2"
)

Use T3 (LeanRAG)

# Multi-hop reasoning
await use_mcp_tool(
    "watercooler_smart_query",
    query="Why did we choose RS256 instead of HS256?",
    force_tier="T3"
)

Prerequisites

T1 (Always Available)

No setup required - uses .graph/nodes.jsonl in threads directory.

T2 (Graphiti)

Enable: WATERCOOLER_GRAPHITI_ENABLED=1
Configure API keys: LLM_API_KEY, EMBEDDING_API_KEY
Start FalkorDB: docker run -d -p 6379:6379 falkordb/falkordb
Index threads: watercooler memory index

T3 (LeanRAG)

Enable: WATERCOOLER_LEANRAG_ENABLED=1
Set path: LEANRAG_PATH=/path/to/leanrag
Same FalkorDB/API keys as T2
Run pipeline: watercooler_leanrag_run_pipeline

watercooler_search - T1 baseline search only
watercooler_find_similar - T2 similarity search
watercooler_memory_sync - Index threads to backends

Overview

Thread Operations

Memory & Search

Diagnostic Tools

watercooler_smart_query

watercooler_smart_query

Overview

Memory Tiers

T1: Baseline Graph

T2: Graphiti Temporal Graph

T3: LeanRAG Clustering

Parameters

Return Value

Usage Examples

Basic Query (Auto-Escalation)

Force Specific Tier

Enable T3 (LeanRAG)

Limit to T1 Only

Example Output

Escalation Logic

Configuration

When to Use Each Tier

Use T1 (Baseline)

Use T2 (Graphiti)

Use T3 (LeanRAG)

Prerequisites

T1 (Always Available)

T2 (Graphiti)

T3 (LeanRAG)

Build docs developers (and LLMs) love

Overview

Thread Operations

Memory & Search

Diagnostic Tools

​watercooler_smart_query

​Overview

​Memory Tiers

​T1: Baseline Graph

​T2: Graphiti Temporal Graph

​T3: LeanRAG Clustering

​Parameters

​Return Value

​Usage Examples

​Basic Query (Auto-Escalation)

​Force Specific Tier

​Enable T3 (LeanRAG)

​Limit to T1 Only

​Example Output

​Escalation Logic

​Configuration

​When to Use Each Tier

​Use T1 (Baseline)

​Use T2 (Graphiti)

​Use T3 (LeanRAG)

​Prerequisites

​T1 (Always Available)

​T2 (Graphiti)

​T3 (LeanRAG)

​Related Tools

Build docs developers (and LLMs) love

watercooler_smart_query

Overview

Memory Tiers

T1: Baseline Graph

T2: Graphiti Temporal Graph

T3: LeanRAG Clustering

Parameters

Return Value

Usage Examples

Basic Query (Auto-Escalation)

Force Specific Tier

Enable T3 (LeanRAG)

Limit to T1 Only

Example Output

Escalation Logic

Configuration

When to Use Each Tier

Use T1 (Baseline)

Use T2 (Graphiti)

Use T3 (LeanRAG)

Prerequisites

T1 (Always Available)

T2 (Graphiti)

T3 (LeanRAG)

Related Tools