Skip to main content

watercooler_memory_sync

Sync threads to memory backends (Graphiti T2, LeanRAG T3) for semantic search and knowledge graph features.
Mutating tool: Triggers background indexing to memory backends

Overview

Memory sync indexes thread content into:
  • T2 (Graphiti): Temporal knowledge graph with entities and relationships
  • T3 (LeanRAG): Hierarchical clustering with multi-hop reasoning
This enables:
  • Semantic search across all threads
  • Entity and relationship queries
  • Temporal “what changed when?” queries
  • Multi-hop “why did we X given Y?” reasoning

Automatic vs Manual Sync

Automatic Sync

By default, threads are synced automatically:
  • After every say/ack/handoff: Entry is queued for indexing
  • Fire-and-forget: Indexing happens in background, doesn’t block tools
  • Retry logic: Failed indexing is retried automatically

Manual Sync

Use this tool to:
  • Bulk index: Index many threads at once
  • Force refresh: Re-index threads with updated config
  • Recovery: Index threads that failed automatic sync

Parameters

topics
list[string]
default:"null"
Specific thread topics to syncIf null, syncs all threads
backend
string
default:"all"
Which backend to sync toOptions:
  • "all" - Sync to all enabled backends
  • "graphiti" - T2 only
  • "leanrag" - T3 only
force
boolean
default:"false"
Force re-indexing even if already indexed
code_path
string
required
Path to code repository directoryResolves:
  • Threads directory location
  • Database name for backends

Return Value

Returns JSON with sync status:
success
boolean
Whether sync was queued successfully
queued
boolean
Whether tasks were queued (true) or executed directly (false)
task_ids
array
List of queued task IDs (if queued)
indexed_count
integer
Number of threads/entries indexed (if direct execution)
message
string
Status message

Usage Examples

Sync All Threads

await use_mcp_tool(
    "watercooler_memory_sync",
    code_path="."
)

Sync Specific Threads

await use_mcp_tool(
    "watercooler_memory_sync",
    topics=["feature-auth", "bug-fix-123"],
    code_path="."
)

Force Re-Index

await use_mcp_tool(
    "watercooler_memory_sync",
    topics=["feature-auth"],
    force=True,
    code_path="."
)

Sync to Specific Backend

# Graphiti only (T2)
await use_mcp_tool(
    "watercooler_memory_sync",
    backend="graphiti",
    code_path="."
)

# LeanRAG only (T3)
await use_mcp_tool(
    "watercooler_memory_sync",
    backend="leanrag",
    code_path="."
)

Example Output

Queued Execution

{
  "success": true,
  "queued": true,
  "task_ids": [
    "01ARZ3NDEKTSV4RRFFQ69G5FAV",
    "01ARZ3NDEKTSV4RRFFQ69G5FAW"
  ],
  "message": "Queued 2 threads for async indexing. Poll with watercooler_memory_task_status."
}

Direct Execution

{
  "success": true,
  "queued": false,
  "indexed_count": 15,
  "execution_time_ms": 8450,
  "message": "Indexed 15 entries across 2 threads"
}

Sync Process

T2 (Graphiti) Indexing

  1. Entry chunking: Split entries into semantic chunks
  2. Entity extraction: LLM extracts entities and relationships
  3. Embedding generation: Chunks embedded with configured model
  4. Graph storage: Nodes and edges written to FalkorDB
  5. Provenance tracking: Entry-to-episode mapping stored
Time: ~30-60s per thread (LLM calls)

T3 (LeanRAG) Indexing

  1. Episode extraction: Load episodes from Graphiti (T2)
  2. Clustering: Hierarchical clustering by semantic similarity
  3. Summarization: LLM generates cluster summaries
  4. Graph storage: Cluster hierarchy written to FalkorDB
  5. Incremental update: Only new episodes processed (if enabled)
Time: ~5-10min for bulk indexing (many LLM calls)

Queue vs Direct Execution

Queued Execution (Default)

When memory task queue is available:
  • Tasks are queued for background processing
  • Tool returns immediately
  • Progress tracked via watercooler_memory_task_status
  • Retry logic handles failures

Direct Execution (Fallback)

When queue is unavailable:
  • Indexing runs synchronously
  • Tool blocks until complete
  • Useful for debugging
  • May timeout on large threads (>50 entries)

Monitoring Progress

Check indexing status:
# Get task status
await use_mcp_tool(
    "watercooler_memory_task_status",
    task_id="01ARZ3NDEKTSV4RRFFQ69G5FAV"
)

# Get queue health
await use_mcp_tool(
    "watercooler_memory_task_status",
    action="health"
)

Prerequisites

T2 (Graphiti)

# Enable Graphiti
export WATERCOOLER_GRAPHITI_ENABLED=1

# Configure API keys
export LLM_API_KEY="sk-..."
export EMBEDDING_API_KEY="sk-..."

# Start FalkorDB
docker run -d -p 6379:6379 -p 3000:3000 \
  --name falkordb \
  -v falkordb_data:/var/lib/falkordb/data \
  falkordb/falkordb:latest

T3 (LeanRAG)

# Enable LeanRAG
export WATERCOOLER_LEANRAG_ENABLED=1
export LEANRAG_PATH="/path/to/leanrag"

# Same FalkorDB and API keys as T2

Best Practices

Initial Setup

  1. Enable backends: Set environment variables
  2. Start services: Launch FalkorDB
  3. Bulk index: Sync all existing threads
# Index everything
await use_mcp_tool(
    "watercooler_memory_sync",
    code_path="."
)

Ongoing Use

Automatic sync handles new entries - no manual intervention needed.

Recovery

If automatic sync fails:
# Check failed tasks
await use_mcp_tool(
    "watercooler_memory_task_status",
    action="recover"
)

# Re-sync specific threads
await use_mcp_tool(
    "watercooler_memory_sync",
    topics=["failed-thread-1", "failed-thread-2"],
    force=True,
    code_path="."
)

Config Changes

After changing embedding model or LLM:
# Force re-index with new config
await use_mcp_tool(
    "watercooler_memory_sync",
    force=True,
    code_path="."
)

Troubleshooting

Indexing Fails

Cause: API key invalid or quota exceeded Solution:
# Verify API keys
echo $LLM_API_KEY
echo $EMBEDDING_API_KEY

# Check diagnostics
await use_mcp_tool("watercooler_diagnose_memory")

FalkorDB Connection Error

Cause: FalkorDB not running Solution:
docker ps | grep falkordb
# If not running:
docker start falkordb

Timeout on Large Threads

Cause: Direct execution timeout (60s tool limit) Solution:
  • Ensure queue is enabled (automatic)
  • Split large threads into smaller chunks
  • Use CLI: watercooler memory index

Build docs developers (and LLMs) love