Architecture Overview
Graph-First Design
All thread operations follow a graph-first approach:- Write to Graph: Data written to
nodes.jsonlandedges.jsonlfirst - Project to Markdown: The
.mdfile is generated from the graph - Enrich: Summaries and embeddings added asynchronously
- Query: Search across threads using keywords, semantics, or filters
Storage Format
Threads are stored in per-thread JSONL format:Building Memory Graphs
watercooler init-thread api-design
watercooler say api-design \
--title "REST vs GraphQL analysis" \
--body "Evaluated both approaches. GraphQL provides better flexibility for mobile clients."
watercooler say api-design \
--title "Decision: GraphQL" \
--type Decision \
--body "Moving forward with GraphQL. Will use Apollo Server with TypeScript."
The graph is built automatically when you use graph-canonical commands (
say, ack, handoff). To manually build/rebuild:Graph Pipeline Options
Configuration
The pipeline supports various modes:- Full (Default)
- Incremental
- Extractive Only
- No Embeddings
- Uses local LLM for summaries (llama-server)
- Generates embeddings
- Processes all threads
Test Mode
Process only a subset of threads:Fresh Build
Clear cache and rebuild everything:Querying the Graph
Keyword Search
Search entry bodies, titles, and summaries:Semantic Search
Find conceptually similar content using embeddings:- Uses cosine similarity on embeddings
- Finds conceptually related content
- Works even with different wording
- Requires embeddings (run build without
--skip-embeddings)
Filters
Combine search with filters:--thread-topic: Filter by specific thread--thread-status: OPEN, CLOSED, BLOCKED, ABANDONED--role: planner, implementer, critic, tester, pm, scribe--entry-type: Note, Plan, Decision, PR, Closure--agent: Filter by agent name--start-time: ISO timestamp (entries after)--end-time: ISO timestamp (entries before)--limit: Max results (default: 10)
Similar Entries
Find entries similar to a specific entry:Graph Structure
Node Types
Thread Node (meta.json)
Entry Node (entries.jsonl)
Edge Types
Contains Edge
Followed By Edge
References Edge (Cross-Thread)
Enrichment
Summaries
LLM-generated summaries provide:- Concise entry overviews (1-2 sentences)
- Thread-level summaries (key decisions and outcomes)
- Extractive fallback (no LLM required)
Embeddings
Vector embeddings enable:- Semantic search
- Similar entry discovery
- Conceptual clustering
- Cross-thread relationship detection
text-embedding-3-small)
Dimension: 1536 (default)
Configuration
Configure LLM and embedding services via environment variables:~/.watercooler/config.toml):
Graph Operations
List Threads
List Entries
Read Entry
Export Thread
Project a thread back to markdown from graph:Reconcile Graph
Fix inconsistencies between markdown and graph:Performance
Incremental Builds
Incremental mode dramatically speeds up subsequent builds:- Track thread modification times
- Cache summaries and embeddings
- Only reprocess changed threads
- Store state in
graph/baseline/state.json
Parallelization
The pipeline uses parallel workers for LLM calls:Optimization Tips
- Use incremental mode for frequent builds
- Skip embeddings if you don’t need semantic search
- Use extractive summaries for faster builds during development
- Increase workers if you have available CPU/memory
- Use test-limit for development/testing
Integration with FalkorDB
For advanced graph queries, export to FalkorDB:- Cypher graph queries
- Path traversal
- Complex relationship queries
- Graph visualizations
Best Practices
1. Build Graphs Regularly
Run incremental builds after batch updates:2. Use Semantic Search for Exploration
Keyword search for precise matches, semantic for discovery:3. Archive Closed Threads
Exclude closed threads from active builds:4. Monitor Graph Health
Check for issues:5. Cross-Reference Related Work
Mention related threads in entries:Troubleshooting
Graph Not Available
If commands fail with “graph not yet built”:Stale Graph
If graph is out of sync with markdown:Missing Embeddings
If semantic search returns no results:LLM Connection Issues
Verify LLM server is running:Next Steps
Multi-Agent Workflows
Design collaborative workflows that populate memory graphs
Branch Pairing
Sync graphs with Git branches