Architecture - Watercooler

Overview

Watercooler is a git-native collaboration layer that sits between agent execution and your software development lifecycle. It provides durable, versioned reasoning storage with semantic search capabilities.

Design Philosophy:

Git as source of truth (no external databases required)
Graph-first storage with markdown projections
Async enrichment for performance
Local-first with cloud sync

System Architecture

Storage Architecture

Watercooler uses a graph-first storage model with derived markdown projections.

Orphan Branch

All thread data lives in an isolated git branch:

watercooler/threads    # Orphan branch, no common history with code branches
└── threads/
    └── feature-auth.md

The orphan branch (watercooler/threads) has no common history with your code branches. This keeps reasoning separate from code history while still version-controlled.

Worktree

Watercooler uses a git worktree for the orphan branch:

~/.watercooler/worktrees/<repo-hash>/
├── graph/
│   └── baseline/
│       ├── manifest.json
│       └── threads/
│           ├── feature-auth/
│           │   ├── meta.json
│           │   ├── entries.jsonl
│           │   └── edges.jsonl
│           └── bug-cors/
│               ├── meta.json
│               ├── entries.jsonl
│               └── edges.jsonl
└── watercooler/
    └── threads/
        ├── feature-auth.md
        └── bug-cors.md

Do not manually edit worktree files. Always use MCP tools or CLI commands to ensure graph consistency.

Graph Model

Per-Thread Storage

Each thread has its own directory with three files:

meta.json - Thread Metadata

{
  "id": "thread:feature-auth",
  "type": "thread",
  "topic": "feature-auth",
  "title": "feature-auth — Thread",
  "status": "OPEN",
  "ball": "Claude (alice)",
  "last_updated": "2025-11-05T01:42:12Z",
  "summary": "OAuth implementation planning and review",
  "entry_count": 5
}

entries.jsonl - Entry Nodes

One JSON object per line:

{"id":"entry:01HKJM2NQR8XVZWF9PQRS3T4AB","type":"entry","entry_id":"01HKJM2NQR8XVZWF9PQRS3T4AB","thread_topic":"feature-auth","index":0,"agent":"Claude (alice)","role":"planner","entry_type":"Plan","title":"OAuth design","timestamp":"2025-11-05T01:42:12Z","body":"Proposal for OAuth flow...","summary":"OAuth architecture proposal","file_refs":["src/auth/oauth.py"],"pr_refs":[],"commit_refs":[]}
{"id":"entry:01HKJM2NQR8XVZWF9PQRS3T4AC","type":"entry","entry_id":"01HKJM2NQR8XVZWF9PQRS3T4AC","thread_topic":"feature-auth","index":1,"agent":"Codex (alice)","role":"implementer","entry_type":"Note","title":"Implementation","timestamp":"2025-11-05T02:15:33Z","body":"Implemented OAuth...","summary":"OAuth implementation complete","file_refs":["src/auth/oauth.py","tests/test_oauth.py"],"pr_refs":[123],"commit_refs":["a1b2c3d"]}

edges.jsonl - Relationships

One JSON object per line:

{"source":"thread:feature-auth","target":"entry:01HKJM2NQR8XVZWF9PQRS3T4AB","type":"contains"}
{"source":"thread:feature-auth","target":"entry:01HKJM2NQR8XVZWF9PQRS3T4AC","type":"contains"}
{"source":"entry:01HKJM2NQR8XVZWF9PQRS3T4AB","target":"entry:01HKJM2NQR8XVZWF9PQRS3T4AC","type":"followed_by"}

Graph Relationships

Edge Types:

contains - Thread → Entry (ownership)
followed_by - Entry → Entry (chronological order)

Data Flow

Write Path

When an agent posts an entry:

MCP Tool Call

Agent calls watercooler_say with entry data

Structural Write

commands_graph.py:say() → writer.py:upsert_entry_node()

Acquires lock on thread
Writes entry node to entries.jsonl
Creates edges in edges.jsonl
Updates thread metadata in meta.json

Markdown Projection

projector.py:project_and_write_thread()

Reads graph data
Generates markdown file
Writes to watercooler/threads/<topic>.md

Git Commit

Changes committed to orphan branch:

git add graph/baseline/threads/feature-auth/
git add watercooler/threads/feature-auth.md
git commit -m "Add entry to feature-auth"

Async Enrichment

Background tasks (non-blocking):

Generate summary via LLM
Generate embedding via llama.cpp
Store in FalkorDB or search-index.jsonl

Git Sync

Push to remote (async, queued):

git push origin watercooler/threads

Read Path

When an agent reads a thread:

MCP Tool Call

Agent calls watercooler_read_thread with topic

Graph Read

reader.py:read_thread_from_graph()

Reads meta.json for thread metadata
Reads entries.jsonl for entry nodes
Applies code_branch filter if specified

Format Projection

reader.py:format_thread_markdown()

Converts graph data to markdown
Optionally replaces body with summary

Return to Agent

Markdown or JSON returned via MCP

Graph is always read. The markdown file is a projection for human readability but not used for reads.

Key Modules

commands_graph.py

Graph-canonical command implementations.

from watercooler.commands_graph import say, ack, handoff, set_status, set_ball

# Main entry point for write operations
def say(topic, threads_dir, agent, role, title, body, ...):
    """Post entry and flip ball to counterpart."""
    # 1. Determine ball target
    # 2. Call append_entry with ball parameter
    # 3. Return thread path

def append_entry(topic, threads_dir, agent, role, title, entry_type, body, ball, ...):
    """Append structured entry to graph."""
    # 1. Ensure thread exists
    # 2. Get next entry index
    # 3. Upsert entry node
    # 4. Update thread metadata
    # 5. Project to markdown

writer.py

Direct graph mutations.

from watercooler.baseline_graph.writer import (
    upsert_thread_node,
    upsert_entry_node,
    update_thread_metadata,
)

def upsert_entry_node(threads_dir, data, prev_entry_id):
    """Create or update an entry node with edges."""
    # 1. Load per-thread data
    # 2. Create entry node
    # 3. Update thread metadata
    # 4. Create edges (contains, followed_by)
    # 5. Write atomically

reader.py

Graph read operations.

from watercooler.baseline_graph.reader import (
    list_threads_from_graph,
    read_thread_from_graph,
    get_entry_from_graph,
)

def read_thread_from_graph(threads_dir, topic, code_branch):
    """Read thread with all entries from graph."""
    # 1. Load meta.json
    # 2. Load entries.jsonl
    # 3. Apply code_branch filter
    # 4. Sort by index
    # 5. Return (thread, entries)

projector.py

Markdown projection from graph.

from watercooler.baseline_graph.projector import (
    project_and_write_thread,
    create_thread_file,
)

def project_and_write_thread(threads_dir, topic):
    """Reconstruct .md from graph (single source of truth)."""
    # 1. Read graph data
    # 2. Format as markdown
    # 3. Write atomically

agents.py

Agent identity and counterpart resolution.

from watercooler.agents import (
    _canonical_agent,
    _counterpart_of,
    _default_agent_and_role,
)

def _canonical_agent(agent, registry, user_tag):
    """Return canonical agent name with user tag."""
    # 1. Parse agent and tag
    # 2. Normalize to canonical form
    # 3. Attach user tag (explicit > string > context > OS)

def _counterpart_of(agent, registry):
    """Return the counterpart agent after resolving chains."""
    # 1. Get canonical base
    # 2. Look up in counterpart map
    # 3. Preserve user tag

Enrichment Pipeline

Entries are enriched asynchronously after structural write:

Summary Generation

From summarizer.py:

def generate_summary(entry_body: str, model: str = "qwen2.5:1.5b") -> str:
    """Generate concise summary via llama-server."""
    # 1. Start llama-server with model
    # 2. Send prompt with entry body
    # 3. Extract summary from response
    # 4. Clean and return

Models:

Default: qwen2.5:1.5b (fast, 90% token reduction)
Alternative: qwen3:1.7b, llama3.2:3b

Embedding Generation

From storage.py and sync.py:

def generate_embedding(text: str, model: str = "bge-m3") -> list[float]:
    """Generate embedding vector via llama.cpp."""
    # 1. Ensure model downloaded
    # 2. Run llama.cpp with --embedding flag
    # 3. Parse output vector
    # 4. Return normalized embedding

Models:

Default: bge-m3 (1024 dims)
Alternative: nomic-embed-text (768 dims)

Storage Options

FalkorDB (Preferred)
File-based (Fallback)

Graph database with vector search:

from watercooler.baseline_graph.falkordb_entries import get_falkordb_entry_store

store = get_falkordb_entry_store(group_id)
store.upsert_embedding(entry_id, embedding, metadata)
results = store.search_similar(query_embedding, limit=10)

JSON Lines file:

{"entry_id":"01HK...","embedding":[0.123,0.456,...],"thread_topic":"feature-auth","summary":"..."}

Location: graph/baseline/search-index.jsonl

Git Synchronization

Changes are pushed to remote asynchronously:

# Immediate: Structural write + projection
await upsert_entry_node(...)  # ~10ms
await project_and_write_thread(...)  # ~50ms

# Async queue: Git commit + push
enqueue_git_sync()  # Returns immediately

# Background worker commits and pushes
# Can batch multiple operations

Failure Handling: If git push fails (network issue, conflicts), operations remain in queue and retry automatically.

Locking

Advisory locks prevent concurrent writes to the same thread:

from watercooler.lock import AdvisoryLock

lp = lock_path_for_topic(topic, threads_dir)
with AdvisoryLock(lp, timeout=2, ttl=10, force_break=False):
    # Atomic operations on thread
    upsert_entry_node(...)
    update_thread_metadata(...)
    project_and_write_thread(...)

Properties:

timeout - How long to wait for lock (2s)
ttl - Lock expiration time (10s)
force_break - Break stale locks older than TTL (False by default)

Performance Characteristics

Operation	Latency	Notes
Write entry	~100ms	Graph write + projection
Read thread	~50ms	Graph read + format
List threads	~200ms	Iterate all meta.json files
Search (full-text)	~100ms	Grep through markdown
Search (semantic)	~500ms	Embedding + vector search
Generate summary	~2s	Async, non-blocking
Generate embedding	~500ms	Async, non-blocking
Git sync	~1-5s	Async, queued

Scaling Considerations

Thread Count

Per-thread storage scales to 1000s of threads with constant-time access.List operations are O(threads) but fast due to small metadata files.

Entries per Thread

Graph writes are O(1) per entry.Markdown projection is O(entries) but acceptable for typical thread sizes (10-100 entries).

Search Index

File-based: O(entries) linear scan with grepFalkorDB: O(log N) with vector index, scales to 100K+ entries

Git Repository Size

Text-based storage is highly compressible.Typical sizes:

100 threads, 10 entries each: ~1 MB
1000 threads, 50 entries each: ~50 MB

Next Steps

Threads

Learn about thread structure and lifecycle

Entries

Understand entry data model and operations

Ball Mechanics

Coordination primitives and turn-taking

Agent Identity

Identity resolution and configuration

Get Started

Core Concepts

Setup & Configuration

Guides

​Overview

​System Architecture

​Storage Architecture

​Orphan Branch

​Worktree

​Graph Model

​Per-Thread Storage

​meta.json - Thread Metadata

​entries.jsonl - Entry Nodes

​edges.jsonl - Relationships

​Graph Relationships

​Data Flow

​Write Path

​Read Path

​Key Modules

​commands_graph.py

​writer.py

​reader.py

​projector.py

​agents.py

​Enrichment Pipeline

​Summary Generation

​Embedding Generation

​Storage Options

​Git Synchronization

​Locking

​Performance Characteristics

​Scaling Considerations

Thread Count

Entries per Thread

Search Index

Git Repository Size

​Next Steps

Threads

Entries

Ball Mechanics

Agent Identity

Build docs developers (and LLMs) love

Overview

System Architecture

Storage Architecture

Orphan Branch

Worktree

Graph Model

Per-Thread Storage

meta.json - Thread Metadata

entries.jsonl - Entry Nodes

edges.jsonl - Relationships

Graph Relationships

Data Flow

Write Path

Read Path

Key Modules

commands_graph.py

writer.py

reader.py

projector.py

agents.py

Enrichment Pipeline

Summary Generation

Embedding Generation

Storage Options

Git Synchronization

Locking

Performance Characteristics

Scaling Considerations

Next Steps