CEMS - Continuous Evolving Memory System

What is CEMS?

CEMS (Continuous Evolving Memory System) gives AI coding assistants persistent memory across sessions. Instead of starting from scratch every time, your AI assistant remembers your preferences, project conventions, past decisions, and learned patterns.

Quick Start

Get from zero to working CEMS in 5 minutes

Installation

All installation methods and configuration options

Server Deployment

Deploy CEMS server for team usage

API Reference

Complete API and CLI documentation

Key Features

Semantic Search

Find memories using natural language, not just keywords. Powered by pgvector and embeddings.

Project-Scoped

Memories automatically boost relevance for the project they were created in.

Multi-IDE Support

Works with Claude Code, Cursor, Codex, Goose, and any MCP-compatible agent.

Auto-Learning

Session end hooks extract learnings automatically. Observer daemon watches transcripts.

Scheduled Maintenance

Nightly consolidation, weekly summarization, monthly re-indexing — all automatic.

Team Memory

Share conventions and decisions with your team using shared memory scope.

How It Works

CEMS integrates into your IDE workflow through hooks and MCP tools:

Memory Lifecycle

Memory Injection — On every prompt, relevant memories are searched and injected as context
Session Learning — On session end, learnings are extracted and stored
Observational Memory — The observer daemon watches session transcripts and extracts high-level observations
Scheduled Maintenance — Nightly/weekly/monthly jobs deduplicate, compress, and prune memories automatically

Architecture Highlights

Storage: PostgreSQL + pgvector

Everything lives in PostgreSQL with pgvector extension:

memory_documents — Documents with user/team scoping, categories, tags
memory_chunks — Chunked content with 1536-dim vector embeddings (HNSW index)
users / teams — Authentication via bcrypt-hashed API keys

Search Pipeline: Multi-Stage Retrieval

CEMS uses a sophisticated retrieval pipeline:

Query Understanding — LLM routes to vector or hybrid strategy
Query Synthesis — Expands query into 2-5 search terms
HyDE — Generates hypothetical ideal answer
Candidate Retrieval — pgvector HNSW + tsvector BM25
RRF Fusion — Reciprocal Rank Fusion combines results
Relevance Filtering — Removes low-confidence results
Scoring — Time decay, priority boost, project-scoped boost
Assembly — Greedy selection within token budget

Search modes: vector (fast, 0 LLM calls), hybrid (thorough, 3-4 LLM calls), auto (smart routing).

Embeddings: text-embedding-3-small via OpenRouter

1536 dimensions
Batch support for bulk operations
Configurable backend (OpenRouter by default)

Observer Daemon: Workflow Learning

The cems-observer background process:

Polls ~/.claude/projects/*/ JSONL transcript files every 30 seconds
Sends 50KB chunks to server for observation extraction
Server uses Gemini 2.5 Flash to extract high-level patterns
Examples: “User deploys via Coolify”, “Project uses PostgreSQL”

Performance

Recall@5: 98%

98% of relevant memories appear in top 5 results

Search Speed

Vector search: <50ms | Hybrid search: <2s

Compatibility

Claude Code
Cursor
Codex
Goose
Any MCP Agent

Full integration with 6 hooks, 6 skills, 2 commands:

Session start: profile + context injection
User prompt: memory search + observations
Tool use: pre/post hooks for learning and gate rules
Session end: learning extraction + observer start
Pre-compaction: context preservation

MCP + commands with 3 commands, 2 skills:

Commands: recall, remember, foundation
Skills: recall, remember
MCP server config in config.toml

MCP extension in config.yaml:

Full MCP tool access
Memory search, add, forget, update

Use Cases

Personal Preferences

Store coding style, editor settings, workflow preferences:

/remember I prefer Python with type hints and Pydantic for data validation
/remember I use pytest with fixtures for all tests

Project Conventions

Document architecture decisions, naming conventions, patterns:

/remember This project uses snake_case for database columns
/remember API endpoints follow REST conventions with /api/v1/...

Team Knowledge

Share conventions across the team:

/share We use PostgreSQL for primary database, Redis for caching
/share All PRs require 2 approvals and passing CI

Session Context

Automatic learning from your workflow:

Successful patterns are extracted at session end
Observer daemon captures high-level workflows
Maintenance jobs consolidate and compress over time

Next Steps

Get Started

Install CEMS client and create your first memory

Deploy Server

Set up CEMS server for your team

Get Started

Core Concepts

IDE Integration

Using CEMS

Server Deployment

Advanced

CEMS - Continuous Evolving Memory System

What is CEMS?

Quick Start

Installation

Server Deployment

API Reference

Key Features

Semantic Search

Project-Scoped

Multi-IDE Support

Auto-Learning

Scheduled Maintenance

Team Memory

How It Works

Memory Lifecycle

Architecture Highlights

Performance

Recall@5: 98%

Search Speed

Compatibility

Use Cases

Next Steps

Get Started

Deploy Server

Build docs developers (and LLMs) love

Get Started

Core Concepts

IDE Integration

Using CEMS

Server Deployment

Advanced

​What is CEMS?

Quick Start

Installation

Server Deployment

API Reference

​Key Features

Semantic Search

Project-Scoped

Multi-IDE Support

Auto-Learning

Scheduled Maintenance

Team Memory

​How It Works

​Memory Lifecycle

​Architecture Highlights

​Performance

Recall@5: 98%

Search Speed

​Compatibility

​Use Cases

​Next Steps

Get Started

Deploy Server

Build docs developers (and LLMs) love

What is CEMS?

Key Features

How It Works

Memory Lifecycle

Architecture Highlights

Performance

Compatibility

Use Cases

Next Steps