What is CEMS?
CEMS (Continuous Evolving Memory System) gives AI coding assistants persistent memory across sessions. Instead of starting from scratch every time, your AI assistant remembers your preferences, project conventions, past decisions, and learned patterns.Quick Start
Get from zero to working CEMS in 5 minutes
Installation
All installation methods and configuration options
Server Deployment
Deploy CEMS server for team usage
API Reference
Complete API and CLI documentation
Key Features
Semantic Search
Find memories using natural language, not just keywords. Powered by pgvector and embeddings.
Project-Scoped
Memories automatically boost relevance for the project they were created in.
Multi-IDE Support
Works with Claude Code, Cursor, Codex, Goose, and any MCP-compatible agent.
Auto-Learning
Session end hooks extract learnings automatically. Observer daemon watches transcripts.
Scheduled Maintenance
Nightly consolidation, weekly summarization, monthly re-indexing — all automatic.
Team Memory
Share conventions and decisions with your team using shared memory scope.
How It Works
CEMS integrates into your IDE workflow through hooks and MCP tools:Memory Lifecycle
- Memory Injection — On every prompt, relevant memories are searched and injected as context
- Session Learning — On session end, learnings are extracted and stored
- Observational Memory — The observer daemon watches session transcripts and extracts high-level observations
- Scheduled Maintenance — Nightly/weekly/monthly jobs deduplicate, compress, and prune memories automatically
Architecture Highlights
Storage: PostgreSQL + pgvector
Storage: PostgreSQL + pgvector
Everything lives in PostgreSQL with pgvector extension:
memory_documents— Documents with user/team scoping, categories, tagsmemory_chunks— Chunked content with 1536-dim vector embeddings (HNSW index)users/teams— Authentication via bcrypt-hashed API keys
Search Pipeline: Multi-Stage Retrieval
Search Pipeline: Multi-Stage Retrieval
CEMS uses a sophisticated retrieval pipeline:
- Query Understanding — LLM routes to vector or hybrid strategy
- Query Synthesis — Expands query into 2-5 search terms
- HyDE — Generates hypothetical ideal answer
- Candidate Retrieval — pgvector HNSW + tsvector BM25
- RRF Fusion — Reciprocal Rank Fusion combines results
- Relevance Filtering — Removes low-confidence results
- Scoring — Time decay, priority boost, project-scoped boost
- Assembly — Greedy selection within token budget
vector (fast, 0 LLM calls), hybrid (thorough, 3-4 LLM calls), auto (smart routing).Embeddings: text-embedding-3-small via OpenRouter
Embeddings: text-embedding-3-small via OpenRouter
- 1536 dimensions
- Batch support for bulk operations
- Configurable backend (OpenRouter by default)
Observer Daemon: Workflow Learning
Observer Daemon: Workflow Learning
The
cems-observer background process:- Polls
~/.claude/projects/*/JSONL transcript files every 30 seconds - Sends 50KB chunks to server for observation extraction
- Server uses Gemini 2.5 Flash to extract high-level patterns
- Examples: “User deploys via Coolify”, “Project uses PostgreSQL”
Performance
Recall@5: 98%
98% of relevant memories appear in top 5 results
Search Speed
Vector search: <50ms | Hybrid search: <2s
Compatibility
- Claude Code
- Cursor
- Codex
- Goose
- Any MCP Agent
Full integration with 6 hooks, 6 skills, 2 commands:
- Session start: profile + context injection
- User prompt: memory search + observations
- Tool use: pre/post hooks for learning and gate rules
- Session end: learning extraction + observer start
- Pre-compaction: context preservation
Use Cases
Next Steps
Get Started
Install CEMS client and create your first memory
Deploy Server
Set up CEMS server for your team