Search memories

Search memories using CEMS’s multi-stage retrieval pipeline with query understanding, HyDE, RRF fusion, and LLM re-ranking.

Endpoint

POST /api/memory/search

Authentication: Required (Bearer token)

Request body

query

string

required

The search query. Maximum 2000 characters (automatically truncated).

limit

integer

default:10

Maximum number of results to return

scope

string

default:"both"

Search scope:

personal - Only your personal memories
shared - Only team-shared memories
both - Search both scopes

max_tokens

integer

default:4000

Token budget for assembled results. Results are selected greedily within this budget.

mode

string

default:"vector"

Retrieval mode:

auto - Smart routing based on query analysis
vector - Fast vector search (0 LLM calls)
hybrid - Full pipeline with HyDE + RRF + re-ranking (3-4 LLM calls)

enable_query_synthesis

boolean

default:false

Enable LLM-powered query expansion into 2-5 search terms

enable_hyde

boolean

default:false

Enable HyDE (Hypothetical Document Embeddings) for better semantic matching

enable_rerank

boolean

default:true

Enable LLM re-ranking of results for improved relevance

enable_graph

boolean

default:true

Enable graph traversal for related memories

project

string

Project identifier for scoped boost. Format: org/repoMemories with matching source_ref get a 1.3x score boost.

raw

boolean

default:false

Debug mode: bypass filtering and return all candidates

Response

success

boolean

Whether the search succeeded

results

object[]

Array of search results

Show properties

memory_id

string

Unique identifier for this memory

content

string

Memory content

Examples

Basic search

curl -X POST https://your-cems-server.com/api/memory/search \
  -H "Authorization: Bearer $CEMS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What are my TypeScript preferences?",
    "limit": 5
  }'

Response:

{
  "success": true,
  "results": [
    {
      "memory_id": "550e8400-e29b-41d4-a716-446655440000",
      "content": "I prefer using TypeScript for all new backend services",
      "category": "preferences",
      "score": 0.89,
      "tags": ["typescript", "backend"],
      "timestamp": "2024-02-28T10:30:00Z"
    }
  ],
  "count": 1,
  "mode": "vector",
  "tokens_used": 15,
  "queries_used": ["What are my TypeScript preferences?"],
  "total_candidates": 10,
  "filtered_count": 9
}

Hybrid search with HyDE

curl -X POST https://your-cems-server.com/api/memory/search \
  -H "Authorization: Bearer $CEMS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "How do we handle authentication?",
    "mode": "hybrid",
    "enable_hyde": true,
    "enable_query_synthesis": true,
    "project": "acme/api-server"
  }'

This uses the full pipeline:

Query understanding and intent detection
Query synthesis (expands to 2-5 queries)
HyDE (generates hypothetical answer)
Multi-query retrieval
RRF fusion
LLM re-ranking
Project-scoped boost for acme/api-server

Retrieval pipeline stages

The enhanced retrieval pipeline (retrieve_for_inference) implements 9 stages:

Query understanding

Analyzes intent, domains, and entities using an LLM

Query synthesis

LLM expands query into 2-5 search terms for better coverage

HyDE

Generates hypothetical ideal answer for better semantic matching

Candidate retrieval

Vector search + graph traversal across multiple queries

RRF fusion

Reciprocal Rank Fusion combines multi-query results

LLM re-ranking

Smarter relevance scoring using LLM judgment

Relevance filtering

Removes results below threshold

Unified scoring

Applies time decay, priority boost, and project-scoped scoring

Token-budgeted assembly

Greedy selection within token budget

See Search Pipeline for detailed explanations.

Search modes comparison

Mode	LLM Calls	Speed	Best for
`vector`	0	Fastest	Simple queries, known terms
`auto`	0-4	Smart	General use (default)
`hybrid`	3-4	Thorough	Complex queries, preferences, temporal

Error responses

Status codes:

400 - Bad request (missing query)
401 - Unauthorized (invalid or missing API key)
500 - Internal server error

Implementation reference

See src/cems/api/handlers/memory.py:272-386 for the complete implementation.

Overview

Memory API

Session API

Admin API

MCP Tools

Search memories

Endpoint

Request body

Response

Examples

Basic search

Hybrid search with HyDE

Retrieval pipeline stages

Search modes comparison

Error responses

Implementation reference

Build docs developers (and LLMs) love

Overview

Memory API

Session API

Admin API

MCP Tools

​Endpoint

​Request body

​Response

​Examples

​Basic search

​Hybrid search with HyDE

​Retrieval pipeline stages

​Search modes comparison

​Error responses

​Implementation reference

Build docs developers (and LLMs) love

Endpoint

Request body

Response

Examples

Basic search

Hybrid search with HyDE

Retrieval pipeline stages

Search modes comparison

Error responses

Implementation reference