Skip to main content
Supermemory provides persistent, searchable memory for the JARVIS pipeline. When a person is researched once, their dossier is stored in Supermemory, allowing future encounters to skip expensive web agent research and retrieve cached intelligence instantly.

Why Supermemory?

Web agent research is expensive and slow:
  • Browser Use agents take 30-60 seconds per person
  • Exa API costs $0.01-0.05 per search
  • PimEyes reverse image search has strict rate limits
Supermemory acts as a smart cache: ✅ Store complete dossiers after first enrichment
✅ Hybrid search (semantic + keyword) finds matches even with name variations
✅ Metadata filtering by source and timestamp
✅ Automatic relevance scoring to avoid false positives

Architecture

Configuration

Get your API key from supermemory.ai/settings:
.env
SUPERMEMORY_API_KEY=sm_live_abc123...
The client automatically initializes:
backend/memory/supermemory_client.py
import os
import httpx
from loguru import logger

class SuperMemoryClient:
    def __init__(self, api_key: str | None = None):
        self._api_key = api_key or os.environ.get("SUPERMEMORY_API_KEY", "")
        if not self._api_key:
            logger.warning("SUPERMEMORY_API_KEY not set")
        
        self._client = httpx.AsyncClient(
            timeout=30,
            headers={
                "Authorization": f"Bearer {self._api_key}",
                "Content-Type": "application/json",
            },
        )

Core Operations

Store Dossier

Persist a complete person dossier to Supermemory:
backend/memory/supermemory_client.py
async def store_dossier(
    self,
    person_name: str,
    dossier_data: dict,
) -> str | None:
    """Persist a dossier to Supermemory.
    
    Returns the document ID on success, or None on failure.
    """
    content = json.dumps(
        {"person_name": person_name, "dossier": dossier_data},
        default=str,
    )
    
    payload = {
        "content": content,
        "containerTags": ["specter-dossiers"],
        "customId": f"specter-{person_name.lower().replace(' ', '-')}",
        "metadata": {
            "person_name": person_name,
            "source": "specter-pipeline",
        },
    }
    
    try:
        resp = await self._client.post(
            "https://api.supermemory.ai/v3/documents",
            json=payload
        )
        resp.raise_for_status()
        data = resp.json()
        doc_id = data.get("id")
        logger.info(f"SuperMemory store OK | person={person_name} id={doc_id}")
        return doc_id
    except Exception as exc:
        logger.error(f"SuperMemory store failed | person={person_name} err={exc}")
        return None
Key features:
  • Custom ID: Deterministic ID (specter-{name}) ensures re-storing overwrites old data
  • Container Tags: Namespace all JARVIS dossiers as specter-dossiers
  • Metadata: Store searchable metadata for filtering

Search Person

Look up a cached dossier by person name:
backend/memory/supermemory_client.py
async def search_person(self, name: str) -> dict | None:
    """Look up a cached dossier by person name.
    
    Returns the parsed dossier dict if a high-confidence match is found,
    otherwise None.
    """
    payload = {
        "q": name,
        "containerTag": "specter-dossiers",
        "searchMode": "hybrid",  # Semantic + keyword
        "limit": 3,
        "threshold": 0.6,  # Minimum similarity score
        "filters": {
            "AND": [
                {"key": "source", "value": "specter-pipeline"},
            ],
        },
    }
    
    try:
        resp = await self._client.post(
            "https://api.supermemory.ai/v4/search",
            json=payload
        )
        resp.raise_for_status()
        data = resp.json()
        results = data.get("results", [])
        
        if not results:
            logger.debug(f"SuperMemory search miss | name={name}")
            return None
        
        top = results[0]
        raw = top.get("memory") or top.get("chunk") or ""
        dossier = self._parse_dossier(raw, name)
        
        if dossier:
            logger.info(
                f"SuperMemory cache hit | name={name} "
                f"similarity={top.get('similarity', 0):.2f}"
            )
        
        return dossier
    except Exception as exc:
        logger.error(f"SuperMemory search failed | name={name} err={exc}")
        return None
Search features:
  • Hybrid search: Combines semantic embedding similarity with keyword matching
  • Similarity threshold: Only returns results above 0.6 score to avoid false positives
  • Fuzzy matching: Handles name variations (“John Smith” vs “J. Smith”)

Parse Dossier

Extract the dossier from Supermemory’s response format:
backend/memory/supermemory_client.py
def _parse_dossier(raw: str, name: str) -> dict | None:
    """Extract dossier dict from SuperMemory memory/chunk."""
    try:
        obj = json.loads(raw)
        if isinstance(obj, dict) and "dossier" in obj:
            return obj["dossier"]
        if isinstance(obj, dict):
            return obj
    except (json.JSONDecodeError, TypeError):
        pass
    
    # SuperMemory may return summarized text instead of raw JSON
    if raw and name.lower() in raw.lower():
        return {"raw_memory": raw}
    
    return None

Pipeline Integration

Supermemory is checked before running expensive web agents:
backend/orchestration/pipeline.py
from backend.memory.supermemory_client import SuperMemoryClient
from backend.agents.orchestrator import ResearchOrchestrator

async def enrich_person(
    person_name: str,
    photo_url: str,
    memory: SuperMemoryClient,
    orchestrator: ResearchOrchestrator,
) -> dict:
    # 1. Check Supermemory cache
    cached = await memory.search_person(person_name)
    if cached:
        logger.info(f"Cache hit for {person_name}, skipping web research")
        return {
            "person_name": person_name,
            "photo_url": photo_url,
            "dossier": cached,
            "source": "supermemory_cache",
        }
    
    # 2. Cache miss — run full research pipeline
    logger.info(f"Cache miss for {person_name}, starting web research")
    research_result = await orchestrator.research_person(
        person_name=person_name,
        photo_url=photo_url,
    )
    
    # 3. Store result in Supermemory for future use
    if research_result.get("dossier"):
        await memory.store_dossier(
            person_name=person_name,
            dossier_data=research_result["dossier"],
        )
    
    return research_result

Example: Complete Flow

import asyncio
from backend.memory.supermemory_client import SuperMemoryClient

async def main():
    async with SuperMemoryClient() as memory:
        # First encounter: cache miss
        print("First lookup...")
        result1 = await memory.search_person("Alice Smith")
        print(f"Result: {result1}")  # None
        
        # Store dossier
        print("Storing dossier...")
        dossier = {
            "summary": "AI researcher at OpenAI. Stanford PhD.",
            "title": "Research Scientist",
            "company": "OpenAI",
            "work_history": [
                {
                    "role": "Research Scientist",
                    "company": "OpenAI",
                    "period": "2022-present"
                }
            ],
            "social_profiles": {
                "linkedin": "https://linkedin.com/in/alicesmith",
                "github": "https://github.com/alicesmith",
            },
        }
        doc_id = await memory.store_dossier("Alice Smith", dossier)
        print(f"Stored with ID: {doc_id}")
        
        # Second encounter: cache hit
        print("Second lookup...")
        result2 = await memory.search_person("Alice Smith")
        print(f"Result: {result2['summary']}")  # Cache hit!
        
        # Fuzzy match: slight name variation
        print("Fuzzy match...")
        result3 = await memory.search_person("A. Smith")
        print(f"Result: {result3['summary'] if result3 else 'No match'}")  # May still match!

if __name__ == "__main__":
    asyncio.run(main())
Output:
First lookup...
Result: None
Storing dossier...
Stored with ID: sm_doc_xyz123
Second lookup...
Result: AI researcher at OpenAI. Stanford PhD.
Fuzzy match...
Result: AI researcher at OpenAI. Stanford PhD.

Performance Benefits

Without Supermemory

# Every person requires full research
await exa.search(person_name)           # ~2s
await browser_agents.research(urls)     # ~45s
await synthesize_dossier(fragments)     # ~5s
# Total: ~52 seconds per person

With Supermemory

# First encounter
cached = await memory.search_person(name)  # ~0.3s, miss
# ... run full research ~52s ...
await memory.store_dossier(name, dossier)  # ~0.2s

# Future encounters
cached = await memory.search_person(name)  # ~0.3s, HIT!
# Total: 0.3 seconds (99.4% faster)

Container Management

Supermemory uses container tags to namespace data:
# All JARVIS dossiers use the same tag
_CONTAINER_TAG = "specter-dossiers"

# Store with tag
payload = {
    "content": dossier_json,
    "containerTags": [_CONTAINER_TAG],
    ...
}

# Search within tag
search_payload = {
    "q": person_name,
    "containerTag": _CONTAINER_TAG,  # Only search JARVIS data
    ...
}
This prevents cross-contamination if you use Supermemory for other projects.

Error Handling

Supermemory failures are non-blocking:
async def safe_supermemory_lookup(memory: SuperMemoryClient, name: str) -> dict | None:
    try:
        return await memory.search_person(name)
    except httpx.TimeoutException:
        logger.warning(f"Supermemory timeout for {name}, proceeding without cache")
        return None
    except httpx.HTTPStatusError as exc:
        logger.error(f"Supermemory HTTP {exc.response.status_code} for {name}")
        return None
    except Exception as exc:
        logger.error(f"Supermemory unexpected error for {name}: {exc}")
        return None
If Supermemory is down, JARVIS falls back to full research without crashing.

Best Practices

Use deterministic custom IDs to enable idempotent updates:
def _custom_id(person_name: str) -> str:
    """Deterministic document ID so re-storing overwrites."""
    return f"specter-{person_name.strip().lower().replace(' ', '-')}"
This ensures re-enriching a person updates their dossier instead of creating duplicates.
Adjust the similarity threshold based on your accuracy needs:
  • 0.8+: High precision, fewer false positives (may miss slight name variations)
  • 0.6-0.7: Balanced (recommended)
  • 0.4-0.5: High recall, more fuzzy matches (risk of wrong person)
# Strict matching
results = await memory.search_person(name, threshold=0.85)

# Fuzzy matching
results = await memory.search_person(name, threshold=0.5)
Use metadata filters to segment data:
# Filter by source
payload = {
    "q": name,
    "filters": {
        "AND": [
            {"key": "source", "value": "specter-pipeline"},
            {"key": "enriched_date", "operator": ">", "value": "2024-01-01"},
        ]
    }
}
This lets you invalidate old dossiers or separate dev/prod data.
Use the async context manager for proper cleanup:
async with SuperMemoryClient() as memory:
    result = await memory.search_person("Alice Smith")
    # Client automatically closes on exit
This ensures HTTP connections are properly closed even if exceptions occur.

Monitoring

Track Supermemory cache hit rates:
from collections import Counter

cache_stats = Counter()

async def enrich_with_stats(person_name: str, memory: SuperMemoryClient):
    cached = await memory.search_person(person_name)
    
    if cached:
        cache_stats["hit"] += 1
        return cached
    else:
        cache_stats["miss"] += 1
        # ... run full research ...

# Log stats periodically
logger.info(
    f"Supermemory stats: {cache_stats['hit']} hits, "
    f"{cache_stats['miss']} misses "
    f"({cache_stats['hit'] / sum(cache_stats.values()) * 100:.1f}% hit rate)"
)

API Reference

SuperMemoryClient

store_dossier
async method
Persist a dossier to SupermemoryParameters:
  • person_name (str): Full name of the person
  • dossier_data (dict): Complete dossier dictionary
Returns: Document ID (str) or None on failure
search_person
async method
Look up a cached dossier by person nameParameters:
  • name (str): Person name to search for
Returns: Dossier dict or None if not found
close
async method
Close the HTTP client connectionReturns: None

Next: Backend Architecture

Learn about the FastAPI backend orchestrating the pipeline

Build docs developers (and LLMs) love