Skip to main content
Mastra’s memory system enables agents to maintain context across conversations through multiple memory types: conversation history, semantic recall, working memory, and observational memory.

Core Concepts

Memory in Mastra operates on three key abstractions:
  • Threads: Individual conversation sessions with unique IDs
  • Resources: Users or entities that own multiple threads
  • Storage: Pluggable backends for persisting memory data

Memory Types

Conversation History

Stores recent messages from the current thread, providing short-term conversational continuity.
const memory = new Memory({
  storage,
  options: {
    lastMessages: 10 // Include last 10 messages
  }
});

Semantic Recall

Uses vector embeddings to retrieve relevant past messages based on semantic similarity.
const memory = new Memory({
  storage,
  vector: new PgVector({ connectionString: process.env.DATABASE_URL }),
  embedder: "openai/text-embedding-3-small",
  options: {
    semanticRecall: {
      topK: 5,
      messageRange: 2,
      scope: 'resource' // Search across all user threads
    }
  }
});

Working Memory

Maintains a structured record of user information and preferences that agents update over time.
const memory = new Memory({
  storage,
  options: {
    workingMemory: {
      enabled: true,
      scope: 'resource',
      template: `
# User Profile
- **Name**:
- **Preferences**:
- **Goals**:
      `
    }
  }
});

Observational Memory

Long-term memory system that uses Observer and Reflector agents to extract and compress observations from conversations.
const memory = new Memory({
  storage,
  options: {
    observationalMemory: {
      scope: 'resource',
      observation: {
        messageTokens: 20_000
      },
      reflection: {
        observationTokens: 90_000
      }
    }
  }
});

Configuration Structure

type MemoryConfig = {
  // Prevent memory from saving new messages
  readOnly?: boolean;
  
  // Number of recent messages to include
  lastMessages?: number | false;
  
  // Semantic recall configuration
  semanticRecall?: boolean | SemanticRecall;
  
  // Working memory configuration
  workingMemory?: WorkingMemory;
  
  // Observational memory configuration
  observationalMemory?: boolean | ObservationalMemoryOptions;
  
  // Auto-generate thread titles
  generateTitle?: boolean | {
    model: MastraModelConfig;
    instructions?: string;
  };
};

Storage Requirements

Memory requires a storage adapter to function:
import { LibSQLStore } from '@mastra/store-libsql';

const storage = new LibSQLStore({
  id: 'agent-memory',
  url: 'file:./agent-memory.db'
});

const memory = new Memory({
  storage,
  options: {
    lastMessages: 10
  }
});

Thread Management

Create and manage conversation threads:
// Create a new thread
const thread = await memory.createThread({
  resourceId: 'user-123',
  title: 'Support conversation',
  metadata: {
    category: 'support',
    priority: 'high'
  }
});

// List threads with filters
const { threads } = await memory.listThreads({
  filter: {
    resourceId: 'user-123',
    metadata: { category: 'support' }
  },
  page: 0,
  perPage: 20
});

// Get a specific thread
const thread = await memory.getThreadById({ 
  threadId: 'thread-xyz' 
});

// Delete a thread
await memory.deleteThread('thread-xyz');

Using Memory with Agents

Memory integrates seamlessly with agents through input and output processors:
import { Agent, Memory } from '@mastra/core';

const memory = new Memory({
  storage,
  options: {
    lastMessages: 10,
    workingMemory: {
      enabled: true,
      scope: 'resource'
    }
  }
});

const agent = new Agent({
  name: 'Support Agent',
  model: 'openai/gpt-4o',
  memory // Memory automatically registers processors
});

// Generate a response with memory context
const result = await agent.generate('How can I help you?', {
  threadId: 'thread-123',
  resourceId: 'user-123'
});

Memory Scopes

Memory features can be scoped to either threads or resources:
Thread Scope: Memory is isolated to a single conversation threadResource Scope: Memory is shared across all threads for a user/resource
const memory = new Memory({
  storage,
  options: {
    // Resource-scoped: shared across all user threads
    workingMemory: {
      enabled: true,
      scope: 'resource'
    },
    // Thread-scoped: only within current thread
    semanticRecall: {
      topK: 5,
      scope: 'thread'
    }
  }
});

Read-Only Memory

Prevent internal agents from modifying memory:
const routingAgent = new Agent({
  name: 'Router',
  model: 'openai/gpt-4o-mini',
  memory,
  memoryConfig: {
    readOnly: true // Can read but not write memory
  }
});

Best Practices

Choose Appropriate Scopes

Use resource scope for user preferences, thread scope for session-specific data

Combine Memory Types

Use conversation history + semantic recall + working memory together for rich context

Configure Token Limits

Balance context window usage with memory richness for optimal performance

Use Read-Only for Routing

Prevent routing/orchestration agents from polluting memory with intermediate decisions

Next Steps

Conversation History

Learn about thread-based message persistence

Semantic Recall

Implement RAG-based memory retrieval

Working Memory

Maintain structured user information

RAG Overview

Explore RAG capabilities in Mastra

Build docs developers (and LLMs) love