Memory Overview

Mastra’s memory system enables agents to maintain context across conversations through multiple memory types: conversation history, semantic recall, working memory, and observational memory.

Core Concepts

Memory in Mastra operates on three key abstractions:

Threads: Individual conversation sessions with unique IDs
Resources: Users or entities that own multiple threads
Storage: Pluggable backends for persisting memory data

Memory Types

Conversation History

Stores recent messages from the current thread, providing short-term conversational continuity.

const memory = new Memory({
  storage,
  options: {
    lastMessages: 10 // Include last 10 messages
  }
});

Semantic Recall

Uses vector embeddings to retrieve relevant past messages based on semantic similarity.

const memory = new Memory({
  storage,
  vector: new PgVector({ connectionString: process.env.DATABASE_URL }),
  embedder: "openai/text-embedding-3-small",
  options: {
    semanticRecall: {
      topK: 5,
      messageRange: 2,
      scope: 'resource' // Search across all user threads
    }
  }
});

Working Memory

Maintains a structured record of user information and preferences that agents update over time.

const memory = new Memory({
  storage,
  options: {
    workingMemory: {
      enabled: true,
      scope: 'resource',
      template: `
# User Profile
- **Name**:
- **Preferences**:
- **Goals**:
      `
    }
  }
});

Observational Memory

Long-term memory system that uses Observer and Reflector agents to extract and compress observations from conversations.

const memory = new Memory({
  storage,
  options: {
    observationalMemory: {
      scope: 'resource',
      observation: {
        messageTokens: 20_000
      },
      reflection: {
        observationTokens: 90_000
      }
    }
  }
});

Configuration Structure

type MemoryConfig = {
  // Prevent memory from saving new messages
  readOnly?: boolean;
  
  // Number of recent messages to include
  lastMessages?: number | false;
  
  // Semantic recall configuration
  semanticRecall?: boolean | SemanticRecall;
  
  // Working memory configuration
  workingMemory?: WorkingMemory;
  
  // Observational memory configuration
  observationalMemory?: boolean | ObservationalMemoryOptions;
  
  // Auto-generate thread titles
  generateTitle?: boolean | {
    model: MastraModelConfig;
    instructions?: string;
  };
};

Storage Requirements

Memory requires a storage adapter to function:

import { LibSQLStore } from '@mastra/store-libsql';

const storage = new LibSQLStore({
  id: 'agent-memory',
  url: 'file:./agent-memory.db'
});

const memory = new Memory({
  storage,
  options: {
    lastMessages: 10
  }
});

Thread Management

Create and manage conversation threads:

// Create a new thread
const thread = await memory.createThread({
  resourceId: 'user-123',
  title: 'Support conversation',
  metadata: {
    category: 'support',
    priority: 'high'
  }
});

// List threads with filters
const { threads } = await memory.listThreads({
  filter: {
    resourceId: 'user-123',
    metadata: { category: 'support' }
  },
  page: 0,
  perPage: 20
});

// Get a specific thread
const thread = await memory.getThreadById({ 
  threadId: 'thread-xyz' 
});

// Delete a thread
await memory.deleteThread('thread-xyz');

Using Memory with Agents

Memory integrates seamlessly with agents through input and output processors:

import { Agent, Memory } from '@mastra/core';

const memory = new Memory({
  storage,
  options: {
    lastMessages: 10,
    workingMemory: {
      enabled: true,
      scope: 'resource'
    }
  }
});

const agent = new Agent({
  name: 'Support Agent',
  model: 'openai/gpt-4o',
  memory // Memory automatically registers processors
});

// Generate a response with memory context
const result = await agent.generate('How can I help you?', {
  threadId: 'thread-123',
  resourceId: 'user-123'
});

Memory Scopes

Memory features can be scoped to either threads or resources:

Thread Scope: Memory is isolated to a single conversation threadResource Scope: Memory is shared across all threads for a user/resource

const memory = new Memory({
  storage,
  options: {
    // Resource-scoped: shared across all user threads
    workingMemory: {
      enabled: true,
      scope: 'resource'
    },
    // Thread-scoped: only within current thread
    semanticRecall: {
      topK: 5,
      scope: 'thread'
    }
  }
});

Read-Only Memory

Prevent internal agents from modifying memory:

const routingAgent = new Agent({
  name: 'Router',
  model: 'openai/gpt-4o-mini',
  memory,
  memoryConfig: {
    readOnly: true // Can read but not write memory
  }
});

Best Practices

Choose Appropriate Scopes

Use resource scope for user preferences, thread scope for session-specific data

Combine Memory Types

Use conversation history + semantic recall + working memory together for rich context

Configure Token Limits

Balance context window usage with memory richness for optimal performance

Use Read-Only for Routing

Prevent routing/orchestration agents from polluting memory with intermediate decisions

Next Steps

Conversation History

Learn about thread-based message persistence

Semantic Recall

Implement RAG-based memory retrieval

Working Memory

Maintain structured user information

RAG Overview

Explore RAG capabilities in Mastra

Get Started

Core Concepts

Agents

Workflows

Memory

RAG

Tools & MCP

Storage

Server & API

Observability

Evals

Deployment

Memory Overview

Core Concepts

Memory Types

Conversation History

Semantic Recall

Working Memory

Observational Memory

Configuration Structure

Storage Requirements

Thread Management

Using Memory with Agents

Memory Scopes

Read-Only Memory

Best Practices

Choose Appropriate Scopes

Combine Memory Types

Configure Token Limits

Use Read-Only for Routing

Next Steps

Conversation History

Semantic Recall

Working Memory

RAG Overview

Build docs developers (and LLMs) love

Get Started

Core Concepts

Agents

Workflows

Memory

RAG

Tools & MCP

Storage

Server & API

Observability

Evals

Deployment

​Core Concepts

​Memory Types

​Conversation History

​Semantic Recall

​Working Memory

​Observational Memory

​Configuration Structure

​Storage Requirements

​Thread Management

​Using Memory with Agents

​Memory Scopes

​Read-Only Memory

​Best Practices

Choose Appropriate Scopes

Combine Memory Types

Configure Token Limits

Use Read-Only for Routing

​Next Steps

Conversation History

Semantic Recall

Working Memory

RAG Overview

Build docs developers (and LLMs) love

Core Concepts

Memory Types

Conversation History

Semantic Recall

Working Memory

Observational Memory

Configuration Structure

Storage Requirements

Thread Management

Using Memory with Agents

Memory Scopes

Read-Only Memory

Best Practices

Next Steps