Skip to main content
Memory in Mastra provides thread-based conversation persistence with semantic recall, working memory, and observational patterns. It enables agents to maintain context across conversations and recall relevant information.

Core Concept

Memory in Mastra:
  • Thread-based: Organize conversations into threads per user/resource
  • Semantic Recall: Vector-based similarity search for relevant history
  • Working Memory: Structured facts that persist across conversations
  • Observational Memory: Learn user patterns and behaviors over time
  • Processor-based: Integrates via input/output processors

Basic Memory

Create a simple memory instance:
import { Memory } from '@mastra/memory';
import { PostgresStore } from '@mastra/postgres';

const memory = new Memory({
  name: 'conversation',
  storage: new PostgresStore({
    connectionString: process.env.DATABASE_URL
  }),
  options: {
    lastMessages: 20 // Include last 20 messages in context
  }
});

const agent = new Agent({
  id: 'assistant',
  instructions: 'You are a helpful assistant',
  model: 'openai/gpt-5',
  memory
});

// Conversation persists across calls
const result1 = await agent.generate(
  'My name is Alice',
  { threadId: 'user-123' }
);

const result2 = await agent.generate(
  'What is my name?',
  { threadId: 'user-123' }
);
console.log(result2.text); // "Your name is Alice"

Memory Configuration

interface MemoryConfig {
  // Message history
  lastMessages?: number;              // Number of recent messages to include
  
  // Semantic recall
  semanticRecall?: boolean | {
    topK?: number;                    // Number of similar messages to recall
    threshold?: number;               // Similarity threshold (0-1)
  };
  
  // Working memory
  workingMemory?: {
    enabled: boolean;
    template?: string;                // Markdown template
    schema?: ZodObject | JSONSchema7; // Structured schema
    scope?: 'thread' | 'resource';    // Storage scope
  };
  
  // Observational memory
  observationalMemory?: {
    enabled: boolean;
    updateInterval?: number;          // Update frequency in messages
  };
  
  // Thread management
  generateTitle?: boolean;            // Auto-generate thread titles
}

Threads and Resources

Memory organizes conversations using threads and resources:
  • Thread: A single conversation (e.g., a chat session)
  • Resource: An entity that owns threads (e.g., a user)
const result = await agent.generate('Hello', {
  threadId: 'thread-456',    // Specific conversation
  resourceId: 'user-123'     // User who owns the thread
});

// List threads for a user
const threads = await memory.listThreads({
  resourceId: 'user-123'
});

// Get messages from a thread
const messages = await memory.getMessages({
  threadId: 'thread-456'
});

Semantic Recall

Retrieve relevant past messages using vector similarity:
import { PineconeVector } from '@mastra/pinecone';
import { OpenAIEmbedder } from '@mastra/embedders';

const memory = new Memory({
  name: 'conversation',
  storage: new PostgresStore({ url: process.env.DATABASE_URL }),
  vector: new PineconeVector({
    apiKey: process.env.PINECONE_API_KEY,
    indexName: 'conversations'
  }),
  embedder: 'openai/text-embedding-3-small',
  options: {
    semanticRecall: {
      topK: 5,        // Retrieve 5 most similar messages
      threshold: 0.7  // Minimum similarity score
    }
  }
});

const agent = new Agent({
  id: 'assistant',
  model: 'openai/gpt-5',
  memory
});

// Agent automatically recalls relevant past messages
const result = await agent.generate(
  'What did I say about my vacation plans?',
  { threadId: 'user-123' }
);
// Memory retrieves similar messages from history

Working Memory

Maintain structured facts across conversations:

Template-based Working Memory

const memory = new Memory({
  name: 'conversation',
  options: {
    workingMemory: {
      enabled: true,
      template: `
# User Profile
- **Name**: 
- **Location**: 
- **Occupation**: 
- **Interests**: 
- **Goals**: 
- **Recent Events**: 
      `
    }
  }
});

const agent = new Agent({
  id: 'assistant',
  memory
});

// Agent updates working memory automatically
const result = await agent.generate(
  'My name is Alice, I live in Paris, and I\'m a software engineer',
  { threadId: 'user-123', resourceId: 'user-123' }
);

// Retrieve working memory
const workingMemory = await memory.getWorkingMemory({
  threadId: 'user-123',
  resourceId: 'user-123'
});
console.log(workingMemory.content);
// "# User Profile\n- **Name**: Alice\n- **Location**: Paris\n- **Occupation**: Software engineer"

Schema-based Working Memory

import { z } from 'zod';

const memory = new Memory({
  name: 'conversation',
  options: {
    workingMemory: {
      enabled: true,
      schema: z.object({
        name: z.string().optional(),
        email: z.string().email().optional(),
        preferences: z.array(z.string()).optional(),
        lastContact: z.string().optional()
      })
    }
  }
});

// Agent updates structured working memory
const result = await agent.generate(
  'I prefer emails over phone calls',
  { threadId: 'user-123', resourceId: 'user-123' }
);

// Retrieve as JSON
const workingMemory = await memory.getWorkingMemory({
  threadId: 'user-123',
  resourceId: 'user-123'
});
console.log(JSON.parse(workingMemory.content));
// { preferences: ['emails'], ... }

Working Memory Scope

Control whether working memory persists per thread or resource:
const memory = new Memory({
  options: {
    workingMemory: {
      enabled: true,
      scope: 'resource', // Default: shared across all threads
      template: '...'
    }
  }
});

// OR

const memory2 = new Memory({
  options: {
    workingMemory: {
      enabled: true,
      scope: 'thread', // Isolated per conversation
      template: '...'
    }
  }
});

Observational Memory

Learn user patterns and behaviors:
const memory = new Memory({
  name: 'conversation',
  options: {
    observationalMemory: {
      enabled: true,
      updateInterval: 5 // Update after every 5 messages
    }
  }
});

// Memory tracks patterns like:
// - Communication style
// - Common topics
// - Interaction patterns
// - User preferences

const agent = new Agent({ memory });

// Agent adapts based on learned patterns
const result = await agent.generate(
  'How should I plan my day?',
  { resourceId: 'user-123' }
);
// Response considers learned preferences and patterns

Message Management

List Messages

const messages = await memory.getMessages({
  threadId: 'thread-456',
  limit: 50,
  offset: 0
});

Delete Messages

await memory.deleteMessages({
  threadId: 'thread-456',
  messageIds: ['msg-1', 'msg-2']
});

Update Message

await memory.updateMessage({
  threadId: 'thread-456',
  messageId: 'msg-1',
  content: 'Updated content'
});

Thread Management

Create Thread

const thread = await memory.createThread({
  resourceId: 'user-123',
  title: 'Planning meeting',
  metadata: { department: 'engineering' }
});

List Threads

const threads = await memory.listThreads({
  resourceId: 'user-123',
  limit: 20,
  offset: 0
});

Delete Thread

await memory.deleteThread({
  threadId: 'thread-456'
});

Clone Thread

const newThread = await memory.cloneThread({
  threadId: 'thread-456',
  resourceId: 'user-123',
  title: 'Continued discussion'
});

Processors

Memory integrates via input/output processors:
import { MessageHistory, SemanticRecall, WorkingMemory } from '@mastra/memory/processors';

// Manually add memory processors
const agent = new Agent({
  id: 'assistant',
  inputProcessors: [
    new MessageHistory({ memory, lastMessages: 20 }),
    new SemanticRecall({ memory, topK: 5 }),
    new WorkingMemory({ memory })
  ],
  outputProcessors: [
    new MessageHistory({ memory }) // Persist messages
  ]
});
When you pass a memory instance to an agent, these processors are automatically configured.

Storage Integration

Memory requires storage for persistence:
import { PostgresStore } from '@mastra/postgres';

const storage = new PostgresStore({
  connectionString: process.env.DATABASE_URL
});

const memory = new Memory({
  name: 'conversation',
  storage // Explicitly provide storage
});

// OR let Mastra inject storage
const mastra = new Mastra({
  storage,
  agents: {
    assistant: new Agent({
      memory: new Memory({ name: 'conversation' })
      // Memory inherits Mastra's storage
    })
  }
});

Auto-Title Generation

Automatically generate thread titles:
const memory = new Memory({
  name: 'conversation',
  options: {
    generateTitle: true
  }
});

const agent = new Agent({ memory });

const result = await agent.generate(
  'I need help planning a trip to Japan',
  { threadId: 'thread-456', resourceId: 'user-123' }
);

// Thread title generated: "Trip Planning to Japan"
const threads = await memory.listThreads({ resourceId: 'user-123' });
console.log(threads[0].title); // "Trip Planning to Japan"

Request Context

Memory uses request context for dynamic behavior:
const memory = new Memory({
  name: 'conversation',
  options: ({ requestContext }) => {
    const userTier = requestContext.get('userTier');
    return {
      lastMessages: userTier === 'premium' ? 50 : 20,
      semanticRecall: userTier === 'premium' ? { topK: 10 } : false
    };
  }
});

const ctx = new RequestContext();
ctx.set('userTier', 'premium');

const result = await agent.generate('Hello', {
  requestContext: ctx,
  threadId: 'user-123'
});

Best Practices

Let Mastra inject storage into memory:
const mastra = new Mastra({
  storage: new PostgresStore({ ... }),
  agents: {
    assistant: new Agent({
      memory: new Memory({ name: 'conversation' })
    })
  }
});
Always provide threadId and resourceId:
const result = await agent.generate(prompt, {
  threadId: 'thread-456',
  resourceId: 'user-123'
});
Use vector search for retrieving relevant context:
const memory = new Memory({
  vector: pineconeVector,
  embedder: 'openai/text-embedding-3-small',
  options: { semanticRecall: { topK: 5 } }
});
Store facts that need to persist:
const memory = new Memory({
  options: {
    workingMemory: {
      enabled: true,
      schema: z.object({ ... })
    }
  }
});

Build docs developers (and LLMs) love