Mastra’s memory system enables agents to maintain context across conversations through multiple memory types: conversation history, semantic recall, working memory, and observational memory.
Core Concepts
Memory in Mastra operates on three key abstractions:
Threads : Individual conversation sessions with unique IDs
Resources : Users or entities that own multiple threads
Storage : Pluggable backends for persisting memory data
Memory Types
Conversation History
Stores recent messages from the current thread, providing short-term conversational continuity.
const memory = new Memory ({
storage ,
options: {
lastMessages: 10 // Include last 10 messages
}
});
Semantic Recall
Uses vector embeddings to retrieve relevant past messages based on semantic similarity.
const memory = new Memory ({
storage ,
vector: new PgVector ({ connectionString: process . env . DATABASE_URL }),
embedder: "openai/text-embedding-3-small" ,
options: {
semanticRecall: {
topK: 5 ,
messageRange: 2 ,
scope: 'resource' // Search across all user threads
}
}
});
Working Memory
Maintains a structured record of user information and preferences that agents update over time.
const memory = new Memory ({
storage ,
options: {
workingMemory: {
enabled: true ,
scope: 'resource' ,
template: `
# User Profile
- **Name**:
- **Preferences**:
- **Goals**:
`
}
}
});
Observational Memory
Long-term memory system that uses Observer and Reflector agents to extract and compress observations from conversations.
const memory = new Memory ({
storage ,
options: {
observationalMemory: {
scope: 'resource' ,
observation: {
messageTokens: 20_000
},
reflection: {
observationTokens: 90_000
}
}
}
});
Configuration Structure
Memory Configuration
Shared Memory Configuration
type MemoryConfig = {
// Prevent memory from saving new messages
readOnly ?: boolean ;
// Number of recent messages to include
lastMessages ?: number | false ;
// Semantic recall configuration
semanticRecall ?: boolean | SemanticRecall ;
// Working memory configuration
workingMemory ?: WorkingMemory ;
// Observational memory configuration
observationalMemory ?: boolean | ObservationalMemoryOptions ;
// Auto-generate thread titles
generateTitle ?: boolean | {
model : MastraModelConfig ;
instructions ?: string ;
};
};
Storage Requirements
Memory requires a storage adapter to function:
import { LibSQLStore } from '@mastra/store-libsql' ;
const storage = new LibSQLStore ({
id: 'agent-memory' ,
url: 'file:./agent-memory.db'
});
const memory = new Memory ({
storage ,
options: {
lastMessages: 10
}
});
Thread Management
Create and manage conversation threads:
// Create a new thread
const thread = await memory . createThread ({
resourceId: 'user-123' ,
title: 'Support conversation' ,
metadata: {
category: 'support' ,
priority: 'high'
}
});
// List threads with filters
const { threads } = await memory . listThreads ({
filter: {
resourceId: 'user-123' ,
metadata: { category: 'support' }
},
page: 0 ,
perPage: 20
});
// Get a specific thread
const thread = await memory . getThreadById ({
threadId: 'thread-xyz'
});
// Delete a thread
await memory . deleteThread ( 'thread-xyz' );
Using Memory with Agents
Memory integrates seamlessly with agents through input and output processors:
import { Agent , Memory } from '@mastra/core' ;
const memory = new Memory ({
storage ,
options: {
lastMessages: 10 ,
workingMemory: {
enabled: true ,
scope: 'resource'
}
}
});
const agent = new Agent ({
name: 'Support Agent' ,
model: 'openai/gpt-4o' ,
memory // Memory automatically registers processors
});
// Generate a response with memory context
const result = await agent . generate ( 'How can I help you?' , {
threadId: 'thread-123' ,
resourceId: 'user-123'
});
Memory Scopes
Memory features can be scoped to either threads or resources:
Thread Scope : Memory is isolated to a single conversation threadResource Scope : Memory is shared across all threads for a user/resource
const memory = new Memory ({
storage ,
options: {
// Resource-scoped: shared across all user threads
workingMemory: {
enabled: true ,
scope: 'resource'
},
// Thread-scoped: only within current thread
semanticRecall: {
topK: 5 ,
scope: 'thread'
}
}
});
Read-Only Memory
Prevent internal agents from modifying memory:
const routingAgent = new Agent ({
name: 'Router' ,
model: 'openai/gpt-4o-mini' ,
memory ,
memoryConfig: {
readOnly: true // Can read but not write memory
}
});
Best Practices
Choose Appropriate Scopes Use resource scope for user preferences, thread scope for session-specific data
Combine Memory Types Use conversation history + semantic recall + working memory together for rich context
Configure Token Limits Balance context window usage with memory richness for optimal performance
Use Read-Only for Routing Prevent routing/orchestration agents from polluting memory with intermediate decisions
Next Steps
Conversation History Learn about thread-based message persistence
Semantic Recall Implement RAG-based memory retrieval
Working Memory Maintain structured user information
RAG Overview Explore RAG capabilities in Mastra