Overview
The Memory system in Qwen-Agent manages document storage, retrieval, and context management for agents. It provides RAG (Retrieval-Augmented Generation) capabilities, allowing agents to work with large document collections efficiently.
Memory Class
The Memory class is a specialized agent that handles file management and document retrieval:
from qwen_agent.memory import Memory
memory = Memory(
llm = { 'model' : 'qwen-plus' },
files = [ 'https://example.com/doc1.pdf' , '/path/to/doc2.txt' ],
rag_cfg = {
'max_ref_token' : 4000 ,
'parser_page_size' : 500 ,
'rag_searchers' : [ 'keyword_search' , 'front_page_search' ],
'rag_keygen_strategy' : 'SplitQueryThenGenKeyword'
}
)
# Retrieve relevant content
from qwen_agent.llm.schema import Message
responses = memory.run_nonstream([
Message( role = 'user' , content = 'What is the main topic of the documents?' )
])
print (responses[ 0 ].content) # Retrieved document snippets
Source Reference: qwen_agent/memory/memory.py:32-144
Configuration
RAG Configuration Parameters
Maximum number of tokens to retrieve from documents. Controls how much context is provided to the LLM.
Number of characters per document chunk. Smaller values create more granular chunks for precise retrieval.
rag_searchers
List[str]
default: "['keyword_search', 'front_page_search']"
List of search strategies to use:
'keyword_search' - BM25-based keyword matching
'vector_search' - Semantic similarity using embeddings
'hybrid_search' - Combines keyword and vector search
'front_page_search' - Searches document front matter/summaries
rag_keygen_strategy
str
default: "'SplitQueryThenGenKeyword'"
Strategy for generating search keywords from user queries:
'SplitQueryThenGenKeyword' - Splits query and generates keywords
'none' - Use raw query without keyword generation
Requires an LLM to be configured. Automatically set to 'none' if no LLM is provided.
Full Configuration Example
memory = Memory(
llm = {
'model' : 'qwen-plus' ,
'generate_cfg' : {
'max_input_tokens' : 8000
}
},
files = [
'https://example.com/research_paper.pdf' ,
'/local/path/documentation.docx'
],
rag_cfg = {
'max_ref_token' : 8000 ,
'parser_page_size' : 1000 ,
'rag_searchers' : [ 'hybrid_search' ],
'rag_keygen_strategy' : 'SplitQueryThenGenKeyword'
}
)
Source Reference: qwen_agent/memory/memory.py:38-76
Retrieval Workflow
Memory retrieves relevant content through several steps:
Retrieval Process
File Processing
Files from messages and system configuration are collected and parsed into structured chunks.
Keyword Generation
If configured, the LLM generates search keywords from the user’s query to improve retrieval accuracy.
Search Execution
The configured search strategies (keyword, vector, hybrid) are used to find relevant document chunks.
Result Formatting
Retrieved content is formatted with source information and returned to the agent.
Source Reference: qwen_agent/memory/memory.py:81-144
Using Memory in Agents
Automatic Memory Integration
Many agents automatically include Memory for document handling:
from qwen_agent.agents import Assistant
# Assistant automatically includes Memory
agent = Assistant(
function_list = [ 'code_interpreter' ],
llm = { 'model' : 'qwen-plus' },
files = [ 'research_papers.pdf' ],
rag_cfg = {
'max_ref_token' : 6000 ,
'rag_searchers' : [ 'hybrid_search' ]
}
)
# Memory is accessible via agent.mem
print (agent.mem.system_files) # ['research_papers.pdf']
Manual Memory Access
class MyAgent ( Agent ):
def __init__ ( self , ** kwargs ):
super (). __init__ ( ** kwargs)
self .mem = Memory(
llm = self .llm,
files = kwargs.get( 'files' , []),
rag_cfg = kwargs.get( 'rag_cfg' , {})
)
def _run ( self , messages , ** kwargs ):
# Retrieve relevant content
mem_responses = self .mem.run_nonstream(messages)
knowledge = mem_responses[ 0 ].content
# Use knowledge in your workflow
enhanced_messages = self .add_knowledge(messages, knowledge)
# Continue with LLM
return self ._call_llm(enhanced_messages)
Source Reference: qwen_agent/agents/fncall_agent.py:56-71
Document Management
Supported File Types
Memory supports various document formats:
PDF - .pdf
Word - .docx
PowerPoint - .pptx
Text - .txt
Spreadsheets - .csv, .tsv, .xlsx, .xls
HTML - .html, .htm
Source Reference: qwen_agent/memory/memory.py:28
Adding Files
System Files
Session Files
# Files provided at initialization
memory = Memory(
files = [
'https://example.com/doc.pdf' ,
'/local/file.txt'
]
)
# These are always available
print (memory.system_files)
File Retrieval
# Get all files available for retrieval
from qwen_agent.llm.schema import Message
messages = [
Message( role = 'user' , content = 'query' , file = 'doc.pdf' )
]
rag_files = memory.get_rag_files(messages)
# Returns: system_files + files from messages
Source Reference: qwen_agent/memory/memory.py:146-154
Search Strategies
Keyword Search
BM25-based keyword matching for exact term retrieval:
memory = Memory(
files = [ 'documents.pdf' ],
rag_cfg = {
'rag_searchers' : [ 'keyword_search' ]
}
)
# Good for: specific terms, names, technical vocabulary
Vector Search
Semantic similarity search using embeddings:
memory = Memory(
files = [ 'documents.pdf' ],
rag_cfg = {
'rag_searchers' : [ 'vector_search' ]
}
)
# Good for: conceptual queries, paraphrased questions
Hybrid Search
Combines keyword and vector search for best results:
memory = Memory(
files = [ 'documents.pdf' ],
rag_cfg = {
'rag_searchers' : [ 'hybrid_search' ]
}
)
# Good for: general queries, balanced precision and recall
Front Page Search
Searches document front matter and summaries:
memory = Memory(
files = [ 'documents.pdf' ],
rag_cfg = {
'rag_searchers' : [ 'front_page_search' ]
}
)
# Good for: document overview, metadata, titles
Combining Strategies
memory = Memory(
files = [ 'documents.pdf' ],
rag_cfg = {
'rag_searchers' : [
'front_page_search' , # Check summaries first
'keyword_search' , # Then specific terms
'vector_search' # Finally semantic search
]
}
)
# Results are combined and ranked
Keyword Generation
Keyword generation enhances queries for better retrieval:
With Keyword Generation
Without Keyword Generation
memory = Memory(
llm = { 'model' : 'qwen-plus' },
files = [ 'documents.pdf' ],
rag_cfg = {
'rag_keygen_strategy' : 'SplitQueryThenGenKeyword'
}
)
# User query: "How do I configure the system?"
# Enhanced query: {
# "text": "How do I configure the system?",
# "keywords": ["configuration", "setup", "settings", "system"]
# }
Source Reference: qwen_agent/memory/memory.py:106-132
Message Management
Message Schema
Memory works with the standard Message format:
from qwen_agent.llm.schema import Message, SYSTEM , USER , ASSISTANT , FUNCTION
# System message (optional)
Message( role = SYSTEM , content = 'You are a helpful assistant.' )
# User message
Message( role = USER , content = 'What is machine learning?' )
# Assistant response
Message( role = ASSISTANT , content = 'Machine learning is...' )
# Function call result
Message( role = FUNCTION , name = 'tool_name' , content = 'Tool output' )
Source Reference: qwen_agent/llm/schema.py:132-164
Context Management
Memory automatically manages context length:
memory = Memory(
llm = {
'model' : 'qwen-plus' ,
'generate_cfg' : {
'max_input_tokens' : 6000 # Truncates if exceeded
}
},
rag_cfg = {
'max_ref_token' : 4000 # Limits retrieved content
}
)
# If retrieved content + messages > max_input_tokens,
# older messages are automatically truncated
Source Reference: qwen_agent/llm/base.py:602-804
Memory returns structured retrieval results:
# JSON format with source attribution
result = [
{
"url" : "document.pdf" ,
"text" : [
"First relevant chunk from the document..." ,
"Second relevant chunk..." ,
"Third relevant chunk..."
]
},
{
"url" : "another_doc.txt" ,
"text" : [
"Relevant content from another document..."
]
}
]
# Can be easily formatted for display
import json
print (json.dumps(result, indent = 2 , ensure_ascii = False ))
Advanced Usage
Custom Memory Subclass
from qwen_agent.memory import Memory
from qwen_agent.llm.schema import Message, ASSISTANT
class CustomMemory ( Memory ):
def _run ( self , messages , ** kwargs ):
# Get standard retrieval results
for responses in super ()._run(messages, ** kwargs):
pass
# Post-process results
content = responses[ 0 ].content
enhanced_content = self .enhance_retrieval(content)
yield [Message( role = ASSISTANT , content = enhanced_content, name = 'memory' )]
def enhance_retrieval ( self , content ):
# Your custom enhancement logic
# E.g., reranking, filtering, summarization
return content
Memory without LLM
# Memory can work without LLM (no keyword generation)
memory = Memory(
llm = None , # No LLM
files = [ 'documents.pdf' ],
rag_cfg = {
'rag_keygen_strategy' : 'none' , # Automatically set
'rag_searchers' : [ 'keyword_search' ]
}
)
# Direct retrieval without keyword enhancement
responses = memory.run_nonstream([
Message( role = 'user' , content = 'search query' )
])
Source Reference: qwen_agent/memory/memory.py:61-63
Memory automatically includes retrieval tools:
RAG-based document retrieval:
# Automatically added to Memory
tool = memory.function_map[ 'retrieval' ]
# Direct usage
result = tool.call({
'query' : 'What is AI?' ,
'files' : [ 'ai_book.pdf' , 'ml_paper.pdf' ]
})
Document parsing without retrieval:
# Also included in Memory
tool = memory.function_map[ 'doc_parser' ]
# Parse entire document
result = tool.call({
'url' : 'document.pdf'
})
Source Reference: qwen_agent/memory/memory.py:66-77
Best Practices
Search Strategy Selection
Use keyword_search for technical documents with specific terms
Use vector_search for conceptual/semantic queries
Use hybrid_search for general-purpose retrieval
Use front_page_search to quickly get document overviews
Chunk Size Tuning
Smaller parser_page_size (300-500) for precise retrieval
Larger parser_page_size (800-1000) for context preservation
Balance between granularity and coherence
Test with your specific document types
Token Management
Set max_ref_token based on your LLM’s context window
Leave room for conversation history
Typical: 4000-8000 tokens for retrieval
Monitor context usage in long conversations
File Organization
Use system_files for persistent knowledge base
Pass session-specific files in messages
Group related documents together
Consider document size and parsing time
memory = Memory(
llm = {
'model' : 'qwen-turbo' , # Faster model for keygen
'generate_cfg' : {
'max_input_tokens' : 30000 # Larger context
}
},
files = [ 'documents.pdf' ],
rag_cfg = {
'max_ref_token' : 6000 ,
'parser_page_size' : 800 , # Balanced chunk size
'rag_searchers' : [ 'hybrid_search' ], # Single efficient searcher
'rag_keygen_strategy' : 'SplitQueryThenGenKeyword'
}
)
Agents Learn how agents integrate with Memory
Tools Understand the Retrieval and DocParser tools
LLM Configuration Configure LLMs for keyword generation