Skip to main content

Overview

The Memory system in Qwen-Agent manages document storage, retrieval, and context management for agents. It provides RAG (Retrieval-Augmented Generation) capabilities, allowing agents to work with large document collections efficiently.

Memory Class

The Memory class is a specialized agent that handles file management and document retrieval:
from qwen_agent.memory import Memory

memory = Memory(
    llm={'model': 'qwen-plus'},
    files=['https://example.com/doc1.pdf', '/path/to/doc2.txt'],
    rag_cfg={
        'max_ref_token': 4000,
        'parser_page_size': 500,
        'rag_searchers': ['keyword_search', 'front_page_search'],
        'rag_keygen_strategy': 'SplitQueryThenGenKeyword'
    }
)

# Retrieve relevant content
from qwen_agent.llm.schema import Message

responses = memory.run_nonstream([
    Message(role='user', content='What is the main topic of the documents?')
])

print(responses[0].content)  # Retrieved document snippets
Source Reference: qwen_agent/memory/memory.py:32-144

Configuration

RAG Configuration Parameters

max_ref_token
int
default:"4000"
Maximum number of tokens to retrieve from documents. Controls how much context is provided to the LLM.
parser_page_size
int
default:"500"
Number of characters per document chunk. Smaller values create more granular chunks for precise retrieval.
rag_searchers
List[str]
default:"['keyword_search', 'front_page_search']"
List of search strategies to use:
  • 'keyword_search' - BM25-based keyword matching
  • 'vector_search' - Semantic similarity using embeddings
  • 'hybrid_search' - Combines keyword and vector search
  • 'front_page_search' - Searches document front matter/summaries
rag_keygen_strategy
str
default:"'SplitQueryThenGenKeyword'"
Strategy for generating search keywords from user queries:
  • 'SplitQueryThenGenKeyword' - Splits query and generates keywords
  • 'none' - Use raw query without keyword generation
Requires an LLM to be configured. Automatically set to 'none' if no LLM is provided.
memory = Memory(
    llm={
        'model': 'qwen-plus',
        'generate_cfg': {
            'max_input_tokens': 8000
        }
    },
    files=[
        'https://example.com/research_paper.pdf',
        '/local/path/documentation.docx'
    ],
    rag_cfg={
        'max_ref_token': 8000,
        'parser_page_size': 1000,
        'rag_searchers': ['hybrid_search'],
        'rag_keygen_strategy': 'SplitQueryThenGenKeyword'
    }
)
Source Reference: qwen_agent/memory/memory.py:38-76

Retrieval Workflow

Memory retrieves relevant content through several steps:

Retrieval Process

1

File Processing

Files from messages and system configuration are collected and parsed into structured chunks.
2

Keyword Generation

If configured, the LLM generates search keywords from the user’s query to improve retrieval accuracy.
3

Search Execution

The configured search strategies (keyword, vector, hybrid) are used to find relevant document chunks.
4

Result Formatting

Retrieved content is formatted with source information and returned to the agent.
Source Reference: qwen_agent/memory/memory.py:81-144

Using Memory in Agents

Automatic Memory Integration

Many agents automatically include Memory for document handling:
from qwen_agent.agents import Assistant

# Assistant automatically includes Memory
agent = Assistant(
    function_list=['code_interpreter'],
    llm={'model': 'qwen-plus'},
    files=['research_papers.pdf'],
    rag_cfg={
        'max_ref_token': 6000,
        'rag_searchers': ['hybrid_search']
    }
)

# Memory is accessible via agent.mem
print(agent.mem.system_files)  # ['research_papers.pdf']

Manual Memory Access

class MyAgent(Agent):
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        self.mem = Memory(
            llm=self.llm,
            files=kwargs.get('files', []),
            rag_cfg=kwargs.get('rag_cfg', {})
        )
    
    def _run(self, messages, **kwargs):
        # Retrieve relevant content
        mem_responses = self.mem.run_nonstream(messages)
        knowledge = mem_responses[0].content
        
        # Use knowledge in your workflow
        enhanced_messages = self.add_knowledge(messages, knowledge)
        
        # Continue with LLM
        return self._call_llm(enhanced_messages)
Source Reference: qwen_agent/agents/fncall_agent.py:56-71

Document Management

Supported File Types

Memory supports various document formats:
  • PDF - .pdf
  • Word - .docx
  • PowerPoint - .pptx
  • Text - .txt
  • Spreadsheets - .csv, .tsv, .xlsx, .xls
  • HTML - .html, .htm
Source Reference: qwen_agent/memory/memory.py:28

Adding Files

# Files provided at initialization
memory = Memory(
    files=[
        'https://example.com/doc.pdf',
        '/local/file.txt'
    ]
)

# These are always available
print(memory.system_files)

File Retrieval

# Get all files available for retrieval
from qwen_agent.llm.schema import Message

messages = [
    Message(role='user', content='query', file='doc.pdf')
]

rag_files = memory.get_rag_files(messages)
# Returns: system_files + files from messages
Source Reference: qwen_agent/memory/memory.py:146-154

Search Strategies

BM25-based keyword matching for exact term retrieval:
memory = Memory(
    files=['documents.pdf'],
    rag_cfg={
        'rag_searchers': ['keyword_search']
    }
)

# Good for: specific terms, names, technical vocabulary
Semantic similarity search using embeddings:
memory = Memory(
    files=['documents.pdf'],
    rag_cfg={
        'rag_searchers': ['vector_search']
    }
)

# Good for: conceptual queries, paraphrased questions
Combines keyword and vector search for best results:
memory = Memory(
    files=['documents.pdf'],
    rag_cfg={
        'rag_searchers': ['hybrid_search']
    }
)

# Good for: general queries, balanced precision and recall
Searches document front matter and summaries:
memory = Memory(
    files=['documents.pdf'],
    rag_cfg={
        'rag_searchers': ['front_page_search']
    }
)

# Good for: document overview, metadata, titles

Combining Strategies

memory = Memory(
    files=['documents.pdf'],
    rag_cfg={
        'rag_searchers': [
            'front_page_search',  # Check summaries first
            'keyword_search',      # Then specific terms
            'vector_search'        # Finally semantic search
        ]
    }
)

# Results are combined and ranked

Keyword Generation

Keyword generation enhances queries for better retrieval:
memory = Memory(
    llm={'model': 'qwen-plus'},
    files=['documents.pdf'],
    rag_cfg={
        'rag_keygen_strategy': 'SplitQueryThenGenKeyword'
    }
)

# User query: "How do I configure the system?"
# Enhanced query: {
#   "text": "How do I configure the system?",
#   "keywords": ["configuration", "setup", "settings", "system"]
# }
Source Reference: qwen_agent/memory/memory.py:106-132

Message Management

Message Schema

Memory works with the standard Message format:
from qwen_agent.llm.schema import Message, SYSTEM, USER, ASSISTANT, FUNCTION

# System message (optional)
Message(role=SYSTEM, content='You are a helpful assistant.')

# User message
Message(role=USER, content='What is machine learning?')

# Assistant response
Message(role=ASSISTANT, content='Machine learning is...')

# Function call result
Message(role=FUNCTION, name='tool_name', content='Tool output')
Source Reference: qwen_agent/llm/schema.py:132-164

Context Management

Memory automatically manages context length:
memory = Memory(
    llm={
        'model': 'qwen-plus',
        'generate_cfg': {
            'max_input_tokens': 6000  # Truncates if exceeded
        }
    },
    rag_cfg={
        'max_ref_token': 4000  # Limits retrieved content
    }
)

# If retrieved content + messages > max_input_tokens,
# older messages are automatically truncated
Source Reference: qwen_agent/llm/base.py:602-804

Retrieval Output Format

Memory returns structured retrieval results:
# JSON format with source attribution
result = [
    {
        "url": "document.pdf",
        "text": [
            "First relevant chunk from the document...",
            "Second relevant chunk...",
            "Third relevant chunk..."
        ]
    },
    {
        "url": "another_doc.txt",
        "text": [
            "Relevant content from another document..."
        ]
    }
]

# Can be easily formatted for display
import json
print(json.dumps(result, indent=2, ensure_ascii=False))

Advanced Usage

Custom Memory Subclass

from qwen_agent.memory import Memory
from qwen_agent.llm.schema import Message, ASSISTANT

class CustomMemory(Memory):
    def _run(self, messages, **kwargs):
        # Get standard retrieval results
        for responses in super()._run(messages, **kwargs):
            pass
        
        # Post-process results
        content = responses[0].content
        enhanced_content = self.enhance_retrieval(content)
        
        yield [Message(role=ASSISTANT, content=enhanced_content, name='memory')]
    
    def enhance_retrieval(self, content):
        # Your custom enhancement logic
        # E.g., reranking, filtering, summarization
        return content

Memory without LLM

# Memory can work without LLM (no keyword generation)
memory = Memory(
    llm=None,  # No LLM
    files=['documents.pdf'],
    rag_cfg={
        'rag_keygen_strategy': 'none',  # Automatically set
        'rag_searchers': ['keyword_search']
    }
)

# Direct retrieval without keyword enhancement
responses = memory.run_nonstream([
    Message(role='user', content='search query')
])
Source Reference: qwen_agent/memory/memory.py:61-63

Built-in Tools

Memory automatically includes retrieval tools:

Retrieval Tool

RAG-based document retrieval:
# Automatically added to Memory
tool = memory.function_map['retrieval']

# Direct usage
result = tool.call({
    'query': 'What is AI?',
    'files': ['ai_book.pdf', 'ml_paper.pdf']
})

DocParser Tool

Document parsing without retrieval:
# Also included in Memory
tool = memory.function_map['doc_parser']

# Parse entire document
result = tool.call({
    'url': 'document.pdf'
})
Source Reference: qwen_agent/memory/memory.py:66-77

Best Practices

Search Strategy Selection

  • Use keyword_search for technical documents with specific terms
  • Use vector_search for conceptual/semantic queries
  • Use hybrid_search for general-purpose retrieval
  • Use front_page_search to quickly get document overviews

Chunk Size Tuning

  • Smaller parser_page_size (300-500) for precise retrieval
  • Larger parser_page_size (800-1000) for context preservation
  • Balance between granularity and coherence
  • Test with your specific document types

Token Management

  • Set max_ref_token based on your LLM’s context window
  • Leave room for conversation history
  • Typical: 4000-8000 tokens for retrieval
  • Monitor context usage in long conversations

File Organization

  • Use system_files for persistent knowledge base
  • Pass session-specific files in messages
  • Group related documents together
  • Consider document size and parsing time

Performance Optimization

memory = Memory(
    llm={
        'model': 'qwen-turbo',  # Faster model for keygen
        'generate_cfg': {
            'max_input_tokens': 30000  # Larger context
        }
    },
    files=['documents.pdf'],
    rag_cfg={
        'max_ref_token': 6000,
        'parser_page_size': 800,  # Balanced chunk size
        'rag_searchers': ['hybrid_search'],  # Single efficient searcher
        'rag_keygen_strategy': 'SplitQueryThenGenKeyword'
    }
)

Agents

Learn how agents integrate with Memory

Tools

Understand the Retrieval and DocParser tools

LLM Configuration

Configure LLMs for keyword generation

Build docs developers (and LLMs) love