Retrieval-Augmented Generation (RAG)

Overview

Retrieval-Augmented Generation (RAG) enables agents to access and reason over external documents. Qwen-Agent provides built-in RAG capabilities through the Assistant class, supporting multiple document formats including PDF, Word, PowerPoint, TXT, and HTML.

Quick Start

Here’s a simple RAG example (examples/assistant_rag.py:19):

from qwen_agent.agents import Assistant

bot = Assistant(
    llm={'model': 'qwen-plus-latest'}
)

messages = [{
    'role': 'user',
    'content': [
        {'text': 'What does this document say about transformers?'},
        {'file': 'https://arxiv.org/pdf/1706.03762.pdf'}
    ]
}]

for response in bot.run(messages):
    print(response)

How RAG Works in Qwen-Agent

The RAG workflow in the Assistant class (qwen_agent/agents/assistant.py:100):

1. User Query + Documents
         ↓
2. Document Parsing & Chunking
         ↓
3. Retrieval (find relevant chunks)
         ↓
4. Knowledge Formatting
         ↓
5. LLM Generation with Context
         ↓
6. Response

Basic Usage

Single Document Q&A

from qwen_agent.agents import Assistant

bot = Assistant(llm={'model': 'qwen-max'})

# Query with PDF
messages = [{
    'role': 'user',
    'content': [
        {'text': 'Summarize the key points'},
        {'file': 'document.pdf'}
    ]
}]

for response in bot.run(messages):
    print(response)

Multiple Document Q&A

messages = [{
    'role': 'user',
    'content': [
        {'text': 'Compare the findings in these two papers'},
        {'file': 'paper1.pdf'},
        {'file': 'paper2.pdf'}
    ]
}]

for response in bot.run(messages):
    print(response)

Pre-loaded Knowledge Base

Load documents when initializing the agent:

bot = Assistant(
    llm={'model': 'qwen-max'},
    files=[
        'docs/manual.pdf',
        'docs/faq.html',
        'docs/guide.txt'
    ]
)

# Documents are available for all queries
messages = [{'role': 'user', 'content': 'What does the manual say about installation?'}]
for response in bot.run(messages):
    print(response)

Supported File Formats

The Assistant automatically handles multiple formats:

Format	Extension	Notes
PDF	`.pdf`	Text extraction with layout preservation
Word	`.docx`, `.doc`	Full text and table extraction
PowerPoint	`.pptx`, `.ppt`	Slide text extraction
HTML	`.html`, `.htm`	Web page content
Text	`.txt`, `.md`	Plain text documents

Remote Files

Support for URLs:

messages = [{
    'role': 'user',
    'content': [
        {'text': 'Summarize this paper'},
        {'file': 'https://arxiv.org/pdf/2307.09288.pdf'}  # Remote URL
    ]
}]

Knowledge Formatting

The Assistant formats retrieved knowledge into structured prompts (qwen_agent/agents/assistant.py:27):

English Format

# Knowledge Base

## The content from [document.pdf]:

[Retrieved content chunk 1]


## The content from [paper.pdf]:

[Retrieved content chunk 2]

Chinese Format

# 知识库

## 来自 [document.pdf] 的内容:

[检索到的内容片段 1]

The knowledge is automatically formatted and prepended to the system message before LLM generation. The language format is auto-detected.

Advanced Configuration

RAG Configuration Options

Customize RAG behavior with rag_cfg parameter:

bot = Assistant(
    llm={'model': 'qwen-max'},
    files=['knowledge_base.pdf'],
    rag_cfg={
        'chunk_size': 500,         # Characters per chunk
        'chunk_overlap': 50,       # Overlap between chunks
        'top_k': 5,                # Number of chunks to retrieve
        'similarity_threshold': 0.7 # Minimum similarity score
    }
)

Custom Knowledge Injection

Manually provide knowledge instead of automatic retrieval (qwen_agent/agents/assistant.py:103):

bot = Assistant(llm={'model': 'qwen-max'})

# Provide pre-retrieved knowledge
external_knowledge = """
Key findings from research:
1. Model architecture uses attention mechanisms
2. Training requires large-scale datasets
3. Performance scales with model size
"""

messages = [{'role': 'user', 'content': 'Explain the key findings'}]

for response in bot.run(messages, knowledge=external_knowledge):
    print(response)

Document Parsing Details

The document parser extracts text and maintains structure:

Parser Features

Layout Preservation: Maintains document structure
Table Extraction: Converts tables to text
Image Handling: Can reference image locations
Metadata: Extracts page numbers and sections

Chunking Strategy

Documents are split into overlapping chunks:

# Example chunking (simplified)
Document: "This is a long document about AI..."

Chunk 1: "This is a long document about AI..."
Chunk 2: "...about AI and machine learning..."
Chunk 3: "...machine learning applications..."

# Overlap ensures context continuity

Retrieval Strategies

Default: Semantic Search

The Assistant uses embedding-based semantic search:

# Automatic semantic retrieval
bot = Assistant(
    llm={'model': 'qwen-max'},
    files=['database.pdf']
)

messages = [{
    'role': 'user',
    'content': 'What are the best practices?'  # Finds semantically similar content
}]

Parallel Document QA

For processing multiple documents in parallel:

from qwen_agent.agents import ParallelDocQA

bot = ParallelDocQA(
    llm={'model': 'qwen-max'}
)

messages = [{
    'role': 'user',
    'content': [
        {'text': 'Compare architectures'},
        {'file': 'paper1.pdf'},
        {'file': 'paper2.pdf'},
        {'file': 'paper3.pdf'}
    ]
}]

# Processes each document in parallel for faster results
for response in bot.run(messages):
    print(response)

Complete Example: Document Assistant

Here’s a complete RAG application with GUI (examples/assistant_rag.py):

from qwen_agent.agents import Assistant
from qwen_agent.gui import WebUI

def app_gui():
    # Configure the assistant
    bot = Assistant(
        llm={'model': 'qwen-plus-latest'},
        name='Assistant',
        description='Use RAG to answer questions. Supports: PDF/Word/PPT/TXT/HTML.'
    )
    
    # Configure UI with suggestions
    chatbot_config = {
        'prompt.suggestions': [
            {'text': 'Introduce figure one'},
            {'text': 'What does the second chapter say?'},
        ]
    }
    
    WebUI(bot, chatbot_config=chatbot_config).run()

if __name__ == '__main__':
    app_gui()

RAG with Tools

Combine RAG with code execution and other tools:

from qwen_agent.agents import Assistant

bot = Assistant(
    llm={'model': 'qwen-max'},
    function_list=['code_interpreter'],  # Add code execution
    files=['data_analysis_guide.pdf']    # Knowledge base
)

messages = [{
    'role': 'user',
    'content': [
        {'text': 'Based on the guide, analyze this CSV file'},
        {'file': 'data.csv'}
    ]
}]

# Agent can:
# 1. Retrieve relevant info from the guide
# 2. Execute code to analyze the CSV
# 3. Combine knowledge and analysis in response
for response in bot.run(messages):
    print(response)

Multi-Turn Conversations

RAG works across conversation turns:

bot = Assistant(
    llm={'model': 'qwen-max'},
    files=['technical_manual.pdf']
)

messages = []

# Turn 1
messages.append({'role': 'user', 'content': 'What is the installation process?'})
for response in bot.run(messages):
    print(response)
messages.extend(response)

# Turn 2 - continues with same knowledge
messages.append({'role': 'user', 'content': 'What about troubleshooting?'})
for response in bot.run(messages):
    print(response)
messages.extend(response)

# Turn 3
messages.append({'role': 'user', 'content': 'Can you show me an example?'})
for response in bot.run(messages):
    print(response)

Best Practices

RAG Optimization Tips

Use specific queries for better retrieval accuracy
Pre-load frequently accessed documents via files parameter
Adjust chunk_size based on document structure (larger for technical docs)
Use top_k to control context length vs. relevance trade-off
Combine with tools for enhanced capabilities (code, search, etc.)

Common Pitfalls

Very large documents may require chunking optimization
Complex layouts (multi-column PDFs) may affect extraction quality
Ensure document URLs are accessible if using remote files
Retrieved context counts toward LLM token limits

Troubleshooting

Document Parsing Errors

# Handle parsing errors gracefully
try:
    messages = [{'role': 'user', 'content': [{'file': 'corrupted.pdf'}]}]
    for response in bot.run(messages):
        print(response)
except Exception as e:
    print(f"Document parsing failed: {e}")

No Relevant Content Retrieved

# Adjust retrieval parameters
bot = Assistant(
    llm={'model': 'qwen-max'},
    files=['docs.pdf'],
    rag_cfg={
        'top_k': 10,  # Retrieve more chunks
        'similarity_threshold': 0.5  # Lower threshold
    }
)

Memory and Performance

# For large document collections, use smaller chunks
bot = Assistant(
    llm={'model': 'qwen-max'},
    files=['large_doc.pdf'],
    rag_cfg={
        'chunk_size': 300,  # Smaller chunks
        'top_k': 3          # Fewer retrieved chunks
    }
)

Next Steps

Learn about Building Agents to customize RAG workflows
Explore Code Interpreter to combine RAG with code execution
See Multi-Agent Systems for distributed RAG

Get Started

Core Concepts

Guides

Built-in Agents

Built-in Tools

Retrieval-Augmented Generation (RAG)

Overview

Quick Start

How RAG Works in Qwen-Agent

Basic Usage

Single Document Q&A

Multiple Document Q&A

Pre-loaded Knowledge Base

Supported File Formats

Remote Files

Knowledge Formatting

English Format

Chinese Format

Advanced Configuration

RAG Configuration Options

Custom Knowledge Injection

Document Parsing Details

Parser Features

Chunking Strategy

Retrieval Strategies

Default: Semantic Search

Parallel Document QA

Complete Example: Document Assistant

RAG with Tools

Multi-Turn Conversations

Best Practices

Troubleshooting

Document Parsing Errors

No Relevant Content Retrieved

Memory and Performance

Next Steps

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Built-in Agents

Built-in Tools

​Overview

​Quick Start

​How RAG Works in Qwen-Agent

​Basic Usage

​Single Document Q&A

​Multiple Document Q&A

​Pre-loaded Knowledge Base

​Supported File Formats

​Remote Files

​Knowledge Formatting

​English Format

​Chinese Format

​Advanced Configuration

​RAG Configuration Options

​Custom Knowledge Injection

​Document Parsing Details

​Parser Features

​Chunking Strategy

​Retrieval Strategies

​Default: Semantic Search

​Parallel Document QA

​Complete Example: Document Assistant

​RAG with Tools

​Multi-Turn Conversations

​Best Practices

​Troubleshooting

​Document Parsing Errors

​No Relevant Content Retrieved

​Memory and Performance

​Next Steps

Build docs developers (and LLMs) love

Overview

Quick Start

How RAG Works in Qwen-Agent

Basic Usage

Single Document Q&A

Multiple Document Q&A

Pre-loaded Knowledge Base

Supported File Formats

Remote Files

Knowledge Formatting

English Format

Chinese Format

Advanced Configuration

RAG Configuration Options

Custom Knowledge Injection

Document Parsing Details

Parser Features

Chunking Strategy

Retrieval Strategies

Default: Semantic Search

Parallel Document QA

Complete Example: Document Assistant

RAG with Tools

Multi-Turn Conversations

Best Practices

Troubleshooting

Document Parsing Errors

No Relevant Content Retrieved

Memory and Performance

Next Steps