Skip to main content

Overview

Retrieval-Augmented Generation (RAG) enables agents to access and reason over external documents. Qwen-Agent provides built-in RAG capabilities through the Assistant class, supporting multiple document formats including PDF, Word, PowerPoint, TXT, and HTML.

Quick Start

Here’s a simple RAG example (examples/assistant_rag.py:19):
from qwen_agent.agents import Assistant

bot = Assistant(
    llm={'model': 'qwen-plus-latest'}
)

messages = [{
    'role': 'user',
    'content': [
        {'text': 'What does this document say about transformers?'},
        {'file': 'https://arxiv.org/pdf/1706.03762.pdf'}
    ]
}]

for response in bot.run(messages):
    print(response)

How RAG Works in Qwen-Agent

The RAG workflow in the Assistant class (qwen_agent/agents/assistant.py:100):
1. User Query + Documents

2. Document Parsing & Chunking

3. Retrieval (find relevant chunks)

4. Knowledge Formatting

5. LLM Generation with Context

6. Response

Basic Usage

Single Document Q&A

from qwen_agent.agents import Assistant

bot = Assistant(llm={'model': 'qwen-max'})

# Query with PDF
messages = [{
    'role': 'user',
    'content': [
        {'text': 'Summarize the key points'},
        {'file': 'document.pdf'}
    ]
}]

for response in bot.run(messages):
    print(response)

Multiple Document Q&A

messages = [{
    'role': 'user',
    'content': [
        {'text': 'Compare the findings in these two papers'},
        {'file': 'paper1.pdf'},
        {'file': 'paper2.pdf'}
    ]
}]

for response in bot.run(messages):
    print(response)

Pre-loaded Knowledge Base

Load documents when initializing the agent:
bot = Assistant(
    llm={'model': 'qwen-max'},
    files=[
        'docs/manual.pdf',
        'docs/faq.html',
        'docs/guide.txt'
    ]
)

# Documents are available for all queries
messages = [{'role': 'user', 'content': 'What does the manual say about installation?'}]
for response in bot.run(messages):
    print(response)

Supported File Formats

The Assistant automatically handles multiple formats:
FormatExtensionNotes
PDF.pdfText extraction with layout preservation
Word.docx, .docFull text and table extraction
PowerPoint.pptx, .pptSlide text extraction
HTML.html, .htmWeb page content
Text.txt, .mdPlain text documents

Remote Files

Support for URLs:
messages = [{
    'role': 'user',
    'content': [
        {'text': 'Summarize this paper'},
        {'file': 'https://arxiv.org/pdf/2307.09288.pdf'}  # Remote URL
    ]
}]

Knowledge Formatting

The Assistant formats retrieved knowledge into structured prompts (qwen_agent/agents/assistant.py:27):

English Format

# Knowledge Base

## The content from [document.pdf]:

[Retrieved content chunk 1]

## The content from [paper.pdf]:

[Retrieved content chunk 2]

Chinese Format

# 知识库

## 来自 [document.pdf] 的内容:

[检索到的内容片段 1]
The knowledge is automatically formatted and prepended to the system message before LLM generation. The language format is auto-detected.

Advanced Configuration

RAG Configuration Options

Customize RAG behavior with rag_cfg parameter:
bot = Assistant(
    llm={'model': 'qwen-max'},
    files=['knowledge_base.pdf'],
    rag_cfg={
        'chunk_size': 500,         # Characters per chunk
        'chunk_overlap': 50,       # Overlap between chunks
        'top_k': 5,                # Number of chunks to retrieve
        'similarity_threshold': 0.7 # Minimum similarity score
    }
)

Custom Knowledge Injection

Manually provide knowledge instead of automatic retrieval (qwen_agent/agents/assistant.py:103):
bot = Assistant(llm={'model': 'qwen-max'})

# Provide pre-retrieved knowledge
external_knowledge = """
Key findings from research:
1. Model architecture uses attention mechanisms
2. Training requires large-scale datasets
3. Performance scales with model size
"""

messages = [{'role': 'user', 'content': 'Explain the key findings'}]

for response in bot.run(messages, knowledge=external_knowledge):
    print(response)

Document Parsing Details

The document parser extracts text and maintains structure:

Parser Features

  • Layout Preservation: Maintains document structure
  • Table Extraction: Converts tables to text
  • Image Handling: Can reference image locations
  • Metadata: Extracts page numbers and sections

Chunking Strategy

Documents are split into overlapping chunks:
# Example chunking (simplified)
Document: "This is a long document about AI..."

Chunk 1: "This is a long document about AI..."
Chunk 2: "...about AI and machine learning..."
Chunk 3: "...machine learning applications..."

# Overlap ensures context continuity

Retrieval Strategies

The Assistant uses embedding-based semantic search:
# Automatic semantic retrieval
bot = Assistant(
    llm={'model': 'qwen-max'},
    files=['database.pdf']
)

messages = [{
    'role': 'user',
    'content': 'What are the best practices?'  # Finds semantically similar content
}]

Parallel Document QA

For processing multiple documents in parallel:
from qwen_agent.agents import ParallelDocQA

bot = ParallelDocQA(
    llm={'model': 'qwen-max'}
)

messages = [{
    'role': 'user',
    'content': [
        {'text': 'Compare architectures'},
        {'file': 'paper1.pdf'},
        {'file': 'paper2.pdf'},
        {'file': 'paper3.pdf'}
    ]
}]

# Processes each document in parallel for faster results
for response in bot.run(messages):
    print(response)

Complete Example: Document Assistant

Here’s a complete RAG application with GUI (examples/assistant_rag.py):
from qwen_agent.agents import Assistant
from qwen_agent.gui import WebUI

def app_gui():
    # Configure the assistant
    bot = Assistant(
        llm={'model': 'qwen-plus-latest'},
        name='Assistant',
        description='Use RAG to answer questions. Supports: PDF/Word/PPT/TXT/HTML.'
    )
    
    # Configure UI with suggestions
    chatbot_config = {
        'prompt.suggestions': [
            {'text': 'Introduce figure one'},
            {'text': 'What does the second chapter say?'},
        ]
    }
    
    WebUI(bot, chatbot_config=chatbot_config).run()

if __name__ == '__main__':
    app_gui()

RAG with Tools

Combine RAG with code execution and other tools:
from qwen_agent.agents import Assistant

bot = Assistant(
    llm={'model': 'qwen-max'},
    function_list=['code_interpreter'],  # Add code execution
    files=['data_analysis_guide.pdf']    # Knowledge base
)

messages = [{
    'role': 'user',
    'content': [
        {'text': 'Based on the guide, analyze this CSV file'},
        {'file': 'data.csv'}
    ]
}]

# Agent can:
# 1. Retrieve relevant info from the guide
# 2. Execute code to analyze the CSV
# 3. Combine knowledge and analysis in response
for response in bot.run(messages):
    print(response)

Multi-Turn Conversations

RAG works across conversation turns:
bot = Assistant(
    llm={'model': 'qwen-max'},
    files=['technical_manual.pdf']
)

messages = []

# Turn 1
messages.append({'role': 'user', 'content': 'What is the installation process?'})
for response in bot.run(messages):
    print(response)
messages.extend(response)

# Turn 2 - continues with same knowledge
messages.append({'role': 'user', 'content': 'What about troubleshooting?'})
for response in bot.run(messages):
    print(response)
messages.extend(response)

# Turn 3
messages.append({'role': 'user', 'content': 'Can you show me an example?'})
for response in bot.run(messages):
    print(response)

Best Practices

RAG Optimization Tips
  • Use specific queries for better retrieval accuracy
  • Pre-load frequently accessed documents via files parameter
  • Adjust chunk_size based on document structure (larger for technical docs)
  • Use top_k to control context length vs. relevance trade-off
  • Combine with tools for enhanced capabilities (code, search, etc.)
Common Pitfalls
  • Very large documents may require chunking optimization
  • Complex layouts (multi-column PDFs) may affect extraction quality
  • Ensure document URLs are accessible if using remote files
  • Retrieved context counts toward LLM token limits

Troubleshooting

Document Parsing Errors

# Handle parsing errors gracefully
try:
    messages = [{'role': 'user', 'content': [{'file': 'corrupted.pdf'}]}]
    for response in bot.run(messages):
        print(response)
except Exception as e:
    print(f"Document parsing failed: {e}")

No Relevant Content Retrieved

# Adjust retrieval parameters
bot = Assistant(
    llm={'model': 'qwen-max'},
    files=['docs.pdf'],
    rag_cfg={
        'top_k': 10,  # Retrieve more chunks
        'similarity_threshold': 0.5  # Lower threshold
    }
)

Memory and Performance

# For large document collections, use smaller chunks
bot = Assistant(
    llm={'model': 'qwen-max'},
    files=['large_doc.pdf'],
    rag_cfg={
        'chunk_size': 300,  # Smaller chunks
        'top_k': 3          # Fewer retrieved chunks
    }
)

Next Steps

Build docs developers (and LLMs) love