Qwen-Agent provides specialized agents for handling long documents like research papers, books, and technical documentation. These agents use parallel processing and retrieval-augmented generation (RAG) for efficient analysis.
ParallelDocQA Agent
The ParallelDocQA agent processes documents in parallel for faster question answering.
Basic Usage
from qwen_agent.agents.doc_qa import ParallelDocQA
from qwen_agent.gui import WebUI
def test ():
bot = ParallelDocQA(
llm = {
'model' : 'qwen2.5-72b-instruct' ,
'generate_cfg' : {
'max_retries' : 10
}
}
)
messages = [
{
'role' : 'user' ,
'content' : [
{ 'text' : '介绍实验方法' },
{ 'file' : 'https://arxiv.org/pdf/2310.08560.pdf' }
]
},
]
for rsp in bot.run(messages):
print ( 'bot response:' , rsp)
if __name__ == '__main__' :
test()
How It Works
Document Chunking
The document is split into manageable chunks (pages or sections)
Parallel Processing
Multiple chunks are processed simultaneously using parallel queries
RAG Retrieval
Relevant chunks are retrieved based on the question
Answer Synthesis
Retrieved content is combined to generate a comprehensive answer
Speed 2-3x faster than sequential processing for long documents
Accuracy High accuracy through comprehensive chunk analysis
Memory Efficient memory usage with streaming
Scalability Handles documents of 100+ pages
Web UI Application
Create a user-friendly interface for document QA:
from qwen_agent.agents.doc_qa import ParallelDocQA
from qwen_agent.gui import WebUI
def app_gui ():
# Define the agent
bot = ParallelDocQA(
llm = {
'model' : 'qwen2.5-72b-instruct' ,
'generate_cfg' : {
'max_retries' : 10
}
},
description = '并行QA后用RAG召回内容并回答。支持文件类型:PDF/Word/PPT/TXT/HTML。使用与材料相同的语言提问会更好。' ,
)
chatbot_config = {
'prompt.suggestions' : [
{ 'text' : '介绍实验方法' },
{ 'text' : '这篇论文的主要贡献是什么?' },
{ 'text' : '总结第三章的内容' },
]
}
WebUI(bot, chatbot_config = chatbot_config).run()
if __name__ == '__main__' :
app_gui()
Features
Drag-and-drop file upload
Multi-format support : PDF, Word, PowerPoint, TXT, HTML
Streaming responses : See answers as they’re generated
Multi-turn conversations : Build on previous questions
Suggested prompts : Guide users with example questions
PDF
Word Documents
PowerPoint
Text & HTML
Full Support
Text extraction from all pages
Handles multi-column layouts
Preserves document structure
Supports both scanned and native PDFs (with OCR)
messages = [{
'role' : 'user' ,
'content' : [
{ 'text' : 'What is the main conclusion?' },
{ 'file' : 'paper.pdf' }
]
}]
Full Support
.doc and .docx formats
Preserves formatting and structure
Handles tables and lists
messages = [{
'role' : 'user' ,
'content' : [
{ 'text' : 'Summarize the key points' },
{ 'file' : 'report.docx' }
]
}]
Full Support
.ppt and .pptx formats
Extracts text from slides
Preserves slide order
messages = [{
'role' : 'user' ,
'content' : [
{ 'text' : 'What is covered in the presentation?' },
{ 'file' : 'slides.pptx' }
]
}]
Full Support
Plain text files (.txt)
HTML documents (.html, .htm)
Markdown files (.md)
messages = [{
'role' : 'user' ,
'content' : [
{ 'text' : 'Extract the main ideas' },
{ 'file' : 'document.txt' }
]
}]
Remote Files
Load documents from URLs:
messages = [{
'role' : 'user' ,
'content' : [
{ 'text' : 'Explain the transformer architecture' },
{ 'file' : 'https://arxiv.org/pdf/1706.03762.pdf' }
]
}]
for response in bot.run(messages):
print (response)
The agent automatically downloads and processes remote files. Ensure URLs are publicly accessible.
Configuration Options
LLM Configuration
bot = ParallelDocQA(
llm = {
'model' : 'qwen2.5-72b-instruct' ,
'model_server' : 'dashscope' , # or custom URL
'api_key' : os.getenv( 'DASHSCOPE_API_KEY' ),
'generate_cfg' : {
'max_retries' : 10 ,
'temperature' : 0.1 , # Lower for more factual responses
'top_p' : 0.9 ,
}
}
)
Agent Configuration
bot = ParallelDocQA(
llm = llm_cfg,
name = 'Document Assistant' ,
description = 'Expert at analyzing long documents' ,
# RAG Configuration
rag_cfg = {
'chunk_size' : 800 , # Characters per chunk
'chunk_overlap' : 100 , # Overlap between chunks
'top_k' : 5 , # Number of chunks to retrieve
},
# Parallel processing
max_workers = 4 , # Number of parallel workers
)
Advanced Usage Patterns
Multi-Document Analysis
Analyze multiple documents simultaneously:
messages = [{
'role' : 'user' ,
'content' : [
{ 'text' : 'Compare the methodologies in these papers' },
{ 'file' : 'paper1.pdf' },
{ 'file' : 'paper2.pdf' },
{ 'file' : 'paper3.pdf' },
]
}]
for response in bot.run(messages):
print (response)
Multi-Turn Conversations
Build on previous questions:
messages = []
# First question
messages.append({
'role' : 'user' ,
'content' : [
{ 'text' : 'What is the main contribution?' },
{ 'file' : 'paper.pdf' }
]
})
for response in bot.run(messages):
print (response)
messages.extend(response)
# Follow-up question
messages.append({
'role' : 'user' ,
'content' : 'How does this compare to prior work?'
})
for response in bot.run(messages):
print (response)
messages.extend(response)
# Another follow-up
messages.append({
'role' : 'user' ,
'content' : 'What are the limitations?'
})
for response in bot.run(messages):
print (response)
Specific Section Queries
Ask about specific parts:
queries = [
'Summarize the introduction' ,
'What methodology is used?' ,
'What are the key results?' ,
'What are the conclusions?' ,
]
for query in queries:
messages = [{
'role' : 'user' ,
'content' : [
{ 'text' : query},
{ 'file' : 'paper.pdf' }
]
}]
print ( f " \n ### { query } " )
for response in bot.run(messages):
print (response[ - 1 ][ 'content' ])
Virtual Memory QA
For extremely large documents, use VirtualMemoryQA:
from qwen_agent.agents import VirtualMemoryQA
bot = VirtualMemoryQA(
llm = { 'model' : 'qwen2.5-72b-instruct' },
memory_limit = 100000 , # Tokens to keep in memory
)
messages = [{
'role' : 'user' ,
'content' : [
{ 'text' : 'Summarize this entire book' },
{ 'file' : 'large_book.pdf' } # 500+ pages
]
}]
for response in bot.run(messages):
print (response)
Virtual Memory Features
Unlimited Size Handle documents of any size
Smart Caching Keep most relevant content in memory
Context Management Automatically manage token limits
Incremental Loading Load content as needed
Best Practices
Query Optimization
Instead of “Tell me about this paper”, ask “What methodology does this paper use to evaluate performance?”
Ask questions in the same language as the document for best results.
Split complex questions into multiple simpler queries for better accuracy.
Mention specific sections or chapters when relevant: “What does Chapter 3 say about…?”
# For faster responses on shorter documents
bot = ParallelDocQA(
llm = {
'model' : 'qwen-turbo' , # Faster model
'generate_cfg' : {
'temperature' : 0.1 ,
}
},
rag_cfg = {
'chunk_size' : 500 , # Smaller chunks
'top_k' : 3 , # Fewer chunks
},
max_workers = 8 , # More parallelism
)
# For better accuracy on complex documents
bot = ParallelDocQA(
llm = {
'model' : 'qwen-max' , # More capable model
'generate_cfg' : {
'temperature' : 0.0 , # More deterministic
}
},
rag_cfg = {
'chunk_size' : 1000 , # Larger context
'top_k' : 7 , # More chunks
},
)
Common Use Cases
Academic Research
# Analyze research papers
bot = ParallelDocQA( llm = { 'model' : 'qwen-max' })
questions = [
'What is the research question?' ,
'What methodology is used?' ,
'What are the main findings?' ,
'What are the limitations?' ,
'What future work is suggested?' ,
]
for question in questions:
messages = [{ 'role' : 'user' , 'content' : [
{ 'text' : question},
{ 'file' : 'research_paper.pdf' }
]}]
for response in bot.run(messages):
print ( f " { question } : { response[ - 1 ][ 'content' ] } " )
Legal Document Analysis
# Review contracts and legal documents
bot = ParallelDocQA(
llm = { 'model' : 'qwen-max' },
rag_cfg = { 'chunk_size' : 1200 } # Larger chunks for legal text
)
messages = [{
'role' : 'user' ,
'content' : [
{ 'text' : 'Identify key obligations and deadlines' },
{ 'file' : 'contract.pdf' }
]
}]
Technical Documentation
# Navigate technical manuals
bot = ParallelDocQA( llm = { 'model' : 'qwen-max' })
messages = [{
'role' : 'user' ,
'content' : [
{ 'text' : 'How do I configure SSL certificates?' },
{ 'file' : 'technical_manual.pdf' }
]
}]
Troubleshooting
Reduce max_workers if API rate limits are hit
Use a faster model like qwen-turbo
Reduce chunk_size and top_k
Ensure good network connection for remote files
Increase top_k to retrieve more chunks
Use a more capable model like qwen-max
Lower temperature to 0.0 for more factual responses
Try rephrasing your question more specifically
Verify file format is supported
Check file is not corrupted
Ensure remote URLs are accessible
Try converting to a different format
Use VirtualMemoryQA for very large documents
Reduce chunk_size
Process documents in sections
Limitations
Be aware of these limitations:
Scanned PDFs require OCR which may have errors
Very large files (>100MB) may be slow to process
Complex formatting may not be fully preserved
Images and figures are not analyzed (use vision models)
Subject to LLM context window limits
Next Steps
Assistant Demos Combine document QA with other capabilities
Multi-Agent Chat Use multiple agents for collaborative analysis