Skip to main content
Qwen-Agent provides specialized agents for handling long documents like research papers, books, and technical documentation. These agents use parallel processing and retrieval-augmented generation (RAG) for efficient analysis.

ParallelDocQA Agent

The ParallelDocQA agent processes documents in parallel for faster question answering.

Basic Usage

parallel_doc_qa.py
from qwen_agent.agents.doc_qa import ParallelDocQA
from qwen_agent.gui import WebUI

def test():
    bot = ParallelDocQA(
        llm={
            'model': 'qwen2.5-72b-instruct',
            'generate_cfg': {
                'max_retries': 10
            }
        }
    )
    
    messages = [
        {
            'role': 'user',
            'content': [
                {'text': '介绍实验方法'},
                {'file': 'https://arxiv.org/pdf/2310.08560.pdf'}
            ]
        },
    ]
    
    for rsp in bot.run(messages):
        print('bot response:', rsp)

if __name__ == '__main__':
    test()

How It Works

1

Document Chunking

The document is split into manageable chunks (pages or sections)
2

Parallel Processing

Multiple chunks are processed simultaneously using parallel queries
3

RAG Retrieval

Relevant chunks are retrieved based on the question
4

Answer Synthesis

Retrieved content is combined to generate a comprehensive answer

Performance Characteristics

Speed

2-3x faster than sequential processing for long documents

Accuracy

High accuracy through comprehensive chunk analysis

Memory

Efficient memory usage with streaming

Scalability

Handles documents of 100+ pages

Web UI Application

Create a user-friendly interface for document QA:
from qwen_agent.agents.doc_qa import ParallelDocQA
from qwen_agent.gui import WebUI

def app_gui():
    # Define the agent
    bot = ParallelDocQA(
        llm={
            'model': 'qwen2.5-72b-instruct',
            'generate_cfg': {
                'max_retries': 10
            }
        },
        description='并行QA后用RAG召回内容并回答。支持文件类型:PDF/Word/PPT/TXT/HTML。使用与材料相同的语言提问会更好。',
    )

    chatbot_config = {
        'prompt.suggestions': [
            {'text': '介绍实验方法'},
            {'text': '这篇论文的主要贡献是什么?'},
            {'text': '总结第三章的内容'},
        ]
    }

    WebUI(bot, chatbot_config=chatbot_config).run()

if __name__ == '__main__':
    app_gui()

Features

  • Drag-and-drop file upload
  • Multi-format support: PDF, Word, PowerPoint, TXT, HTML
  • Streaming responses: See answers as they’re generated
  • Multi-turn conversations: Build on previous questions
  • Suggested prompts: Guide users with example questions

Supported Document Formats

Full Support
  • Text extraction from all pages
  • Handles multi-column layouts
  • Preserves document structure
  • Supports both scanned and native PDFs (with OCR)
messages = [{
    'role': 'user',
    'content': [
        {'text': 'What is the main conclusion?'},
        {'file': 'paper.pdf'}
    ]
}]

Remote Files

Load documents from URLs:
messages = [{
    'role': 'user',
    'content': [
        {'text': 'Explain the transformer architecture'},
        {'file': 'https://arxiv.org/pdf/1706.03762.pdf'}
    ]
}]

for response in bot.run(messages):
    print(response)
The agent automatically downloads and processes remote files. Ensure URLs are publicly accessible.

Configuration Options

LLM Configuration

bot = ParallelDocQA(
    llm={
        'model': 'qwen2.5-72b-instruct',
        'model_server': 'dashscope',  # or custom URL
        'api_key': os.getenv('DASHSCOPE_API_KEY'),
        'generate_cfg': {
            'max_retries': 10,
            'temperature': 0.1,  # Lower for more factual responses
            'top_p': 0.9,
        }
    }
)

Agent Configuration

bot = ParallelDocQA(
    llm=llm_cfg,
    name='Document Assistant',
    description='Expert at analyzing long documents',
    
    # RAG Configuration
    rag_cfg={
        'chunk_size': 800,  # Characters per chunk
        'chunk_overlap': 100,  # Overlap between chunks
        'top_k': 5,  # Number of chunks to retrieve
    },
    
    # Parallel processing
    max_workers=4,  # Number of parallel workers
)

Advanced Usage Patterns

Multi-Document Analysis

Analyze multiple documents simultaneously:
messages = [{
    'role': 'user',
    'content': [
        {'text': 'Compare the methodologies in these papers'},
        {'file': 'paper1.pdf'},
        {'file': 'paper2.pdf'},
        {'file': 'paper3.pdf'},
    ]
}]

for response in bot.run(messages):
    print(response)

Multi-Turn Conversations

Build on previous questions:
messages = []

# First question
messages.append({
    'role': 'user',
    'content': [
        {'text': 'What is the main contribution?'},
        {'file': 'paper.pdf'}
    ]
})

for response in bot.run(messages):
    print(response)
messages.extend(response)

# Follow-up question
messages.append({
    'role': 'user',
    'content': 'How does this compare to prior work?'
})

for response in bot.run(messages):
    print(response)
messages.extend(response)

# Another follow-up
messages.append({
    'role': 'user',
    'content': 'What are the limitations?'
})

for response in bot.run(messages):
    print(response)

Specific Section Queries

Ask about specific parts:
queries = [
    'Summarize the introduction',
    'What methodology is used?',
    'What are the key results?',
    'What are the conclusions?',
]

for query in queries:
    messages = [{
        'role': 'user',
        'content': [
            {'text': query},
            {'file': 'paper.pdf'}
        ]
    }]
    
    print(f"\n### {query}")
    for response in bot.run(messages):
        print(response[-1]['content'])

Virtual Memory QA

For extremely large documents, use VirtualMemoryQA:
from qwen_agent.agents import VirtualMemoryQA

bot = VirtualMemoryQA(
    llm={'model': 'qwen2.5-72b-instruct'},
    memory_limit=100000,  # Tokens to keep in memory
)

messages = [{
    'role': 'user',
    'content': [
        {'text': 'Summarize this entire book'},
        {'file': 'large_book.pdf'}  # 500+ pages
    ]
}]

for response in bot.run(messages):
    print(response)

Virtual Memory Features

Unlimited Size

Handle documents of any size

Smart Caching

Keep most relevant content in memory

Context Management

Automatically manage token limits

Incremental Loading

Load content as needed

Best Practices

Query Optimization

Instead of “Tell me about this paper”, ask “What methodology does this paper use to evaluate performance?”
Ask questions in the same language as the document for best results.
Split complex questions into multiple simpler queries for better accuracy.
Mention specific sections or chapters when relevant: “What does Chapter 3 say about…?”

Performance Optimization

# For faster responses on shorter documents
bot = ParallelDocQA(
    llm={
        'model': 'qwen-turbo',  # Faster model
        'generate_cfg': {
            'temperature': 0.1,
        }
    },
    rag_cfg={
        'chunk_size': 500,  # Smaller chunks
        'top_k': 3,  # Fewer chunks
    },
    max_workers=8,  # More parallelism
)

# For better accuracy on complex documents
bot = ParallelDocQA(
    llm={
        'model': 'qwen-max',  # More capable model
        'generate_cfg': {
            'temperature': 0.0,  # More deterministic
        }
    },
    rag_cfg={
        'chunk_size': 1000,  # Larger context
        'top_k': 7,  # More chunks
    },
)

Common Use Cases

Academic Research

# Analyze research papers
bot = ParallelDocQA(llm={'model': 'qwen-max'})

questions = [
    'What is the research question?',
    'What methodology is used?',
    'What are the main findings?',
    'What are the limitations?',
    'What future work is suggested?',
]

for question in questions:
    messages = [{'role': 'user', 'content': [
        {'text': question},
        {'file': 'research_paper.pdf'}
    ]}]
    for response in bot.run(messages):
        print(f"{question}: {response[-1]['content']}")
# Review contracts and legal documents
bot = ParallelDocQA(
    llm={'model': 'qwen-max'},
    rag_cfg={'chunk_size': 1200}  # Larger chunks for legal text
)

messages = [{
    'role': 'user',
    'content': [
        {'text': 'Identify key obligations and deadlines'},
        {'file': 'contract.pdf'}
    ]
}]

Technical Documentation

# Navigate technical manuals
bot = ParallelDocQA(llm={'model': 'qwen-max'})

messages = [{
    'role': 'user',
    'content': [
        {'text': 'How do I configure SSL certificates?'},
        {'file': 'technical_manual.pdf'}
    ]
}]

Troubleshooting

  • Reduce max_workers if API rate limits are hit
  • Use a faster model like qwen-turbo
  • Reduce chunk_size and top_k
  • Ensure good network connection for remote files
  • Increase top_k to retrieve more chunks
  • Use a more capable model like qwen-max
  • Lower temperature to 0.0 for more factual responses
  • Try rephrasing your question more specifically
  • Verify file format is supported
  • Check file is not corrupted
  • Ensure remote URLs are accessible
  • Try converting to a different format
  • Use VirtualMemoryQA for very large documents
  • Reduce chunk_size
  • Process documents in sections

Limitations

Be aware of these limitations:
  • Scanned PDFs require OCR which may have errors
  • Very large files (>100MB) may be slow to process
  • Complex formatting may not be fully preserved
  • Images and figures are not analyzed (use vision models)
  • Subject to LLM context window limits

Next Steps

Assistant Demos

Combine document QA with other capabilities

Multi-Agent Chat

Use multiple agents for collaborative analysis

Build docs developers (and LLMs) love