Overview
BasicDocQA is a specialized agent for document question-answering. It retrieves relevant information from documents and provides clear, focused answers based on the retrieved content.
Class Signature
from qwen_agent.agents import BasicDocQA
class BasicDocQA(Assistant):
def __init__(
self,
function_list: Optional[List[Union[str, Dict, BaseTool]]] = None,
llm: Optional[Union[Dict, BaseChatModel]] = None,
system_message: Optional[str] = DEFAULT_SYSTEM_MESSAGE,
name: Optional[str] = DEFAULT_NAME,
description: Optional[str] = DEFAULT_DESC,
files: Optional[List[str]] = None,
rag_cfg: Optional[Dict] = None
)
Constructor Parameters
function_list
List[Union[str, Dict, BaseTool]]
Additional tools for the agent (optional)
llm
Union[Dict, BaseChatModel]
LLM configuration:{'model': 'qwen-max', 'api_key': 'your-api-key'}
Custom system message (has default document QA prompt)
Agent description (Chinese by default)
Document files (PDFs, Word docs, text files, web pages)
RAG configuration for retrieval
Methods
run
def run(
self,
messages: List[Message],
lang: str = 'en',
**kwargs
) -> Iterator[List[Message]]
Runs document Q&A with retrieval.
Streaming response based on retrieved documents
Default Prompt
BasicDocQA uses a specialized prompt:
English:
Please fully understand the content of the following reference materials
and organize a clear response that meets the user's questions.
# Reference materials:
{retrieved_content}
Chinese:
请充分理解以下参考资料内容,组织出满足用户提问的条理清晰的回复。
#参考资料:
{retrieved_content}
Usage Examples
Basic Usage
from qwen_agent.agents import BasicDocQA
from qwen_agent.llm.schema import Message
# Create agent with documents
agent = BasicDocQA(
llm={'model': 'qwen-max', 'api_key': 'your-api-key'},
files=['user_manual.pdf', 'faq.txt']
)
# Ask question
messages = [Message(role='user', content='How do I reset my password?')]
for response in agent.run(messages):
print(response[-1].content)
Multiple Documents
agent = BasicDocQA(
llm={'model': 'qwen-max'},
files=[
'product_specifications.pdf',
'installation_guide.pdf',
'https://example.com/docs/troubleshooting.html'
]
)
messages = [
Message(
role='user',
content='What are the system requirements for installation?'
)
]
for response in agent.run(messages):
print(response[-1].content)
Chinese Documents
agent = BasicDocQA(
llm={'model': 'qwen-max'},
files=['产品说明.pdf', '用户指南.docx']
)
messages = [Message(role='user', content='如何使用这个功能?')]
for response in agent.run(messages, lang='zh'):
print(response[-1].content)
With Custom Prompt
custom_prompt = """
You are a technical support assistant. Based on the reference materials below,
provide a detailed, step-by-step answer to the user's question.
Reference materials:
"""
agent = BasicDocQA(
llm={'model': 'qwen-max'},
files=['technical_docs.pdf'],
system_message=custom_prompt
)
messages = [Message(role='user', content='How do I configure the database?')]
for response in agent.run(messages):
print(response[-1].content)
RAG Configuration
agent = BasicDocQA(
llm={'model': 'qwen-max'},
files=['large_document.pdf'],
rag_cfg={
'max_ref_token': 6000, # More context
'parser_page_size': 400, # Chunk size
'rag_searchers': ['keyword_search'] # Search method
}
)
messages = [Message(role='user', content='Summarize chapter 3')]
for response in agent.run(messages):
print(response[-1].content)
Multi-turn Conversation
agent = BasicDocQA(
llm={'model': 'qwen-max'},
files=['company_handbook.pdf']
)
messages = []
# First question
messages.append(Message(role='user', content='What are the vacation policies?'))
for response in agent.run(messages):
pass
messages.extend(response)
print(f"A: {messages[-1].content}")
# Follow-up question
messages.append(Message(role='user', content='How many days do I get per year?'))
for response in agent.run(messages):
pass
messages.extend(response)
print(f"A: {messages[-1].content}")
Non-streaming Mode
agent = BasicDocQA(
llm={'model': 'qwen-max'},
files=['documentation.pdf']
)
messages = [Message(role='user', content='What is the API endpoint?')]
# Get complete answer at once
response = agent.run_nonstream(messages)
print(response[-1].content)
Specific Section Queries
agent = BasicDocQA(
llm={'model': 'qwen-max'},
files=['report.pdf']
)
# Good for queries about specific sections
messages = [
Message(
role='user',
content='Explain Table 1 from the report'
)
]
for response in agent.run(messages):
print(response[-1].content)
Comparison with Assistant
BasicDocQA
- Focus: Pure document Q&A
- Prompt: Specialized for reference-based answers
- Use case: Specific questions about document content
- Best for: FAQ, documentation lookup, fact-finding
Assistant
- Focus: General-purpose with RAG
- Prompt: More flexible
- Use case: Complex tasks with documents and tools
- Best for: Multi-step tasks, tool usage + documents
Supported File Types
BasicDocQA supports:
- PDF (
.pdf)
- Word documents (
.docx)
- Text files (
.txt)
- HTML pages (URLs)
- PowerPoint (
.pptx)
- Excel (
.xlsx)
agent = BasicDocQA(
llm={'model': 'qwen-max'},
files=[
'report.pdf',
'data.xlsx',
'presentation.pptx',
'https://docs.example.com'
]
)
When to Use BasicDocQA
Use BasicDocQA when:
- You need simple document question-answering
- Questions are about specific details in documents
- You want focused, reference-based answers
- No tool calling is needed
Use Assistant or ParallelDocQA when:
- You need tool calling capabilities
- Documents are very large (use ParallelDocQA)
- Task involves multiple steps beyond Q&A
Error Handling
try:
agent = BasicDocQA(
llm={'model': 'qwen-max'},
files=['nonexistent.pdf']
)
messages = [Message(role='user', content='Question?')]
response = agent.run_nonstream(messages)
except Exception as e:
print(f"Error: {e}")
See Also