Skip to main content

Overview

BasicDocQA is a specialized agent for document question-answering. It retrieves relevant information from documents and provides clear, focused answers based on the retrieved content.

Class Signature

from qwen_agent.agents import BasicDocQA

class BasicDocQA(Assistant):
    def __init__(
        self,
        function_list: Optional[List[Union[str, Dict, BaseTool]]] = None,
        llm: Optional[Union[Dict, BaseChatModel]] = None,
        system_message: Optional[str] = DEFAULT_SYSTEM_MESSAGE,
        name: Optional[str] = DEFAULT_NAME,
        description: Optional[str] = DEFAULT_DESC,
        files: Optional[List[str]] = None,
        rag_cfg: Optional[Dict] = None
    )

Constructor Parameters

function_list
List[Union[str, Dict, BaseTool]]
Additional tools for the agent (optional)
llm
Union[Dict, BaseChatModel]
LLM configuration:
{'model': 'qwen-max', 'api_key': 'your-api-key'}
system_message
str
Custom system message (has default document QA prompt)
name
str
default:"Basic DocQA"
Agent name
description
str
Agent description (Chinese by default)
files
List[str]
Document files (PDFs, Word docs, text files, web pages)
rag_cfg
Dict
RAG configuration for retrieval

Methods

run

def run(
    self,
    messages: List[Message],
    lang: str = 'en',
    **kwargs
) -> Iterator[List[Message]]
Runs document Q&A with retrieval.
messages
List[Message]
required
Conversation messages
lang
str
default:"en"
Language: ‘en’ or ‘zh’
return
Iterator[List[Message]]
Streaming response based on retrieved documents

Default Prompt

BasicDocQA uses a specialized prompt: English:
Please fully understand the content of the following reference materials 
and organize a clear response that meets the user's questions.
# Reference materials:
{retrieved_content}
Chinese:
请充分理解以下参考资料内容,组织出满足用户提问的条理清晰的回复。
#参考资料:
{retrieved_content}

Usage Examples

Basic Usage

from qwen_agent.agents import BasicDocQA
from qwen_agent.llm.schema import Message

# Create agent with documents
agent = BasicDocQA(
    llm={'model': 'qwen-max', 'api_key': 'your-api-key'},
    files=['user_manual.pdf', 'faq.txt']
)

# Ask question
messages = [Message(role='user', content='How do I reset my password?')]

for response in agent.run(messages):
    print(response[-1].content)

Multiple Documents

agent = BasicDocQA(
    llm={'model': 'qwen-max'},
    files=[
        'product_specifications.pdf',
        'installation_guide.pdf',
        'https://example.com/docs/troubleshooting.html'
    ]
)

messages = [
    Message(
        role='user',
        content='What are the system requirements for installation?'
    )
]

for response in agent.run(messages):
    print(response[-1].content)

Chinese Documents

agent = BasicDocQA(
    llm={'model': 'qwen-max'},
    files=['产品说明.pdf', '用户指南.docx']
)

messages = [Message(role='user', content='如何使用这个功能?')]

for response in agent.run(messages, lang='zh'):
    print(response[-1].content)

With Custom Prompt

custom_prompt = """
You are a technical support assistant. Based on the reference materials below,
provide a detailed, step-by-step answer to the user's question.
Reference materials:
"""

agent = BasicDocQA(
    llm={'model': 'qwen-max'},
    files=['technical_docs.pdf'],
    system_message=custom_prompt
)

messages = [Message(role='user', content='How do I configure the database?')]

for response in agent.run(messages):
    print(response[-1].content)

RAG Configuration

agent = BasicDocQA(
    llm={'model': 'qwen-max'},
    files=['large_document.pdf'],
    rag_cfg={
        'max_ref_token': 6000,        # More context
        'parser_page_size': 400,       # Chunk size
        'rag_searchers': ['keyword_search']  # Search method
    }
)

messages = [Message(role='user', content='Summarize chapter 3')]

for response in agent.run(messages):
    print(response[-1].content)

Multi-turn Conversation

agent = BasicDocQA(
    llm={'model': 'qwen-max'},
    files=['company_handbook.pdf']
)

messages = []

# First question
messages.append(Message(role='user', content='What are the vacation policies?'))
for response in agent.run(messages):
    pass
messages.extend(response)
print(f"A: {messages[-1].content}")

# Follow-up question
messages.append(Message(role='user', content='How many days do I get per year?'))
for response in agent.run(messages):
    pass
messages.extend(response)
print(f"A: {messages[-1].content}")

Non-streaming Mode

agent = BasicDocQA(
    llm={'model': 'qwen-max'},
    files=['documentation.pdf']
)

messages = [Message(role='user', content='What is the API endpoint?')]

# Get complete answer at once
response = agent.run_nonstream(messages)
print(response[-1].content)

Specific Section Queries

agent = BasicDocQA(
    llm={'model': 'qwen-max'},
    files=['report.pdf']
)

# Good for queries about specific sections
messages = [
    Message(
        role='user',
        content='Explain Table 1 from the report'
    )
]

for response in agent.run(messages):
    print(response[-1].content)

Comparison with Assistant

BasicDocQA

  • Focus: Pure document Q&A
  • Prompt: Specialized for reference-based answers
  • Use case: Specific questions about document content
  • Best for: FAQ, documentation lookup, fact-finding

Assistant

  • Focus: General-purpose with RAG
  • Prompt: More flexible
  • Use case: Complex tasks with documents and tools
  • Best for: Multi-step tasks, tool usage + documents

Supported File Types

BasicDocQA supports:
  • PDF (.pdf)
  • Word documents (.docx)
  • Text files (.txt)
  • HTML pages (URLs)
  • PowerPoint (.pptx)
  • Excel (.xlsx)
agent = BasicDocQA(
    llm={'model': 'qwen-max'},
    files=[
        'report.pdf',
        'data.xlsx',
        'presentation.pptx',
        'https://docs.example.com'
    ]
)

When to Use BasicDocQA

Use BasicDocQA when:
  • You need simple document question-answering
  • Questions are about specific details in documents
  • You want focused, reference-based answers
  • No tool calling is needed
Use Assistant or ParallelDocQA when:
  • You need tool calling capabilities
  • Documents are very large (use ParallelDocQA)
  • Task involves multiple steps beyond Q&A

Error Handling

try:
    agent = BasicDocQA(
        llm={'model': 'qwen-max'},
        files=['nonexistent.pdf']
    )
    messages = [Message(role='user', content='Question?')]
    response = agent.run_nonstream(messages)
except Exception as e:
    print(f"Error: {e}")

See Also

Build docs developers (and LLMs) love