Overview
Qwen-Agent supports multiple LLM providers and model types through a unified configuration interface. The framework automatically selects the appropriate model client based on your configuration.Basic Configuration
Configuration Parameters
Core Parameters
Model identifier. Examples:
'qwen-plus', 'qwen-max', 'gpt-4'Model service endpoint:
'dashscope'- Use Alibaba Cloud DashScope'http://127.0.0.1:7905/v1'- Custom OpenAI-compatible endpoint'https://api.openai.com/v1'- OpenAI API
API key for authentication. Can also be set via environment variables:
DASHSCOPE_API_KEYfor DashScopeOPENAI_API_KEYfor OpenAI
Explicitly specify the model type. Auto-detected if not provided.Available types:
'qwen_dashscope'- Qwen models via DashScope'qwenvl_dashscope'- Qwen-VL vision models'qwenaudio_dashscope'- Qwen-Audio models'oai'- OpenAI-compatible API'qwenvl_oai'- Vision models via OpenAI API'azure'- Azure OpenAI'transformers'- Local Hugging Face models'openvino'- OpenVINO optimized models
Generation hyperparameters (see below)
Generation Configuration
Thegenerate_cfg dictionary controls how the LLM generates responses.
Common Parameters
Nucleus sampling parameter. Controls diversity by sampling from top probability mass.
Range: 0.0 to 1.0
Sampling temperature. Higher values increase randomness.
Range: 0.0 to 2.0
Maximum number of tokens to generate in the response
Maximum input context length. Messages are automatically truncated if exceeded.
Set to -1 to disable truncation.
Number of retry attempts on service errors. Implements exponential backoff.
Random seed for reproducible generation. Auto-generated if not provided.
Stop sequences that halt generation when encountered
Directory for caching LLM responses. Requires
diskcache package.Function Calling Parameters
Enable parallel execution of multiple function calls in a single response
Control function calling behavior:
'auto'- Model decides whether to call functions'none'- Disable function calling'function_name'- Force call a specific function
Include reasoning thoughts in the content field along with function calls
Function calling prompt style:
'nous'- Nous Research format'qwen'- Qwen-specific format
Model Types
DashScope Models (Alibaba Cloud)
qwen_agent/llm/__init__.py:31-100
OpenAI-Compatible Models
Local Models
Using LLM Directly
Chat Interface
Function Calling
qwen_agent/llm/base.py:118-290
Advanced Configuration
Response Caching
Raw API Mode
Bypass Qwen-Agent preprocessing for direct model access:stream=True, delta_stream=False)
Source Reference: qwen_agent/llm/base.py:89-223
Error Handling
Message Schema
Message Format
qwen_agent/llm/schema.py:132-164
Content Types
Messages can contain multiple content types:Plain text content
Image URL or base64-encoded image
File URL or path
Audio URL or audio configuration
Video URL or frame list
qwen_agent/llm/schema.py:80-129
Best Practices
Model Selection
- Use qwen-max for complex reasoning tasks
- Use qwen-plus for balanced performance
- Use qwen-turbo for speed-critical applications
- Use vision models only when processing images
Context Management
- Set
max_input_tokensto prevent context overflow - The framework auto-truncates old messages when needed
- Keep system messages concise
- Consider RAG for large document contexts
Performance
- Enable caching for repeated queries
- Use streaming for better UX
- Configure
max_retriesfor production reliability - Use
qwen-turbofor latency-sensitive apps
Function Calling
- Use
parallel_function_callsfor independent operations - Set
function_choice='none'to disable functions temporarily - Always validate function arguments
- Handle tool errors gracefully
Environment Variables
API key for DashScope services
API key for OpenAI services
Enable raw API mode globally. Set to
'true' to enable.Related Resources
Agents
Learn how to use LLMs within agents
Function Calling
Deep dive into function calling