Skip to main content

Overview

Qwen Vision-Language models that support images, videos, and audio inputs.

Class Signature

from qwen_agent.llm import get_chat_model

llm = get_chat_model({
    'model': 'qwen-vl-max',
    'model_type': 'qwenvl_dashscope',
    'api_key': 'your-api-key'
})

Configuration

model
str
required
Model: ‘qwen-vl-max’, ‘qwen-vl-plus’, ‘qwen-vl-ocr’
model_type
str
required
Must be ‘qwenvl_dashscope’

Usage Example

from qwen_agent.llm import get_chat_model
from qwen_agent.llm.schema import Message, ContentItem

llm = get_chat_model({
    'model': 'qwen-vl-max',
    'model_type': 'qwenvl_dashscope',
    'api_key': 'your-api-key'
})

# Image input
messages = [
    Message(role='user', content=[
        ContentItem(text='What is in this image?'),
        ContentItem(image='https://example.com/image.jpg')
    ])
]

for response in llm.chat(messages=messages):
    print(response[-1].content)

Supported Inputs

  • Images: URLs or local file paths
  • Videos: Video file URLs
  • Audio: Audio files (select models)

See Also

Build docs developers (and LLMs) love