Skip to main content

Welcome to Qwen

Qwen (通义千问) is a series of open-source large language models developed by Alibaba Cloud. The models range from 1.8B to 72B parameters and include both pretrained base models and chat-aligned variants optimized for conversational AI.

Quickstart

Get started with Qwen in minutes

Model Selection

Choose the right model for your use case

Fine-tuning

Customize models for your domain

API Reference

Explore the complete API

Key Features

Multiple Model Sizes

Choose from 1.8B, 7B, 14B, and 72B parameter models to balance performance and efficiency

Quantization Support

Run models efficiently with GPTQ Int4/Int8 and KV cache quantization

Long Context

Process up to 32K tokens in a single context window

Function Calling

Enable tool use and agent capabilities with built-in function calling

Multi-language

Strong performance on both Chinese and English tasks

OpenAI Compatible

Deploy with OpenAI-compatible API endpoints

Model Variants

Qwen offers two main types of models:
Qwen-1.8B, Qwen-7B, Qwen-14B, Qwen-72BPretrained language models trained on over 2.2-3.0 trillion tokens. Ideal for:
  • Further fine-tuning on domain-specific data
  • Research and experimentation
  • Building custom chat applications

Performance Highlights

Qwen-72B achieves state-of-the-art performance among open-source models:
  • MMLU: 77.4% (outperforms LLaMA2-70B and GPT-3.5)
  • C-Eval: 83.3% (leading performance on Chinese benchmarks)
  • GSM8K: 78.9% (strong mathematical reasoning)
  • HumanEval: 35.4% (competitive coding capabilities)
For detailed benchmark results and methodology, see the Benchmarks page and Technical Report.

Getting Started

1

Install Dependencies

Install the required packages:
pip install transformers>=4.32.0 torch>=2.0.0
2

Load a Model

Load and use a Qwen model in just a few lines:
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen-7B-Chat", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen-7B-Chat", device_map="auto", trust_remote_code=True).eval()

response, history = model.chat(tokenizer, "Hello!", history=None)
print(response)
3

Explore Advanced Features

Discover quantization, fine-tuning, and deployment options in the documentation

Next Steps

Installation Guide

Set up your environment and install Qwen

Model Overview

Compare model sizes and capabilities

Quantization

Optimize memory and speed with quantization

Deployment

Deploy Qwen in production environments

Community and Support

GitHub Repository

View source code and contribute

FAQ

Find answers to common questions
Note: This repository (QwenLM/Qwen) focuses on the first generation of Qwen models. For Qwen2, please visit QwenLM/Qwen2.

Build docs developers (and LLMs) love