Overview
The web demo (web_demo.py) creates an interactive chat interface with:
- Modern web-based UI with chat bubbles
- Markdown and code syntax highlighting
- Message regeneration capability
- History management
- Shareable public links
- Auto-launch in browser
- Customizable server settings
Installation
Basic Usage
Quick Start
Launch the web demo with default settings:http://127.0.0.1:8000 by default.
Command-Line Options
Model checkpoint name or path from HuggingFace/ModelScope
Run the demo with CPU only (no GPU required)
Create a publicly shareable Gradio link (tunnels through Gradio servers)
Automatically open the interface in your default browser
Port number for the web server
Server hostname or IP address (use “0.0.0.0” to allow external access)
Usage Examples
Interface Features
Chat Interface
The web demo provides a clean, modern chat interface with:- Chat Display: Shows conversation history with proper formatting
- Input Box: Multi-line text input for your messages
- Action Buttons:
- 🚀 Submit: Send your message
- 🧹 Clear History: Reset the conversation
- 🤔 Regenerate: Re-generate the last response
Message Formatting
The demo supports rich text formatting:- Markdown
- Code Blocks
- LaTeX
Messages are rendered with full Markdown support:
- Bold and italic text
- Lists and bullet points
- Links and quotes
- Tables
Key Functions
Message Submission
When you submit a message, the interface:- Displays your message in the chat
- Shows a streaming response from the model
- Updates the conversation history
web_demo.py:119
Regenerate Response
Click the “Regenerate” button to get a different response to your last message:web_demo.py:134
Clear History
Reset the conversation and free up memory:web_demo.py:145
Deployment Options
Local Network Access
Allow other devices on your network to access the demo:Public Sharing
Create a public shareable link:Production Deployment
For production environments, consider:- Docker
- Systemd Service
- Nginx Reverse Proxy
Create a Build and run:
Dockerfile:Customization
UI Customization
The demo interface is defined using Gradio Blocks:web_demo.py:151
- Logo and branding
- Colors and styling (via CSS)
- Button labels and icons
- Layout and spacing
Text Processing
The demo includes custom text processing for better display:web_demo.py:78
- Formats code blocks properly
- Handles special characters in code
- Preserves whitespace and indentation
Performance Optimization
Memory Management
web_demo.py:110
Queueing
Gradio’s queue system is enabled for better concurrency:web_demo.py:192
- Multiple users to interact simultaneously
- Requests to be processed in order
- Better handling of long-running generations
Response Streaming
The demo uses streaming for real-time responses:web_demo.py:124
- Users see responses as they’re generated
- Better perceived performance
- Can stop generation early if needed
Troubleshooting
Port already in use
Port already in use
If port 8000 is occupied:
Cannot access from other machines
Cannot access from other machines
Make sure to:
- Use
--server-name 0.0.0.0to bind to all interfaces - Check firewall settings allow the port
- Use the correct IP address (not 127.0.0.1)
Gradio import errors
Gradio import errors
If you see import errors:
Slow response times
Slow response times
To improve performance:
- Use GPU instead of CPU mode
- Use quantized models (Int4/Int8)
- Reduce max token length in generation config
- Enable Flash Attention if available
- Clear history regularly
Share link not working
Share link not working
Advanced Features
Custom CSS Styling
Add custom CSS to Gradio interface:Adding Authentication
Protect your demo with a password:Multiple Concurrent Users
Gradio’s queue handles multiple users automatically, but for heavy loads consider:- Use multiple model replicas
- Implement request batching
- Add rate limiting
- Deploy with a load balancer
Source Code Reference
The web demo implementation can be found atweb_demo.py:1 in the Qwen repository.
Key components:
- Argument parsing:
web_demo.py:21 - Model loading:
web_demo.py:40 - Text processing:
web_demo.py:78 - Interface definition:
web_demo.py:151 - Launch configuration:
web_demo.py:192
Next Steps
CLI Demo
Try the command-line interface
API Deployment
Deploy Qwen as an API service