Platform Requirements: Mini-SGLang supports Linux only (x86_64 and aarch64). For Windows users, use WSL2. macOS is not supported due to dependencies on Linux-specific CUDA kernels.
Installation and First Run
Launch the server
Start an OpenAI-compatible API server with a single command:The server will start on
http://localhost:1919 by default. You’ll see output indicating the server is ready:Alternative: Interactive Shell
For quick testing and exploration, launch the interactive shell mode:/reset to clear chat history or /exit to quit.
Quick Examples
Next Steps
Now that you have Mini-SGLang running, explore more capabilities:- Installation Guide - Detailed installation options including Docker and WSL2
- Server Configuration - Configure advanced options like Tensor Parallelism and attention backends
- API Reference - Complete OpenAI-compatible API documentation
- Core Concepts - Learn about Radix Cache, Chunked Prefill, and other optimizations
If you encounter network issues downloading models from HuggingFace, use
--model-source modelscope to download from ModelScope instead.