Overview
SGLang supports deployment across major cloud platforms, leveraging managed services for Kubernetes, GPUs, and TPUs. This guide covers platform-specific configurations and best practices.Amazon Web Services (AWS)
AWS SageMaker
AWS SageMaker provides managed inference with built-in SGLang container support.Prerequisites
- AWS account with SageMaker access
- IAM role with SageMaker permissions
- AWS CLI configured
- SGLang container on Amazon ECR
Build and Push Container
Deploy Model Endpoint
Use the SageMaker Python SDK:SageMaker Environment Variables
The SageMaker container uses environment variables with theSM_SGLANG_ prefix:
Query SageMaker Endpoint
AWS Deep Learning Containers
AWS maintains official SGLang containers with security patches:Amazon EKS
Deploy SGLang on Elastic Kubernetes Service:Create EKS Cluster
Install NVIDIA Device Plugin
Deploy SGLang
Follow the Kubernetes deployment guide with EKS-specific configurations:AWS EC2
Direct deployment on EC2 GPU instances:Launch GPU Instance
Install and Run SGLang
Google Cloud Platform (GCP)
Google Kubernetes Engine (GKE)
Create GKE Cluster with GPUs
Deploy SGLang on GKE
Google Cloud TPU
SGLang supports TPU inference through the JAX backend:Prerequisites
- TPU v5e, v6e, or v7 instance
- SGLang-JAX installation
Using SkyPilot
Direct TPU VM Setup
Google Compute Engine
Microsoft Azure
Azure Kubernetes Service (AKS)
Create AKS Cluster
Deploy SGLang
Use standard Kubernetes manifests from the Kubernetes guide.Azure VM
Azure Container Instances
Other Cloud Providers
Oracle Cloud Infrastructure (OCI)
Alibaba Cloud
Lambda Labs
Lambda Labs provides cost-effective GPU cloud:Cloud Storage Integration
AWS S3 for Models
Google Cloud Storage
Azure Blob Storage
Cost Optimization
Use Spot/Preemptible Instances
AWS Spot Instances:Auto-Scaling
Implement cluster autoscaling to scale down during low usage:Security Best Practices
Network Security
- Use private subnets for compute instances
- Implement VPC peering for multi-region deployments
- Configure security groups to restrict access:
Secrets Management
AWS Secrets Manager:Monitoring and Logging
AWS CloudWatch
GCP Cloud Logging
Azure Monitor
Next Steps
- Kubernetes Deployment - Detailed K8s configurations
- Multi-Node Setup - Distributed deployments
- Docker Deployment - Container configurations
