Practice: Cloud Platforms

Overview

In this practice exercise, you’ll compare managed ML platforms and implement multi-model endpoints on both AWS SageMaker and GCP Vertex AI. This hands-on experience will help you understand the trade-offs between platforms and gain practical deployment experience.

Learning Objectives

By completing this exercise, you will:

Deploy multiple models on AWS SageMaker multi-model endpoints
Deploy multiple models on GCP Vertex AI
Compare the developer experience, features, and costs of each platform
Evaluate buy vs build decisions for your specific context
Document platform recommendations with pros and cons

Tasks

Task 1: AWS SageMaker Multi-Model Deployment

Setup AWS Environment

Configure your AWS credentials and create required IAM roles:

# Configure AWS CLI
aws configure

# Create SageMaker execution role (if not exists)
aws iam create-role \
    --role-name sagemaker-execution-role \
    --assume-role-policy-document file://trust-policy.json

# Attach required policies
aws iam attach-role-policy \
    --role-name sagemaker-execution-role \
    --policy-arn arn:aws:iam::aws:policy/AmazonSageMakerFullAccess

Prepare Models

Create at least 2 different models for deployment:

One image classification model (e.g., ResNet, MobileNet)
One text/tabular model (e.g., simple classifier)

Package them in Triton-compatible format:

model_registry/
├── image_classifier_v1/
│   ├── config.pbtxt
│   └── 1/
│       └── model.pt
└── text_classifier_v1/
    ├── config.pbtxt
    └── 1/
        └── model.pt

Deploy Multi-Model Endpoint

Use the provided CLI tool to create and deploy:

# Create endpoint
python cli.py create-endpoint

# Add models
python cli.py add-model ./model_registry/image_classifier_v1/ image_v1.tar.gz
python cli.py add-model ./model_registry/text_classifier_v1/ text_v1.tar.gz

# Verify upload
aws s3 ls s3://your-bucket/models/

Test Inference

Invoke both models and measure performance:

# Test image model
python cli.py call-model-image image_v1.tar.gz

# Test text model
python cli.py call-model-vector text_v1.tar.gz

Measure:

Cold start latency (first request)
Warm latency (subsequent requests)
Throughput (requests per second)

Monitor and Optimize

Check CloudWatch metrics and optimize:

# View invocation metrics
aws cloudwatch get-metric-statistics \
    --namespace AWS/SageMaker \
    --metric-name ModelLoadingLatency \
    --dimensions Name=EndpointName,Value=sagemaker-poc

Success Criteria:

Multi-model endpoint successfully created
At least 2 models deployed and callable
Performance metrics documented
Code committed to repository

Task 2: GCP Vertex AI Multi-Model Deployment

Setup GCP Environment

Configure GCP credentials and enable required APIs:

# Authenticate
gcloud auth login
gcloud config set project YOUR_PROJECT_ID

# Enable APIs
gcloud services enable aiplatform.googleapis.com
gcloud services enable storage.googleapis.com

Deploy Models to Vertex AI

Upload and deploy models:

from google.cloud import aiplatform

aiplatform.init(project="YOUR_PROJECT_ID", location="us-central1")

# Upload model
model = aiplatform.Model.upload(
    display_name="image-classifier",
    artifact_uri="gs://your-bucket/models/image_v1",
    serving_container_image_uri="us-docker.pkg.dev/vertex-ai/prediction/pytorch-gpu.1-13:latest",
)

# Deploy to endpoint
endpoint = model.deploy(
    machine_type="n1-standard-4",
    accelerator_type="NVIDIA_TESLA_T4",
    accelerator_count=1,
)

Test Inference

Invoke models and compare with SageMaker:

# Make prediction
prediction = endpoint.predict(instances=[{
    "input": input_data.tolist()
}])

print(prediction.predictions)

Compare Approaches

Document differences in:

Deployment process complexity
API ergonomics
Monitoring capabilities
Cost structure

Vertex AI Note: Vertex AI doesn’t have direct multi-model endpoint support like SageMaker. You’ll need to:

Deploy models to separate endpoints, or
Use a custom prediction container that routes to multiple models, or
Use Vertex AI Prediction with NVIDIA Triton

Task 3: Platform Comparison Document

Create a comprehensive comparison document covering:

Technical Comparison
Cost Comparison
Developer Experience
Recommendation

Feature	AWS SageMaker	GCP Vertex AI	Winner
Multi-model endpoints	Native support	Custom container needed	AWS
Deployment API	Boto3 (verbose)	Cloud SDK (cleaner)	?
Monitoring	CloudWatch + Model Monitor	Cloud Monitoring	?
Custom containers	Full support	Full support	Tie
AutoML	Built-in	Strong AutoML	?
GPU support	Wide range	Good selection	?
Async inference	Native support	Need Cloud Tasks	AWS
Framework support	All major frameworks	All major frameworks	Tie

Calculate costs for your workload:Scenario: 10 models, 100K predictions/dayAWS SageMaker:

Multi-model endpoint: 1 × ml.g5.xlarge
- Compute: $1.41/hour × 730 hours = $1,029/month
- Storage: 50GB × $0.10/GB = $5/month
- Inference: 100K × $0.00 (included) = $0
Total: ~$1,034/month

GCP Vertex AI:

Separate endpoints: 10 × n1-standard-4 + T4 GPU
- Compute: 10 × $0.60/hour × 730 hours = $4,380/month
- Storage: 50GB × $0.026/GB = $1.30/month
- Predictions: 100K × $0.00 (included) = $0
Total: ~$4,381/month

OR with custom multi-model container:
- Compute: 1 × n1-standard-4 + T4 = $438/month
- Storage: $1.30/month
Total: ~$439/month

Deliverables

PR1: AWS SageMaker

Code for multi-model deployment on AWS SageMaker with:

CLI tool for endpoint management
At least 2 different model types
Testing scripts
README with instructions

PR2: GCP Vertex AI

Code for multi-model deployment on GCP Vertex AI with:

Deployment scripts
At least 2 different model types
Testing scripts
README with instructions

Platform Comparison

Google Doc or Markdown document with:

Technical feature comparison
Cost analysis for your workload
Developer experience notes
Recommendation with rationale

MLOps Stack Templates

Two alternative MLOps stack designs:

AWS SageMaker-based stack
GCP Vertex AI-based stack
Comparison with current implementation

Acceptance Criteria

Code Quality

✅ Code follows project style guide (ruff format, ruff check)
✅ All tests pass (pytest)
✅ Clear README with setup instructions
✅ No hardcoded credentials or account IDs
✅ Proper error handling and logging

Functionality

✅ Multi-model endpoints successfully deployed
✅ At least 2 models per platform
✅ Inference working for all models
✅ Performance metrics collected
✅ Cleanup scripts provided

Documentation

✅ Platform comparison document complete
✅ Cost analysis with specific numbers
✅ Clear recommendation with rationale
✅ MLOps stack templates documented
✅ Pros and cons for each approach

Reading List

Work through these resources to build understanding:

MLOps Platforms

AWS SageMaker

GCP Vertex AI

Azure ML

Azure AI Reference Architectures

Tips for Success

Cost Management:

Use the smallest instance types for testing
Delete endpoints when not in use
Set up billing alerts
Use free tier where possible
Clean up resources after testing

Development Workflow:

Start with SageMaker (better multi-model support)
Use the provided CLI tool as a starting point
Test with simple models first (faster iteration)
Document everything as you go
Take screenshots of monitoring dashboards
Track all costs for comparison

Common Issues:

IAM permissions: Ensure your role has S3, SageMaker access
Model format: Follow Triton model repository structure
Cold starts: First request will be slow (expected)
Region consistency: Keep all resources in same region

Keep Iterating!

After completing this module, continue exploring:

Azure Machine Learning for third platform perspective
Kubernetes-based alternatives (KServe, Seldon)
Model serving optimizations (TensorRT, ONNX)
A/B testing and canary deployments
Cost optimization techniques
Multi-cloud strategies

Continue Learning

Explore other modules in the ML in Production course

Module 1: Infrastructure

Module 2: Data Management

Module 3: Training Workflows

Module 4: Pipeline Orchestration

Module 5: Model Serving

Module 6: Optimization

Module 7: Monitoring

Module 8: Cloud Platforms

Overview

Learning Objectives

Tasks

Task 1: AWS SageMaker Multi-Model Deployment

Task 2: GCP Vertex AI Multi-Model Deployment

Task 3: Platform Comparison Document

Deliverables

PR1: AWS SageMaker

PR2: GCP Vertex AI

Platform Comparison

MLOps Stack Templates

Acceptance Criteria

Reading List

MLOps Platforms

AWS SageMaker

GCP Vertex AI

Azure ML

Tips for Success

Keep Iterating!

Continue Learning

Build docs developers (and LLMs) love

Module 1: Infrastructure

Module 2: Data Management

Module 3: Training Workflows

Module 4: Pipeline Orchestration

Module 5: Model Serving

Module 6: Optimization

Module 7: Monitoring

Module 8: Cloud Platforms

​Overview

​Learning Objectives

​Tasks

​Task 1: AWS SageMaker Multi-Model Deployment

​Task 2: GCP Vertex AI Multi-Model Deployment

​Task 3: Platform Comparison Document

​Deliverables

PR1: AWS SageMaker

PR2: GCP Vertex AI

Platform Comparison

MLOps Stack Templates

​Acceptance Criteria

​Reading List

​MLOps Platforms

​AWS SageMaker

​GCP Vertex AI

​Azure ML

​Tips for Success

​Keep Iterating!

Continue Learning

Build docs developers (and LLMs) love

Overview

Learning Objectives

Tasks

Task 1: AWS SageMaker Multi-Model Deployment

Task 2: GCP Vertex AI Multi-Model Deployment

Task 3: Platform Comparison Document

Deliverables

Acceptance Criteria

Reading List

MLOps Platforms

AWS SageMaker

GCP Vertex AI

Azure ML

Tips for Success

Keep Iterating!