Deploy LangShazam to a Kubernetes cluster for production-grade scalability, automatic failover, and horizontal pod autoscaling.
Overview
The Kubernetes deployment includes:
2 replicas by default
Horizontal Pod Autoscaler (2-10 pods)
LoadBalancer service
Health probes (liveness and readiness)
Resource limits and requests
ConfigMap and Secret management
Optional Ingress with SSL
Prerequisites
Kubernetes Cluster v1.24+ (EKS, GKE, AKS, or self-managed)
kubectl CLI Configured to access your cluster
Docker Registry Access to ECR, Docker Hub, or similar
OpenAI API Key Valid API key with Whisper access
Deployment Files
All Kubernetes manifests are in backend/deployment/kubernetes/manifests/:
namespace.yaml - Dedicated namespace
secrets.yaml - OpenAI API key (base64)
configmap.yaml - Application configuration
deployment.yaml - Pod deployment spec
service.yaml - LoadBalancer service
ingress.yaml - Ingress configuration (optional)
hpa.yaml - Horizontal Pod Autoscaler
Quick Start
Build and Push Docker Image
cd backend
# Build the image
docker build -t your-registry/langshazam:latest -f deployment/docker/Dockerfile .
# Push to registry
docker push your-registry/langshazam:latest
Update Image URL
Edit kubernetes/manifests/deployment.yaml and replace ${DOCKER_IMAGE_URL} at line 19: containers :
- name : language-detector
image : your-registry/langshazam:latest # Update this
Encode API Key
echo -n "your-openai-api-key" | base64
Update kubernetes/manifests/secrets.yaml at line 9: data :
openai-api-key : eW91ci1iYXNlNjQtZW5jb2RlZC1rZXk=
Apply Kubernetes Resources
cd backend/deployment
kubectl apply -f kubernetes/manifests/namespace.yaml
kubectl apply -f kubernetes/manifests/secrets.yaml
kubectl apply -f kubernetes/manifests/configmap.yaml
kubectl apply -f kubernetes/manifests/deployment.yaml
kubectl apply -f kubernetes/manifests/service.yaml
kubectl apply -f kubernetes/manifests/hpa.yaml
Verify Deployment
# Check pods
kubectl get pods -n language-detector
# Check service
kubectl get svc -n language-detector
# Get external IP (may take a few minutes)
kubectl get svc language-detector -n language-detector
Manifest Details
Namespace
From kubernetes/manifests/namespace.yaml:
apiVersion : v1
kind : Namespace
metadata :
name : language-detector
labels :
name : language-detector
Secrets
From kubernetes/manifests/secrets.yaml:
apiVersion : v1
kind : Secret
metadata :
name : language-detector-secrets
type : Opaque
data :
# The value below is a placeholder; it should be replaced with a base64-encoded API key
# To encode: echo -n "your-api-key" | base64
openai-api-key : ${BASE64_ENCODED_OPENAI_API_KEY}
Never commit your actual API key to version control. Use a secret management tool like AWS Secrets Manager, HashiCorp Vault, or sealed-secrets.
ConfigMap
From kubernetes/manifests/configmap.yaml:
apiVersion : v1
kind : ConfigMap
metadata :
name : language-detector-config
data :
# Application configuration
# These values can be adjusted based on environment
LOGGING_LEVEL : "INFO"
MAX_CONNECTIONS : "100"
MAX_AUDIO_SIZE_MB : "5"
Deployment
From kubernetes/manifests/deployment.yaml:
apiVersion : apps/v1
kind : Deployment
metadata :
name : language-detector
labels :
app : language-detector
spec :
replicas : 2
selector :
matchLabels :
app : language-detector
template :
metadata :
labels :
app : language-detector
spec :
containers :
- name : language-detector
image : ${DOCKER_IMAGE_URL}
imagePullPolicy : Always
ports :
- containerPort : 10000
name : http
env :
- name : OPENAI_API_KEY
valueFrom :
secretKeyRef :
name : language-detector-secrets
key : openai-api-key
resources :
requests :
cpu : "100m"
memory : "256Mi"
limits :
cpu : "500m"
memory : "512Mi"
livenessProbe :
httpGet :
path : /
port : 10000
initialDelaySeconds : 30
periodSeconds : 10
timeoutSeconds : 5
readinessProbe :
httpGet :
path : /
port : 10000
initialDelaySeconds : 5
periodSeconds : 5
timeoutSeconds : 3
Key Features:
2 replicas for high availability
Resource requests : 100m CPU, 256Mi RAM (minimum guaranteed)
Resource limits : 500m CPU, 512Mi RAM (maximum allowed)
Liveness probe : Restarts unhealthy pods after 30s
Readiness probe : Marks pods ready after 5s
Service
From kubernetes/manifests/service.yaml:
apiVersion : v1
kind : Service
metadata :
name : language-detector
labels :
app : language-detector
spec :
type : LoadBalancer
ports :
- port : 80
targetPort : 10000
protocol : TCP
name : http
selector :
app : language-detector
This creates a LoadBalancer that routes traffic from port 80 to container port 10000.
Horizontal Pod Autoscaler
From kubernetes/manifests/hpa.yaml:
apiVersion : autoscaling/v2
kind : HorizontalPodAutoscaler
metadata :
name : language-detector-hpa
spec :
scaleTargetRef :
apiVersion : apps/v1
kind : Deployment
name : language-detector
minReplicas : 2
maxReplicas : 10
metrics :
- type : Resource
resource :
name : cpu
target :
type : Utilization
averageUtilization : 70
- type : Resource
resource :
name : memory
target :
type : Utilization
averageUtilization : 70
Auto-scaling behavior:
Starts with 2 pods
Scales up to 10 pods maximum
Triggers when CPU or memory exceeds 70% average utilization
Ingress Setup (Optional)
For custom domain and SSL, configure the Ingress.
Install Nginx Ingress Controller
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.8.1/deploy/static/provider/cloud/deploy.yaml
Update Ingress Manifest
From kubernetes/manifests/ingress.yaml:
apiVersion : networking.k8s.io/v1
kind : Ingress
metadata :
name : language-detector-ingress
annotations :
kubernetes.io/ingress.class : nginx
# Add these annotations for SSL if needed
# cert-manager.io/cluster-issuer: letsencrypt-prod
# nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec :
rules :
- host : ${YOUR_DOMAIN} # e.g., api.language-detector.com
http :
paths :
- path : /
pathType : Prefix
backend :
service :
name : language-detector
port :
number : 80
# Add this section for SSL if needed
# tls:
# - hosts:
# - ${YOUR_DOMAIN}
# secretName: language-detector-tls
Replace ${YOUR_DOMAIN} with your domain, then apply:
kubectl apply -f kubernetes/manifests/ingress.yaml
SSL with cert-manager
Install cert-manager
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.0/cert-manager.yaml
Create ClusterIssuer
Create letsencrypt-issuer.yaml: apiVersion : cert-manager.io/v1
kind : ClusterIssuer
metadata :
name : letsencrypt-prod
spec :
acme :
server : https://acme-v02.api.letsencrypt.org/directory
email : [email protected]
privateKeySecretRef :
name : letsencrypt-prod
solvers :
- http01 :
ingress :
class : nginx
kubectl apply -f letsencrypt-issuer.yaml
Update Ingress
Uncomment the SSL annotations and TLS section in ingress.yaml, then reapply.
Monitoring
Check Pod Status
# List all pods
kubectl get pods -n language-detector
# Describe a specific pod
kubectl describe pod < pod-nam e > -n language-detector
# View logs
kubectl logs < pod-nam e > -n language-detector
# Follow logs
kubectl logs -f < pod-nam e > -n language-detector
Check HPA Status
kubectl get hpa -n language-detector
# Watch HPA in real-time
kubectl get hpa -n language-detector -w
Check Service and Endpoints
# Get external IP
kubectl get svc language-detector -n language-detector
# View endpoints
kubectl get endpoints language-detector -n language-detector
Access Metrics
# Get external IP
EXTERNAL_IP = $( kubectl get svc language-detector -n language-detector -o jsonpath='{.status.loadBalancer.ingress[0].ip}' )
# Test endpoint
curl http:// $EXTERNAL_IP /
# Get metrics
curl http:// $EXTERNAL_IP /metrics
Scaling
Manual Scaling
# Scale to 5 replicas
kubectl scale deployment language-detector --replicas=5 -n language-detector
# Verify
kubectl get pods -n language-detector
Adjust HPA Thresholds
Edit hpa.yaml to change scaling behavior:
metrics :
- type : Resource
resource :
name : cpu
target :
type : Utilization
averageUtilization : 50 # Changed from 70
Reapply:
kubectl apply -f kubernetes/manifests/hpa.yaml
Updates and Rollbacks
Rolling Update
# Build and push new image with version tag
docker build -t your-registry/langshazam:v2 -f deployment/docker/Dockerfile .
docker push your-registry/langshazam:v2
# Update deployment
kubectl set image deployment/language-detector \
language-detector=your-registry/langshazam:v2 \
-n language-detector
# Monitor rollout
kubectl rollout status deployment/language-detector -n language-detector
Rollback
# Rollback to previous version
kubectl rollout undo deployment/language-detector -n language-detector
# Rollback to specific revision
kubectl rollout undo deployment/language-detector --to-revision=2 -n language-detector
# View rollout history
kubectl rollout history deployment/language-detector -n language-detector
Troubleshooting
Pods Not Starting
# Check pod events
kubectl describe pod < pod-nam e > -n language-detector
# Common issues:
# - Image pull errors (check registry credentials)
# - Insufficient resources (check node capacity)
# - Missing secrets (verify secrets.yaml applied)
Service Not Accessible
# Check service
kubectl get svc language-detector -n language-detector
# Check endpoints
kubectl get endpoints language-detector -n language-detector
# If endpoints are empty, pods aren't ready
kubectl get pods -n language-detector
HPA Not Scaling
# Check metrics server is installed
kubectl top nodes
kubectl top pods -n language-detector
# If metrics-server not found:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
Cleanup
# Delete all resources
kubectl delete namespace language-detector
# Or delete individually
kubectl delete -f kubernetes/manifests/hpa.yaml
kubectl delete -f kubernetes/manifests/ingress.yaml
kubectl delete -f kubernetes/manifests/service.yaml
kubectl delete -f kubernetes/manifests/deployment.yaml
kubectl delete -f kubernetes/manifests/configmap.yaml
kubectl delete -f kubernetes/manifests/secrets.yaml
kubectl delete -f kubernetes/manifests/namespace.yaml
Next Steps
Environment Variables Customize application configuration
CORS Setup Configure allowed origins