Skip to main content
Deploy LangShazam to a Kubernetes cluster for production-grade scalability, automatic failover, and horizontal pod autoscaling.

Overview

The Kubernetes deployment includes:
  • 2 replicas by default
  • Horizontal Pod Autoscaler (2-10 pods)
  • LoadBalancer service
  • Health probes (liveness and readiness)
  • Resource limits and requests
  • ConfigMap and Secret management
  • Optional Ingress with SSL

Prerequisites

Kubernetes Cluster

v1.24+ (EKS, GKE, AKS, or self-managed)

kubectl CLI

Configured to access your cluster

Docker Registry

Access to ECR, Docker Hub, or similar

OpenAI API Key

Valid API key with Whisper access

Deployment Files

All Kubernetes manifests are in backend/deployment/kubernetes/manifests/:
  • namespace.yaml - Dedicated namespace
  • secrets.yaml - OpenAI API key (base64)
  • configmap.yaml - Application configuration
  • deployment.yaml - Pod deployment spec
  • service.yaml - LoadBalancer service
  • ingress.yaml - Ingress configuration (optional)
  • hpa.yaml - Horizontal Pod Autoscaler

Quick Start

1

Build and Push Docker Image

cd backend

# Build the image
docker build -t your-registry/langshazam:latest -f deployment/docker/Dockerfile .

# Push to registry
docker push your-registry/langshazam:latest
2

Update Image URL

Edit kubernetes/manifests/deployment.yaml and replace ${DOCKER_IMAGE_URL} at line 19:
containers:
- name: language-detector
  image: your-registry/langshazam:latest  # Update this
3

Encode API Key

echo -n "your-openai-api-key" | base64
Update kubernetes/manifests/secrets.yaml at line 9:
data:
  openai-api-key: eW91ci1iYXNlNjQtZW5jb2RlZC1rZXk=
4

Apply Kubernetes Resources

cd backend/deployment

kubectl apply -f kubernetes/manifests/namespace.yaml
kubectl apply -f kubernetes/manifests/secrets.yaml
kubectl apply -f kubernetes/manifests/configmap.yaml
kubectl apply -f kubernetes/manifests/deployment.yaml
kubectl apply -f kubernetes/manifests/service.yaml
kubectl apply -f kubernetes/manifests/hpa.yaml
5

Verify Deployment

# Check pods
kubectl get pods -n language-detector

# Check service
kubectl get svc -n language-detector

# Get external IP (may take a few minutes)
kubectl get svc language-detector -n language-detector

Manifest Details

Namespace

From kubernetes/manifests/namespace.yaml:
apiVersion: v1
kind: Namespace
metadata:
  name: language-detector
  labels:
    name: language-detector

Secrets

From kubernetes/manifests/secrets.yaml:
apiVersion: v1
kind: Secret
metadata:
  name: language-detector-secrets
type: Opaque
data:
  # The value below is a placeholder; it should be replaced with a base64-encoded API key
  # To encode: echo -n "your-api-key" | base64
  openai-api-key: ${BASE64_ENCODED_OPENAI_API_KEY}
Never commit your actual API key to version control. Use a secret management tool like AWS Secrets Manager, HashiCorp Vault, or sealed-secrets.

ConfigMap

From kubernetes/manifests/configmap.yaml:
apiVersion: v1
kind: ConfigMap
metadata:
  name: language-detector-config
data:
  # Application configuration
  # These values can be adjusted based on environment
  LOGGING_LEVEL: "INFO"
  MAX_CONNECTIONS: "100"
  MAX_AUDIO_SIZE_MB: "5"

Deployment

From kubernetes/manifests/deployment.yaml:
apiVersion: apps/v1
kind: Deployment
metadata:
  name: language-detector
  labels:
    app: language-detector
spec:
  replicas: 2
  selector:
    matchLabels:
      app: language-detector
  template:
    metadata:
      labels:
        app: language-detector
    spec:
      containers:
      - name: language-detector
        image: ${DOCKER_IMAGE_URL}
        imagePullPolicy: Always
        ports:
        - containerPort: 10000
          name: http
        env:
        - name: OPENAI_API_KEY
          valueFrom:
            secretKeyRef:
              name: language-detector-secrets
              key: openai-api-key
        resources:
          requests:
            cpu: "100m"
            memory: "256Mi"
          limits:
            cpu: "500m"
            memory: "512Mi"
        livenessProbe:
          httpGet:
            path: /
            port: 10000
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
        readinessProbe:
          httpGet:
            path: /
            port: 10000
          initialDelaySeconds: 5
          periodSeconds: 5
          timeoutSeconds: 3
Key Features:
  • 2 replicas for high availability
  • Resource requests: 100m CPU, 256Mi RAM (minimum guaranteed)
  • Resource limits: 500m CPU, 512Mi RAM (maximum allowed)
  • Liveness probe: Restarts unhealthy pods after 30s
  • Readiness probe: Marks pods ready after 5s

Service

From kubernetes/manifests/service.yaml:
apiVersion: v1
kind: Service
metadata:
  name: language-detector
  labels:
    app: language-detector
spec:
  type: LoadBalancer
  ports:
  - port: 80
    targetPort: 10000
    protocol: TCP
    name: http
  selector:
    app: language-detector
This creates a LoadBalancer that routes traffic from port 80 to container port 10000.

Horizontal Pod Autoscaler

From kubernetes/manifests/hpa.yaml:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: language-detector-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: language-detector
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 70
Auto-scaling behavior:
  • Starts with 2 pods
  • Scales up to 10 pods maximum
  • Triggers when CPU or memory exceeds 70% average utilization

Ingress Setup (Optional)

For custom domain and SSL, configure the Ingress.

Install Nginx Ingress Controller

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.8.1/deploy/static/provider/cloud/deploy.yaml

Update Ingress Manifest

From kubernetes/manifests/ingress.yaml:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: language-detector-ingress
  annotations:
    kubernetes.io/ingress.class: nginx
    # Add these annotations for SSL if needed
    # cert-manager.io/cluster-issuer: letsencrypt-prod
    # nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
  rules:
  - host: ${YOUR_DOMAIN}  # e.g., api.language-detector.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: language-detector
            port:
              number: 80
  # Add this section for SSL if needed
  # tls:
  # - hosts:
  #   - ${YOUR_DOMAIN}
  #   secretName: language-detector-tls
Replace ${YOUR_DOMAIN} with your domain, then apply:
kubectl apply -f kubernetes/manifests/ingress.yaml

SSL with cert-manager

1

Install cert-manager

kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.0/cert-manager.yaml
2

Create ClusterIssuer

Create letsencrypt-issuer.yaml:
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: [email protected]
    privateKeySecretRef:
      name: letsencrypt-prod
    solvers:
    - http01:
        ingress:
          class: nginx
kubectl apply -f letsencrypt-issuer.yaml
3

Update Ingress

Uncomment the SSL annotations and TLS section in ingress.yaml, then reapply.

Monitoring

Check Pod Status

# List all pods
kubectl get pods -n language-detector

# Describe a specific pod
kubectl describe pod <pod-name> -n language-detector

# View logs
kubectl logs <pod-name> -n language-detector

# Follow logs
kubectl logs -f <pod-name> -n language-detector

Check HPA Status

kubectl get hpa -n language-detector

# Watch HPA in real-time
kubectl get hpa -n language-detector -w

Check Service and Endpoints

# Get external IP
kubectl get svc language-detector -n language-detector

# View endpoints
kubectl get endpoints language-detector -n language-detector

Access Metrics

# Get external IP
EXTERNAL_IP=$(kubectl get svc language-detector -n language-detector -o jsonpath='{.status.loadBalancer.ingress[0].ip}')

# Test endpoint
curl http://$EXTERNAL_IP/

# Get metrics
curl http://$EXTERNAL_IP/metrics

Scaling

Manual Scaling

# Scale to 5 replicas
kubectl scale deployment language-detector --replicas=5 -n language-detector

# Verify
kubectl get pods -n language-detector

Adjust HPA Thresholds

Edit hpa.yaml to change scaling behavior:
metrics:
- type: Resource
  resource:
    name: cpu
    target:
      type: Utilization
      averageUtilization: 50  # Changed from 70
Reapply:
kubectl apply -f kubernetes/manifests/hpa.yaml

Updates and Rollbacks

Rolling Update

# Build and push new image with version tag
docker build -t your-registry/langshazam:v2 -f deployment/docker/Dockerfile .
docker push your-registry/langshazam:v2

# Update deployment
kubectl set image deployment/language-detector \
  language-detector=your-registry/langshazam:v2 \
  -n language-detector

# Monitor rollout
kubectl rollout status deployment/language-detector -n language-detector

Rollback

# Rollback to previous version
kubectl rollout undo deployment/language-detector -n language-detector

# Rollback to specific revision
kubectl rollout undo deployment/language-detector --to-revision=2 -n language-detector

# View rollout history
kubectl rollout history deployment/language-detector -n language-detector

Troubleshooting

Pods Not Starting

# Check pod events
kubectl describe pod <pod-name> -n language-detector

# Common issues:
# - Image pull errors (check registry credentials)
# - Insufficient resources (check node capacity)
# - Missing secrets (verify secrets.yaml applied)

Service Not Accessible

# Check service
kubectl get svc language-detector -n language-detector

# Check endpoints
kubectl get endpoints language-detector -n language-detector

# If endpoints are empty, pods aren't ready
kubectl get pods -n language-detector

HPA Not Scaling

# Check metrics server is installed
kubectl top nodes
kubectl top pods -n language-detector

# If metrics-server not found:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

Cleanup

# Delete all resources
kubectl delete namespace language-detector

# Or delete individually
kubectl delete -f kubernetes/manifests/hpa.yaml
kubectl delete -f kubernetes/manifests/ingress.yaml
kubectl delete -f kubernetes/manifests/service.yaml
kubectl delete -f kubernetes/manifests/deployment.yaml
kubectl delete -f kubernetes/manifests/configmap.yaml
kubectl delete -f kubernetes/manifests/secrets.yaml
kubectl delete -f kubernetes/manifests/namespace.yaml

Next Steps

Environment Variables

Customize application configuration

CORS Setup

Configure allowed origins

Build docs developers (and LLMs) love