Kubernetes Deployment

Overview

Deploy LLM Gateway Enterprise on Kubernetes for production-grade scalability, high availability, and automated operations.

Prerequisites

Kubernetes 1.25 or later
kubectl configured
Helm 3.x (optional but recommended)
4GB RAM per node minimum
Persistent volume provisioner
Ingress controller (nginx, traefik, etc.)

Architecture

┌────────────────────┐
│  Ingress Controller   │
└────────┬───────────┘
         │
    ┌────┼────┐
    │         │
┌───┼──┐  ┌──┼───┐
│ UI   │  │ API  │
│ Pods │  │ Pods │
└──┬──┘  └──┬──┘
   │        │
┌──┼───────┼──┐
│ Gateway Pods  │
└─────┬──────┘
      │
  ┌───┼───┐
  │       │
┌─┼──┐  ┌─┼──┐
│ PG │  │ Redis│
└────┘  └─────┘
 PVC     PVC

Namespace

Create a dedicated namespace:

kubectl create namespace llmgateway

Configuration

ConfigMap

Create configmap.yaml:

apiVersion: v1
kind: ConfigMap
metadata:
  name: llmgateway-config
  namespace: llmgateway
data:
  NODE_ENV: "production"
  POSTGRES_USER: "postgres"
  POSTGRES_DB: "llmgateway"
  REDIS_PORT: "6379"
  UI_PORT: "3002"
  API_PORT: "4002"
  GATEWAY_PORT: "4001"
  PLAYGROUND_PORT: "3003"
  ADMIN_PORT: "3006"
  UI_URL: "https://llmgateway.yourdomain.com"
  API_URL: "https://api.llmgateway.yourdomain.com"
  GATEWAY_URL: "https://gateway.llmgateway.yourdomain.com"
  PLAYGROUND_URL: "https://playground.llmgateway.yourdomain.com"
  ADMIN_URL: "https://admin.llmgateway.yourdomain.com"
  COOKIE_DOMAIN: "yourdomain.com"
  PASSKEY_RP_ID: "yourdomain.com"
  PASSKEY_RP_NAME: "LLMGateway"

Apply:

kubectl apply -f configmap.yaml

Secrets

Create secrets.yaml:

apiVersion: v1
kind: Secret
metadata:
  name: llmgateway-secrets
  namespace: llmgateway
type: Opaque
stringData:
  POSTGRES_PASSWORD: "your-postgres-password"
  REDIS_PASSWORD: "your-redis-password"
  AUTH_SECRET: "your-auth-secret-32-bytes"
  GITHUB_CLIENT_ID: "your-github-client-id"
  GITHUB_CLIENT_SECRET: "your-github-client-secret"
  STRIPE_SECRET_KEY: "sk_live_your-stripe-key"
  STRIPE_WEBHOOK_SECRET: "whsec_your-webhook-secret"
  LLM_OPENAI_API_KEY: "sk-your-openai-key"
  LLM_ANTHROPIC_API_KEY: "sk-ant-your-anthropic-key"

Apply:

kubectl apply -f secrets.yaml

For production, use a secrets management solution like:

HashiCorp Vault
AWS Secrets Manager
Google Secret Manager
Azure Key Vault
Sealed Secrets

PostgreSQL

Deploy PostgreSQL with persistent storage.

StatefulSet

Create postgres.yaml:

apiVersion: v1
kind: Service
metadata:
  name: postgres
  namespace: llmgateway
spec:
  ports:
    - port: 5432
      name: postgres
  clusterIP: None
  selector:
    app: postgres
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
  namespace: llmgateway
spec:
  serviceName: postgres
  replicas: 1
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
      - name: postgres
        image: postgres:17-alpine
        ports:
        - containerPort: 5432
          name: postgres
        env:
        - name: POSTGRES_USER
          valueFrom:
            configMapKeyRef:
              name: llmgateway-config
              key: POSTGRES_USER
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: llmgateway-secrets
              key: POSTGRES_PASSWORD
        - name: POSTGRES_DB
          valueFrom:
            configMapKeyRef:
              name: llmgateway-config
              key: POSTGRES_DB
        - name: PGDATA
          value: /var/lib/postgresql/data/pgdata
        volumeMounts:
        - name: postgres-storage
          mountPath: /var/lib/postgresql/data
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "1000m"
        livenessProbe:
          exec:
            command:
            - pg_isready
            - -U
            - postgres
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          exec:
            command:
            - pg_isready
            - -U
            - postgres
          initialDelaySeconds: 5
          periodSeconds: 5
  volumeClaimTemplates:
  - metadata:
      name: postgres-storage
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 10Gi

Apply:

kubectl apply -f postgres.yaml

Redis

Deploy Redis for caching and queues.

Deployment

Create redis.yaml:

apiVersion: v1
kind: Service
metadata:
  name: redis
  namespace: llmgateway
spec:
  ports:
    - port: 6379
      name: redis
  selector:
    app: redis
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: redis
  namespace: llmgateway
spec:
  replicas: 1
  selector:
    matchLabels:
      app: redis
  template:
    metadata:
      labels:
        app: redis
    spec:
      containers:
      - name: redis
        image: redis:8-alpine
        args:
        - redis-server
        - --appendonly
        - "yes"
        - --requirepass
        - $(REDIS_PASSWORD)
        ports:
        - containerPort: 6379
          name: redis
        env:
        - name: REDIS_PASSWORD
          valueFrom:
            secretKeyRef:
              name: llmgateway-secrets
              key: REDIS_PASSWORD
        volumeMounts:
        - name: redis-storage
          mountPath: /data
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        livenessProbe:
          exec:
            command:
            - redis-cli
            - ping
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          exec:
            command:
            - redis-cli
            - ping
          initialDelaySeconds: 5
          periodSeconds: 5
      volumes:
      - name: redis-storage
        persistentVolumeClaim:
          claimName: redis-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: redis-pvc
  namespace: llmgateway
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 5Gi

Apply:

kubectl apply -f redis.yaml

Application Services

Gateway

Create gateway.yaml:

apiVersion: v1
kind: Service
metadata:
  name: gateway
  namespace: llmgateway
spec:
  selector:
    app: gateway
  ports:
    - port: 80
      targetPort: 4001
      name: http
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: gateway
  namespace: llmgateway
spec:
  replicas: 3
  selector:
    matchLabels:
      app: gateway
  template:
    metadata:
      labels:
        app: gateway
    spec:
      containers:
      - name: gateway
        image: ghcr.io/theopenco/llmgateway-gateway:latest
        ports:
        - containerPort: 4001
          name: http
        env:
        - name: NODE_ENV
          value: "production"
        - name: PORT
          value: "4001"
        - name: DATABASE_URL
          value: "postgres://$(POSTGRES_USER):$(POSTGRES_PASSWORD)@postgres:5432/$(POSTGRES_DB)"
        - name: REDIS_HOST
          value: "redis"
        - name: REDIS_PORT
          value: "6379"
        - name: REDIS_PASSWORD
          valueFrom:
            secretKeyRef:
              name: llmgateway-secrets
              key: REDIS_PASSWORD
        - name: LLM_OPENAI_API_KEY
          valueFrom:
            secretKeyRef:
              name: llmgateway-secrets
              key: LLM_OPENAI_API_KEY
        - name: LLM_ANTHROPIC_API_KEY
          valueFrom:
            secretKeyRef:
              name: llmgateway-secrets
              key: LLM_ANTHROPIC_API_KEY
        envFrom:
        - configMapRef:
            name: llmgateway-config
        resources:
          requests:
            memory: "256Mi"
            cpu: "200m"
          limits:
            memory: "1Gi"
            cpu: "1000m"
        livenessProbe:
          httpGet:
            path: /
            port: 4001
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /
            port: 4001
          initialDelaySeconds: 5
          periodSeconds: 5

API

Create api.yaml:

apiVersion: v1
kind: Service
metadata:
  name: api
  namespace: llmgateway
spec:
  selector:
    app: api
  ports:
    - port: 80
      targetPort: 4002
      name: http
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api
  namespace: llmgateway
spec:
  replicas: 2
  selector:
    matchLabels:
      app: api
  template:
    metadata:
      labels:
        app: api
    spec:
      initContainers:
      - name: migrations
        image: ghcr.io/theopenco/llmgateway-api:latest
        command: ["pnpm", "run", "migrate"]
        env:
        - name: DATABASE_URL
          value: "postgres://$(POSTGRES_USER):$(POSTGRES_PASSWORD)@postgres:5432/$(POSTGRES_DB)"
        envFrom:
        - configMapRef:
            name: llmgateway-config
        - secretKeyRef:
            name: llmgateway-secrets
      containers:
      - name: api
        image: ghcr.io/theopenco/llmgateway-api:latest
        ports:
        - containerPort: 4002
          name: http
        env:
        - name: NODE_ENV
          value: "production"
        - name: PORT
          value: "4002"
        - name: DATABASE_URL
          value: "postgres://$(POSTGRES_USER):$(POSTGRES_PASSWORD)@postgres:5432/$(POSTGRES_DB)"
        envFrom:
        - configMapRef:
            name: llmgateway-config
        - secretKeyRef:
            name: llmgateway-secrets
        resources:
          requests:
            memory: "256Mi"
            cpu: "200m"
          limits:
            memory: "1Gi"
            cpu: "1000m"
        livenessProbe:
          httpGet:
            path: /
            port: 4002
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /
            port: 4002
          initialDelaySeconds: 5
          periodSeconds: 5

UI

Create ui.yaml:

apiVersion: v1
kind: Service
metadata:
  name: ui
  namespace: llmgateway
spec:
  selector:
    app: ui
  ports:
    - port: 80
      targetPort: 3002
      name: http
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ui
  namespace: llmgateway
spec:
  replicas: 2
  selector:
    matchLabels:
      app: ui
  template:
    metadata:
      labels:
        app: ui
    spec:
      containers:
      - name: ui
        image: ghcr.io/theopenco/llmgateway-ui:latest
        ports:
        - containerPort: 3002
          name: http
        envFrom:
        - configMapRef:
            name: llmgateway-config
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        livenessProbe:
          httpGet:
            path: /
            port: 3002
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /
            port: 3002
          initialDelaySeconds: 5
          periodSeconds: 5

Apply all:

kubectl apply -f gateway.yaml
kubectl apply -f api.yaml
kubectl apply -f ui.yaml

Ingress

Configure ingress for external access.

Nginx Ingress

Create ingress.yaml:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: llmgateway-ingress
  namespace: llmgateway
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
  ingressClassName: nginx
  tls:
  - hosts:
    - llmgateway.yourdomain.com
    - api.llmgateway.yourdomain.com
    - gateway.llmgateway.yourdomain.com
    secretName: llmgateway-tls
  rules:
  - host: llmgateway.yourdomain.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: ui
            port:
              number: 80
  - host: api.llmgateway.yourdomain.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: api
            port:
              number: 80
  - host: gateway.llmgateway.yourdomain.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: gateway
            port:
              number: 80

Apply:

kubectl apply -f ingress.yaml

Horizontal Pod Autoscaler

Auto-scale based on CPU/memory usage.

Gateway HPA

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: gateway-hpa
  namespace: llmgateway
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: gateway
  minReplicas: 2
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

Apply:

kubectl apply -f hpa.yaml

Monitoring

Prometheus

Add service monitors:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: llmgateway-metrics
  namespace: llmgateway
spec:
  selector:
    matchLabels:
      app: gateway
  endpoints:
  - port: http
    path: /metrics

Logs

View logs:

# All gateway pods
kubectl logs -n llmgateway -l app=gateway -f

# Specific pod
kubectl logs -n llmgateway gateway-abc123 -f

# Previous container
kubectl logs -n llmgateway gateway-abc123 --previous

Backup and Restore

PostgreSQL Backup

# Create backup
kubectl exec -n llmgateway postgres-0 -- pg_dump -U postgres llmgateway > backup.sql

# Restore backup
cat backup.sql | kubectl exec -i -n llmgateway postgres-0 -- psql -U postgres llmgateway

Velero

Use Velero for cluster backups:

# Backup namespace
velero backup create llmgateway-backup --include-namespaces llmgateway

# Restore
velero restore create --from-backup llmgateway-backup

Production Checklist

Docker Deployment

Environment Variables

⌘I

Learn more about Mintlify

Enter your email to receive updates about new features and product releases.

Overview
Prerequisites
Architecture
Namespace
Configuration
ConfigMap
Secrets
PostgreSQL
StatefulSet
Redis
Deployment
Application Services
Gateway
API
UI
Ingress
Nginx Ingress
Horizontal Pod Autoscaler
Gateway HPA
Monitoring
Prometheus
Logs
Backup and Restore
PostgreSQL Backup
Velero
Production Checklist

Build docs developers (and LLMs) love

Get started for free Talk to us

Overview

Deployment

Administration

​Overview

​Prerequisites

​Architecture

​Namespace

​Configuration

​ConfigMap

​Secrets

​PostgreSQL

​StatefulSet

​Redis

​Deployment

​Application Services

​Gateway

​API

​UI

​Ingress

​Nginx Ingress

​Horizontal Pod Autoscaler

​Gateway HPA

​Monitoring

​Prometheus

​Logs

​Backup and Restore

​PostgreSQL Backup

​Velero

​Production Checklist

Build docs developers (and LLMs) love

Overview

Prerequisites

Architecture

Namespace

Configuration

ConfigMap

Secrets

PostgreSQL

StatefulSet

Redis

Deployment

Application Services

Gateway

API

UI

Ingress

Nginx Ingress

Horizontal Pod Autoscaler

Gateway HPA

Monitoring

Prometheus

Logs

Backup and Restore

PostgreSQL Backup

Velero

Production Checklist