Deployment Patterns - Secure MCP Gateway

Overview

The Secure MCP Gateway supports multiple deployment patterns to meet different scalability, availability, and security requirements. This guide covers common architectures from single-instance local deployments to distributed cloud deployments.

Architecture Patterns

1. Local Development

Use Case: Development, testing, personal use

┌─────────────────┐
│  MCP Client     │ (Claude Desktop, Cursor)
│  (stdio)        │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│  Gateway        │ (localhost)
│  (pip install)  │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│  MCP Servers    │ (local processes)
│  (stdio/http)   │
└─────────────────┘

Characteristics:

Single machine deployment
Stdio transport for client communication
Local in-memory cache
Ideal for: Development, testing, single-user scenarios

Setup:

# Install with pip
pip install secure-mcp-gateway
secure-mcp-gateway generate-config
secure-mcp-gateway install --client claude-desktop

This is the simplest deployment pattern. Configuration is stored locally at ~/.enkrypt/enkrypt_mcp_config.json.

2. Docker Single Instance

Use Case: Production single-server, consistent environment

┌─────────────────┐
│  MCP Client     │
│  (stdio)        │
└────────┬────────┘
         │
         ▼
┌─────────────────────────────────┐
│  Docker Container               │
│  ┌─────────────────────────┐   │
│  │  Gateway                │   │
│  └─────────────────────────┘   │
│  ┌─────────────────────────┐   │
│  │  Config Volume          │   │
│  │  ~/.enkrypt/docker      │   │
│  └─────────────────────────┘   │
└────────┬────────────────────────┘
         │
         ▼
┌─────────────────┐
│  MCP Servers    │
└─────────────────┘

Characteristics:

Containerized gateway with volume-mounted config
Includes Python 3.12, Node.js 22.x runtime
Supports both Python and JavaScript MCP servers
Local cache within container

Setup:

# Build image
docker build -t secure-mcp-gateway .

# Generate config
docker run --rm \
  -v ~/.enkrypt/docker:/app/.enkrypt/docker \
  --entrypoint python \
  secure-mcp-gateway \
  -m secure_mcp_gateway.cli generate-config

# Run gateway
docker run -d \
  --name secure-mcp-gateway \
  -p 8000:8000 \
  -v ~/.enkrypt/docker:/app/.enkrypt/docker \
  -e ENKRYPT_GATEWAY_KEY="..." \
  secure-mcp-gateway

3. Docker Compose with Observability

Use Case: Production with monitoring, debugging, team deployments

┌─────────────────┐
│  MCP Client     │
└────────┬────────┘
         │
         ▼
┌─────────────────────────────────────────────────┐
│  Docker Compose Stack                           │
│                                                 │
│  ┌──────────┐    ┌──────────────────────┐     │
│  │ Gateway  │───▶│ OpenTelemetry        │     │
│  │ (Port    │    │ Collector            │     │
│  │  8000)   │    │ (OTLP 4317, 4318)    │     │
│  └──────────┘    └───┬──────────┬───────┘     │
│                      │          │             │
│       ┌──────────────┴─┐    ┌───┴──────────┐  │
│       │  Jaeger        │    │  Loki        │  │
│       │  (Traces)      │    │  (Logs)      │  │
│       │  Port 16686    │    │  Port 3100   │  │
│       └────────┬───────┘    └───┬──────────┘  │
│                │                │             │
│                │    ┌───────────┴───┐         │
│                │    │  Prometheus   │         │
│                │    │  (Metrics)    │         │
│                │    │  Port 9090    │         │
│                │    └───────┬───────┘         │
│                │            │                 │
│                └────────────┼─────────────┐   │
│                             ▼             │   │
│                      ┌──────────────┐     │   │
│                      │   Grafana    │     │   │
│                      │  (Dashboards)│     │   │
│                      │   Port 3000  │     │   │
│                      └──────────────┘     │   │
└─────────────────────────────────────────────────┘

Characteristics:

Complete observability stack
Distributed tracing with Jaeger
Log aggregation with Loki
Metrics with Prometheus
Unified dashboards in Grafana
Inter-service communication via Docker network

Setup:

# Navigate to infrastructure directory
cd infra

# Start all services
docker compose up -d

# Access dashboards
# Grafana: http://localhost:3000
# Jaeger: http://localhost:16686
# Prometheus: http://localhost:9090

Observability Features:

Distributed Tracing

Track requests across gateway, clients, and MCP servers. Visualize latency bottlenecks.

Log Aggregation

Centralized logs from all services. Query with LogQL, correlate with traces.

Metrics Monitoring

Request rates, error rates, cache hit ratios, timeout metrics, OAuth metrics.

Grafana Dashboards

Pre-configured dashboards for gateway health, performance, and security insights.

4. Distributed with External Cache

Use Case: Multi-instance deployments, high availability, load balancing

┌──────────┐  ┌──────────┐  ┌──────────┐
│  Client  │  │  Client  │  │  Client  │
│    A     │  │    B     │  │    C     │
└────┬─────┘  └────┬─────┘  └────┬─────┘
     │             │             │
     └─────────────┼─────────────┘
                   │
                   ▼
          ┌────────────────┐
          │  Load Balancer │
          └───────┬────────┘
                  │
      ┌───────────┼───────────┐
      │           │           │
      ▼           ▼           ▼
┌──────────┐┌──────────┐┌──────────┐
│ Gateway  ││ Gateway  ││ Gateway  │
│ Instance ││ Instance ││ Instance │
│    1     ││    2     ││    3     │
└─────┬────┘└─────┬────┘└─────┬────┘
      │           │           │
      └───────────┼───────────┘
                  ▼
         ┌─────────────────┐
         │  Redis/KeyDB    │
         │  External Cache │
         └─────────────────┘
                  │
      ┌───────────┼───────────┐
      ▼           ▼           ▼
┌──────────┐┌──────────┐┌──────────┐
│   MCP    ││   MCP    ││   MCP    │
│ Server A ││ Server B ││ Server C │
└──────────┘└──────────┘└──────────┘

Characteristics:

Multiple gateway instances behind load balancer
Shared Redis/KeyDB for distributed caching
Session affinity not required (stateless gateway)
Horizontal scaling for high throughput
Fault tolerance via redundancy

Configuration:

enkrypt_mcp_config.json

{
  "common_mcp_gateway_config": {
    "enkrypt_mcp_use_external_cache": true,
    "enkrypt_cache_host": "redis.internal.example.com",
    "enkrypt_cache_port": 6379,
    "enkrypt_cache_db": 0,
    "enkrypt_cache_password": "your_redis_password",
    "enkrypt_tool_cache_expiration": 4,
    "enkrypt_gateway_cache_expiration": 24
  }
}

Cache Keys (MD5 hashed for security):

Tool cache: mcp_tools:{config_id}:{server_name}
Gateway config: mcp_gateway_config:{config_id}
API key mapping: mcp_key_to_id:{gateway_key}

Scaling Considerations:

Cache Sizing

Estimate cache size based on:

Number of MCP servers: N
Average tools per server: T
Cache entry size: ~1-5 KB per server
Total: N * (T * 1KB + 5KB config)

Example: 100 servers, 20 tools each = ~500 KB cache

Redis Configuration

Recommended Redis/KeyDB settings:

maxmemory 1gb
maxmemory-policy allkeys-lru
save ""
appendonly yes
tcp-keepalive 60

Load Balancer Setup

Use HTTP health check: GET /health (port 8000)
No session affinity needed (stateless)
Timeout: 30s (or default_timeout from config)
Round-robin or least-connections algorithm

5. Remote Gateway Deployment

Use Case: Centralized gateway, remote MCP clients, cloud deployment

┌────────────────────────────────────────┐
│  Local Environment                      │
│  ┌──────────┐  ┌──────────┐           │
│  │  Claude  │  │  Cursor  │           │
│  │ Desktop  │  │          │           │
│  └────┬─────┘  └────┬─────┘           │
│       │             │                  │
└───────┼─────────────┼──────────────────┘
        │             │
        │   HTTPS     │
        │   (TLS)     │
        ▼             ▼
┌─────────────────────────────────────────┐
│  Cloud Environment (AWS/GCP/Azure)      │
│                                         │
│  ┌───────────────────────────────────┐ │
│  │  Load Balancer (TLS Termination)  │ │
│  └──────────────┬────────────────────┘ │
│                 │                       │
│        ┌────────┼────────┐             │
│        ▼        ▼        ▼             │
│   ┌────────┬────────┬────────┐        │
│   │Gateway │Gateway │Gateway │        │
│   │   1    │   2    │   3    │        │
│   └───┬────┴───┬────┴───┬────┘        │
│       │        │        │              │
│       └────────┼────────┘              │
│                ▼                       │
│        ┌──────────────┐               │
│        │  Redis Cache │               │
│        └──────────────┘               │
│                │                       │
│       ┌────────┼────────┐             │
│       ▼        ▼        ▼             │
│  ┌────────┬────────┬────────┐        │
│  │  MCP   │  MCP   │  MCP   │        │
│  │Server A│Server B│Server C│        │
│  └────────┴────────┴────────┘        │
│                                       │
│  ┌────────────────────────────────┐  │
│  │  Observability Stack           │  │
│  │  (OTEL, Prometheus, Grafana)   │  │
│  └────────────────────────────────┘  │
└─────────────────────────────────────────┘

Characteristics:

Gateway hosted in cloud (AWS, GCP, Azure)
Clients connect via HTTPS (streamable HTTP transport)
TLS termination at load balancer
Distributed caching with managed Redis
Centralized monitoring and logging
API key authentication over HTTPS

Gateway Configuration:

gateway.py

# Enable streamable HTTP transport
if __name__ == "__main__":
    mcp.run(
        transport="streamable-http",
        host="0.0.0.0",
        port=8000,
        path="/mcp/"
    )

Client Configuration:

claude_desktop_config.json

{
  "mcpServers": {
    "Remote Secure Gateway": {
      "url": "https://gateway.example.com/mcp/",
      "headers": {
        "Authorization": "Bearer YOUR_GATEWAY_API_KEY"
      }
    }
  }
}

Security Considerations:

Always use HTTPS for remote deployments
Store API keys in environment variables or secrets manager
Enable rate limiting at load balancer
Use VPC/private subnets for internal services
Rotate API keys regularly with CLI: secure-mcp-gateway apikey rotate

Cloud Deployment Examples:

AWS
GCP
Azure

Architecture:

ALB: Application Load Balancer with TLS
ECS Fargate: Containerized gateway instances
ElastiCache Redis: Managed cache cluster
CloudWatch: Logs and metrics
X-Ray: Distributed tracing

Terraform Example:

resource "aws_ecs_service" "gateway" {
  name            = "secure-mcp-gateway"
  cluster         = aws_ecs_cluster.main.id
  task_definition = aws_ecs_task_definition.gateway.arn
  desired_count   = 3

  load_balancer {
    target_group_arn = aws_lb_target_group.gateway.arn
    container_name   = "gateway"
    container_port   = 8000
  }

  network_configuration {
    subnets         = aws_subnet.private[*].id
    security_groups = [aws_security_group.gateway.id]
  }
}

resource "aws_elasticache_cluster" "cache" {
  cluster_id           = "mcp-gateway-cache"
  engine               = "redis"
  node_type            = "cache.t3.small"
  num_cache_nodes      = 1
  parameter_group_name = "default.redis7"
  port                 = 6379
}

Architecture:

Cloud Load Balancing: HTTPS load balancer
Cloud Run: Serverless containers
Memorystore: Managed Redis
Cloud Logging: Centralized logs
Cloud Trace: Distributed tracing

gcloud Deploy:

# Build and push image
gcloud builds submit --tag gcr.io/PROJECT_ID/secure-mcp-gateway

# Deploy to Cloud Run
gcloud run deploy secure-mcp-gateway \
  --image gcr.io/PROJECT_ID/secure-mcp-gateway \
  --platform managed \
  --region us-central1 \
  --allow-unauthenticated \
  --set-env-vars "ENKRYPT_CACHE_HOST=REDIS_IP"

Architecture:

Application Gateway: HTTPS load balancer
Container Instances: Managed containers
Azure Cache for Redis: Managed Redis
Application Insights: Monitoring
Log Analytics: Log aggregation

Azure CLI Deploy:

# Create container instance
az container create \
  --resource-group mcp-gateway-rg \
  --name secure-mcp-gateway \
  --image secure-mcp-gateway:latest \
  --cpu 2 --memory 4 \
  --ports 8000 \
  --environment-variables \
    ENKRYPT_CACHE_HOST=$REDIS_HOST \
    ENKRYPT_CACHE_PASSWORD=$REDIS_PASSWORD

6. Hybrid Deployment

Use Case: Gateway in cloud, MCP servers on-premise or distributed

┌─────────────────────────────────────────┐
│  Cloud (Gateway + Observability)        │
│  ┌──────────┐  ┌──────────────┐        │
│  │ Gateway  │──│ Redis Cache  │        │
│  └────┬─────┘  └──────────────┘        │
└───────┼──────────────────────────────────┘
        │
        │  VPN / Private Link
        │
┌───────┼──────────────────────────────────┐
│  On-Premise / Edge                       │
│       │                                  │
│  ┌────┴─────────┬─────────────┐         │
│  ▼              ▼             ▼         │
│  ┌──────────┬──────────┬──────────┐    │
│  │Internal  │Database  │File      │    │
│  │API MCP   │MCP       │System MCP│    │
│  └──────────┴──────────┴──────────┘    │
└─────────────────────────────────────────┘

Characteristics:

Gateway in cloud for scalability
MCP servers on-premise for data security
Secure connectivity via VPN or private link
Sensitive data never leaves private network
Gateway acts as secure proxy

Use Cases:

Accessing internal databases without exposing them
File system operations on private networks
Compliance requirements (HIPAA, SOC 2)
Legacy system integration

Scaling Strategies

Vertical Scaling (Scale Up)

When to use: Initial growth, simpler operations

# Increase Docker container resources
docker run \
  --memory="4g" \
  --cpus="2.0" \
  secure-mcp-gateway

# Or in docker-compose.yml
services:
  gateway:
    deploy:
      resources:
        limits:
          cpus: '2.0'
          memory: 4G
        reservations:
          cpus: '1.0'
          memory: 2G

Limits: Single machine CPU/memory capacity (~16 cores, 64GB RAM typical)

Horizontal Scaling (Scale Out)

When to use: High throughput, high availability, distributed clients Requirements:

External cache (Redis/KeyDB)
Load balancer
Stateless gateway design ✓ (already implemented)

Scaling Steps:

Enable External Cache

Update enkrypt_mcp_config.json:

{
  "common_mcp_gateway_config": {
    "enkrypt_mcp_use_external_cache": true,
    "enkrypt_cache_host": "redis.internal",
    "enkrypt_cache_port": 6379
  }
}

Deploy Redis Cluster

For high availability:

# Redis Sentinel (3 nodes)
docker run -d redis:7 --sentinel

# Or KeyDB (Redis-compatible, multi-threaded)
docker run -d eqalpha/keydb:latest

Launch Gateway Instances

# Instance 1
docker run -d --name gateway-1 -p 8001:8000 ...

# Instance 2
docker run -d --name gateway-2 -p 8002:8000 ...

# Instance 3
docker run -d --name gateway-3 -p 8003:8000 ...

Configure Load Balancer

Nginx Example:

upstream gateway_backend {
    least_conn;
    server gateway-1:8000 max_fails=3 fail_timeout=30s;
    server gateway-2:8000 max_fails=3 fail_timeout=30s;
    server gateway-3:8000 max_fails=3 fail_timeout=30s;
}

server {
    listen 443 ssl http2;
    server_name gateway.example.com;

    ssl_certificate /etc/nginx/ssl/cert.pem;
    ssl_certificate_key /etc/nginx/ssl/key.pem;

    location /mcp/ {
        proxy_pass http://gateway_backend;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
    }

    location /health {
        proxy_pass http://gateway_backend/health;
    }
}

Auto-Scaling

Kubernetes HPA (Horizontal Pod Autoscaler):

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: gateway-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: secure-mcp-gateway
  minReplicas: 3
  maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80

Metrics to Monitor:

CPU Utilization: Target 60-70%
Memory Usage: Target < 80%
Request Rate: Requests/second
Response Time: p50, p95, p99 latency
Error Rate: % failed requests
Cache Hit Ratio: > 80% ideal

Performance Optimization

Cache Configuration

{
  "common_mcp_gateway_config": {
    "enkrypt_tool_cache_expiration": 4,     // 4 hours (tool discovery)
    "enkrypt_gateway_cache_expiration": 24  // 24 hours (gateway config)
  }
}

Tuning Guidelines:

Frequent tool changes: Reduce to 1-2 hours
Stable environments: Increase to 12-24 hours
High cache hit ratio: Longer expiration acceptable
Low memory: Shorter expiration to free space

Timeout Configuration

{
  "timeout_settings": {
    "default_timeout": 30,
    "guardrail_timeout": 15,
    "auth_timeout": 10,
    "tool_execution_timeout": 60,
    "discovery_timeout": 20,
    "cache_timeout": 5,
    "connectivity_timeout": 2
  }
}

Optimization:

Increase tool_execution_timeout for slow MCP servers
Decrease guardrail_timeout if using fast local guardrails
Set connectivity_timeout low to fail fast on network issues

Async Guardrails

{
  "enkrypt_async_input_guardrails_enabled": true,
  "enkrypt_async_output_guardrails_enabled": true
}

Benefits:

Reduced latency (guardrails run in parallel)
Improved throughput
Better resource utilization

Trade-offs:

Slightly higher complexity
Potential for race conditions in logging

High Availability

Components for HA

Multiple Gateway Instances

Minimum 3 instances for quorum
Deploy across availability zones
Use health checks for automatic failover

Redis Sentinel

3-node Sentinel cluster
Automatic failover (< 30s)
Monitors master health

Load Balancer

Active health checks every 10s
Remove unhealthy instances
SSL/TLS termination

Observability

Real-time alerts on failures
Distributed tracing for debugging
Historical metrics for capacity planning

Disaster Recovery

Backup Strategy:

# Backup configuration
secure-mcp-gateway system backup --output backup_$(date +%Y%m%d).json

# Automated daily backups
0 2 * * * /usr/bin/secure-mcp-gateway system backup --output /backups/backup_$(date +\%Y\%m\%d).json

Restore Procedure:

# Restore from backup
secure-mcp-gateway system restore --file backup_20260304.json

# Verify configuration
secure-mcp-gateway config list

Multi-Region Setup:

Deploy gateway in multiple regions
Use global load balancer (AWS Route 53, Cloudflare)
Replicate Redis across regions (async replication)
Geo-routing for low latency

Cost Optimization

Resource Sizing Guidelines

Deployment	Instances	CPU/Instance	Memory/Instance	Cache	Est. Cost/Month
Development	1	1 core	2 GB	Local	$20-40
Small Prod	2	2 cores	4 GB	Redis (1 node)	$150-250
Medium Prod	3-5	2 cores	8 GB	Redis (3 nodes)	$400-700
Large Prod	5-10	4 cores	16 GB	Redis Cluster	$1000-2000

Cost Reduction Tips:

Use spot/preemptible instances for non-critical environments
Enable auto-scaling to scale down during low traffic
Use reserved instances for predictable workloads (40-60% savings)
Optimize cache expiration to reduce Redis memory
Consider managed services (ECS Fargate, Cloud Run) for operational simplicity

Security Best Practices

Production Security Checklist:

✓ Use HTTPS for all remote connections
✓ Store API keys in secrets manager (AWS Secrets Manager, HashiCorp Vault)
✓ Enable Redis authentication (requirepass)
✓ Use VPC/private subnets for internal services
✓ Enable TLS for OTLP endpoints
✓ Rotate API keys every 90 days
✓ Enable guardrails on production servers
✓ Set up alerting for security events
✓ Regular security audits and penetration testing
✓ Keep dependencies updated

Troubleshooting Production Issues

High Latency

Check cache hit ratio:

# Via gateway tool
enkrypt_get_cache_status()

# Target: > 80% hit ratio

Review timeout metrics:
```
enkrypt_get_timeout_metrics()
```
Analyze traces in Jaeger:
- Identify slow MCP servers
- Look for guardrail bottlenecks
- Check network latency

High Error Rate

Check Grafana dashboards:
- Error rate by server
- Error types (auth, timeout, execution)
Review logs in Loki:
```
{job="gateway"} |= "ERROR"
```
Common causes:
- MCP server crashes
- Network connectivity issues
- Invalid API keys
- Guardrail blocks

Cache Issues

Verify Redis connectivity:

redis-cli -h redis.internal ping
# Expected: PONG

Check Redis memory:
```
redis-cli INFO memory
```
Clear cache if stale:
```
secure-mcp-gateway cache clear --all
```

Next Steps

External Cache Setup

Configure Redis/KeyDB for distributed caching

Load Balancing

Set up Nginx or cloud load balancers

Monitoring & Alerting

Configure alerts for production incidents

Security Hardening

Implement production security controls

Get Started

Core Concepts

Features

Deployment

Client Integration

Observability

Security

Guides

​Overview

​Architecture Patterns

​1. Local Development

​2. Docker Single Instance

​3. Docker Compose with Observability

Distributed Tracing

Log Aggregation

Metrics Monitoring

Grafana Dashboards

​4. Distributed with External Cache

​5. Remote Gateway Deployment

​6. Hybrid Deployment

​Scaling Strategies

​Vertical Scaling (Scale Up)

​Horizontal Scaling (Scale Out)

​Auto-Scaling

​Performance Optimization

​Cache Configuration

​Timeout Configuration

​Async Guardrails

​High Availability

​Components for HA

Multiple Gateway Instances

Redis Sentinel

Load Balancer

Observability

​Disaster Recovery

​Cost Optimization

​Resource Sizing Guidelines

​Security Best Practices

​Troubleshooting Production Issues

​High Latency

​High Error Rate

​Cache Issues

​Next Steps

External Cache Setup

Load Balancing

Monitoring & Alerting

Security Hardening

Build docs developers (and LLMs) love

Overview

Architecture Patterns

1. Local Development

2. Docker Single Instance

3. Docker Compose with Observability

4. Distributed with External Cache

5. Remote Gateway Deployment

6. Hybrid Deployment

Scaling Strategies

Vertical Scaling (Scale Up)

Horizontal Scaling (Scale Out)

Auto-Scaling

Performance Optimization

Cache Configuration

Timeout Configuration

Async Guardrails

High Availability

Components for HA

Disaster Recovery

Cost Optimization

Resource Sizing Guidelines

Security Best Practices

Troubleshooting Production Issues

High Latency

High Error Rate

Cache Issues

Next Steps