Overview
The Secure MCP Gateway supports multiple deployment patterns to meet different scalability, availability, and security requirements. This guide covers common architectures from single-instance local deployments to distributed cloud deployments.
Architecture Patterns
1. Local Development
Use Case : Development, testing, personal use
┌─────────────────┐
│ MCP Client │ (Claude Desktop, Cursor)
│ (stdio) │
└────────┬────────┘
│
▼
┌─────────────────┐
│ Gateway │ (localhost)
│ (pip install) │
└────────┬────────┘
│
▼
┌─────────────────┐
│ MCP Servers │ (local processes)
│ (stdio/http) │
└─────────────────┘
Characteristics :
Single machine deployment
Stdio transport for client communication
Local in-memory cache
Ideal for: Development, testing, single-user scenarios
Setup :
# Install with pip
pip install secure-mcp-gateway
secure-mcp-gateway generate-config
secure-mcp-gateway install --client claude-desktop
This is the simplest deployment pattern. Configuration is stored locally at ~/.enkrypt/enkrypt_mcp_config.json.
2. Docker Single Instance
Use Case : Production single-server, consistent environment
┌─────────────────┐
│ MCP Client │
│ (stdio) │
└────────┬────────┘
│
▼
┌─────────────────────────────────┐
│ Docker Container │
│ ┌─────────────────────────┐ │
│ │ Gateway │ │
│ └─────────────────────────┘ │
│ ┌─────────────────────────┐ │
│ │ Config Volume │ │
│ │ ~/.enkrypt/docker │ │
│ └─────────────────────────┘ │
└────────┬────────────────────────┘
│
▼
┌─────────────────┐
│ MCP Servers │
└─────────────────┘
Characteristics :
Containerized gateway with volume-mounted config
Includes Python 3.12, Node.js 22.x runtime
Supports both Python and JavaScript MCP servers
Local cache within container
Setup :
# Build image
docker build -t secure-mcp-gateway .
# Generate config
docker run --rm \
-v ~/.enkrypt/docker:/app/.enkrypt/docker \
--entrypoint python \
secure-mcp-gateway \
-m secure_mcp_gateway.cli generate-config
# Run gateway
docker run -d \
--name secure-mcp-gateway \
-p 8000:8000 \
-v ~/.enkrypt/docker:/app/.enkrypt/docker \
-e ENKRYPT_GATEWAY_KEY="..." \
secure-mcp-gateway
3. Docker Compose with Observability
Use Case : Production with monitoring, debugging, team deployments
┌─────────────────┐
│ MCP Client │
└────────┬────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ Docker Compose Stack │
│ │
│ ┌──────────┐ ┌──────────────────────┐ │
│ │ Gateway │───▶│ OpenTelemetry │ │
│ │ (Port │ │ Collector │ │
│ │ 8000) │ │ (OTLP 4317, 4318) │ │
│ └──────────┘ └───┬──────────┬───────┘ │
│ │ │ │
│ ┌──────────────┴─┐ ┌───┴──────────┐ │
│ │ Jaeger │ │ Loki │ │
│ │ (Traces) │ │ (Logs) │ │
│ │ Port 16686 │ │ Port 3100 │ │
│ └────────┬───────┘ └───┬──────────┘ │
│ │ │ │
│ │ ┌───────────┴───┐ │
│ │ │ Prometheus │ │
│ │ │ (Metrics) │ │
│ │ │ Port 9090 │ │
│ │ └───────┬───────┘ │
│ │ │ │
│ └────────────┼─────────────┐ │
│ ▼ │ │
│ ┌──────────────┐ │ │
│ │ Grafana │ │ │
│ │ (Dashboards)│ │ │
│ │ Port 3000 │ │ │
│ └──────────────┘ │ │
└─────────────────────────────────────────────────┘
Characteristics :
Complete observability stack
Distributed tracing with Jaeger
Log aggregation with Loki
Metrics with Prometheus
Unified dashboards in Grafana
Inter-service communication via Docker network
Setup :
# Navigate to infrastructure directory
cd infra
# Start all services
docker compose up -d
# Access dashboards
# Grafana: http://localhost:3000
# Jaeger: http://localhost:16686
# Prometheus: http://localhost:9090
Observability Features :
Distributed Tracing Track requests across gateway, clients, and MCP servers. Visualize latency bottlenecks.
Log Aggregation Centralized logs from all services. Query with LogQL, correlate with traces.
Metrics Monitoring Request rates, error rates, cache hit ratios, timeout metrics, OAuth metrics.
Grafana Dashboards Pre-configured dashboards for gateway health, performance, and security insights.
4. Distributed with External Cache
Use Case : Multi-instance deployments, high availability, load balancing
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Client │ │ Client │ │ Client │
│ A │ │ B │ │ C │
└────┬─────┘ └────┬─────┘ └────┬─────┘
│ │ │
└─────────────┼─────────────┘
│
▼
┌────────────────┐
│ Load Balancer │
└───────┬────────┘
│
┌───────────┼───────────┐
│ │ │
▼ ▼ ▼
┌──────────┐┌──────────┐┌──────────┐
│ Gateway ││ Gateway ││ Gateway │
│ Instance ││ Instance ││ Instance │
│ 1 ││ 2 ││ 3 │
└─────┬────┘└─────┬────┘└─────┬────┘
│ │ │
└───────────┼───────────┘
▼
┌─────────────────┐
│ Redis/KeyDB │
│ External Cache │
└─────────────────┘
│
┌───────────┼───────────┐
▼ ▼ ▼
┌──────────┐┌──────────┐┌──────────┐
│ MCP ││ MCP ││ MCP │
│ Server A ││ Server B ││ Server C │
└──────────┘└──────────┘└──────────┘
Characteristics :
Multiple gateway instances behind load balancer
Shared Redis/KeyDB for distributed caching
Session affinity not required (stateless gateway)
Horizontal scaling for high throughput
Fault tolerance via redundancy
Configuration :
{
"common_mcp_gateway_config" : {
"enkrypt_mcp_use_external_cache" : true ,
"enkrypt_cache_host" : "redis.internal.example.com" ,
"enkrypt_cache_port" : 6379 ,
"enkrypt_cache_db" : 0 ,
"enkrypt_cache_password" : "your_redis_password" ,
"enkrypt_tool_cache_expiration" : 4 ,
"enkrypt_gateway_cache_expiration" : 24
}
}
Cache Keys (MD5 hashed for security):
Tool cache: mcp_tools:{config_id}:{server_name}
Gateway config: mcp_gateway_config:{config_id}
API key mapping: mcp_key_to_id:{gateway_key}
Scaling Considerations :
Cache Sizing
Estimate cache size based on:
Number of MCP servers: N
Average tools per server: T
Cache entry size: ~1-5 KB per server
Total: N * (T * 1KB + 5KB config)
Example: 100 servers, 20 tools each = ~500 KB cache
Redis Configuration
Recommended Redis/KeyDB settings: maxmemory 1gb
maxmemory-policy allkeys-lru
save ""
appendonly yes
tcp-keepalive 60
Load Balancer Setup
Use HTTP health check: GET /health (port 8000)
No session affinity needed (stateless)
Timeout: 30s (or default_timeout from config)
Round-robin or least-connections algorithm
5. Remote Gateway Deployment
Use Case : Centralized gateway, remote MCP clients, cloud deployment
┌────────────────────────────────────────┐
│ Local Environment │
│ ┌──────────┐ ┌──────────┐ │
│ │ Claude │ │ Cursor │ │
│ │ Desktop │ │ │ │
│ └────┬─────┘ └────┬─────┘ │
│ │ │ │
└───────┼─────────────┼──────────────────┘
│ │
│ HTTPS │
│ (TLS) │
▼ ▼
┌─────────────────────────────────────────┐
│ Cloud Environment (AWS/GCP/Azure) │
│ │
│ ┌───────────────────────────────────┐ │
│ │ Load Balancer (TLS Termination) │ │
│ └──────────────┬────────────────────┘ │
│ │ │
│ ┌────────┼────────┐ │
│ ▼ ▼ ▼ │
│ ┌────────┬────────┬────────┐ │
│ │Gateway │Gateway │Gateway │ │
│ │ 1 │ 2 │ 3 │ │
│ └───┬────┴───┬────┴───┬────┘ │
│ │ │ │ │
│ └────────┼────────┘ │
│ ▼ │
│ ┌──────────────┐ │
│ │ Redis Cache │ │
│ └──────────────┘ │
│ │ │
│ ┌────────┼────────┐ │
│ ▼ ▼ ▼ │
│ ┌────────┬────────┬────────┐ │
│ │ MCP │ MCP │ MCP │ │
│ │Server A│Server B│Server C│ │
│ └────────┴────────┴────────┘ │
│ │
│ ┌────────────────────────────────┐ │
│ │ Observability Stack │ │
│ │ (OTEL, Prometheus, Grafana) │ │
│ └────────────────────────────────┘ │
└─────────────────────────────────────────┘
Characteristics :
Gateway hosted in cloud (AWS, GCP, Azure)
Clients connect via HTTPS (streamable HTTP transport)
TLS termination at load balancer
Distributed caching with managed Redis
Centralized monitoring and logging
API key authentication over HTTPS
Gateway Configuration :
# Enable streamable HTTP transport
if __name__ == "__main__" :
mcp.run(
transport = "streamable-http" ,
host = "0.0.0.0" ,
port = 8000 ,
path = "/mcp/"
)
Client Configuration :
claude_desktop_config.json
{
"mcpServers" : {
"Remote Secure Gateway" : {
"url" : "https://gateway.example.com/mcp/" ,
"headers" : {
"Authorization" : "Bearer YOUR_GATEWAY_API_KEY"
}
}
}
}
Security Considerations :
Always use HTTPS for remote deployments
Store API keys in environment variables or secrets manager
Enable rate limiting at load balancer
Use VPC/private subnets for internal services
Rotate API keys regularly with CLI: secure-mcp-gateway apikey rotate
Cloud Deployment Examples :
Architecture :
ALB : Application Load Balancer with TLS
ECS Fargate : Containerized gateway instances
ElastiCache Redis : Managed cache cluster
CloudWatch : Logs and metrics
X-Ray : Distributed tracing
Terraform Example :resource "aws_ecs_service" "gateway" {
name = "secure-mcp-gateway"
cluster = aws_ecs_cluster . main . id
task_definition = aws_ecs_task_definition . gateway . arn
desired_count = 3
load_balancer {
target_group_arn = aws_lb_target_group . gateway . arn
container_name = "gateway"
container_port = 8000
}
network_configuration {
subnets = aws_subnet . private [ * ] . id
security_groups = [ aws_security_group . gateway . id ]
}
}
resource "aws_elasticache_cluster" "cache" {
cluster_id = "mcp-gateway-cache"
engine = "redis"
node_type = "cache.t3.small"
num_cache_nodes = 1
parameter_group_name = "default.redis7"
port = 6379
}
Architecture :
Cloud Load Balancing : HTTPS load balancer
Cloud Run : Serverless containers
Memorystore : Managed Redis
Cloud Logging : Centralized logs
Cloud Trace : Distributed tracing
gcloud Deploy :# Build and push image
gcloud builds submit --tag gcr.io/PROJECT_ID/secure-mcp-gateway
# Deploy to Cloud Run
gcloud run deploy secure-mcp-gateway \
--image gcr.io/PROJECT_ID/secure-mcp-gateway \
--platform managed \
--region us-central1 \
--allow-unauthenticated \
--set-env-vars "ENKRYPT_CACHE_HOST=REDIS_IP"
Architecture :
Application Gateway : HTTPS load balancer
Container Instances : Managed containers
Azure Cache for Redis : Managed Redis
Application Insights : Monitoring
Log Analytics : Log aggregation
Azure CLI Deploy :# Create container instance
az container create \
--resource-group mcp-gateway-rg \
--name secure-mcp-gateway \
--image secure-mcp-gateway:latest \
--cpu 2 --memory 4 \
--ports 8000 \
--environment-variables \
ENKRYPT_CACHE_HOST= $REDIS_HOST \
ENKRYPT_CACHE_PASSWORD= $REDIS_PASSWORD
6. Hybrid Deployment
Use Case : Gateway in cloud, MCP servers on-premise or distributed
┌─────────────────────────────────────────┐
│ Cloud (Gateway + Observability) │
│ ┌──────────┐ ┌──────────────┐ │
│ │ Gateway │──│ Redis Cache │ │
│ └────┬─────┘ └──────────────┘ │
└───────┼──────────────────────────────────┘
│
│ VPN / Private Link
│
┌───────┼──────────────────────────────────┐
│ On-Premise / Edge │
│ │ │
│ ┌────┴─────────┬─────────────┐ │
│ ▼ ▼ ▼ │
│ ┌──────────┬──────────┬──────────┐ │
│ │Internal │Database │File │ │
│ │API MCP │MCP │System MCP│ │
│ └──────────┴──────────┴──────────┘ │
└─────────────────────────────────────────┘
Characteristics :
Gateway in cloud for scalability
MCP servers on-premise for data security
Secure connectivity via VPN or private link
Sensitive data never leaves private network
Gateway acts as secure proxy
Use Cases :
Accessing internal databases without exposing them
File system operations on private networks
Compliance requirements (HIPAA, SOC 2)
Legacy system integration
Scaling Strategies
Vertical Scaling (Scale Up)
When to use : Initial growth, simpler operations
# Increase Docker container resources
docker run \
--memory= "4g" \
--cpus= "2.0" \
secure-mcp-gateway
# Or in docker-compose.yml
services:
gateway:
deploy:
resources:
limits:
cpus: '2.0'
memory: 4G
reservations:
cpus: '1.0'
memory: 2G
Limits : Single machine CPU/memory capacity (~16 cores, 64GB RAM typical)
Horizontal Scaling (Scale Out)
When to use : High throughput, high availability, distributed clients
Requirements :
External cache (Redis/KeyDB)
Load balancer
Stateless gateway design ✓ (already implemented)
Scaling Steps :
Enable External Cache
Update enkrypt_mcp_config.json: {
"common_mcp_gateway_config" : {
"enkrypt_mcp_use_external_cache" : true ,
"enkrypt_cache_host" : "redis.internal" ,
"enkrypt_cache_port" : 6379
}
}
Deploy Redis Cluster
For high availability: # Redis Sentinel (3 nodes)
docker run -d redis:7 --sentinel
# Or KeyDB (Redis-compatible, multi-threaded)
docker run -d eqalpha/keydb:latest
Launch Gateway Instances
# Instance 1
docker run -d --name gateway-1 -p 8001:8000 ...
# Instance 2
docker run -d --name gateway-2 -p 8002:8000 ...
# Instance 3
docker run -d --name gateway-3 -p 8003:8000 ...
Configure Load Balancer
Nginx Example :upstream gateway_backend {
least_conn ;
server gateway-1:8000 max_fails = 3 fail_timeout=30s;
server gateway-2:8000 max_fails = 3 fail_timeout=30s;
server gateway-3:8000 max_fails = 3 fail_timeout=30s;
}
server {
listen 443 ssl http2;
server_name gateway.example.com;
ssl_certificate /etc/nginx/ssl/cert.pem;
ssl_certificate_key /etc/nginx/ssl/key.pem;
location /mcp/ {
proxy_pass http://gateway_backend;
proxy_set_header Host $ host ;
proxy_set_header X-Real-IP $ remote_addr ;
proxy_http_version 1.1 ;
proxy_set_header Upgrade $ http_upgrade ;
proxy_set_header Connection "upgrade" ;
}
location /health {
proxy_pass http://gateway_backend/health;
}
}
Auto-Scaling
Kubernetes HPA (Horizontal Pod Autoscaler) :
apiVersion : autoscaling/v2
kind : HorizontalPodAutoscaler
metadata :
name : gateway-hpa
spec :
scaleTargetRef :
apiVersion : apps/v1
kind : Deployment
name : secure-mcp-gateway
minReplicas : 3
maxReplicas : 10
metrics :
- type : Resource
resource :
name : cpu
target :
type : Utilization
averageUtilization : 70
- type : Resource
resource :
name : memory
target :
type : Utilization
averageUtilization : 80
Metrics to Monitor :
CPU Utilization : Target 60-70%
Memory Usage : Target < 80%
Request Rate : Requests/second
Response Time : p50, p95, p99 latency
Error Rate : % failed requests
Cache Hit Ratio : > 80% ideal
Cache Configuration
{
"common_mcp_gateway_config" : {
"enkrypt_tool_cache_expiration" : 4 , // 4 hours (tool discovery)
"enkrypt_gateway_cache_expiration" : 24 // 24 hours (gateway config)
}
}
Tuning Guidelines :
Frequent tool changes : Reduce to 1-2 hours
Stable environments : Increase to 12-24 hours
High cache hit ratio : Longer expiration acceptable
Low memory : Shorter expiration to free space
Timeout Configuration
{
"timeout_settings" : {
"default_timeout" : 30 ,
"guardrail_timeout" : 15 ,
"auth_timeout" : 10 ,
"tool_execution_timeout" : 60 ,
"discovery_timeout" : 20 ,
"cache_timeout" : 5 ,
"connectivity_timeout" : 2
}
}
Optimization :
Increase tool_execution_timeout for slow MCP servers
Decrease guardrail_timeout if using fast local guardrails
Set connectivity_timeout low to fail fast on network issues
Async Guardrails
{
"enkrypt_async_input_guardrails_enabled" : true ,
"enkrypt_async_output_guardrails_enabled" : true
}
Benefits :
Reduced latency (guardrails run in parallel)
Improved throughput
Better resource utilization
Trade-offs :
Slightly higher complexity
Potential for race conditions in logging
High Availability
Components for HA
Multiple Gateway Instances
Minimum 3 instances for quorum
Deploy across availability zones
Use health checks for automatic failover
Redis Sentinel
3-node Sentinel cluster
Automatic failover (< 30s)
Monitors master health
Load Balancer
Active health checks every 10s
Remove unhealthy instances
SSL/TLS termination
Observability
Real-time alerts on failures
Distributed tracing for debugging
Historical metrics for capacity planning
Disaster Recovery
Backup Strategy :
# Backup configuration
secure-mcp-gateway system backup --output backup_ $( date +%Y%m%d ) .json
# Automated daily backups
0 2 * * * /usr/bin/secure-mcp-gateway system backup --output /backups/backup_ $( date + \% Y \% m \% d ) .json
Restore Procedure :
# Restore from backup
secure-mcp-gateway system restore --file backup_20260304.json
# Verify configuration
secure-mcp-gateway config list
Multi-Region Setup :
Deploy gateway in multiple regions
Use global load balancer (AWS Route 53, Cloudflare)
Replicate Redis across regions (async replication)
Geo-routing for low latency
Cost Optimization
Resource Sizing Guidelines
Deployment Instances CPU/Instance Memory/Instance Cache Est. Cost/Month Development 1 1 core 2 GB Local $20-40 Small Prod 2 2 cores 4 GB Redis (1 node) $150-250 Medium Prod 3-5 2 cores 8 GB Redis (3 nodes) $400-700 Large Prod 5-10 4 cores 16 GB Redis Cluster $1000-2000
Cost Reduction Tips :
Use spot/preemptible instances for non-critical environments
Enable auto-scaling to scale down during low traffic
Use reserved instances for predictable workloads (40-60% savings)
Optimize cache expiration to reduce Redis memory
Consider managed services (ECS Fargate, Cloud Run) for operational simplicity
Security Best Practices
Production Security Checklist :
✓ Use HTTPS for all remote connections
✓ Store API keys in secrets manager (AWS Secrets Manager, HashiCorp Vault)
✓ Enable Redis authentication (requirepass)
✓ Use VPC/private subnets for internal services
✓ Enable TLS for OTLP endpoints
✓ Rotate API keys every 90 days
✓ Enable guardrails on production servers
✓ Set up alerting for security events
✓ Regular security audits and penetration testing
✓ Keep dependencies updated
Troubleshooting Production Issues
High Latency
Check cache hit ratio :
# Via gateway tool
enkrypt_get_cache_status ()
# Target: > 80% hit ratio
Review timeout metrics :
enkrypt_get_timeout_metrics ()
Analyze traces in Jaeger :
Identify slow MCP servers
Look for guardrail bottlenecks
Check network latency
High Error Rate
Check Grafana dashboards :
Error rate by server
Error types (auth, timeout, execution)
Review logs in Loki :
{job="gateway"} |= "ERROR"
Common causes :
MCP server crashes
Network connectivity issues
Invalid API keys
Guardrail blocks
Cache Issues
Verify Redis connectivity :
redis-cli -h redis.internal ping
# Expected: PONG
Check Redis memory :
Clear cache if stale :
secure-mcp-gateway cache clear --all
Next Steps
External Cache Setup Configure Redis/KeyDB for distributed caching
Load Balancing Set up Nginx or cloud load balancers
Monitoring & Alerting Configure alerts for production incidents
Security Hardening Implement production security controls