Overview
The Secure MCP Gateway uses OpenTelemetry as its primary observability framework, providing:
Structured Logging via OTLP log export to Loki
Distributed Tracing with context propagation to Jaeger
Metrics Collection with Prometheus export
Unified Telemetry through the OpenTelemetry Collector
OpenTelemetry Provider
The OpenTelemetryProvider implements full OpenTelemetry support:
Location: src/secure_mcp_gateway/plugins/telemetry/opentelemetry_provider.py
Key Features
OTLP Export: gRPC and HTTP protocols supported
Resource Attributes: Service name, version, environment metadata
Batch Processing: Efficient batching of telemetry data
Connectivity Check: Automatic endpoint validation on startup
Graceful Degradation: Falls back to no-op if collector unavailable
Provider Implementation
class OpenTelemetryProvider ( TelemetryProvider ):
def __init__ ( self , config : dict[ str , Any] | None = None ):
self ._initialized = False
self ._logger = None
self ._tracer = None
self ._meter = None
self ._resource = None
if config:
self .initialize(config)
def initialize ( self , config : dict[ str , Any]) -> TelemetryResult:
# Extract configuration
enabled = self ._check_telemetry_enabled(config)
endpoint = config.get( "url" , "http://localhost:4317" )
insecure = config.get( "insecure" , True )
service_name = config.get( "service_name" , "secure-mcp-gateway" )
job_name = config.get( "job_name" , "enkryptai" )
if enabled:
self ._setup_enabled_telemetry(
endpoint, insecure, service_name, job_name, config
)
else :
self ._setup_disabled_telemetry()
self ._initialized = True
return TelemetryResult( success = True , provider_name = self .name)
Configuration
Basic Configuration
Add telemetry configuration to enkrypt_mcp_config.json:
{
"plugins" : {
"telemetry" : {
"provider" : "opentelemetry" ,
"config" : {
"enabled" : true ,
"url" : "http://localhost:4317" ,
"insecure" : true ,
"service_name" : "secure-mcp-gateway" ,
"job_name" : "enkryptai"
}
}
},
"common_mcp_gateway_config" : {
"enkrypt_log_level" : "INFO"
}
}
Configuration Options
Enable OpenTelemetry telemetry. When false, uses no-op implementations.
url
string
default: "http://localhost:4317"
OTLP endpoint URL. Supports:
gRPC: http://localhost:4317 (default)
HTTP: http://localhost:4318
Use insecure connection (no TLS). Set to false for production with TLS.
service_name
string
default: "secure-mcp-gateway"
Service name in resource attributes. Used for filtering in Grafana/Jaeger.
job_name
string
default: "enkryptai"
Job name for Prometheus metrics and resource attributes.
Production Configuration
For production deployments with TLS:
{
"plugins" : {
"telemetry" : {
"provider" : "opentelemetry" ,
"config" : {
"enabled" : true ,
"url" : "https://otel-collector.example.com:4318" ,
"insecure" : false ,
"service_name" : "secure-mcp-gateway-prod" ,
"job_name" : "production"
}
}
},
"common_mcp_gateway_config" : {
"enkrypt_log_level" : "WARNING"
}
}
OpenTelemetry Collector Setup
Using Docker Compose
The gateway includes a complete observability stack:
cd infra/
docker-compose up -d
This starts:
OpenTelemetry Collector (ports 4317, 4318, 8889)
Jaeger (port 16686)
Loki (port 3100)
Prometheus (port 9090)
Grafana (port 3000)
Collector Configuration
Location: infra/otel_collector/otel-collector-config.yaml
receivers :
otlp :
protocols :
grpc :
endpoint : 0.0.0.0:4317
http :
endpoint : 0.0.0.0:4318
processors :
batch :
timeout : 1s
send_batch_size : 1024
exporters :
# Traces to Jaeger
otlp :
endpoint : jaeger:4317
tls :
insecure : true
# Logs to Loki
otlphttp/loki :
endpoint : "http://loki:3100/otlp"
tls :
insecure : true
# Metrics to Prometheus
prometheus :
endpoint : "0.0.0.0:8889"
namespace : "otel"
const_labels :
service_name : "secure-mcp-gateway"
service :
pipelines :
traces :
receivers : [ otlp ]
processors : [ batch ]
exporters : [ otlp , debug ]
metrics :
receivers : [ otlp ]
processors : [ batch ]
exporters : [ prometheus , debug ]
logs :
receivers : [ otlp ]
processors : [ batch ]
exporters : [ otlphttp/loki , debug ]
Manual Installation
If not using Docker Compose:
# Download collector
wget https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.134.1/otelcol-contrib_0.134.1_linux_amd64.tar.gz
tar -xzf otelcol-contrib_0.134.1_linux_amd64.tar.gz
# Create config file (use example above)
vim otel-collector-config.yaml
# Run collector
./otelcol-contrib --config=otel-collector-config.yaml
Distributed Tracing
Trace Context Propagation
The gateway automatically propagates trace context across operations:
from secure_mcp_gateway.utils import logger
from secure_mcp_gateway.plugins.telemetry import get_telemetry_config_manager
telemetry_manager = get_telemetry_config_manager()
tracer = telemetry_manager.get_tracer()
# Create a span
with tracer.start_as_current_span( "tool_execution" ) as span:
span.set_attribute( "server_name" , "github_server" )
span.set_attribute( "tool_name" , "create_issue" )
span.set_attribute( "user_id" , user_id)
# Execute operation
result = execute_tool(server_name, tool_name, args)
# Add result attributes
span.set_attribute( "success" , result.success)
span.set_attribute( "duration_ms" , result.duration)
Trace Attributes
Common attributes used in gateway traces:
Attribute Type Description server_namestring MCP server name tool_namestring Tool being executed user_idstring User identifier project_idstring Project identifier custom_idstring Request correlation ID duration_msint Operation duration successboolean Operation success status error_typestring Error type if failed
Viewing Traces in Jaeger
Open Jaeger UI: http://localhost:16686
Select service: secure-mcp-gateway
Search by:
Operation name (e.g., tool_execution)
Tags (e.g., server_name=github_server)
Duration (e.g., slow traces > 1s)
Trace Examples
Tool Execution Trace:
tool_execution [250ms]
├── authenticate [10ms]
│ └── cache_lookup [2ms]
├── input_guardrails [30ms]
│ ├── pii_detection [15ms]
│ └── toxicity_check [15ms]
├── forward_to_server [180ms]
│ ├── discover_tools [50ms]
│ └── call_tool [130ms]
└── output_guardrails [30ms]
├── relevancy_check [15ms]
└── adherence_check [15ms]
OTLP Export
gRPC Export (Default)
Default configuration uses gRPC:
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import OTLPMetricExporter
from opentelemetry.exporter.otlp.proto.grpc._log_exporter import OTLPLogExporter
# Traces
otlp_exporter = OTLPSpanExporter(
endpoint = "localhost:4317" ,
insecure = True
)
# Metrics
metric_exporter = OTLPMetricExporter(
endpoint = "localhost:4317" ,
insecure = True
)
# Logs
log_exporter = OTLPLogExporter(
endpoint = "localhost:4317" ,
insecure = True
)
Location: src/secure_mcp_gateway/plugins/telemetry/opentelemetry_provider.py:349
HTTP Export
To use HTTP instead of gRPC, configure endpoint with port 4318:
{
"plugins" : {
"telemetry" : {
"config" : {
"url" : "http://localhost:4318"
}
}
}
}
The provider automatically selects the appropriate exporter based on the port.
Resource Attributes
Resource attributes identify the telemetry source:
from opentelemetry.sdk.resources import Resource
self ._resource = Resource(
attributes = {
"service.name" : "secure-mcp-gateway" ,
"job" : "enkryptai" ,
"service.version" : "2.1.2" ,
"deployment.environment" : "production"
}
)
These attributes appear in all logs, traces, and metrics.
Location: src/secure_mcp_gateway/plugins/telemetry/opentelemetry_provider.py:321
Connectivity Check
Before enabling telemetry, the provider validates endpoint reachability:
def _check_telemetry_enabled ( self , config : dict[ str , Any]) -> bool :
"""Check if telemetry is enabled and endpoint is reachable."""
if not config.get( "enabled" , False ):
return False
endpoint = config.get( "url" , "http://localhost:4317" )
parsed_url = urlparse(endpoint)
hostname = parsed_url.hostname
port = parsed_url.port
try :
# Get timeout from TimeoutManager
from secure_mcp_gateway.services.timeout import get_timeout_manager
timeout_manager = get_timeout_manager()
timeout_value = timeout_manager.get_timeout( "connectivity" )
# Test connection
with socket.create_connection((hostname, port), timeout = timeout_value):
logger.debug( f "OTLP endpoint { endpoint } is reachable" )
return True
except ( OSError , AttributeError , TypeError , ValueError ) as e:
logger.error(
f "Telemetry enabled but endpoint { endpoint } unreachable. "
f "Disabling telemetry. Error: { e } "
)
return False
Location: src/secure_mcp_gateway/plugins/telemetry/opentelemetry_provider.py:152
This ensures the gateway starts successfully even if the collector is unavailable.
Metrics Export
Metrics are exported via Prometheus exporter:
from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import OTLPMetricExporter
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.metrics.export import PeriodicExportingMetricReader
# Create exporter
otlp_exporter = OTLPMetricExporter(
endpoint = "localhost:4317" ,
insecure = True
)
# Create reader with 5-second export interval
reader = PeriodicExportingMetricReader(
otlp_exporter,
export_interval_millis = 5000
)
# Create meter provider
provider = MeterProvider(
resource = self ._resource,
metric_readers = [reader]
)
metrics.set_meter_provider(provider)
# Get meter
self ._meter = metrics.get_meter( "enkrypt.meter" )
Location: src/secure_mcp_gateway/plugins/telemetry/opentelemetry_provider.py:358
The collector receives OTLP metrics and exports them to Prometheus on port 8889.
Integration with Services
Grafana Integration
Datasource Configuration:
Location: infra/grafana/provisioning/datasources/datasources.yaml
apiVersion : 1
datasources :
- name : Loki
type : loki
access : proxy
url : http://loki:3100
jsonData :
maxLines : 1000
- name : Prometheus
type : prometheus
access : proxy
url : http://prometheus:9090
isDefault : true
jsonData :
exemplarTraceIdDestinations :
- name : trace_id
datasourceUid : jaeger
- name : Jaeger
type : jaeger
uid : jaeger
access : proxy
url : http://jaeger:16686/jaeger
jsonData :
nodeGraph :
enabled : true
Jaeger Integration
Jaeger receives traces via OTLP:
jaeger :
image : jaegertracing/all-in-one:1.73.0
ports :
- "16686:16686" # Web UI
- "14250:14250" # gRPC for collector
environment :
- COLLECTOR_OTLP_ENABLED=true
Location: infra/docker-compose.yml:37
Loki Integration
Loki receives logs via OTLP HTTP:
loki :
image : grafana/loki:main-cadc824
ports :
- "3100:3100"
volumes :
- ./loki/loki-config.yaml:/etc/loki/local-config.yaml
Location: infra/docker-compose.yml:50
Troubleshooting
Telemetry Not Exporting
Check collector is running:
docker ps | grep otel-collector
curl http://localhost:4317 # Should return HTTP error (expected)
Check gateway logs:
# Look for telemetry initialization messages
grep -i "telemetry" gateway.log
Verify configuration:
cat ~/.enkrypt/enkrypt_mcp_config.json | jq '.plugins.telemetry'
Connection Refused
Error: Telemetry enabled but endpoint localhost:4317 unreachable
Solutions:
Start the collector: docker-compose up -d otel-collector
Check firewall rules
Verify endpoint in config matches collector address
No Traces in Jaeger
Check collector exports to Jaeger:
docker logs otel-collector | grep jaeger
Verify Jaeger OTLP is enabled:
docker logs jaeger | grep OTLP
Check trace sampling (if configured)
Metrics Not in Prometheus
Check Prometheus scrape targets:
http://localhost:9090/targets
Verify collector Prometheus endpoint:
curl http://localhost:8889/metrics
Check Prometheus config:
cat infra/prometheus/prometheus.yml
Batch Processing
Adjust batch settings in collector config:
processors :
batch :
timeout : 1s # Export every 1 second
send_batch_size : 1024 # Or when 1024 items collected
Export Interval
Adjust metric export interval:
reader = PeriodicExportingMetricReader(
otlp_exporter,
export_interval_millis = 10000 # 10 seconds instead of 5
)
Sampling
For high-volume deployments, configure trace sampling:
processors :
probabilistic_sampler :
sampling_percentage : 10 # Sample 10% of traces
service :
pipelines :
traces :
receivers : [ otlp ]
processors : [ batch , probabilistic_sampler ]
exporters : [ otlp ]
Advanced Topics
Custom Exporters
Add custom exporters to the collector:
exporters :
otlphttp/custom :
endpoint : "https://custom-backend.example.com/v1/traces"
headers :
Authorization : "Bearer ${CUSTOM_TOKEN}"
service :
pipelines :
traces :
exporters : [ otlp , otlphttp/custom ]
TLS Configuration
For production with TLS:
receivers :
otlp :
protocols :
grpc :
endpoint : 0.0.0.0:4317
tls :
cert_file : /path/to/cert.pem
key_file : /path/to/key.pem
Authentication
Add authentication to OTLP export:
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
exporter = OTLPSpanExporter(
endpoint = "collector.example.com:4317" ,
headers = (( "authorization" , "Bearer YOUR_TOKEN" ),)
)
Next Steps
Metrics Explore available metrics and Grafana dashboards
Logging Configure structured logging and log aggregation
Overview Return to observability overview
Deployment Deploy the full observability stack