OpenTelemetry Setup - Secure MCP Gateway

Overview

The Secure MCP Gateway uses OpenTelemetry as its primary observability framework, providing:

Structured Logging via OTLP log export to Loki
Distributed Tracing with context propagation to Jaeger
Metrics Collection with Prometheus export
Unified Telemetry through the OpenTelemetry Collector

OpenTelemetry Provider

The OpenTelemetryProvider implements full OpenTelemetry support: Location: src/secure_mcp_gateway/plugins/telemetry/opentelemetry_provider.py

Key Features

OTLP Export: gRPC and HTTP protocols supported
Resource Attributes: Service name, version, environment metadata
Batch Processing: Efficient batching of telemetry data
Connectivity Check: Automatic endpoint validation on startup
Graceful Degradation: Falls back to no-op if collector unavailable

Provider Implementation

class OpenTelemetryProvider(TelemetryProvider):
    def __init__(self, config: dict[str, Any] | None = None):
        self._initialized = False
        self._logger = None
        self._tracer = None
        self._meter = None
        self._resource = None
        
        if config:
            self.initialize(config)
    
    def initialize(self, config: dict[str, Any]) -> TelemetryResult:
        # Extract configuration
        enabled = self._check_telemetry_enabled(config)
        endpoint = config.get("url", "http://localhost:4317")
        insecure = config.get("insecure", True)
        service_name = config.get("service_name", "secure-mcp-gateway")
        job_name = config.get("job_name", "enkryptai")
        
        if enabled:
            self._setup_enabled_telemetry(
                endpoint, insecure, service_name, job_name, config
            )
        else:
            self._setup_disabled_telemetry()
        
        self._initialized = True
        return TelemetryResult(success=True, provider_name=self.name)

Configuration

Basic Configuration

Add telemetry configuration to enkrypt_mcp_config.json:

{
  "plugins": {
    "telemetry": {
      "provider": "opentelemetry",
      "config": {
        "enabled": true,
        "url": "http://localhost:4317",
        "insecure": true,
        "service_name": "secure-mcp-gateway",
        "job_name": "enkryptai"
      }
    }
  },
  "common_mcp_gateway_config": {
    "enkrypt_log_level": "INFO"
  }
}

Configuration Options

enabled

boolean

default:"true"

Enable OpenTelemetry telemetry. When false, uses no-op implementations.

url

string

default:"http://localhost:4317"

OTLP endpoint URL. Supports:

gRPC: http://localhost:4317 (default)
HTTP: http://localhost:4318

insecure

boolean

default:"true"

Use insecure connection (no TLS). Set to false for production with TLS.

service_name

string

default:"secure-mcp-gateway"

Service name in resource attributes. Used for filtering in Grafana/Jaeger.

job_name

string

default:"enkryptai"

Job name for Prometheus metrics and resource attributes.

Production Configuration

For production deployments with TLS:

{
  "plugins": {
    "telemetry": {
      "provider": "opentelemetry",
      "config": {
        "enabled": true,
        "url": "https://otel-collector.example.com:4318",
        "insecure": false,
        "service_name": "secure-mcp-gateway-prod",
        "job_name": "production"
      }
    }
  },
  "common_mcp_gateway_config": {
    "enkrypt_log_level": "WARNING"
  }
}

OpenTelemetry Collector Setup

Using Docker Compose

The gateway includes a complete observability stack:

cd infra/
docker-compose up -d

This starts:

OpenTelemetry Collector (ports 4317, 4318, 8889)
Jaeger (port 16686)
Loki (port 3100)
Prometheus (port 9090)
Grafana (port 3000)

Collector Configuration

Location: infra/otel_collector/otel-collector-config.yaml

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  batch:
    timeout: 1s
    send_batch_size: 1024

exporters:
  # Traces to Jaeger
  otlp:
    endpoint: jaeger:4317
    tls:
      insecure: true
  
  # Logs to Loki
  otlphttp/loki:
    endpoint: "http://loki:3100/otlp"
    tls:
      insecure: true
  
  # Metrics to Prometheus
  prometheus:
    endpoint: "0.0.0.0:8889"
    namespace: "otel"
    const_labels:
      service_name: "secure-mcp-gateway"

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlp, debug]
    
    metrics:
      receivers: [otlp]
      processors: [batch]
      exporters: [prometheus, debug]
    
    logs:
      receivers: [otlp]
      processors: [batch]
      exporters: [otlphttp/loki, debug]

Manual Installation

If not using Docker Compose:

# Download collector
wget https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.134.1/otelcol-contrib_0.134.1_linux_amd64.tar.gz
tar -xzf otelcol-contrib_0.134.1_linux_amd64.tar.gz

# Create config file (use example above)
vim otel-collector-config.yaml

# Run collector
./otelcol-contrib --config=otel-collector-config.yaml

Distributed Tracing

Trace Context Propagation

The gateway automatically propagates trace context across operations:

from secure_mcp_gateway.utils import logger
from secure_mcp_gateway.plugins.telemetry import get_telemetry_config_manager

telemetry_manager = get_telemetry_config_manager()
tracer = telemetry_manager.get_tracer()

# Create a span
with tracer.start_as_current_span("tool_execution") as span:
    span.set_attribute("server_name", "github_server")
    span.set_attribute("tool_name", "create_issue")
    span.set_attribute("user_id", user_id)
    
    # Execute operation
    result = execute_tool(server_name, tool_name, args)
    
    # Add result attributes
    span.set_attribute("success", result.success)
    span.set_attribute("duration_ms", result.duration)

Trace Attributes

Common attributes used in gateway traces:

Attribute	Type	Description
`server_name`	string	MCP server name
`tool_name`	string	Tool being executed
`user_id`	string	User identifier
`project_id`	string	Project identifier
`custom_id`	string	Request correlation ID
`duration_ms`	int	Operation duration
`success`	boolean	Operation success status
`error_type`	string	Error type if failed

Viewing Traces in Jaeger

Open Jaeger UI: http://localhost:16686
Select service: secure-mcp-gateway
Search by:
- Operation name (e.g., tool_execution)
- Tags (e.g., server_name=github_server)
- Duration (e.g., slow traces > 1s)

Trace Examples

Tool Execution Trace:

tool_execution [250ms]
├── authenticate [10ms]
│   └── cache_lookup [2ms]
├── input_guardrails [30ms]
│   ├── pii_detection [15ms]
│   └── toxicity_check [15ms]
├── forward_to_server [180ms]
│   ├── discover_tools [50ms]
│   └── call_tool [130ms]
└── output_guardrails [30ms]
    ├── relevancy_check [15ms]
    └── adherence_check [15ms]

OTLP Export

gRPC Export (Default)

Default configuration uses gRPC:

from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import OTLPMetricExporter
from opentelemetry.exporter.otlp.proto.grpc._log_exporter import OTLPLogExporter

# Traces
otlp_exporter = OTLPSpanExporter(
    endpoint="localhost:4317",
    insecure=True
)

# Metrics
metric_exporter = OTLPMetricExporter(
    endpoint="localhost:4317",
    insecure=True
)

# Logs
log_exporter = OTLPLogExporter(
    endpoint="localhost:4317",
    insecure=True
)

Location: src/secure_mcp_gateway/plugins/telemetry/opentelemetry_provider.py:349

HTTP Export

To use HTTP instead of gRPC, configure endpoint with port 4318:

{
  "plugins": {
    "telemetry": {
      "config": {
        "url": "http://localhost:4318"
      }
    }
  }
}

The provider automatically selects the appropriate exporter based on the port.

Resource Attributes

Resource attributes identify the telemetry source:

from opentelemetry.sdk.resources import Resource

self._resource = Resource(
    attributes={
        "service.name": "secure-mcp-gateway",
        "job": "enkryptai",
        "service.version": "2.1.2",
        "deployment.environment": "production"
    }
)

These attributes appear in all logs, traces, and metrics. Location: src/secure_mcp_gateway/plugins/telemetry/opentelemetry_provider.py:321

Connectivity Check

Before enabling telemetry, the provider validates endpoint reachability:

def _check_telemetry_enabled(self, config: dict[str, Any]) -> bool:
    """Check if telemetry is enabled and endpoint is reachable."""
    if not config.get("enabled", False):
        return False
    
    endpoint = config.get("url", "http://localhost:4317")
    parsed_url = urlparse(endpoint)
    hostname = parsed_url.hostname
    port = parsed_url.port
    
    try:
        # Get timeout from TimeoutManager
        from secure_mcp_gateway.services.timeout import get_timeout_manager
        timeout_manager = get_timeout_manager()
        timeout_value = timeout_manager.get_timeout("connectivity")
        
        # Test connection
        with socket.create_connection((hostname, port), timeout=timeout_value):
            logger.debug(f"OTLP endpoint {endpoint} is reachable")
            return True
    except (OSError, AttributeError, TypeError, ValueError) as e:
        logger.error(
            f"Telemetry enabled but endpoint {endpoint} unreachable. "
            f"Disabling telemetry. Error: {e}"
        )
        return False

Location: src/secure_mcp_gateway/plugins/telemetry/opentelemetry_provider.py:152 This ensures the gateway starts successfully even if the collector is unavailable.

Metrics Export

Metrics are exported via Prometheus exporter:

from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import OTLPMetricExporter
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.metrics.export import PeriodicExportingMetricReader

# Create exporter
otlp_exporter = OTLPMetricExporter(
    endpoint="localhost:4317",
    insecure=True
)

# Create reader with 5-second export interval
reader = PeriodicExportingMetricReader(
    otlp_exporter,
    export_interval_millis=5000
)

# Create meter provider
provider = MeterProvider(
    resource=self._resource,
    metric_readers=[reader]
)
metrics.set_meter_provider(provider)

# Get meter
self._meter = metrics.get_meter("enkrypt.meter")

Location: src/secure_mcp_gateway/plugins/telemetry/opentelemetry_provider.py:358 The collector receives OTLP metrics and exports them to Prometheus on port 8889.

Integration with Services

Grafana Integration

Datasource Configuration: Location: infra/grafana/provisioning/datasources/datasources.yaml

apiVersion: 1

datasources:
  - name: Loki
    type: loki
    access: proxy
    url: http://loki:3100
    jsonData:
      maxLines: 1000
  
  - name: Prometheus
    type: prometheus
    access: proxy
    url: http://prometheus:9090
    isDefault: true
    jsonData:
      exemplarTraceIdDestinations:
        - name: trace_id
          datasourceUid: jaeger
  
  - name: Jaeger
    type: jaeger
    uid: jaeger
    access: proxy
    url: http://jaeger:16686/jaeger
    jsonData:
      nodeGraph:
        enabled: true

Jaeger Integration

Jaeger receives traces via OTLP:

jaeger:
  image: jaegertracing/all-in-one:1.73.0
  ports:
    - "16686:16686"  # Web UI
    - "14250:14250"  # gRPC for collector
  environment:
    - COLLECTOR_OTLP_ENABLED=true

Location: infra/docker-compose.yml:37

Loki Integration

Loki receives logs via OTLP HTTP:

loki:
  image: grafana/loki:main-cadc824
  ports:
    - "3100:3100"
  volumes:
    - ./loki/loki-config.yaml:/etc/loki/local-config.yaml

Location: infra/docker-compose.yml:50

Troubleshooting

Telemetry Not Exporting

Check collector is running:

docker ps | grep otel-collector
curl http://localhost:4317  # Should return HTTP error (expected)

Check gateway logs:

# Look for telemetry initialization messages
grep -i "telemetry" gateway.log

Verify configuration:

cat ~/.enkrypt/enkrypt_mcp_config.json | jq '.plugins.telemetry'

Connection Refused

Error: Telemetry enabled but endpoint localhost:4317 unreachable Solutions:

Start the collector: docker-compose up -d otel-collector
Check firewall rules
Verify endpoint in config matches collector address

No Traces in Jaeger

Check collector exports to Jaeger:

docker logs otel-collector | grep jaeger

Verify Jaeger OTLP is enabled:
```
docker logs jaeger | grep OTLP
```
Check trace sampling (if configured)

Metrics Not in Prometheus

Check Prometheus scrape targets:
```
http://localhost:9090/targets
```
Verify collector Prometheus endpoint:
```
curl http://localhost:8889/metrics
```
Check Prometheus config:
```
cat infra/prometheus/prometheus.yml
```

Performance Tuning

Batch Processing

Adjust batch settings in collector config:

processors:
  batch:
    timeout: 1s          # Export every 1 second
    send_batch_size: 1024  # Or when 1024 items collected

Export Interval

Adjust metric export interval:

reader = PeriodicExportingMetricReader(
    otlp_exporter,
    export_interval_millis=10000  # 10 seconds instead of 5
)

Sampling

For high-volume deployments, configure trace sampling:

processors:
  probabilistic_sampler:
    sampling_percentage: 10  # Sample 10% of traces

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [batch, probabilistic_sampler]
      exporters: [otlp]

Advanced Topics

Custom Exporters

Add custom exporters to the collector:

exporters:
  otlphttp/custom:
    endpoint: "https://custom-backend.example.com/v1/traces"
    headers:
      Authorization: "Bearer ${CUSTOM_TOKEN}"

service:
  pipelines:
    traces:
      exporters: [otlp, otlphttp/custom]

TLS Configuration

For production with TLS:

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
        tls:
          cert_file: /path/to/cert.pem
          key_file: /path/to/key.pem

Authentication

Add authentication to OTLP export:

from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter

exporter = OTLPSpanExporter(
    endpoint="collector.example.com:4317",
    headers=(("authorization", "Bearer YOUR_TOKEN"),)
)

Next Steps

Metrics

Explore available metrics and Grafana dashboards

Logging

Configure structured logging and log aggregation

Overview

Return to observability overview

Deployment

Deploy the full observability stack

Get Started

Core Concepts

Features

Deployment

Client Integration

Observability

Security

Guides

​Overview

​OpenTelemetry Provider

​Key Features

​Provider Implementation

​Configuration

​Basic Configuration

​Configuration Options

​Production Configuration

​OpenTelemetry Collector Setup

​Using Docker Compose

​Collector Configuration

​Manual Installation

​Distributed Tracing

​Trace Context Propagation

​Trace Attributes

​Viewing Traces in Jaeger

​Trace Examples

​OTLP Export

​gRPC Export (Default)

​HTTP Export

​Resource Attributes

​Connectivity Check

​Metrics Export

​Integration with Services

​Grafana Integration

​Jaeger Integration

​Loki Integration

​Troubleshooting

​Telemetry Not Exporting

​Connection Refused

​No Traces in Jaeger

​Metrics Not in Prometheus

​Performance Tuning

​Batch Processing

​Export Interval

​Sampling

​Advanced Topics

​Custom Exporters

​TLS Configuration

​Authentication

​Next Steps

Metrics

Logging

Overview

Deployment

Build docs developers (and LLMs) love

Overview

OpenTelemetry Provider

Key Features

Provider Implementation

Configuration

Basic Configuration

Configuration Options

Production Configuration

OpenTelemetry Collector Setup

Using Docker Compose

Collector Configuration

Manual Installation

Distributed Tracing

Trace Context Propagation

Trace Attributes

Viewing Traces in Jaeger

Trace Examples

OTLP Export

gRPC Export (Default)

HTTP Export

Resource Attributes

Connectivity Check

Metrics Export

Integration with Services

Grafana Integration

Jaeger Integration

Loki Integration

Troubleshooting

Telemetry Not Exporting

Connection Refused

No Traces in Jaeger

Metrics Not in Prometheus

Performance Tuning

Batch Processing

Export Interval

Sampling

Advanced Topics

Custom Exporters

TLS Configuration

Authentication

Next Steps