Skip to main content

Overview

The Secure MCP Gateway provides automatic detection and redaction of Personally Identifiable Information (PII) to protect sensitive user data. The PII handling system operates transparently:
  1. Input: PII is detected and redacted before sending to MCP servers
  2. Processing: MCP servers see only redacted/anonymized data
  3. Output: PII is automatically restored in responses (de-anonymization)
Zero Trust for PII: Even trusted MCP servers never see original PII values, reducing data exposure risk.

How PII Redaction Works

Complete Flow

Step-by-Step Process

1

PII Detection

Input Analysis: Scan request for PII entitiesThe PII handler analyzes the input text using pattern matching and NLP models:
async def detect_pii(content: str) -> List[GuardrailViolation]:
    # Call Enkrypt PII API with mode="request"
    payload = {
        "text": content,
        "mode": "request",
        "key": "null"  # No existing mapping
    }
    
    result = await call_pii_api(payload)
    
    # If text changed, PII was detected
    if result["text"] != content:
        return [PII_VIOLATION]
    return []
Example:
Input: "Contact John Smith at [email protected] or 555-123-4567"

Detected:
- NAME: "John Smith"
- EMAIL: "[email protected]"  
- PHONE: "555-123-4567"
2

PII Redaction

Token Replacement: Replace PII with anonymized tokensEach PII entity is replaced with a unique token:
async def redact_pii(content: str) -> tuple[str, Dict[str, Any]]:
    payload = {
        "text": content,
        "mode": "request",
        "key": "null"
    }
    
    result = await call_pii_api(payload)
    
    redacted_text = result["text"]
    pii_key = result["key"]  # Unique key for this session
    
    return redacted_text, {"key": pii_key}
Example:
Original: "Contact John Smith at [email protected] or 555-123-4567"

Redacted: "Contact [NAME_1] at [EMAIL_1] or [PHONE_1]"

Mapping (stored with key "abc123xyz"):
{
  "NAME_1": "John Smith",
  "EMAIL_1": "[email protected]",
  "PHONE_1": "555-123-4567"
}
3

Protected Processing

Server Communication: Send redacted text to MCP serverThe MCP server receives only anonymized data:
MCP Server receives: "Contact [NAME_1] at [EMAIL_1] or [PHONE_1]"

Server processes request without seeing actual PII

MCP Server responds: "Contact request sent to [EMAIL_1]. [NAME_1] will be notified at [PHONE_1]."
4

PII Restoration (De-anonymization)

Token Replacement (Reverse): Restore original PII in responseUsing the stored mapping, tokens are replaced with original values:
async def restore_pii(content: str, pii_mapping: Dict[str, Any]) -> str:
    pii_key = pii_mapping.get("key", "")
    if not pii_key:
        return content  # No PII to restore
    
    payload = {
        "text": content,
        "mode": "response",
        "key": pii_key  # Use same key from redaction
    }
    
    result = await call_pii_api(payload)
    return result["text"]
Example:
Server Response: "Contact request sent to [EMAIL_1]. [NAME_1] will be notified at [PHONE_1]."

Restored: "Contact request sent to [email protected]. John Smith will be notified at 555-123-4567."

Supported PII Types

Personal Information

Names:
  • Person names (first, last, full)
  • Organization names
  • Nicknames and aliases
Identifiers:
  • Social Security Numbers (SSN)
  • Tax IDs (EIN, ITIN)
  • National ID numbers
  • Passport numbers
  • Driver’s license numbers
Example:
Input: "John Q. Public, SSN 123-45-6789"
Redacted: "[NAME_1], SSN [SSN_1]"
Email Addresses:Phone Numbers:
  • US format: (555) 123-4567
  • International: +1-555-123-4567
  • Extensions: 555-1234 x567
Physical Addresses:
  • Street addresses
  • Cities, states, ZIP codes
  • Country information
  • PO boxes
Example:
Input: "Email: [email protected], Phone: +1-555-0100, Address: 123 Main St, New York, NY 10001"
Redacted: "Email: [EMAIL_1], Phone: [PHONE_1], Address: [ADDRESS_1]"
Payment Cards:
  • Credit card numbers (Visa, MasterCard, Amex, Discover)
  • Debit card numbers
  • CVV codes
  • Expiration dates
Bank Details:
  • Account numbers
  • Routing numbers
  • IBAN codes
  • SWIFT codes
Example:
Input: "Card: 4532-1234-5678-9010, CVV: 123, Exp: 12/25"
Redacted: "Card: [CARD_1], CVV: [CVV_1], Exp: [DATE_1]"
IP Addresses:
  • IPv4 (192.168.1.1)
  • IPv6 (2001:0db8:85a3::8a2e:0370:7334)
  • Private IPs (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16)
MAC Addresses:
  • Standard format (00:1B:44:11:3A:B7)
  • Cisco format (001B.4411.3AB7)
  • Windows format (00-1B-44-11-3A-B7)
Example:
Input: "Server IP: 192.168.1.100, MAC: 00:1B:44:11:3A:B7"
Redacted: "Server IP: [IP_1], MAC: [MAC_1]"
Dates:
  • Birth dates
  • Event dates
  • Timestamps
Ages:
  • Exact ages
  • Age ranges (if specific)
Example:
Input: "DOB: 01/15/1990, Age: 35"
Redacted: "DOB: [DATE_1], Age: [AGE_1]"

Configuration

Enable PII Redaction

Per-Server Configuration:
{
  "server_name": "customer_service_server",
  "input_guardrails_policy": {
    "enabled": true,
    "additional_config": {
      "pii_redaction": true  // Enable automatic redaction
    },
    "block": []  // Don't block on PII, just redact
  },
  "output_guardrails_policy": {
    "enabled": true,
    "additional_config": {
      "pii_redaction": false  // De-anonymization happens automatically
    }
  }
}

Block on PII Detection (Optional)

You can also block requests that contain PII instead of redacting:
{
  "input_guardrails_policy": {
    "enabled": true,
    "additional_config": {
      "pii_redaction": false  // Don't redact, just detect
    },
    "block": ["pii"]  // Block if PII is detected
  }
}
Use Case: Prevent users from accidentally including PII in public-facing tools.

Custom PII Entities

Configure which PII types to detect:
{
  "input_guardrails_policy": {
    "enabled": true,
    "additional_config": {
      "pii_redaction": true,
      "pii_entities": [
        "EMAIL",
        "PHONE",
        "SSN",
        "CREDIT_CARD"
      ]
    }
  }
}

PII Mapping Security

Secure Storage

Ephemeral Mappings: PII mappings are stored only for the duration of the request/response cycle
The PII mapping is:
  1. Generated server-side by Enkrypt API
  2. Associated with a unique session key
  3. Never logged or persisted
  4. Automatically expires after use
  5. Encrypted in transit (HTTPS)

Mapping Key Structure

# PII Mapping Example
{
  "key": "a1b2c3d4e5f6g7h8i9j0",  # Unique session key
  "mappings": {
    "NAME_1": "<encrypted_value>",
    "EMAIL_1": "<encrypted_value>",
    "PHONE_1": "<encrypted_value>"
  }
}
Security Properties:
  • Keys are cryptographically random (160+ bits of entropy)
  • Mappings are never exposed in logs
  • Keys cannot be reused across sessions
  • Server-side storage is encrypted at rest

Advanced Use Cases

Scenario 1: Customer Support

Goal: Protect customer PII when using AI tools
{
  "server_name": "customer_support_ai",
  "description": "AI assistant for customer support",
  "input_guardrails_policy": {
    "enabled": true,
    "additional_config": {
      "pii_redaction": true
    }
  }
}
Flow:
Agent: "Customer John Doe called about order #12345, email [email protected]"
  → AI sees: "Customer [NAME_1] called about order #12345, email [EMAIL_1]"
  → AI responds: "I've looked up [EMAIL_1]'s order #12345..."
  → Agent sees: "I've looked up [email protected]'s order #12345..."

Scenario 2: Data Analysis

Goal: Analyze customer data without exposing PII
{
  "server_name": "analytics_server",
  "input_guardrails_policy": {
    "enabled": true,
    "additional_config": {
      "pii_redaction": true
    }
  }
}
Flow:
Analyst: "Analyze feedback from [email protected], [email protected]"
  → Analytics sees: "Analyze feedback from [EMAIL_1], [EMAIL_2]"
  → Analytics returns: "[EMAIL_1] satisfaction: 85%, [EMAIL_2] satisfaction: 92%"
  → Analyst sees: "[email protected] satisfaction: 85%, [email protected] satisfaction: 92%"

Scenario 3: Compliance (GDPR/CCPA)

Goal: Minimize PII exposure for compliance
{
  "server_name": "gdpr_compliant_server",
  "input_guardrails_policy": {
    "enabled": true,
    "policy_name": "GDPR Compliance Policy",
    "additional_config": {
      "pii_redaction": true,
      "pii_entities": ["EMAIL", "PHONE", "SSN", "NAME", "ADDRESS"]
    },
    "block": ["pii"]  // Optional: block instead of redact for strict compliance
  }
}

Limitations & Best Practices

Not 100% Accurate: PII detection uses ML models with ~95-98% accuracyFalse Positives:
  • Generic names (“John Smith” in documentation)
  • Example emails ([email protected])
  • Sample phone numbers (555-0100)
False Negatives:
  • Obfuscated PII ([email protected] as “j dot doe at example”)
  • Non-standard formats
  • Context-dependent PII
Best Practice:
  • Test with sample data before production
  • Review redaction results periodically
  • Use additional guardrails (keyword detection) for critical PII
Latency: PII redaction adds 50-150ms per requestOptimization:
  • Enable only for servers handling user data
  • Use selective entity types (don’t detect all PII if unnecessary)
  • Cache redaction results for repeated inputs
Monitoring:
# Check PII redaction metrics
secure-mcp-gateway metrics --filter pii

# Output
pii_redactions_total: 1234
pii_redaction_latency_avg: 87ms
pii_restoration_latency_avg: 45ms
Challenge: Redaction may break context for AI understandingExample:
Original: "Send meeting invite to [email protected] and [email protected]"
Redacted: "Send meeting invite to [EMAIL_1] and [EMAIL_2]"

AI loses context that both emails are from same company
Mitigation:
  • Use partial redaction for non-sensitive patterns
  • Provide domain whitelist (e.g., allow @company.com)
  • Include metadata hints (e.g., “[EMAIL_1 from company.com]”)
Problem: Tokens don’t persist across sessionsExample:
Session 1:
  Input: "Email [email protected]"
  Redacted: "Email [EMAIL_1]"
  
Session 2:
  Input: "Email [email protected]"  
  Redacted: "Email [EMAIL_2]"  // Different token!
Why: Each session gets a unique PII key for securityBest Practice: If consistency needed, use custom identifiers instead of PII

Testing PII Redaction

Manual Testing

# Test with sample PII
echo '{"text": "Contact John Doe at [email protected] or 555-123-4567"}' | \
  curl -X POST https://api.enkryptai.com/guardrails/pii \
    -H "apikey: YOUR_API_KEY" \
    -H "Content-Type: application/json" \
    -d @-

# Response
{
  "text": "Contact [NAME_1] at [EMAIL_1] or [PHONE_1]",
  "key": "abc123xyz789"
}

Integration Testing

# Test PII redaction in gateway
import asyncio
from secure_mcp_gateway.plugins.guardrails.enkrypt_provider import EnkryptPIIHandler

async def test_pii():
    handler = EnkryptPIIHandler(
        api_key="YOUR_API_KEY",
        base_url="https://api.enkryptai.com"
    )
    
    # Test redaction
    text = "Email me at [email protected]"
    redacted, mapping = await handler.redact_pii(text)
    
    print(f"Original: {text}")
    print(f"Redacted: {redacted}")
    print(f"Mapping Key: {mapping['key']}")
    
    # Test restoration
    response = "Message sent to [EMAIL_1]"
    restored = await handler.restore_pii(response, mapping)
    
    print(f"Response: {response}")
    print(f"Restored: {restored}")

asyncio.run(test_pii())

Automated Testing

Use the included test suite:
# Run PII tests
pytest tests/test_pii_handling.py -v

# Test with sample data
pytest tests/test_pii_handling.py::test_redact_email
pytest tests/test_pii_handling.py::test_redact_phone
pytest tests/test_pii_handling.py::test_restore_pii

Monitoring & Metrics

PII Metrics

Available Metrics:
  • pii_redactions_total - Total PII redaction operations
  • pii_detections_by_type - PII detections by entity type (EMAIL, PHONE, etc.)
  • pii_redaction_latency - Time to redact PII
  • pii_restoration_latency - Time to restore PII
  • pii_failures_total - Failed PII operations
Grafana Dashboard: The gateway includes a PII monitoring dashboard showing:
  • Redaction rate over time
  • PII types detected
  • Latency percentiles (p50, p95, p99)
  • Error rates

Logging

PII events are logged (with PII values masked):
{
  "event": "pii_redacted",
  "timestamp": "2025-01-15T10:30:00Z",
  "server_name": "customer_support",
  "entities_detected": ["EMAIL", "PHONE"],
  "entity_count": 2,
  "original_length": 150,
  "redacted_length": 135,
  "pii_key": "****xyz",  // Last 3 chars only
  "latency_ms": 87
}

Next Steps

Security Testing

Test PII redaction with attack scenarios

Guardrail Types

Learn about other guardrail types

Configuration

Configure PII redaction for your servers

Compliance

GDPR/CCPA compliance guide

Build docs developers (and LLMs) love