Skip to main content
Prompt Guard allows you to intercept LLM requests and responses at the gateway layer — before they reach your AI provider or return to the client. This enables consistent content policy enforcement without modifying application code. Guards can be applied to request (what the user sends to the LLM) and response (what the LLM sends back).

Overview

Regex

Match patterns in prompt text and block or mask matching content. Supports custom patterns and built-in rules for common PII types.

Webhook

Forward request or response content to an external moderation service. The webhook can reject or rewrite the content.

AWS Bedrock Guardrails

Delegate content safety to AWS Bedrock Guardrails using a guardrail identifier, version, and region.

Google Model Armor

Delegate content safety to Google Cloud Model Armor using a template, project, and location.

Configuration placement

Prompt Guard is configured under policies.ai.promptGuard on a route, applied to an AI backend:
binds:
- port: 3000
  listeners:
  - routes:
    - backends:
      - ai:
          name: openai
          provider:
            openAI:
              model: gpt-4o-mini
      policies:
        ai:
          promptGuard:
            request:
              ...
            response:
              ...

Regex-based filtering

Use the regex guard to match patterns in prompts and responses. The request and response fields each accept a list of guard entries. Specify action: reject to block matching content.
policies:
  ai:
    promptGuard:
      request:
      - regex:
          action: reject
          rules:
          - pattern: SSN
          - pattern: Social Security
        rejection:
          status: 400
          body: "Request rejected: Contains sensitive information"
      response:
      - regex:
          action: reject
          rules:
          - builtin: email
        rejection:
          status: 400
          body: "Response blocked: Contains email address"

Built-in patterns

Agentgateway ships with built-in patterns for common PII types. Reference them with the builtin key:
Built-in nameMatches
emailEmail addresses
ssnUS Social Security Numbers

Custom patterns

Custom patterns accept any regular expression via the pattern field:
rules:
- pattern: "\\b\\d{3}-\\d{2}-\\d{4}\\b"
- pattern: "CONFIDENTIAL"

Custom rejection responses

Rejection responses are fully configurable. Return JSON errors compatible with OpenAI-style clients:
policies:
  ai:
    promptGuard:
      request:
      - regex:
          action: reject
          rules:
          - pattern: SSN
          - pattern: Social Security
        rejection:
          status: 400
          headers:
            set:
              content-type: "application/json"
          body: |
            {
              "error": {
                "message": "Request rejected: Content contains sensitive information",
                "type": "invalid_request_error",
                "code": "content_policy_violation"
              }
            }
      - regex:
          action: reject
          rules:
          - builtin: email
        rejection:
          status: 400
          headers:
            set:
              content-type: "application/json"
          body: |
            {
              "error": {
                "message": "Response blocked: Contains email",
                "type": "invalid_request_error",
                "code": "pii_detected"
              }
            }

Header operations

Rejection responses support three header operations:
rejection:
  status: 403
  headers:
    set:
      content-type: "application/json"
      x-moderation-version: "v1"
    add:
      x-blocked-category: "violence"
    remove:
      - server
  body: '{"error": "Forbidden"}'
OperationBehavior
setReplace or create a header (overwrites existing value)
addAppend a header value (allows multiple values for the same header)
removeRemove a header from the response

Testing regex guards

Send a request containing a blocked pattern:
curl http://localhost:3000 \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Is 123-45-6789 a valid SSN"}
    ]
  }'
# Request rejected due to inappropriate content
Test that the response guard blocks LLM output containing emails:
curl http://localhost:3000 \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Return a fake email address"}
    ]
  }'
# Response rejected due to inappropriate content

Webhook-based validation

Forward request or response content to an external moderation service. The webhook can reject content or allow it through:
policies:
  ai:
    promptGuard:
      request:
        webhook:
          target: 127.0.0.1:8000
          # Forward specific request headers to the webhook
          forwardHeaderMatches:
          - name: h1
            value:
              regex: v1
          - name: h2
            value:
              regex: v2.*
      response:
        webhook:
          target: 127.0.0.1:8000
By default, request headers are not forwarded to the webhook. Use forwardHeaderMatches to specify which headers to include — each entry is a header name with an optional regex value matcher.

Provider integrations

Delegate content moderation to AWS Bedrock Guardrails. Configure with a guardrail identifier, version, and the AWS region:
# yaml-language-server: $schema=https://agentgateway.dev/schema/config
binds:
- port: 3000
  listeners:
  - routes:
    - backends:
      - ai:
          name: openai
          provider:
            openAI:
              model: gpt-4o-mini
      policies:
        ai:
          promptGuard:
            request:
            - bedrockGuardrails:
                guardrailIdentifier: bedrock-guardrail-identifier
                guardrailVersion: DRAFT
                region: us-west-2
            response:
            - bedrockGuardrails:
                guardrailIdentifier: bedrock-guardrail-identifier
                guardrailVersion: DRAFT
                region: us-west-2
FieldDescription
guardrailIdentifierThe Bedrock guardrail ID or ARN
guardrailVersionVersion string, e.g. DRAFT or 1
regionAWS region where the guardrail is deployed
AWS credentials must be available in the environment (e.g. via IAM role, environment variables, or the default credential chain).

Running the prompt guard example

cargo run -- -f examples/ai-prompt-guard/config.yaml
The example configures request guards for SSN and Social Security patterns, plus a response guard for the email built-in pattern, with JSON rejection responses compatible with OpenAI clients.

Build docs developers (and LLMs) love