Prompt Guard

Prompt Guard allows you to intercept LLM requests and responses at the gateway layer — before they reach your AI provider or return to the client. This enables consistent content policy enforcement without modifying application code. Guards can be applied to request (what the user sends to the LLM) and response (what the LLM sends back).

Overview

Regex

Match patterns in prompt text and block or mask matching content. Supports custom patterns and built-in rules for common PII types.

Webhook

Forward request or response content to an external moderation service. The webhook can reject or rewrite the content.

AWS Bedrock Guardrails

Delegate content safety to AWS Bedrock Guardrails using a guardrail identifier, version, and region.

Google Model Armor

Delegate content safety to Google Cloud Model Armor using a template, project, and location.

Configuration placement

Prompt Guard is configured under policies.ai.promptGuard on a route, applied to an AI backend:

binds:
- port: 3000
  listeners:
  - routes:
    - backends:
      - ai:
          name: openai
          provider:
            openAI:
              model: gpt-4o-mini
      policies:
        ai:
          promptGuard:
            request:
              ...
            response:
              ...

Regex-based filtering

Use the regex guard to match patterns in prompts and responses. The request and response fields each accept a list of guard entries. Specify action: reject to block matching content.

policies:
  ai:
    promptGuard:
      request:
      - regex:
          action: reject
          rules:
          - pattern: SSN
          - pattern: Social Security
        rejection:
          status: 400
          body: "Request rejected: Contains sensitive information"
      response:
      - regex:
          action: reject
          rules:
          - builtin: email
        rejection:
          status: 400
          body: "Response blocked: Contains email address"

Built-in patterns

Agentgateway ships with built-in patterns for common PII types. Reference them with the builtin key:

Built-in name	Matches
`email`	Email addresses
`ssn`	US Social Security Numbers

Custom patterns

Custom patterns accept any regular expression via the pattern field:

rules:
- pattern: "\\b\\d{3}-\\d{2}-\\d{4}\\b"
- pattern: "CONFIDENTIAL"

Custom rejection responses

Rejection responses are fully configurable. Return JSON errors compatible with OpenAI-style clients:

policies:
  ai:
    promptGuard:
      request:
      - regex:
          action: reject
          rules:
          - pattern: SSN
          - pattern: Social Security
        rejection:
          status: 400
          headers:
            set:
              content-type: "application/json"
          body: |
            {
              "error": {
                "message": "Request rejected: Content contains sensitive information",
                "type": "invalid_request_error",
                "code": "content_policy_violation"
              }
            }
      - regex:
          action: reject
          rules:
          - builtin: email
        rejection:
          status: 400
          headers:
            set:
              content-type: "application/json"
          body: |
            {
              "error": {
                "message": "Response blocked: Contains email",
                "type": "invalid_request_error",
                "code": "pii_detected"
              }
            }

Header operations

Rejection responses support three header operations:

rejection:
  status: 403
  headers:
    set:
      content-type: "application/json"
      x-moderation-version: "v1"
    add:
      x-blocked-category: "violence"
    remove:
      - server
  body: '{"error": "Forbidden"}'

Operation	Behavior
`set`	Replace or create a header (overwrites existing value)
`add`	Append a header value (allows multiple values for the same header)
`remove`	Remove a header from the response

Testing regex guards

Send a request containing a blocked pattern:

curl http://localhost:3000 \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Is 123-45-6789 a valid SSN"}
    ]
  }'
# Request rejected due to inappropriate content

Test that the response guard blocks LLM output containing emails:

curl http://localhost:3000 \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Return a fake email address"}
    ]
  }'
# Response rejected due to inappropriate content

Webhook-based validation

Forward request or response content to an external moderation service. The webhook can reject content or allow it through:

policies:
  ai:
    promptGuard:
      request:
        webhook:
          target: 127.0.0.1:8000
          # Forward specific request headers to the webhook
          forwardHeaderMatches:
          - name: h1
            value:
              regex: v1
          - name: h2
            value:
              regex: v2.*
      response:
        webhook:
          target: 127.0.0.1:8000

By default, request headers are not forwarded to the webhook. Use forwardHeaderMatches to specify which headers to include — each entry is a header name with an optional regex value matcher.

Provider integrations

AWS Bedrock Guardrails
Google Cloud Model Armor

Delegate content moderation to AWS Bedrock Guardrails. Configure with a guardrail identifier, version, and the AWS region:

# yaml-language-server: $schema=https://agentgateway.dev/schema/config
binds:
- port: 3000
  listeners:
  - routes:
    - backends:
      - ai:
          name: openai
          provider:
            openAI:
              model: gpt-4o-mini
      policies:
        ai:
          promptGuard:
            request:
            - bedrockGuardrails:
                guardrailIdentifier: bedrock-guardrail-identifier
                guardrailVersion: DRAFT
                region: us-west-2
            response:
            - bedrockGuardrails:
                guardrailIdentifier: bedrock-guardrail-identifier
                guardrailVersion: DRAFT
                region: us-west-2

Field	Description
`guardrailIdentifier`	The Bedrock guardrail ID or ARN
`guardrailVersion`	Version string, e.g. `DRAFT` or `1`
`region`	AWS region where the guardrail is deployed

AWS credentials must be available in the environment (e.g. via IAM role, environment variables, or the default credential chain).

Delegate content moderation to Google Cloud Model Armor. Configure with a template ID, project ID, and location:

# yaml-language-server: $schema=https://agentgateway.dev/schema/config
binds:
- port: 3000
  listeners:
  - routes:
    - backends:
      - ai:
          name: openai
          provider:
            openAI:
              model: gpt-4o-mini
      policies:
        ai:
          promptGuard:
            request:
            - googleModelArmor:
                templateId: model-armor-template-id
                projectId: model-armor-project-id
                location: us-central1
            response:
            - googleModelArmor:
                templateId: model-armor-template-id
                projectId: model-armor-project-id
                location: us-central1

Field	Description
`templateId`	The Model Armor template ID
`projectId`	Your Google Cloud project ID
`location`	The GCP region where the template is deployed

Google Cloud Application Default Credentials (ADC) must be configured in the environment.

Running the prompt guard example

cargo run -- -f examples/ai-prompt-guard/config.yaml

The example configures request guards for SSN and Social Security patterns, plus a response guard for the email built-in pattern, with JSON rejection responses compatible with OpenAI clients.

Get Started

Core Concepts

Guides

Deployment

Overview

Regex

Webhook

AWS Bedrock Guardrails

Google Model Armor

Configuration placement

Regex-based filtering

Built-in patterns

Custom patterns

Custom rejection responses

Header operations

Testing regex guards

Webhook-based validation

Provider integrations

Running the prompt guard example

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Deployment

​Overview

Regex

Webhook

AWS Bedrock Guardrails

Google Model Armor

​Configuration placement

​Regex-based filtering

​Built-in patterns

​Custom patterns

​Custom rejection responses

​Header operations

​Testing regex guards

​Webhook-based validation

​Provider integrations

​Running the prompt guard example

Build docs developers (and LLMs) love

Overview

Configuration placement

Regex-based filtering

Built-in patterns

Custom patterns

Custom rejection responses

Header operations

Testing regex guards

Webhook-based validation

Provider integrations

Running the prompt guard example