Guardrail Types
NeMo Guardrails applies guardrails at multiple stages of the LLM interaction pipeline. Each rail type serves a specific purpose and operates at a distinct point in the conversation flow.
Overview of Rail Types
The five main types of guardrails are:| Stage | Rail Type | Common Use Cases |
|---|---|---|
| Before LLM | Input rails | Content safety, jailbreak detection, topic control, PII masking |
| RAG pipeline | Retrieval rails | Document filtering, chunk validation |
| Conversation | Dialog rails | Flow control, guided conversations |
| Tool calls | Execution rails | Action input/output validation |
| After LLM | Output rails | Response filtering, fact checking, sensitive data removal |
Input and Output rails are the most commonly used types. Start with these before implementing more advanced rail types.
1. Input Rails
Input rails are applied to user input before the LLM is invoked. They can validate, reject, or transform user messages.Use Cases
Jailbreak Detection
Detect and block attempts to bypass safety measures or manipulate the LLM into harmful behavior.
Content Safety
Check user inputs for harmful, offensive, or inappropriate content before processing.
PII Masking
Detect and mask personally identifiable information like emails, phone numbers, or SSNs.
Topic Control
Ensure user requests stay within allowed topic boundaries.
Configuration Example
Colang Example
2. Retrieval Rails
Retrieval rails operate in RAG (Retrieval Augmented Generation) scenarios, filtering and validating retrieved chunks before they’re used to prompt the LLM.Use Cases
- Chunk Relevance: Ensure retrieved documents are actually relevant to the query
- PII Detection: Mask or remove sensitive data from retrieved content
- Source Validation: Verify chunks come from trusted sources
- Content Filtering: Remove inappropriate or outdated information
Configuration Example
Custom Retrieval Action
3. Dialog Rails
Dialog rails influence how the conversation flows by operating on canonical form messages. They determine if an action should execute, if the LLM should generate the next step, or if a predefined response should be used.Use Cases
- Conversational Flows
- Topic Restrictions
- Authentication Flows
Define specific paths the conversation should follow:
Configuration Example
4. Execution Rails
Execution rails are applied to tool/action calls, validating both the input parameters and the output results before they’re used in the conversation.Use Cases
- Input Validation: Ensure action parameters are safe and well-formed
- Output Sanitization: Filter sensitive data from action results
- Authorization: Verify the user has permission to execute the action
- Rate Limiting: Control how often certain actions can be called
Configuration Example
Example: Database Query Validation
5. Output Rails
Output rails are applied to the LLM-generated response before it’s returned to the user. They can reject, modify, or enhance the output.Use Cases
Fact Checking
Verify factual claims in the response against trusted sources.
Hallucination Detection
Detect when the LLM generates information not grounded in context.
Content Moderation
Check for harmful, biased, or inappropriate content in responses.
PII Removal
Strip any accidentally generated personal information.
Configuration Example
Colang Example
Use Case Matrix
Different use cases benefit from different combinations of rail types:| Use Case | Input | Retrieval | Dialog | Execution | Output |
|---|---|---|---|---|---|
| Content Safety | ✅ | ✅ | |||
| Jailbreak Protection | ✅ | ||||
| Topic Control | ✅ | ✅ | |||
| PII Detection | ✅ | ✅ | ✅ | ||
| Knowledge Base / RAG | ✅ | ✅ | |||
| Agentic Security | ✅ | ||||
| Custom Rails | ✅ | ✅ | ✅ | ✅ | ✅ |
Combining Multiple Rails
Rails of different types work together to provide comprehensive protection:Rail Execution Order
Rails are executed in this order for each conversation turn:- Input rails → Process user message
- Dialog rails → Determine conversation flow
- Retrieval rails → Filter RAG chunks (if applicable)
- Execution rails → Validate tool calls (if applicable)
- Output rails → Validate bot response
Any rail can call
stop to halt processing immediately. This is useful for rejecting inappropriate requests or responses.Next Steps
Colang DSL
Learn how to write rail definitions using Colang
Guardrails Library
Explore pre-built guardrails you can use immediately