Core Concepts Overview

NeMo Guardrails is an open-source toolkit for easily adding programmable guardrails to LLM-based conversational applications. Guardrails (or “rails” for short) are specific ways of controlling the output of a large language model, such as not talking about politics, responding in a particular way to specific user requests, following a predefined dialog path, using a particular language style, extracting structured data, and more.

What Are Programmable Guardrails?

Programmable guardrails sit between your application code and the LLM, providing a flexible layer of control over how the LLM behaves. Rather than relying solely on prompts or post-processing, guardrails enable you to define explicit rules and flows that govern the conversation.

Key Benefits

Programmable guardrails provide several critical advantages:

Build Trustworthy Applications

Define rails to guide and safeguard conversations. Choose to define the behavior of your LLM-based application on specific topics and prevent it from engaging in discussions on unwanted topics.

Connect Services Securely

Connect an LLM to other services (tools) seamlessly and securely. Validate tool inputs and outputs with execution rails.

Controllable Dialog

Steer the LLM to follow pre-defined conversational paths, allowing you to design the interaction following conversation design best practices and enforce standard operating procedures.

Multi-Stage Protection

Apply different types of guardrails at five distinct stages: input, retrieval, dialog, execution, and output.

Core Framework Components

The NeMo Guardrails framework consists of several key components that work together:

RailsConfig

The RailsConfig class defines the complete configuration for your guardrails, including:

LLM Models: Specify which language models to use (main, embeddings, etc.)
Rails: Configure which guardrails are active and how they operate
Colang Definitions: Load dialog flows and message definitions from .co files
Custom Actions: Register Python functions as callable actions
Instructions: Provide context and guidelines to the LLM

from nemoguardrails import RailsConfig

# Load configuration from a directory
config = RailsConfig.from_path("./config")

LLMRails

The LLMRails class is the main entry point for using guardrails. It wraps your LLM with the configured guardrails:

from nemoguardrails import LLMRails, RailsConfig

config = RailsConfig.from_path("./config")
rails = LLMRails(config)

response = rails.generate(
    messages=[{"role": "user", "content": "Hello!"}]
)

The generate method uses the same message format as the OpenAI Chat Completions API, making it easy to integrate with existing applications.

Event-Driven Runtime

NeMo Guardrails uses an event-driven runtime to process conversations. Every interaction generates events that flow through the system:

User utterance → UtteranceUserActionFinished event
Canonical form generation → UserIntent event
Next step decision → BotIntent or action events
Bot response generation → StartUtteranceBotAction event

This event-driven design allows guardrails to intercept and modify the conversation at any stage.

Async-First Architecture

NeMo Guardrails is built with an async-first design. The core mechanics are implemented using Python’s async model, providing several advantages:

Better Concurrency

Multiple users can be served concurrently without blocking. When one request waits for an LLM response, others can continue processing.

Dual API Support

Both synchronous and asynchronous versions of methods are available:

Sync: rails.generate(messages)
Async: await rails.generate_async(messages)

Efficient Resource Usage

Actions and LLM calls run asynchronously, making better use of system resources during I/O operations.

import asyncio
from nemoguardrails import LLMRails, RailsConfig

config = RailsConfig.from_path("./config")
rails = LLMRails(config)

# Async usage
async def chat():
    response = await rails.generate_async(
        messages=[{"role": "user", "content": "Hello!"}]
    )
    return response

response = asyncio.run(chat())

Configuration Structure

A typical guardrails configuration follows this structure:

config/
├── config.yml          # Main configuration file
├── config.py           # Custom initialization code (optional)
├── actions.py          # Custom Python actions (optional)
├── rails.co            # Colang flow definitions
└── kb/                 # Knowledge base documents (optional)
    └── *.md

Sample config.yml

models:
  - type: main
    engine: openai
    model: gpt-4o-mini

rails:
  input:
    flows:
      - check jailbreak
      - mask sensitive data on input
  
  output:
    flows:
      - self check facts
      - activefence moderation

Use Cases

You can use programmable guardrails in different types of applications:

Question Answering
Domain Assistants
LLM Endpoints

Enforce fact-checking and output moderation over a set of documents (RAG).

rails:
  retrieval:
    flows:
      - check relevance
  output:
    flows:
      - self check facts
      - check hallucination

Ensure the assistant stays on topic and follows designed conversational flows.

define user ask about politics
  "What do you think about the government?"

define flow
  user ask about politics
  bot refuse to respond

Add guardrails to your custom LLM for safer customer interaction.

rails:
  input:
    flows:
      - jailbreak detection
  output:
    flows:
      - content safety check

Next Steps

Guardrail Types

Learn about the five types of rails and when to use them

Colang DSL

Understand the Colang language for defining flows and rails

Architecture

Deep dive into the runtime and processing pipeline

Get Started

Start building your first guardrails configuration

Get Started

Core Concepts

Configuration

Guardrails Library

Built-in Guardrails

Usage

Deployment

Evaluation

Overview

Core Concepts Overview

What Are Programmable Guardrails?

Key Benefits

Build Trustworthy Applications

Connect Services Securely

Controllable Dialog

Multi-Stage Protection

Core Framework Components

RailsConfig

LLMRails

Event-Driven Runtime

Async-First Architecture

Configuration Structure

Sample config.yml

Use Cases

Next Steps

Guardrail Types

Colang DSL

Architecture

Get Started

Build docs developers (and LLMs) love

Get Started

Core Concepts

Configuration

Guardrails Library

Built-in Guardrails

Usage

Deployment

Evaluation

​Core Concepts Overview

​What Are Programmable Guardrails?

​Key Benefits

Build Trustworthy Applications

Connect Services Securely

Controllable Dialog

Multi-Stage Protection

​Core Framework Components

​RailsConfig

​LLMRails

​Event-Driven Runtime

​Async-First Architecture

​Configuration Structure

​Sample config.yml

​Use Cases

​Next Steps

Guardrail Types

Colang DSL

Architecture

Get Started

Build docs developers (and LLMs) love

Core Concepts Overview

What Are Programmable Guardrails?

Key Benefits

Core Framework Components

RailsConfig

LLMRails

Event-Driven Runtime

Async-First Architecture

Configuration Structure

Sample config.yml

Use Cases

Next Steps