NeMo Guardrails provides a straightforward Python API for adding programmable guardrails to your LLM-based applications. The API is async-first and integrates seamlessly with your existing code.
Core Classes
LLMRails
The LLMRails class is the main entry point for using guardrails programmatically.
from nemoguardrails import LLMRails, RailsConfig
Initialization
From Path
With Custom LLM
Verbose Mode
# Load configuration from a directory
config = RailsConfig.from_path( "path/to/config" )
rails = LLMRails(config)
Constructor Parameters:
A rails configuration loaded from a directory or created programmatically.
llm
BaseLLM | BaseChatModel
default: "None"
An optional LLM engine to use. If provided, this will be used as the main LLM and will take precedence over any main LLM specified in the config.
Whether the logging should be verbose or not.
RailsConfig
The RailsConfig class represents a guardrails configuration.
Loading Configuration
from nemoguardrails import RailsConfig
# Load from a directory
config = RailsConfig.from_path( "path/to/config" )
Loads a RailsConfig from the specified path. The path should contain:
config.yml or config.yaml - Main configuration file
*.co - Colang files defining rails and flows
config.py - Optional initialization code
actions.py - Optional custom actions
Generation Methods
generate_async()
The primary async method for generating responses with guardrails applied.
response = await rails.generate_async(
messages = [{ "role" : "user" , "content" : "Hello!" }]
)
Parameters:
The prompt to be used for completion. Cannot be used with messages.
The history of messages to generate the next message. Cannot be used with prompt.
options
GenerationOptions | dict
default: "None"
Options specific for the generation (e.g., output variables, logging).
state
State | dict
default: "None"
The state object that should be used as the starting point.
streaming_handler
StreamingHandler
default: "None"
If specified, and the config supports streaming, the provided handler will be used for streaming.
Returns:
When using prompt: Returns a string with the completion
When using messages: Returns a dict with the assistant’s message
When using options: Returns a GenerationResponse object with additional metadata
Messages follow the OpenAI Chat Completions API format:
messages = [
{ "role" : "context" , "content" : { "user_name" : "John" }},
{ "role" : "user" , "content" : "Hello! How are you?" },
{ "role" : "assistant" , "content" : "I am fine, thank you!" },
{ "role" : "event" , "event" : { "type" : "UserSilent" }}
]
Supported roles:
user - User messages
assistant - Assistant/bot messages
context - Context variables (must be a dict)
event - Custom events
system - System messages
tool - Tool/function call results
generate()
Synchronous wrapper around generate_async().
response = rails.generate(
messages = [{ "role" : "user" , "content" : "Hello!" }]
)
The synchronous method is provided for convenience but internally uses the async API. For best performance, use generate_async() in async contexts.
Streaming
stream_async()
Streams the response token-by-token with guardrails applied.
async for chunk in rails.stream_async(
messages = [{ "role" : "user" , "content" : "Tell me a story" }]
):
print (chunk, end = "" , flush = True )
Parameters:
The prompt to be used for completion.
options
GenerationOptions | dict
default: "None"
Generation options.
state
State | dict
default: "None"
The state object to use as starting point.
If True, yields dicts with text and metadata keys. If False, yields strings.
Streaming with output rails requires enabling streaming support in your configuration: rails :
output :
streaming :
enabled : true
Complete Examples
Basic Usage
from nemoguardrails import LLMRails, RailsConfig
# Load configuration
config = RailsConfig.from_path( "./config" )
rails = LLMRails(config)
# Generate a response
response = rails.generate(
messages = [{ "role" : "user" , "content" : "Hello!" }]
)
print (response[ "content" ])
# Output: "Hi! How can I help you?"
Using Async API
import asyncio
from nemoguardrails import LLMRails, RailsConfig
async def main ():
config = RailsConfig.from_path( "./config" )
rails = LLMRails(config)
# Async generation
response = await rails.generate_async(
messages = [
{ "role" : "user" , "content" : "What is the weather like?" }
]
)
print (response[ "content" ])
asyncio.run(main())
With Generation Options
from nemoguardrails import LLMRails, RailsConfig
from nemoguardrails.rails.llm.options import GenerationOptions
config = RailsConfig.from_path( "./config" )
rails = LLMRails(config)
# Create generation options
options = GenerationOptions(
output_vars = True , # Return context variables
log = {
"activated_rails" : True ,
"llm_calls" : True ,
"internal_events" : True
}
)
response = rails.generate(
messages = [{ "role" : "user" , "content" : "Hello" }],
options = options
)
# Access the response
print (response.response[ 0 ][ "content" ])
# Access output variables
print (response.output_data)
# Access logs
if response.log:
print ( f "Activated rails: { response.log.activated_rails } " )
print ( f "LLM calls: { len (response.log.llm_calls) } " )
Streaming Example
import asyncio
from nemoguardrails import LLMRails, RailsConfig
async def stream_example ():
config = RailsConfig.from_path( "./config" )
rails = LLMRails(config)
print ( "Bot: " , end = "" , flush = True )
async for chunk in rails.stream_async(
messages = [{ "role" : "user" , "content" : "Tell me a joke" }]
):
print (chunk, end = "" , flush = True )
print () # New line after streaming
asyncio.run(stream_example())
With Context Variables
from nemoguardrails import LLMRails, RailsConfig
config = RailsConfig.from_path( "./config" )
rails = LLMRails(config)
# Include context in messages
messages = [
{
"role" : "context" ,
"content" : {
"user_name" : "Alice" ,
"user_age" : 25 ,
"preferences" : { "language" : "English" }
}
},
{ "role" : "user" , "content" : "What's my name?" }
]
response = rails.generate( messages = messages)
print (response[ "content" ])
# Output: "Your name is Alice."
With Custom LLM
from nemoguardrails import LLMRails, RailsConfig
from langchain_openai import ChatOpenAI
config = RailsConfig.from_path( "./config" )
# Use GPT-4 instead of the configured model
llm = ChatOpenAI( model = "gpt-4" , temperature = 0.7 )
rails = LLMRails(config, llm = llm)
response = rails.generate(
messages = [{ "role" : "user" , "content" : "Explain quantum computing" }]
)
print (response[ "content" ])
Conversation History
from nemoguardrails import LLMRails, RailsConfig
config = RailsConfig.from_path( "./config" )
rails = LLMRails(config)
# Maintain conversation history
history = []
while True :
user_input = input ( "You: " )
if user_input.lower() == "quit" :
break
# Add user message to history
history.append({ "role" : "user" , "content" : user_input})
# Generate response with full history
response = rails.generate( messages = history)
# Add bot response to history
history.append(response)
print ( f "Bot: { response[ 'content' ] } " )
Advanced Features
Updating the LLM
You can update the LLM used by the rails instance:
from langchain_openai import ChatOpenAI
config = RailsConfig.from_path( "./config" )
rails = LLMRails(config)
# Update to use a different model
new_llm = ChatOpenAI( model = "gpt-4-turbo" )
rails.update_llm(new_llm)
Registering Custom Actions
Register custom Python functions as actions:
from nemoguardrails import LLMRails, RailsConfig
config = RailsConfig.from_path( "./config" )
rails = LLMRails(config)
def custom_action ( context : dict ):
"""A custom action that can be called from Colang."""
user_name = context.get( "user_name" , "User" )
return f "Hello, { user_name } !"
# Register the action
rails.runtime.register_action(custom_action, "custom_action" )
response = rails.generate(
messages = [{ "role" : "user" , "content" : "Greet me" }]
)
Error Handling
from nemoguardrails import LLMRails, RailsConfig
from nemoguardrails.exceptions import (
InvalidRailsConfigurationError,
StreamingNotSupportedError
)
try :
config = RailsConfig.from_path( "./config" )
rails = LLMRails(config)
response = rails.generate(
messages = [{ "role" : "user" , "content" : "Hello" }]
)
except InvalidRailsConfigurationError as e:
print ( f "Configuration error: { e } " )
except StreamingNotSupportedError as e:
print ( f "Streaming error: { e } " )
except Exception as e:
print ( f "Unexpected error: { e } " )
Next Steps
Server API Deploy guardrails as a REST API server
CLI Tools Use command-line tools for testing and development
LangChain Integration Integrate with LangChain chains and agents
Configuration Guide Learn how to configure guardrails