Python API

NeMo Guardrails provides a straightforward Python API for adding programmable guardrails to your LLM-based applications. The API is async-first and integrates seamlessly with your existing code.

Core Classes

LLMRails

The LLMRails class is the main entry point for using guardrails programmatically.

from nemoguardrails import LLMRails, RailsConfig

Initialization

# Load configuration from a directory
config = RailsConfig.from_path("path/to/config")
rails = LLMRails(config)

Constructor Parameters:

config

RailsConfig

required

A rails configuration loaded from a directory or created programmatically.

llm

BaseLLM | BaseChatModel

default:"None"

An optional LLM engine to use. If provided, this will be used as the main LLM and will take precedence over any main LLM specified in the config.

verbose

bool

default:"False"

Whether the logging should be verbose or not.

RailsConfig

The RailsConfig class represents a guardrails configuration.

Loading Configuration

from nemoguardrails import RailsConfig

# Load from a directory
config = RailsConfig.from_path("path/to/config")

from_path

classmethod

Loads a RailsConfig from the specified path. The path should contain:

config.yml or config.yaml - Main configuration file
*.co - Colang files defining rails and flows
config.py - Optional initialization code
actions.py - Optional custom actions

Generation Methods

generate_async()

The primary async method for generating responses with guardrails applied.

response = await rails.generate_async(
    messages=[{"role": "user", "content": "Hello!"}]
)

Parameters:

prompt

str

default:"None"

The prompt to be used for completion. Cannot be used with messages.

messages

List[dict]

default:"None"

The history of messages to generate the next message. Cannot be used with prompt.

options

GenerationOptions | dict

default:"None"

Options specific for the generation (e.g., output variables, logging).

state

State | dict

default:"None"

The state object that should be used as the starting point.

streaming_handler

StreamingHandler

default:"None"

If specified, and the config supports streaming, the provided handler will be used for streaming.

Returns:

When using prompt: Returns a string with the completion
When using messages: Returns a dict with the assistant’s message
When using options: Returns a GenerationResponse object with additional metadata

Message Format

Messages follow the OpenAI Chat Completions API format:

messages = [
    {"role": "context", "content": {"user_name": "John"}},
    {"role": "user", "content": "Hello! How are you?"},
    {"role": "assistant", "content": "I am fine, thank you!"},
    {"role": "event", "event": {"type": "UserSilent"}}
]

Supported roles:

user - User messages
assistant - Assistant/bot messages
context - Context variables (must be a dict)
event - Custom events
system - System messages
tool - Tool/function call results

generate()

Synchronous wrapper around generate_async().

response = rails.generate(
    messages=[{"role": "user", "content": "Hello!"}]
)

The synchronous method is provided for convenience but internally uses the async API. For best performance, use generate_async() in async contexts.

Streaming

stream_async()

Streams the response token-by-token with guardrails applied.

async for chunk in rails.stream_async(
    messages=[{"role": "user", "content": "Tell me a story"}]
):
    print(chunk, end="", flush=True)

Parameters:

prompt

str

default:"None"

The prompt to be used for completion.

messages

List[dict]

default:"None"

The history of messages.

options

GenerationOptions | dict

default:"None"

Generation options.

state

State | dict

default:"None"

The state object to use as starting point.

include_metadata

bool

default:"False"

If True, yields dicts with text and metadata keys. If False, yields strings.

Streaming with output rails requires enabling streaming support in your configuration:

config.yml

rails:
  output:
    streaming:
      enabled: true

Complete Examples

Basic Usage

from nemoguardrails import LLMRails, RailsConfig

# Load configuration
config = RailsConfig.from_path("./config")
rails = LLMRails(config)

# Generate a response
response = rails.generate(
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response["content"])
# Output: "Hi! How can I help you?"

Using Async API

import asyncio
from nemoguardrails import LLMRails, RailsConfig

async def main():
    config = RailsConfig.from_path("./config")
    rails = LLMRails(config)
    
    # Async generation
    response = await rails.generate_async(
        messages=[
            {"role": "user", "content": "What is the weather like?"}
        ]
    )
    
    print(response["content"])

asyncio.run(main())

With Generation Options

from nemoguardrails import LLMRails, RailsConfig
from nemoguardrails.rails.llm.options import GenerationOptions

config = RailsConfig.from_path("./config")
rails = LLMRails(config)

# Create generation options
options = GenerationOptions(
    output_vars=True,  # Return context variables
    log={
        "activated_rails": True,
        "llm_calls": True,
        "internal_events": True
    }
)

response = rails.generate(
    messages=[{"role": "user", "content": "Hello"}],
    options=options
)

# Access the response
print(response.response[0]["content"])

# Access output variables
print(response.output_data)

# Access logs
if response.log:
    print(f"Activated rails: {response.log.activated_rails}")
    print(f"LLM calls: {len(response.log.llm_calls)}")

Streaming Example

import asyncio
from nemoguardrails import LLMRails, RailsConfig

async def stream_example():
    config = RailsConfig.from_path("./config")
    rails = LLMRails(config)
    
    print("Bot: ", end="", flush=True)
    
    async for chunk in rails.stream_async(
        messages=[{"role": "user", "content": "Tell me a joke"}]
    ):
        print(chunk, end="", flush=True)
    
    print()  # New line after streaming

asyncio.run(stream_example())

With Context Variables

from nemoguardrails import LLMRails, RailsConfig

config = RailsConfig.from_path("./config")
rails = LLMRails(config)

# Include context in messages
messages = [
    {
        "role": "context",
        "content": {
            "user_name": "Alice",
            "user_age": 25,
            "preferences": {"language": "English"}
        }
    },
    {"role": "user", "content": "What's my name?"}
]

response = rails.generate(messages=messages)
print(response["content"])
# Output: "Your name is Alice."

With Custom LLM

from nemoguardrails import LLMRails, RailsConfig
from langchain_openai import ChatOpenAI

config = RailsConfig.from_path("./config")

# Use GPT-4 instead of the configured model
llm = ChatOpenAI(model="gpt-4", temperature=0.7)
rails = LLMRails(config, llm=llm)

response = rails.generate(
    messages=[{"role": "user", "content": "Explain quantum computing"}]
)

print(response["content"])

Conversation History

from nemoguardrails import LLMRails, RailsConfig

config = RailsConfig.from_path("./config")
rails = LLMRails(config)

# Maintain conversation history
history = []

while True:
    user_input = input("You: ")
    if user_input.lower() == "quit":
        break
    
    # Add user message to history
    history.append({"role": "user", "content": user_input})
    
    # Generate response with full history
    response = rails.generate(messages=history)
    
    # Add bot response to history
    history.append(response)
    
    print(f"Bot: {response['content']}")

Advanced Features

Updating the LLM

You can update the LLM used by the rails instance:

from langchain_openai import ChatOpenAI

config = RailsConfig.from_path("./config")
rails = LLMRails(config)

# Update to use a different model
new_llm = ChatOpenAI(model="gpt-4-turbo")
rails.update_llm(new_llm)

Registering Custom Actions

from nemoguardrails import LLMRails, RailsConfig

config = RailsConfig.from_path("./config")
rails = LLMRails(config)

def custom_action(context: dict):
    """A custom action that can be called from Colang."""
    user_name = context.get("user_name", "User")
    return f"Hello, {user_name}!"

# Register the action
rails.runtime.register_action(custom_action, "custom_action")

response = rails.generate(
    messages=[{"role": "user", "content": "Greet me"}]
)

Error Handling

from nemoguardrails import LLMRails, RailsConfig
from nemoguardrails.exceptions import (
    InvalidRailsConfigurationError,
    StreamingNotSupportedError
)

try:
    config = RailsConfig.from_path("./config")
    rails = LLMRails(config)
    
    response = rails.generate(
        messages=[{"role": "user", "content": "Hello"}]
    )
    
except InvalidRailsConfigurationError as e:
    print(f"Configuration error: {e}")
except StreamingNotSupportedError as e:
    print(f"Streaming error: {e}")
except Exception as e:
    print(f"Unexpected error: {e}")

Next Steps

Server API

Deploy guardrails as a REST API server

CLI Tools

Use command-line tools for testing and development

LangChain Integration

Integrate with LangChain chains and agents

Configuration Guide

Learn how to configure guardrails

Get Started

Core Concepts

Configuration

Guardrails Library

Built-in Guardrails

Usage

Deployment

Evaluation

Core Classes

LLMRails

Initialization

RailsConfig

Loading Configuration

Generation Methods

generate_async()

Message Format

generate()

Streaming

stream_async()

Complete Examples

Basic Usage

Using Async API

With Generation Options

Streaming Example

With Context Variables

With Custom LLM

Conversation History

Advanced Features

Updating the LLM

Registering Custom Actions

Error Handling

Next Steps

Server API

CLI Tools

LangChain Integration

Configuration Guide

Build docs developers (and LLMs) love

Get Started

Core Concepts

Configuration

Guardrails Library

Built-in Guardrails

Usage

Deployment

Evaluation

​Core Classes

​LLMRails

​Initialization

​RailsConfig

​Loading Configuration

​Generation Methods

​generate_async()

​Message Format

​generate()

​Streaming

​stream_async()

​Complete Examples

​Basic Usage

​Using Async API

​With Generation Options

​Streaming Example

​With Context Variables

​With Custom LLM

​Conversation History

​Advanced Features

​Updating the LLM

​Registering Custom Actions

​Error Handling

​Next Steps

Server API

CLI Tools

LangChain Integration

Configuration Guide

Build docs developers (and LLMs) love

Core Classes

LLMRails

Initialization

RailsConfig

Loading Configuration

Generation Methods

generate_async()

Message Format

generate()

Streaming

stream_async()

Complete Examples

Basic Usage

Using Async API

With Generation Options

Streaming Example

With Context Variables

With Custom LLM

Conversation History

Advanced Features

Updating the LLM

Registering Custom Actions

Error Handling

Next Steps