Server Configuration - NeMo Guardrails

Learn more about Mintlify

Enter your email to receive updates about new features and product releases.

Server Modes
Multi-Config Mode
Single-Config Mode
Server Options
Command-Line Options
Environment Variables
CORS Configuration
Model Configuration
Server Configuration File
API Endpoints
GET /v1/rails/configs
GET /v1/models
POST /v1/chat/completions
Thread Management
Using Threads
Custom Datastore
Auto-Reload
Chat UI
Production Deployment
Using Gunicorn
Using Docker
Environment Variables

Server Modes

The NeMo Guardrails server supports two modes:

Multi-Config Mode

In multi-config mode, the server can serve multiple guardrails configurations:

nemoguardrails server --config=/path/to/configs

Directory structure:

configs/
├── config1/
│   ├── config.yml
│   └── rails.co
├── config2/
│   ├── config.yml
│   └── rails.co
└── config3/
    ├── config.yml
    └── rails.co

Clients specify which config to use:

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[...],
    extra_body={
        "guardrails": {
            "config_id": "config1"
        }
    }
)

Single-Config Mode

In single-config mode, the server serves a single configuration:

nemoguardrails server --config=/path/to/my-config

Directory structure:

my-config/
├── config.yml
├── rails.co
└── actions.py

Clients don’t need to specify a config_id.

Server Options

Command-Line Options

nemoguardrails server [OPTIONS]

--port

integer

default:"8000"

The port that the server should listen on.

--config

path

Path to a directory containing configuration sub-folders (multi-config mode) or a single configuration directory (single-config mode).

--default-config-id

string

The default configuration to use when no config is specified in requests.

--verbose

boolean

default:"false"

Enable verbose logging including prompts and LLM calls.

--disable-chat-ui

boolean

default:"false"

Disable the built-in chat UI.

--auto-reload

boolean

default:"false"

Enable automatic reloading when configuration files change.

--prefix

string

default:""

A prefix that should be added to all server paths (must start with ’/’).

nemoguardrails server --config=./configs --port=8000

Environment Variables

CORS Configuration

NEMO_GUARDRAILS_SERVER_ENABLE_CORS

string

default:"false"

Enable Cross-Origin Resource Sharing (CORS).

NEMO_GUARDRAILS_SERVER_ALLOWED_ORIGINS

string

default:"*"

Comma-separated list of allowed origins. Use ”*” to allow all origins.

export NEMO_GUARDRAILS_SERVER_ENABLE_CORS=true
export NEMO_GUARDRAILS_SERVER_ALLOWED_ORIGINS="http://localhost:3000,https://myapp.com"
nemoguardrails server --config=./configs

Model Configuration

MAIN_MODEL_ENGINE

string

default:"openai"

The default LLM provider when model is specified in request.

MAIN_MODEL_BASE_URL

string

Base URL for the LLM provider API.

DEFAULT_CONFIG_ID

string

Default configuration ID to use.

export MAIN_MODEL_ENGINE=nvidia_ai_endpoints
export MAIN_MODEL_BASE_URL=https://integrate.api.nvidia.com/v1
nemoguardrails server --config=./configs

Server Configuration File

You can create a config.py file in your configs directory to customize server behavior:

config.py

from fastapi import FastAPI
from nemoguardrails.server.datastore import MemoryStore, register_datastore

def init(app: FastAPI):
    """Initialize server with custom configuration."""
    
    # Register a custom datastore for threads
    datastore = MemoryStore()
    register_datastore(datastore)
    
    # Register custom loggers
    def custom_logger(data: dict):
        print(f"Request logged: {data['endpoint']}")
    
    from nemoguardrails.server.api import register_logger
    register_logger(custom_logger)
    
    # Set default config
    from nemoguardrails.server.api import set_default_config_id
    set_default_config_id("my-default-config")

API Endpoints

GET /v1/rails/configs

List available guardrails configurations.

curl http://localhost:8000/v1/rails/configs

GET /v1/models

List available LLM models from the configured provider.

curl http://localhost:8000/v1/models

POST /v1/chat/completions

See Chat Completions API for details.

Thread Management

The server supports conversation threads for maintaining state across requests.

Using Threads

import uuid
from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="not-needed"
)

# Generate a unique thread ID (minimum 16 characters)
thread_id = str(uuid.uuid4())

# First message in thread
response1 = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "My name is Alice"}],
    extra_body={
        "guardrails": {
            "config_id": "my-config",
            "thread_id": thread_id
        }
    }
)

# Continue thread
response2 = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What is my name?"}],
    extra_body={
        "guardrails": {
            "config_id": "my-config",
            "thread_id": thread_id
        }
    }
)

print(response2.choices[0].message.content)  # "Your name is Alice"

Custom Datastore

By default, threads are stored in memory. You can configure a custom datastore:

config.py

from nemoguardrails.server.datastore import RedisStore, register_datastore

def init(app):
    # Use Redis for persistent thread storage
    datastore = RedisStore(
        host="localhost",
        port=6379,
        db=0
    )
    register_datastore(datastore)

Auto-Reload

Enable auto-reload to automatically reload configurations when files change:

nemoguardrails server --config=./configs --auto-reload

Requires the watchdog package:

pip install watchdog

When enabled, the server monitors configuration files and reloads them automatically when changes are detected.

Chat UI

The server includes a built-in chat UI accessible at http://localhost:8000. To disable the chat UI:

nemoguardrails server --config=./configs --disable-chat-ui

Production Deployment

Using Gunicorn

pip install gunicorn
gunicorn nemoguardrails.server.api:app \
  --workers 4 \
  --worker-class uvicorn.workers.UvicornWorker \
  --bind 0.0.0.0:8000

Using Docker

Dockerfile

FROM python:3.10-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install -r requirements.txt

COPY configs/ /app/configs/

EXPOSE 8000

CMD ["nemoguardrails", "server", "--config=/app/configs", "--port=8000"]

docker build -t guardrails-server .
docker run -p 8000:8000 guardrails-server

Environment Variables

export OPENAI_API_KEY=sk-...
export MAIN_MODEL_ENGINE=openai
export NEMO_GUARDRAILS_SERVER_ENABLE_CORS=true
nemoguardrails server --config=./configs

nemoguardrails server

⌘I

Build docs developers (and LLMs) love

Get started for free Talk to us