Skip to main content

POST /v1/chat/completions

Create a chat completion with guardrails applied. This endpoint is compatible with the OpenAI Chat Completions API with additional guardrails-specific extensions.

Request

model
string
required
The LLM model to use for chat completion (e.g., “gpt-4o”, “llama-3.1-8b”).
messages
array
The list of messages in the current conversation.
[
  {"role": "user", "content": "Hello!"},
  {"role": "assistant", "content": "Hi there!"}
]
stream
boolean
default:"false"
If set, partial message deltas will be sent as server-sent events.
max_tokens
integer
The maximum number of tokens to generate.
temperature
number
Sampling temperature to use (0.0 to 2.0).
top_p
number
Top-p sampling parameter (0.0 to 1.0).
stop
string | array
Stop sequences where the API will stop generating further tokens.
presence_penalty
number
Presence penalty parameter (-2.0 to 2.0).
frequency_penalty
number
Frequency penalty parameter (-2.0 to 2.0).

Guardrails Extensions

guardrails
object
Guardrails-specific options:
guardrails.config_id
string
The guardrails configuration ID to use.
guardrails.config_ids
array
List of configuration IDs to combine. Cannot be used with config_id.
guardrails.thread_id
string
The ID of an existing thread to continue (minimum 16 characters).
guardrails.context
object
Additional context data for the conversation.
{
  "user_name": "Alice",
  "user_id": "12345"
}
guardrails.options
GenerationOptions
Additional generation options:
  • rails: Which rails to enable ({"input": true, "output": true})
  • log: Logging options ({"activated_rails": true, "llm_calls": true})
  • output_vars: Variables to extract from context
guardrails.state
object
State object to continue the interaction. Must contain events or state key.

Response

id
string
Unique identifier for the chat completion.
object
string
Always “chat.completion”.
created
integer
Unix timestamp of when the completion was created.
model
string
The model used for the completion.
choices
array
Array of completion choices.
choices[].index
integer
The index of this choice.
choices[].message
object
The generated message.
choices[].message.role
string
Always “assistant”.
choices[].message.content
string
The content of the message.
choices[].message.tool_calls
array
Tool calls generated by the model (if any).
choices[].finish_reason
string
The reason the generation stopped: “stop”, “length”, or “content_filter”.
guardrails
object
Guardrails-specific output data:
guardrails.config_id
string
The configuration ID that was used.
guardrails.state
object
Updated state object for continuing the conversation.
guardrails.log
object
Generation log data (if requested):
  • activated_rails: List of rails that were activated
  • llm_calls: Details of LLM calls made
  • stats: Performance statistics
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ],
    "guardrails": {
      "config_id": "my-config"
    }
  }'

Response Examples

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop",
      "logprobs": null
    }
  ],
  "guardrails": {
    "config_id": "my-config",
    "state": {
      "events": [...]
    },
    "log": {
      "stats": {
        "total_llm_calls": 2,
        "total_time": 1.5
      }
    }
  }
}

Error Responses

error
object
error.message
string
Human-readable error message.
error.type
string
Error type: “invalid_request_error”, “authentication_error”, “server_error”, etc.
error.code
string
Error code.
{
  "error": {
    "message": "No guardrails config_id provided and server has no default configuration",
    "type": "invalid_request_error",
    "code": "missing_config"
  }
}

Build docs developers (and LLMs) love