Skip to main content
POST
/
v1
/
chat
/
completions
Chat Completions
curl --request POST \
  --url https://api.example.com/v1/chat/completions \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "messages": [
    {}
  ],
  "tools": [
    {}
  ],
  "tool_choice": {},
  "parallel_tool_calls": true,
  "stream": true,
  "stream_options": {},
  "temperature": 123,
  "top_p": 123,
  "max_tokens": 123,
  "max_completion_tokens": 123,
  "response_format": {},
  "stop": {},
  "presence_penalty": 123,
  "frequency_penalty": 123,
  "logprobs": true,
  "top_logprobs": 123,
  "seed": 123,
  "n": 123
}
'
{
  "id": "<string>",
  "object": "<string>",
  "created": 123,
  "model": "<string>",
  "choices": [
    {}
  ],
  "usage": {}
}

Overview

The /v1/chat/completions endpoint provides full OpenAI Chat Completions API compatibility. It accepts chat-formatted messages and maps them internally to the Responses API format while preserving streaming behavior and tool calling capabilities.

Authentication

Authorization
string
required
Bearer token for API authentication. Format: Bearer YOUR_API_KEY

Request Body

model
string
required
ID of the model to use. Must be a valid model slug from the /v1/models endpoint.Example: "gpt-4.1", "gpt-5.2"
messages
array
required
Array of message objects representing the conversation history. Must contain at least one message.Each message object has:
  • role (string, required): One of "system", "developer", "user", "assistant", or "tool"
  • content (string | array): Message content. For system/developer roles, must be text-only.
  • tool_calls (array, optional): For assistant messages, array of tool call objects
  • tool_call_id (string, required for tool role): ID of the tool call this message responds to
tools
array
Array of tool definitions available to the model.Each tool object:
  • type (string): "function" or "web_search"
  • function (object): For function tools, contains name, description, and parameters
Supported tool types:
  • function: Custom function calls
  • web_search or web_search_preview: Web search capability
Unsupported types (will return error):
  • file_search, code_interpreter, computer_use, computer_use_preview, image_generation
tool_choice
string | object
Controls which tool the model should use.Options:
  • "none": Model will not call tools
  • "auto": Model decides whether to call tools
  • "required": Model must call at least one tool
  • Object with {"type": "function", "function": {"name": "tool_name"}}: Force specific tool
parallel_tool_calls
boolean
Whether to enable parallel tool calling. When true, the model can call multiple tools simultaneously.
stream
boolean
default:false
Whether to stream the response as server-sent events.
  • true: Returns text/event-stream with chat.completion.chunk objects
  • false: Returns a single chat.completion object
stream_options
object
Options for streaming responses.Properties:
  • include_usage (boolean): Include token usage in final chunk
  • include_obfuscation (boolean): Include obfuscation data in stream
temperature
number
Sampling temperature between 0 and 2. Higher values make output more random.
top_p
number
Nucleus sampling parameter. Alternative to temperature.
max_tokens
integer
Maximum number of tokens to generate. Alias for max_completion_tokens.
max_completion_tokens
integer
Maximum number of tokens in the completion.
response_format
object
Format for the model’s output.Options:
  • {"type": "text"}: Plain text (default)
  • {"type": "json_object"}: Valid JSON object
  • {"type": "json_schema", "json_schema": {...}}: JSON matching provided schema
For json_schema type:
  • json_schema.name (string): Schema name, 1-64 chars, alphanumeric/underscore/hyphen
  • json_schema.schema (object): JSON Schema definition
  • json_schema.strict (boolean): Enable strict schema adherence
stop
string | array
Stop sequence(s). Generation stops when these tokens are encountered.
presence_penalty
number
Penalty for token presence. Range: -2.0 to 2.0.
frequency_penalty
number
Penalty for token frequency. Range: -2.0 to 2.0.
logprobs
boolean
Whether to return log probabilities of output tokens.
top_logprobs
integer
Number of most likely tokens to return at each position (requires logprobs: true).
seed
integer
Random seed for deterministic sampling.
n
integer
default:1
Number of completions to generate. Must be 1 (only value supported).

Response (Non-Streaming)

When stream is false or omitted, returns a chat.completion object:
id
string
Unique identifier for the completion.
object
string
Always "chat.completion".
created
integer
Unix timestamp of creation.
model
string
Model used for completion.
choices
array
Array of completion choices (always contains one choice).Each choice object:
  • index (integer): Choice index (always 0)
  • message (object): The assistant’s message
    • role (string): Always "assistant"
    • content (string | null): Text content of the message
    • refusal (string | null): Refusal message if model declined
    • tool_calls (array | null): Tool calls made by the model
  • finish_reason (string): Why generation stopped
    • "stop": Natural completion
    • "length": Max tokens reached
    • "tool_calls": Model called tools
    • "content_filter": Content filtered
usage
object
Token usage information.Properties:
  • prompt_tokens (integer): Tokens in the prompt
  • completion_tokens (integer): Tokens in the completion
  • total_tokens (integer): Total tokens used
  • prompt_tokens_details (object | null):
    • cached_tokens (integer): Cached prompt tokens
  • completion_tokens_details (object | null):
    • reasoning_tokens (integer): Tokens used for reasoning

Response (Streaming)

When stream is true, returns text/event-stream with chat.completion.chunk objects:
id
string
Unique identifier for the chunk stream.
object
string
Always "chat.completion.chunk".
created
integer
Unix timestamp of creation.
model
string
Model being used.
choices
array
Array of delta choices.Each choice contains:
  • index (integer): Always 0
  • delta (object): Incremental content
    • role (string | null): Role (only in first chunk)
    • content (string | null): Content delta
    • refusal (string | null): Refusal delta
    • tool_calls (array | null): Tool call deltas
  • finish_reason (string | null): Reason when complete
usage
object | null
Token usage (only in final chunk when stream_options.include_usage is true).

Examples

Basic Chat Completion

curl https://api.example.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4.1",
    "messages": [
      {"role": "user", "content": "What is the capital of France?"}
    ]
  }'

Streaming Response

curl https://api.example.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4.1",
    "messages": [
      {"role": "user", "content": "Write a haiku about coding"}
    ],
    "stream": true
  }'

Streaming with Usage

curl https://api.example.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.2",
    "messages": [
      {"role": "user", "content": "Explain quantum computing"}
    ],
    "stream": true,
    "stream_options": {
      "include_usage": true
    }
  }'

Tool Calling

curl https://api.example.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4.1",
    "messages": [
      {"role": "user", "content": "What is the weather in San Francisco?"}
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get current weather for a location",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {
                "type": "string",
                "description": "City name"
              }
            },
            "required": ["location"]
          }
        }
      }
    ],
    "tool_choice": "auto"
  }'

Web Search Tool

curl https://api.example.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4.1",
    "messages": [
      {"role": "user", "content": "What are the latest news about AI?"}
    ],
    "tools": [
      {"type": "web_search"}
    ]
  }'

JSON Schema Response Format

curl https://api.example.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4.1",
    "messages": [
      {"role": "user", "content": "Generate a person profile"}
    ],
    "response_format": {
      "type": "json_schema",
      "json_schema": {
        "name": "person_profile",
        "schema": {
          "type": "object",
          "properties": {
            "name": {"type": "string"},
            "age": {"type": "number"},
            "city": {"type": "string"}
          },
          "required": ["name", "age"]
        },
        "strict": true
      }
    }
  }'

Multi-turn Conversation

curl https://api.example.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4.1",
    "messages": [
      {"role": "system", "content": "You are a helpful math tutor."},
      {"role": "user", "content": "What is 25 * 4?"}
    ]
  }'

Content Type Restrictions

System and Developer Messages

  • Must contain text-only content
  • Cannot include images, files, or other media types
  • Violations return 400 with invalid_request_error

User Messages

Supported content types:
  • Text: String or {"type": "text", "text": "..."}
  • Images: {"type": "image_url", "image_url": {"url": "..."}}
    • Data URLs and HTTP(S) URLs supported
    • Images over 8MB are automatically dropped
  • Files: {"type": "file", "file": {...}}
    • file_id is not supported and will return error
Unsupported:
  • Audio input: input_audio type returns 400 error

Assistant Messages

  • Can include content (text) and/or tool_calls
  • Tool calls must have valid id and function with name

Tool Messages

  • Must include tool_call_id matching a previous assistant tool call
  • Content becomes the tool output

Error Handling

All errors return OpenAI-compatible error envelopes:
{
  "error": {
    "message": "Error description",
    "type": "invalid_request_error",
    "code": "error_code",
    "param": "field_name"
  }
}
Common error codes:
  • invalid_request_error: Invalid request parameters
  • model_not_allowed: API key lacks access to requested model
  • no_accounts: No upstream accounts available
  • upstream_error: Upstream service error
For streaming requests, errors are emitted as error chunks followed by data: [DONE].

Model Restrictions

If your API key has allowed_models configured, only those models can be used. Requests for other models return:
{
  "error": {
    "message": "This API key does not have access to model 'gpt-5.2'",
    "type": "invalid_request_error",
    "code": "model_not_allowed"
  }
}
Check available models at /v1/models.

Build docs developers (and LLMs) love