Skip to main content
POST
/
v1
/
responses
Responses
curl --request POST \
  --url https://api.example.com/v1/responses \
  --header 'Authorization: <authorization>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "input": {},
  "instructions": "<string>",
  "messages": [
    {}
  ],
  "tools": [
    {}
  ],
  "tool_choice": {},
  "parallel_tool_calls": true,
  "reasoning": {},
  "text": {},
  "stream": true,
  "include": [
    {}
  ],
  "conversation": "<string>",
  "previous_response_id": "<string>",
  "store": true,
  "truncation": "<string>",
  "prompt_cache_key": "<string>"
}
'
{
  "id": "<string>",
  "object": "<string>",
  "status": "<string>",
  "output": [
    {}
  ],
  "usage": {},
  "error": {},
  "response.created": {},
  "response.in_progress": {},
  "response.output_text.delta": {},
  "response.refusal.delta": {},
  "response.function_call.delta": {},
  "response.completed": {},
  "response.incomplete": {},
  "response.failed": {}
}

Overview

The /v1/responses endpoint provides OpenAI Responses API compatibility. It accepts structured input with instructions and forwards requests to upstream with proper validation, sanitization, and error handling. This endpoint supports both streaming and non-streaming modes, handles conversation context, and provides full tool calling capabilities.

Authentication

Authorization
string
required
Bearer token for API authentication. Format: Bearer YOUR_API_KEY

Request Body

model
string
required
ID of the model to use. Must be a valid model slug from the /v1/models endpoint.Example: "gpt-4.1", "gpt-5.2"
input
string | array
required
User input to the model.Can be:
  • String: Plain text input (normalized to single input_text item)
  • Array: Structured input items with role-based messages
When providing array, each item can have:
  • role (string): "user", "assistant", or "tool"
  • content (string | array): Message content
  • type (string): Item type (e.g., "input_text", "input_image", "function_call_output")
Note: input_file.file_id is not supported and will return error.
instructions
string
System-level instructions for the model. Equivalent to system/developer messages in Chat Completions.
messages
array
Alternative to input. Array of chat-formatted messages.Cannot be used together with input. Provide either input or messages, not both.Messages are coerced into instructions (for system/developer) and input items (for user/assistant/tool).
tools
array
Array of tool definitions available to the model.Each tool object:
  • type (string): Tool type
  • name (string): Tool name (for function tools)
  • description (string): Tool description
  • parameters (object): JSON Schema for parameters
Supported tool types:
  • function: Custom function calls
  • web_search or web_search_preview: Web search capability
Unsupported types (will return error):
  • file_search, code_interpreter, computer_use, computer_use_preview, image_generation
tool_choice
string | object
Controls which tool the model should use.Options:
  • "none": Model will not call tools
  • "auto": Model decides whether to call tools
  • "required": Model must call at least one tool
  • Object: {"type": "...", "name": "..."}
parallel_tool_calls
boolean
Whether to enable parallel tool calling.
reasoning
object
Reasoning controls for the model.Properties:
  • effort (string): Reasoning effort level (e.g., "low", "medium", "high")
  • summary (string): Reasoning summary mode
text
object
Text output controls.Properties:
  • verbosity (string): Output verbosity level
  • format (object): Output format specification
    • type (string): "text", "json_object", or "json_schema"
    • schema (object): JSON Schema (for json_schema type)
    • name (string): Schema name
    • strict (boolean): Strict schema adherence
stream
boolean
default:false
Whether to stream the response as server-sent events.
  • true: Returns text/event-stream with Responses events
  • false: Returns a single response object
include
array
Additional data to include in the response.Allowed values:
  • "code_interpreter_call.outputs"
  • "computer_call_output.output.image_url"
  • "file_search_call.results"
  • "message.input_image.image_url"
  • "message.output_text.logprobs"
  • "reasoning.encrypted_content"
  • "web_search_call.action.sources"
Unknown values return 400 error.
conversation
string
Conversation ID for multi-turn context.Cannot be used with previous_response_id.
previous_response_id
string
Not supported. Returns 400 error.Use conversation instead for multi-turn context.
store
boolean
default:false
Must be false or omitted. Setting to true returns 400 error.
truncation
string
Not supported. Returns 400 error if provided.
prompt_cache_key
string
Cache key for prompt caching optimization.

Response (Non-Streaming)

When stream is false or omitted, returns a response object:
id
string
Unique identifier for the response.
object
string
Always "response".
status
string
Response status:
  • "completed": Successfully completed
  • "incomplete": Incomplete (e.g., max tokens reached)
  • "failed": Failed with error
output
array
Array of output items generated by the model.Each output item has:
  • type (string): Output type (e.g., "message", "function_call", "web_search_call")
  • Additional fields based on type
usage
object
Token usage information.Properties:
  • input_tokens (integer): Tokens in the input
  • output_tokens (integer): Tokens in the output
  • total_tokens (integer): Total tokens used
  • input_tokens_details (object | null):
    • cached_tokens (integer): Cached input tokens
  • output_tokens_details (object | null):
    • reasoning_tokens (integer): Tokens used for reasoning
error
object
Error information (only present when status is "failed").Properties:
  • message (string): Error message
  • type (string): Error type
  • code (string): Error code

Response (Streaming)

When stream is true, returns text/event-stream with event objects:

Event Types

response.created
event
Emitted when response is created.Contains response object with id and initial metadata.
response.in_progress
event
Emitted during response generation.May include partial response data.
response.output_text.delta
event
Emitted for text output deltas.Properties:
  • delta (string): Text fragment
response.refusal.delta
event
Emitted for refusal text deltas.Properties:
  • delta (string): Refusal fragment
response.function_call.delta
event
Emitted for tool call deltas.Properties:
  • call_id (string): Tool call ID
  • name (string): Tool name
  • arguments (string): Arguments fragment
response.completed
event
Emitted when response completes successfully.Contains full response object with output and usage.
response.incomplete
event
Emitted when response is incomplete.Contains response with incomplete_details:
  • reason (string): Why incomplete (e.g., "max_output_tokens", "content_filter")
response.failed
event
Emitted when response fails.Contains response with error object.
error
event
Emitted for immediate errors.Contains error object with error details.

Examples

Basic Text Response

curl https://api.example.com/v1/responses \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4.1",
    "input": "Explain quantum entanglement in simple terms"
  }'

Streaming Response

curl https://api.example.com/v1/responses \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4.1",
    "input": "Write a short story about a robot",
    "stream": true
  }'

With Instructions

curl https://api.example.com/v1/responses \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4.1",
    "instructions": "You are a helpful coding assistant. Provide concise answers with code examples.",
    "input": "How do I sort a list in Python?"
  }'

Structured Input with Messages

curl https://api.example.com/v1/responses \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4.1",
    "messages": [
      {"role": "system", "content": "You are a math tutor."},
      {"role": "user", "content": "What is 15 * 23?"}
    ]
  }'

Tool Calling

curl https://api.example.com/v1/responses \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4.1",
    "input": "What is the current time in Tokyo?",
    "tools": [
      {
        "type": "function",
        "name": "get_time",
        "description": "Get current time for a timezone",
        "parameters": {
          "type": "object",
          "properties": {
            "timezone": {
              "type": "string",
              "description": "IANA timezone name"
            }
          },
          "required": ["timezone"]
        }
      }
    ],
    "tool_choice": "auto"
  }'
curl https://api.example.com/v1/responses \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4.1",
    "input": "What are the latest developments in fusion energy?",
    "tools": [
      {"type": "web_search"}
    ]
  }'

Reasoning with Summary

curl https://api.example.com/v1/responses \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.2",
    "input": "Solve this complex math problem: ...",
    "reasoning": {
      "effort": "high",
      "summary": "concise"
    }
  }'

JSON Output Format

curl https://api.example.com/v1/responses \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4.1",
    "input": "List 3 programming languages with their use cases",
    "text": {
      "format": {
        "type": "json_schema",
        "name": "languages",
        "schema": {
          "type": "object",
          "properties": {
            "languages": {
              "type": "array",
              "items": {
                "type": "object",
                "properties": {
                  "name": {"type": "string"},
                  "use_case": {"type": "string"}
                },
                "required": ["name", "use_case"]
              }
            }
          }
        },
        "strict": true
      }
    }
  }'

Conversation Context

curl https://api.example.com/v1/responses \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4.1",
    "input": "Tell me more about that",
    "conversation": "conv_abc123"
  }'

Input Sanitization

The service automatically sanitizes input before forwarding to upstream:

Interleaved Reasoning Removal

Unsupported interleaved reasoning fields are stripped from input:
  • reasoning_content
  • reasoning_details
  • tool_calls (in input context)
  • function_call (in input context)
Top-level reasoning controls are preserved.

Content Type Normalization

  • Assistant text content is rewritten to use output_text type
  • Tool messages are converted to function_call_output format with call_id

Unsupported Field Removal

Before upstream forwarding, these fields are stripped:
  • safety_identifier
  • prompt_cache_retention
  • temperature
  • max_output_tokens

Error Handling

All errors return OpenAI-compatible error envelopes:
{
  "error": {
    "message": "Error description",
    "type": "invalid_request_error",
    "code": "error_code",
    "param": "field_name"
  }
}
Common error codes:
  • invalid_request_error: Invalid request parameters
  • model_not_allowed: API key lacks access to requested model
  • no_accounts: No upstream accounts available (503 status)
  • upstream_error: Upstream service error (502 status)
  • not_implemented: Feature not implemented (501 status)
For streaming requests, errors are emitted as response.failed or error events.

Validation Rules

Input Validation

  • Either input or messages required (not both)
  • input must be string or array
  • input_file.file_id is rejected

Tool Validation

  • web_search_preview normalized to web_search
  • Unsupported tool types rejected: file_search, code_interpreter, computer_use, image_generation

Conversation Validation

  • Cannot provide both conversation and previous_response_id
  • previous_response_id is not supported

Store Validation

  • store must be false or omitted
  • Setting to true returns error

Include Validation

  • Only allowlisted include values accepted
  • Unknown values return error

Truncation Validation

  • truncation is not supported
  • Any value returns error

Model Restrictions

If your API key has allowed_models configured, only those models can be used. Requests for other models return:
{
  "error": {
    "message": "This API key does not have access to model 'gpt-5.2'",
    "type": "invalid_request_error",
    "code": "model_not_allowed"
  }
}
Check available models at /v1/models.

Build docs developers (and LLMs) love