Responses

Overview

The Responses API is the primary interface for generating model outputs. It supports text and image inputs, structured outputs, function calling, built-in tools (web search, file search, code interpreter), and both streaming and non-streaming responses.

Create response

Creates a model response from text, image, or file inputs.

response = client.responses.create(
  model: "gpt-4o",
  input: "What is the capital of France?"
)

puts response.output.first.content.first.text

Parameters

model

string

Model ID like gpt-4o, gpt-4o-mini, o3-mini, or o1. See the model guide for available options.

input

string | array

Input to the model. Can be:

A simple string
An array of message objects with role and content
An array of input items (messages, tool calls, reasoning, etc.)

instructions

string

System message prepended to the model’s context (max 256,000 characters)

temperature

float

Sampling temperature between 0 and 2. Higher = more random (default: 1)

max_output_tokens

integer

Maximum tokens to generate in the response

tools

array

Tools the model can use:

Show Built-in tools

{ type: "file_search" } - Search uploaded files
{ type: "code_interpreter" } - Run Python code
{ type: "web_search" } - Search the web

Show Custom functions

{
  type: "function",
  function: {
    name: "get_weather",
    description: "Get current weather",
    parameters: { ... }
  }
}

tool_choice

string | object

How to select tools:

auto (default) - Model decides
none - No tools
required - Must use a tool
{ type: "function", function: { name: "..." } } - Force specific function

text

object

Text output configuration:

Show Structured output

text: {
  format: {
    type: "json_schema",
    strict: true,
    name: "Recipe",
    schema: {
      type: "object",
      properties: {
        name: { type: "string" },
        ingredients: { type: "array", items: { type: "string" } }
      },
      required: ["name", "ingredients"]
    }
  }
}

conversation

string | object

Conversation ID to continue, or object with id to create/reference

previous_response_id

string

ID of previous response to continue from

store

boolean

Whether to store the response for later retrieval (default: false)

metadata

hash

Optional metadata (up to 16 key-value pairs)

reasoning

object

Reasoning configuration for o-series and gpt-5 models:

Show Properties

effort

string

Reasoning effort: low, medium, or high

summary

string

Summary mode: auto, concise, or detailed

top_p

float

Nucleus sampling parameter (0-1, alternative to temperature)

parallel_tool_calls

boolean

Allow parallel tool execution (default: true)

Response

string

Unique response identifier

object

string

Object type: response

created_at

integer

Unix timestamp of creation

model

string

Model used for generation

output

array

Array of output items (messages, tool calls, reasoning)

Show Message output

type

string

message

role

string

assistant

content

array

Array of content parts (text, images, etc.)

usage

object

Token usage information

status

string

Response status: completed, incomplete, failed, etc.

Stream response

Creates a streaming response for real-time output.

stream = client.responses.stream(
  model: "gpt-4o",
  input: "Write a haiku about coding"
)

stream.on_text_delta do |delta|
  print delta
end

stream.run

Stream helpers

# Handle different event types
stream.on_text_delta { |delta| print delta }
stream.on_reasoning_delta { |delta| puts "[thinking: #{delta}]" }
stream.on_function_call { |name, args| handle_function(name, args) }
stream.on_completed { |response| puts "\nDone: #{response.id}" }
stream.on_error { |error| puts "Error: #{error}" }

Retrieve response

Retrieves a stored response by ID.

response = client.responses.retrieve("resp_abc123")

Parameters

response_id

string

required

ID of the response to retrieve

include

array

Additional fields to include: reasoning, audio

Delete response

Deletes a stored response.

client.responses.delete("resp_abc123")

Parameters

response_id

string

required

ID of the response to delete

Cancel response

Cancels a background response.

client.responses.cancel("resp_abc123")

Parameters

response_id

string

required

ID of the response to cancel

Compact conversation

Compacts a long conversation to fit within context limits.

compacted = client.responses.compact(
  model: "gpt-4o",
  previous_response_id: "resp_abc123"
)

Examples

Basic text generation

response = client.responses.create(
  model: "gpt-4o",
  input: "Explain quantum computing in simple terms",
  max_output_tokens: 500
)

puts response.output.first.content.first.text

Streaming with tool use

stream = client.responses.stream(
  model: "gpt-4o",
  input: "What's the weather in San Francisco?",
  tools: [{ type: "web_search" }]
)

stream.on_web_search_call do |query|
  puts "Searching: #{query}"
end

stream.on_text_delta { |delta| print delta }
stream.run

Structured output

response = client.responses.create(
  model: "gpt-4o",
  input: "Generate a recipe for chocolate chip cookies",
  text: {
    format: {
      type: "json_schema",
      strict: true,
      name: "Recipe",
      schema: {
        type: "object",
        properties: {
          name: { type: "string" },
          ingredients: {
            type: "array",
            items: { type: "string" }
          },
          steps: {
            type: "array",
            items: { type: "string" }
          }
        },
        required: ["name", "ingredients", "steps"]
      }
    }
  }
)

recipe = JSON.parse(response.output.first.content.first.text)

Multi-turn conversation

# First turn
response1 = client.responses.create(
  model: "gpt-4o",
  input: "What is the capital of France?",
  store: true
)

# Second turn
response2 = client.responses.create(
  model: "gpt-4o",
  input: "What is its population?",
  previous_response_id: response1.id
)

Function calling

response = client.responses.create(
  model: "gpt-4o",
  input: "What's the weather in Tokyo?",
  tools: [
    {
      type: "function",
      function: {
        name: "get_weather",
        description: "Get current weather for a location",
        parameters: {
          type: "object",
          properties: {
            location: { type: "string" }
          },
          required: ["location"]
        }
      }
    }
  ]
)

# Check for function calls
response.output.each do |item|
  if item.type == "function_call"
    puts "Function: #{item.name}"
    puts "Arguments: #{item.arguments}"
  end
end

Resources

Client

Overview

Create response

Parameters

Response

Stream response

Stream helpers

Retrieve response

Parameters

Delete response

Parameters

Cancel response

Parameters

Compact conversation

Examples

Basic text generation

Streaming with tool use

Structured output

Multi-turn conversation

Function calling

Build docs developers (and LLMs) love

Resources

Client

​Overview

​Create response

​Parameters

​Response

​Stream response

​Stream helpers

​Retrieve response

​Parameters

​Delete response

​Parameters

​Cancel response

​Parameters

​Compact conversation

​Examples

​Basic text generation

​Streaming with tool use

​Structured output

​Multi-turn conversation

​Function calling

​Related

Build docs developers (and LLMs) love

Overview

Create response

Parameters

Response

Stream response

Stream helpers

Retrieve response

Parameters

Delete response

Parameters

Cancel response

Parameters

Compact conversation

Examples

Basic text generation

Streaming with tool use

Structured output

Multi-turn conversation

Function calling

Related