Chat Completions

The Chat Completions API lets you create model responses for chat conversations. It supports text generation, vision, audio, function calling, and structured outputs.

Create a chat completion

Creates a model response for the given chat conversation.

client.chat.completions.create(params)

messages

Array

required

A list of messages comprising the conversation so far. Each message can be a system, user, assistant, tool, or function message.

model

String

required

Model ID used to generate the response, like gpt-4o or o3. See the models documentation for available options.

temperature

Float

default:"1.0"

Sampling temperature between 0 and 2. Higher values like 0.8 make output more random, while lower values like 0.2 make it more focused and deterministic.

max_completion_tokens

Integer

An upper bound for the number of tokens that can be generated for a completion, including both visible output tokens and reasoning tokens.

max_tokens

Integer

The maximum number of tokens that can be generated in the chat completion. This value is deprecated in favor of max_completion_tokens.

stream

Boolean

default:"false"

If set to true, partial message deltas will be sent via Server-Sent Events. For streaming, use stream_raw method instead.

tools

Array

A list of tools the model may call. Can be functions or custom tools defined using OpenAI::BaseModel.

tool_choice

String | Object

Controls which (if any) tool is called by the model. Can be auto, none, required, or a specific tool name.

response_format

Object

An object specifying the format that the model must output. Supports text, json_object, or json_schema for structured outputs.

frequency_penalty

Float

default:"0"

Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text.

presence_penalty

Float

default:"0"

Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far.

seed

Integer

If specified, the system will make a best effort to sample deterministically for improved reproducibility.

stop

String | Array<String>

Up to 4 sequences where the API will stop generating further tokens. Not supported with latest reasoning models.

Integer

default:"1"

How many chat completion choices to generate for each input message.

logprobs

Boolean

default:"false"

Whether to return log probabilities of the output tokens.

top_logprobs

Integer

An integer between 0 and 20 specifying the number of most likely tokens to return at each token position.

reasoning_effort

String

Constrains effort on reasoning for reasoning models. Options: low, medium, high.

store

Boolean

Whether or not to store the output of this chat completion request for use in distillation and evals.

Response

Returns a ChatCompletion object.

String

Unique identifier for the chat completion.

object

String

The object type, always chat.completion.

created

Integer

Unix timestamp of when the completion was created.

model

String

The model used for the completion.

choices

Array

A list of chat completion choices.

Show Choice properties

index

Integer

The index of the choice in the list.

message

Object

The message generated by the model.

Show Message properties

role

String

The role of the author (e.g., assistant).

content

String

The contents of the message.

tool_calls

Array

The tool calls generated by the model.

parsed

Object

For structured outputs, the parsed object matching your schema.

finish_reason

String

The reason the model stopped generating tokens: stop, length, tool_calls, content_filter, or function_call.

usage

Object

Token usage information.

Show Usage properties

prompt_tokens

Integer

Number of tokens in the prompt.

completion_tokens

Integer

Number of tokens in the completion.

total_tokens

Integer

Total number of tokens used.

Examples

Basic chat completion

require "openai"

client = OpenAI::Client.new

completion = client.chat.completions.create(
  model: "gpt-4",
  messages: [
    {
      role: "user",
      content: "Say this is a test"
    }
  ]
)

puts completion.choices.first&.message&.content

Streaming chat completion

require "openai"

client = OpenAI::Client.new

stream = client.chat.completions.stream(
  model: "gpt-4o-mini",
  messages: [
    {role: :user, content: "List three fun facts about dolphins."}
  ]
)

stream.text.each do |text|
  print(text)
end
puts

Function calling with tools

require "openai"

class GetWeather < OpenAI::BaseModel
  required :location, String
end

client = OpenAI::Client.new

stream = client.chat.completions.stream(
  model: "gpt-4o-mini",
  tools: [GetWeather],
  messages: [
    {role: :user, content: "What's the weather in San Francisco?"}
  ]
)

stream.each do |event|
  case event
  when OpenAI::Streaming::ChatFunctionToolCallArgumentsDoneEvent
    puts "Tool: #{event.name}"
    puts "Args: #{event.arguments}"
    pp event.parsed
  end
end

Structured outputs

require "openai"

class Location < OpenAI::BaseModel
  required :city, String
  required :country, String
end

class CalendarEvent < OpenAI::BaseModel
  required :name, String
  required :date, String
  required :location, Location
end

client = OpenAI::Client.new

completion = client.chat.completions.create(
  model: "gpt-4o-2024-08-06",
  messages: [
    {role: :system, content: "Extract the event information."},
    {role: :user, content: "Science fair on Friday in San Diego"}
  ],
  response_format: CalendarEvent
)

# Access the parsed structured output
event = completion.choices.first.message.parsed
puts "Event: #{event.name} on #{event.date}"

Additional methods

Stream raw events

For low-level streaming access, use stream_raw:

stream = client.chat.completions.stream_raw(
  model: "gpt-4",
  messages: [{role: "user", content: "Hello"}]
)

stream.each do |chunk|
  puts chunk.choices.first&.delta&.content
end

Retrieve a stored completion

completion = client.chat.completions.retrieve(completion_id)

List stored completions

completions = client.chat.completions.list(limit: 10)

Update completion metadata

client.chat.completions.update(
  completion_id,
  metadata: {key: "value"}
)

Delete a stored completion

client.chat.completions.delete(completion_id)

Resources

Client

Create a chat completion

Response

Examples

Basic chat completion

Streaming chat completion

Function calling with tools

Structured outputs

Additional methods

Stream raw events

Retrieve a stored completion

List stored completions

Update completion metadata

Delete a stored completion

Build docs developers (and LLMs) love

Resources

Client

​Create a chat completion

​Response

​Examples

​Basic chat completion

​Streaming chat completion

​Function calling with tools

​Structured outputs

​Additional methods

​Stream raw events

​Retrieve a stored completion

​List stored completions

​Update completion metadata

​Delete a stored completion

​Related

Build docs developers (and LLMs) love

Create a chat completion

Response

Examples

Basic chat completion

Streaming chat completion

Function calling with tools

Structured outputs

Additional methods

Stream raw events

Retrieve a stored completion

List stored completions

Update completion metadata

Delete a stored completion

Related