Skip to main content
The Chat Completions API lets you create model responses for chat conversations. It supports text generation, vision, audio, function calling, and structured outputs.

Create a chat completion

Creates a model response for the given chat conversation.
client.chat.completions.create(params)
messages
Array
required
A list of messages comprising the conversation so far. Each message can be a system, user, assistant, tool, or function message.
model
String
required
Model ID used to generate the response, like gpt-4o or o3. See the models documentation for available options.
temperature
Float
default:"1.0"
Sampling temperature between 0 and 2. Higher values like 0.8 make output more random, while lower values like 0.2 make it more focused and deterministic.
max_completion_tokens
Integer
An upper bound for the number of tokens that can be generated for a completion, including both visible output tokens and reasoning tokens.
max_tokens
Integer
The maximum number of tokens that can be generated in the chat completion. This value is deprecated in favor of max_completion_tokens.
stream
Boolean
default:"false"
If set to true, partial message deltas will be sent via Server-Sent Events. For streaming, use stream_raw method instead.
tools
Array
A list of tools the model may call. Can be functions or custom tools defined using OpenAI::BaseModel.
tool_choice
String | Object
Controls which (if any) tool is called by the model. Can be auto, none, required, or a specific tool name.
response_format
Object
An object specifying the format that the model must output. Supports text, json_object, or json_schema for structured outputs.
frequency_penalty
Float
default:"0"
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text.
presence_penalty
Float
default:"0"
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far.
seed
Integer
If specified, the system will make a best effort to sample deterministically for improved reproducibility.
stop
String | Array<String>
Up to 4 sequences where the API will stop generating further tokens. Not supported with latest reasoning models.
n
Integer
default:"1"
How many chat completion choices to generate for each input message.
logprobs
Boolean
default:"false"
Whether to return log probabilities of the output tokens.
top_logprobs
Integer
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position.
reasoning_effort
String
Constrains effort on reasoning for reasoning models. Options: low, medium, high.
store
Boolean
Whether or not to store the output of this chat completion request for use in distillation and evals.

Response

Returns a ChatCompletion object.
id
String
Unique identifier for the chat completion.
object
String
The object type, always chat.completion.
created
Integer
Unix timestamp of when the completion was created.
model
String
The model used for the completion.
choices
Array
A list of chat completion choices.
usage
Object
Token usage information.

Examples

Basic chat completion

require "openai"

client = OpenAI::Client.new

completion = client.chat.completions.create(
  model: "gpt-4",
  messages: [
    {
      role: "user",
      content: "Say this is a test"
    }
  ]
)

puts completion.choices.first&.message&.content

Streaming chat completion

require "openai"

client = OpenAI::Client.new

stream = client.chat.completions.stream(
  model: "gpt-4o-mini",
  messages: [
    {role: :user, content: "List three fun facts about dolphins."}
  ]
)

stream.text.each do |text|
  print(text)
end
puts

Function calling with tools

require "openai"

class GetWeather < OpenAI::BaseModel
  required :location, String
end

client = OpenAI::Client.new

stream = client.chat.completions.stream(
  model: "gpt-4o-mini",
  tools: [GetWeather],
  messages: [
    {role: :user, content: "What's the weather in San Francisco?"}
  ]
)

stream.each do |event|
  case event
  when OpenAI::Streaming::ChatFunctionToolCallArgumentsDoneEvent
    puts "Tool: #{event.name}"
    puts "Args: #{event.arguments}"
    pp event.parsed
  end
end

Structured outputs

require "openai"

class Location < OpenAI::BaseModel
  required :city, String
  required :country, String
end

class CalendarEvent < OpenAI::BaseModel
  required :name, String
  required :date, String
  required :location, Location
end

client = OpenAI::Client.new

completion = client.chat.completions.create(
  model: "gpt-4o-2024-08-06",
  messages: [
    {role: :system, content: "Extract the event information."},
    {role: :user, content: "Science fair on Friday in San Diego"}
  ],
  response_format: CalendarEvent
)

# Access the parsed structured output
event = completion.choices.first.message.parsed
puts "Event: #{event.name} on #{event.date}"

Additional methods

Stream raw events

For low-level streaming access, use stream_raw:
stream = client.chat.completions.stream_raw(
  model: "gpt-4",
  messages: [{role: "user", content: "Hello"}]
)

stream.each do |chunk|
  puts chunk.choices.first&.delta&.content
end

Retrieve a stored completion

completion = client.chat.completions.retrieve(completion_id)

List stored completions

completions = client.chat.completions.list(limit: 10)

Update completion metadata

client.chat.completions.update(
  completion_id,
  metadata: {key: "value"}
)

Delete a stored completion

client.chat.completions.delete(completion_id)

Build docs developers (and LLMs) love