Create a chat completion
Creates a model response for the given chat conversation.A list of messages comprising the conversation so far. Each message can be a system, user, assistant, tool, or function message.
Model ID used to generate the response, like
gpt-4o or o3. See the models documentation for available options.Sampling temperature between 0 and 2. Higher values like 0.8 make output more random, while lower values like 0.2 make it more focused and deterministic.
An upper bound for the number of tokens that can be generated for a completion, including both visible output tokens and reasoning tokens.
The maximum number of tokens that can be generated in the chat completion. This value is deprecated in favor of
max_completion_tokens.If set to true, partial message deltas will be sent via Server-Sent Events. For streaming, use
stream_raw method instead.A list of tools the model may call. Can be functions or custom tools defined using
OpenAI::BaseModel.Controls which (if any) tool is called by the model. Can be
auto, none, required, or a specific tool name.An object specifying the format that the model must output. Supports
text, json_object, or json_schema for structured outputs.Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text.
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far.
If specified, the system will make a best effort to sample deterministically for improved reproducibility.
Up to 4 sequences where the API will stop generating further tokens. Not supported with latest reasoning models.
How many chat completion choices to generate for each input message.
Whether to return log probabilities of the output tokens.
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position.
Constrains effort on reasoning for reasoning models. Options:
low, medium, high.Whether or not to store the output of this chat completion request for use in distillation and evals.
Response
Returns aChatCompletion object.
Unique identifier for the chat completion.
The object type, always
chat.completion.Unix timestamp of when the completion was created.
The model used for the completion.
A list of chat completion choices.
Token usage information.
Examples
Basic chat completion
Streaming chat completion
Function calling with tools
Structured outputs
Additional methods
Stream raw events
For low-level streaming access, usestream_raw: