The Assistants API is deprecated in favor of the Responses API.
Create a run to execute an assistant on a thread.
Path Parameters
The ID of the thread to run.
Request Body
The ID of the assistant to use to execute this run.
The ID of the Model to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used.
Overrides the instructions of the assistant. This is useful for modifying the behavior on a per-run basis.
Appends additional instructions at the end of the instructions for the run. This is useful for modifying the behavior on a per-run basis without overriding other instructions.
Adds additional messages to the thread before creating the run.
Override the tools the assistant can use for this run. This is useful for modifying the behavior on a per-run basis.
Set of 16 key-value pairs that can be attached to an object.Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
The maximum number of prompt tokens that may be used over the course of the run. The run will make a best effort to use only the number of prompt tokens specified, across multiple turns of the run. If the run exceeds the number of prompt tokens specified, the run will end with status incomplete.
The maximum number of completion tokens that may be used over the course of the run. The run will make a best effort to use only the number of completion tokens specified, across multiple turns of the run. If the run exceeds the number of completion tokens specified, the run will end with status incomplete.
If true, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a data: [DONE] message.
Response
Returns a Run object.
The object type, always thread.run.
The Unix timestamp (in seconds) for when the run was created.
The ID of the thread that was executed on as a part of this run.
The ID of the assistant used for execution of this run.
The status of the run. One of:
queued
in_progress
requires_action
cancelling
cancelled
failed
completed
expired
The model that the assistant used for this run.
The instructions that the assistant used for this run.
Retrieve Run
GET https://api.openai.com/v1/threads/{thread_id}/runs/{run_id}
Retrieves a run.
List Runs
GET https://api.openai.com/v1/threads/{thread_id}/runs
Returns a list of runs belonging to a thread.
Cancel Run
POST https://api.openai.com/v1/threads/{thread_id}/runs/{run_id}/cancel
Cancels a run that is in_progress.
from openai import OpenAI
client = OpenAI()
# Create and run
run = client.beta.threads.runs.create(
thread_id="thread_abc123",
assistant_id="asst_abc123"
)
print(run.id, run.status)
# Retrieve run status
run = client.beta.threads.runs.retrieve(
thread_id="thread_abc123",
run_id=run.id
)
print(run.status)
{
"id": "run_abc123",
"object": "thread.run",
"created_at": 1699063290,
"thread_id": "thread_abc123",
"assistant_id": "asst_abc123",
"status": "queued",
"started_at": 1699063290,
"model": "gpt-4o",
"instructions": "You are a personal math tutor.",
"tools": [
{
"type": "code_interpreter"
}
],
"metadata": {},
"usage": null,
"temperature": 1.0,
"top_p": 1.0,
"max_prompt_tokens": 1000,
"max_completion_tokens": 1000
}