Streaming Responses
The OpenAI Python SDK provides support for streaming responses using Server-Side Events (SSE). This allows you to receive model output in real-time as it’s generated, rather than waiting for the entire response to complete.Basic Streaming
To enable streaming, set thestream parameter to True when calling client.responses.create():
Stream Events
When streaming is enabled, the API returns aStream[ResponseStreamEvent] object. Each event in the stream represents a different type of update:
Event Types
The stream emits various event types to communicate the progress of the response:response.created- Initial event when the response startsresponse.in_progress- Response is being generatedresponse.output_item_added- A new output item was addedresponse.content_part_added- A new content part was addedresponse.text_delta- Text content delta (incremental update)response.text_done- Text content is completeresponse.output_item_done- An output item is completeresponse.completed- Response generation is completeresponse.failed- Response generation failedresponse.error- An error occurredresponse.incomplete- Response is incomplete
Tool Call Events
When using tools, additional events are emitted:response.function_call_arguments_delta- Function call arguments deltaresponse.function_call_arguments_done- Function call arguments completeresponse.web_search_call_searching- Web search in progressresponse.web_search_call_completed- Web search completedresponse.file_search_call_searching- File search in progressresponse.file_search_call_completed- File search completedresponse.code_interpreter_call_interpreting- Code interpreter runningresponse.code_interpreter_call_code_delta- Code deltaresponse.code_interpreter_call_code_done- Code completeresponse.code_interpreter_call_completed- Code interpreter completed
Reasoning Events
For reasoning models (o-series and gpt-5):response.reasoning_text_delta- Reasoning text deltaresponse.reasoning_text_done- Reasoning text completeresponse.reasoning_summary_part_added- Reasoning summary part addedresponse.reasoning_summary_text_delta- Reasoning summary text deltaresponse.reasoning_summary_text_done- Reasoning summary text completeresponse.reasoning_summary_part_done- Reasoning summary part complete
Async Streaming
The async client uses the same interface:Processing Text Deltas
Here’s an example of processing only the text content as it streams:Handling Different Event Types
Stream Options
You can configure streaming behavior using thestream_options parameter:
Streaming with Tools
Error Handling
Best Practices
-
Use
flush=True- When printing text deltas, useflush=Trueto ensure immediate output: - Handle all event types - Make sure to handle different event types gracefully, especially error events.
- Close streams properly - When using context managers or manual stream handling, ensure streams are properly closed to free up resources.
-
Monitor token usage - Use
stream_optionsto include usage information in the final event to track costs. -
Implement timeouts - Set appropriate timeouts to prevent indefinite waiting:
Return Type
Whenstream=True, the method returns:
- Sync:
Stream[ResponseStreamEvent] - Async:
AsyncStream[ResponseStreamEvent]
stream=False or omitted, the method returns:
Response- The complete response object
See Also
- Create Response - Learn about the main response creation method
- Server-Sent Events (MDN)
- Streaming Guide