Overview
The Realtime API enables low-latency, multi-turn conversations with AI models over WebSocket connections. This is ideal for voice assistants, interactive applications, and real-time chat experiences.The Realtime API is currently supported on Cloudflare Workers runtime. For Node.js, use the dedicated realtime handler.
Connection
WebSocket Endpoint
Authentication
Pass authentication as query parameters:Query Parameters
The provider to use (e.g.,
openai)Your provider API key
The model to use (default:
gpt-4o-realtime-preview)Event Types
Client Events
Events sent from your application to the model:Update session configuration
Add audio data to the input buffer
Commit the audio buffer for processing
Add a message to the conversation
Trigger a model response
Cancel an in-progress response
Server Events
Events sent from the model to your application:Session was successfully created
Session configuration was updated
A new conversation item was created
Audio response chunk
Audio response completed
Text response chunk
Text response completed
Response generation completed
An error occurred
Example
Basic Text Conversation
Audio Streaming
Best Practices
Audio Format
Audio Format
- Use PCM16 format at 24kHz sample rate for best compatibility
- Keep audio chunks around 100ms (2400 samples) for optimal latency
- Buffer audio on the client side to handle network jitter
Connection Management
Connection Management
- Implement reconnection logic with exponential backoff
- Monitor connection health with ping/pong frames
- Close connections gracefully when done
Error Handling
Error Handling
- Always handle
errorevents from the server - Implement timeout logic for responses
- Provide fallback behavior for connection failures
Performance
Performance
- Use audio compression where appropriate
- Implement voice activity detection to reduce unnecessary data
- Cache session configuration to avoid repeated updates
Supported Providers
Realtime API support:
- OpenAI: Full support with
gpt-4o-realtime-preview - Azure OpenAI: Supported on compatible deployments
Related Resources
Chat Completions
Standard chat API
Audio Speech
Text-to-speech API
Streaming
HTTP streaming guide