Skip to main content

Overview

The Realtime API enables low-latency, bidirectional voice and audio conversations with OpenAI models. It supports WebRTC for browser-based apps and SIP for telephony integrations.

Client Secrets

Client secrets are short-lived tokens that grant access to the Realtime API without exposing your main API key. Use these for client-side applications.

Create client secret

Creates a client secret with session configuration.
secret = client.realtime.client_secrets.create(
  session: {
    type: "realtime",
    model: "gpt-4o-realtime-preview",
    instructions: "You are a helpful voice assistant",
    audio: {
      voice: "alloy",
      format: "pcm16"
    }
  },
  expires_after: {
    ttl: 3600  # 1 hour
  }
)

puts secret.client_secret  # => "ek_abc123..."

Parameters

session
object
Session configuration for the client secret
expires_after
object
Expiration configuration

Response

client_secret
string
The client secret token (starts with ek_)
session
object
The effective session configuration
expires_at
integer
Unix timestamp when the secret expires

Calls

Manage real-time calls, including SIP integration.

Accept call

Accepts an incoming SIP call.
client.realtime.calls.accept(
  "call_abc123",
  model: "gpt-4o-realtime-preview",
  instructions: "You are a customer service assistant",
  audio: {
    voice: "shimmer",
    format: "g711_ulaw"
  }
)

Parameters

call_id
string
required
Call identifier from the incoming call webhook
model
string
Realtime model to use
instructions
string
System instructions
audio
object
Audio configuration (voice, format, sample rate)
tools
array
Available tools

Hangup call

Ends an active call.
client.realtime.calls.hangup("call_abc123")

Parameters

call_id
string
required
ID of the call to end

Refer call

Transfers a SIP call to another destination.
client.realtime.calls.refer(
  "call_abc123",
  target_uri: "sip:[email protected]"
)

Parameters

call_id
string
required
ID of the call to transfer
target_uri
string
required
SIP URI to transfer to (e.g., sip:[email protected] or tel:+15551234567)

Reject call

Declines an incoming SIP call.
client.realtime.calls.reject(
  "call_abc123",
  status_code: 486  # Busy Here
)

Parameters

call_id
string
required
ID of the call to reject
status_code
integer
SIP response code (default: 603 - Decline). Common codes:
  • 486 - Busy Here
  • 603 - Decline
  • 480 - Temporarily Unavailable

Examples

WebRTC client authentication

# Generate client secret on your server
secret = client.realtime.client_secrets.create(
  session: {
    type: "realtime",
    model: "gpt-4o-realtime-preview",
    instructions: "You are a friendly voice assistant",
    audio: {
      voice: "alloy",
      format: "pcm16",
      sample_rate: 24000
    }
  },
  expires_after: { ttl: 3600 }
)

# Send to client
render json: { client_secret: secret.client_secret }

SIP call handling

# In your webhook handler for incoming calls
post '/webhooks/sip/inbound' do
  call_id = params[:call_id]
  from_number = params[:from]
  
  # Accept the call with a configured session
  client.realtime.calls.accept(
    call_id,
    model: "gpt-4o-realtime-preview",
    instructions: "You are a customer support agent for Acme Corp. Be helpful and professional.",
    audio: {
      voice: "shimmer",
      format: "g711_ulaw",
      sample_rate: 8000
    },
    tools: [
      {
        type: "function",
        function: {
          name: "lookup_order",
          description: "Look up customer order status",
          parameters: {
            type: "object",
            properties: {
              order_id: { type: "string" }
            }
          }
        }
      }
    ]
  )
  
  status 200
end

# Transfer call to human agent
post '/transfer/:call_id' do
  client.realtime.calls.refer(
    params[:call_id],
    target_uri: "sip:[email protected]"
  )
  
  status 200
end

Voice assistant with custom voice

secret = client.realtime.client_secrets.create(
  session: {
    type: "realtime",
    model: "gpt-4o-realtime-preview",
    instructions: "You are a meditation guide. Speak slowly and calmly.",
    audio: {
      voice: "ballad",  # Calm, soothing voice
      format: "pcm16",
      sample_rate: 24000
    },
    output_modalities: ["audio", "text"],
    temperature: 0.7
  },
  expires_after: { ttl: 7200 }  # 2 hours
)

Build docs developers (and LLMs) love