Server Events - Unmute

Overview

Server events are messages sent from the Unmute backend to the client. These events stream audio responses, transcriptions, and status updates.

session.updated

Confirms that the session configuration was successfully updated.

Response Fields

type

string

required

Always "session.updated"

event_id

string

required

Unique event identifier

session

object

required

The updated session configuration (mirrors the session.update request)

Example

{
  "type": "session.updated",
  "event_id": "event_DEF456uvw",
  "session": {
    "instructions": {
      "character": "You are a helpful AI assistant.",
      "scenario": "Casual conversation"
    },
    "voice": "default",
    "allow_recording": false
  }
}

response.created

Indicates that the assistant has started generating a response.

Response Fields

type

string

required

Always "response.created"

event_id

string

required

Unique event identifier

response

object

required

Response metadata

response.object

string

required

Always "realtime.response"

response.status

string

required

Response status. One of: "in_progress", "completed", "cancelled", "failed", "incomplete"

response.voice

string

required

Voice identifier being used for this response

response.chat_history

array

default:"[]"

Conversation history (array of message objects)

Example

{
  "type": "response.created",
  "event_id": "event_GHI789rst",
  "response": {
    "object": "realtime.response",
    "status": "in_progress",
    "voice": "default",
    "chat_history": []
  }
}

response.audio.delta

Streams generated speech audio to the client.

Response Fields

type

string

required

Always "response.audio.delta"

event_id

string

required

Unique event identifier

delta

string

required

Base64-encoded Opus audio chunkAudio Specifications:

Codec: Opus
Sample Rate: 24 kHz
Channels: Mono
Encoding: Base64 string

Example

{
  "type": "response.audio.delta",
  "event_id": "event_JKL012mno",
  "delta": "T2dnUwACAAAAAAAAAAAljMlAAAAABCmR7kBE09w..."
}

Implementation Notes

Audio chunks are sent as they become available from the text-to-speech system
Due to Opus buffering, not every PCM chunk results in output
Chunks should be decoded and played in sequence

JavaScript Example

ws.onmessage = (event) => {
  const message = JSON.parse(event.data);
  
  if (message.type === 'response.audio.delta') {
    // Decode base64 audio
    const audioData = atob(message.delta);
    const audioBytes = new Uint8Array(audioData.length);
    for (let i = 0; i < audioData.length; i++) {
      audioBytes[i] = audioData.charCodeAt(i);
    }
    
    // Play audio using Web Audio API or audio element
    playOpusAudio(audioBytes);
  }
};

response.audio.done

Indicates that audio streaming for the current response has completed.

Response Fields

type

string

required

Always "response.audio.done"

event_id

string

required

Unique event identifier

Example

{
  "type": "response.audio.done",
  "event_id": "event_MNO345pqr"
}

response.text.delta

Streams the text being generated (for display or debugging).

Response Fields

type

string

required

Always "response.text.delta"

event_id

string

required

Unique event identifier

delta

string

required

Text chunk being generated

Example

{
  "type": "response.text.delta",
  "event_id": "event_PQR678stu",
  "delta": "Hello! How can I "
}

response.text.done

Indicates that text generation is complete and provides the full text.

Response Fields

type

string

required

Always "response.text.done"

event_id

string

required

Unique event identifier

text

string

required

Complete generated text

Example

{
  "type": "response.text.done",
  "event_id": "event_STU901vwx",
  "text": "Hello! How can I help you today?"
}

conversation.item.input_audio_transcription.delta

Streams real-time transcription of user speech.

Response Fields

type

string

required

Always "conversation.item.input_audio_transcription.delta"

event_id

string

required

Unique event identifier

delta

string

required

Transcription text chunk

start_time

number

required

Timestamp when speech started (Unmute extension)

Example

{
  "type": "conversation.item.input_audio_transcription.delta",
  "event_id": "event_VWX234yza",
  "delta": "Hello, can you",
  "start_time": 1234567890.123
}

input_audio_buffer.speech_started

Indicates that speech was detected in the user’s audio input. Note: Based on speech-to-text detection, not voice activity detection (VAD). This ensures the event is only sent when actual speech is transcribed.

Response Fields

type

string

required

Always "input_audio_buffer.speech_started"

event_id

string

required

Unique event identifier

Example

{
  "type": "input_audio_buffer.speech_started",
  "event_id": "event_YZA567bcd"
}

input_audio_buffer.speech_stopped

Indicates that a pause was detected in the user’s audio input. Note: Based on voice activity detection (VAD).

Response Fields

type

string

required

Always "input_audio_buffer.speech_stopped"

event_id

string

required

Unique event identifier

Example

{
  "type": "input_audio_buffer.speech_stopped",
  "event_id": "event_BCD890efg"
}

unmute.interrupted_by_vad

Indicates that the voice activity detector interrupted the assistant’s response generation because the user started speaking. Unmute Extension: This event is specific to Unmute.

Response Fields

type

string

required

Always "unmute.interrupted_by_vad"

event_id

string

required

Unique event identifier

Example

{
  "type": "unmute.interrupted_by_vad",
  "event_id": "event_EFG123hij"
}

unmute.response.text.delta.ready

Indicates that a text delta is ready for processing. Unmute Extension: This event is specific to Unmute.

Response Fields

type

string

required

Always "unmute.response.text.delta.ready"

event_id

string

required

Unique event identifier

delta

string

required

Text chunk that is ready

Example

{
  "type": "unmute.response.text.delta.ready",
  "event_id": "event_HIJ456klm",
  "delta": "help you today?"
}

unmute.response.audio.delta.ready

Indicates that an audio delta is ready with sample count information. Unmute Extension: This event is specific to Unmute.

Response Fields

type

string

required

Always "unmute.response.audio.delta.ready"

event_id

string

required

Unique event identifier

number_of_samples

integer

required

Number of audio samples in this chunk

Example

{
  "type": "unmute.response.audio.delta.ready",
  "event_id": "event_KLM789nop",
  "number_of_samples": 480
}

unmute.additional_outputs

Provides additional debug or metadata outputs from the system. Unmute Extension: This event is specific to Unmute and used for debugging.

Response Fields

type

string

required

Always "unmute.additional_outputs"

event_id

string

required

Unique event identifier

args

any

required

Additional output data (structure varies)

Example

{
  "type": "unmute.additional_outputs",
  "event_id": "event_NOP012qrs",
  "args": {
    "debug_info": "Processing latency: 45ms"
  }
}

error

Reports errors during the WebSocket session.

Response Fields

type

string

required

Always "error"

event_id

string

required

Unique event identifier

error

object

required

Error details

error.type

string

required

Error type (e.g., "invalid_request_error", "fatal")

error.code

string

Error code (optional)

error.message

string

required

Human-readable error message

error.param

string

Parameter that caused the error (optional)

error.details

any

Additional error details (Unmute extension, optional)

Example: Invalid JSON

{
  "type": "error",
  "event_id": "event_QRS345tuv",
  "error": {
    "type": "invalid_request_error",
    "message": "Invalid JSON: Expecting value: line 1 column 1 (char 0)"
  }
}

Example: Fatal Error

{
  "type": "error",
  "event_id": "event_TUV678wxy",
  "error": {
    "type": "fatal",
    "message": "Too many people are connected to service 'tts'. Please try again later."
  }
}

Note: Fatal errors typically result in the WebSocket connection being closed by the server.

WebSocket API

Python API

REST API

​Overview

​session.updated

​Response Fields

​Example

​response.created

​Response Fields

​Example

​response.audio.delta

​Response Fields

​Example

​Implementation Notes

​JavaScript Example

​response.audio.done

​Response Fields

​Example

​response.text.delta

​Response Fields

​Example

​response.text.done

​Response Fields

​Example

​conversation.item.input_audio_transcription.delta

​Response Fields

​Example

​input_audio_buffer.speech_started

​Response Fields

​Example

​input_audio_buffer.speech_stopped

​Response Fields

​Example

​unmute.interrupted_by_vad

​Response Fields

​Example

​unmute.response.text.delta.ready

​Response Fields

​Example

​unmute.response.audio.delta.ready

​Response Fields

​Example

​unmute.additional_outputs

​Response Fields

​Example

​error

​Response Fields

​Example: Invalid JSON

​Example: Fatal Error

​Next Steps

Client Events

Session Management

Build docs developers (and LLMs) love

Overview

session.updated

Response Fields

Example

response.created

Response Fields

Example

response.audio.delta

Response Fields

Example

Implementation Notes

JavaScript Example

response.audio.done

Response Fields

Example

response.text.delta

Response Fields

Example

response.text.done

Response Fields

Example

conversation.item.input_audio_transcription.delta

Response Fields

Example

input_audio_buffer.speech_started

Response Fields

Example

input_audio_buffer.speech_stopped

Response Fields

Example

unmute.interrupted_by_vad

Response Fields

Example

unmute.response.text.delta.ready

Response Fields

Example

unmute.response.audio.delta.ready

Response Fields

Example

unmute.additional_outputs

Response Fields

Example

error

Response Fields

Example: Invalid JSON

Example: Fatal Error

Next Steps