Session Management

Overview

Session management controls the behavior of the Unmute conversation system, including voice selection, conversation instructions, and recording preferences.

Session Configuration

Sessions are configured using the session.update client event. The backend requires this configuration before it begins processing audio.

Configuration Object

The session configuration is defined by the SessionConfig model:

instructions

object

Conversation instructions (Unmute extension to OpenAI Realtime API)

instructions.character

string

Character personality and behavior description

instructions.scenario

string

Conversation scenario or context

voice

string

Voice identifier for text-to-speech synthesis. Must match a voice ID from the /v1/voices endpoint.

allow_recording

boolean

required

Whether to allow recording of the conversation session. Set to false to disable recording.

Getting Available Voices

Before configuring a session, retrieve the list of available voices:

HTTP Endpoint

GET /v1/voices

Response

[
  {
    "name": "default",
    "language": "en",
    "gender": "neutral"
  },
  {
    "name": "voice_001",
    "language": "en",
    "gender": "female"
  },
  {
    "name": "voice_002",
    "language": "en",
    "gender": "male"
  }
]

Note: Only voices with good: true are returned. The comment field is excluded from the response.

JavaScript Example

const response = await fetch('http://localhost:8000/v1/voices');
const voices = await response.json();

console.log('Available voices:', voices);

Configuring a Session

Send a session.update event immediately after establishing the WebSocket connection.

Example: Basic Configuration

{
  "type": "session.update",
  "session": {
    "voice": "default",
    "allow_recording": false
  }
}

Example: With Instructions

{
  "type": "session.update",
  "session": {
    "instructions": {
      "character": "You are a friendly and helpful AI assistant with a warm personality.",
      "scenario": "You are helping a user learn about voice AI technology."
    },
    "voice": "voice_001",
    "allow_recording": false
  }
}

Server Confirmation

The server responds with a session.updated event:

{
  "type": "session.updated",
  "event_id": "event_ABC123xyz",
  "session": {
    "instructions": {
      "character": "You are a friendly and helpful AI assistant with a warm personality.",
      "scenario": "You are helping a user learn about voice AI technology."
    },
    "voice": "voice_001",
    "allow_recording": false
  }
}

Instructions Object

The instructions field is an Unmute extension to the OpenAI Realtime API. It provides structured guidance to the language model.

Character

Defines the AI assistant’s personality, tone, and behavioral characteristics. Examples:

"You are a professional medical assistant."
"You are an enthusiastic teacher who loves explaining complex topics."
"You are a concise technical support agent."

Scenario

Provides context about the conversation setting or purpose. Examples:

"You are helping a user troubleshoot their software issue."
"The user is practicing English conversation."
"You are conducting a job interview simulation."

Updating Session Mid-Conversation

You can send additional session.update events at any time to change the configuration:

// Change voice mid-conversation
ws.send(JSON.stringify({
  type: 'session.update',
  session: {
    voice: 'voice_002',
    allow_recording: false
  }
}));

The server will apply the new configuration and respond with session.updated.

Voice Cloning

Unmute supports custom voice cloning. Upload audio samples to create new voices.

Upload Voice Sample

POST /v1/voices
Content-Type: multipart/form-data

Parameters:

file: Audio file (maximum size: configurable via MAX_VOICE_FILE_SIZE_MB)

Response:

{
  "name": "cloned_voice_ABC123"
}

Using Cloned Voice

Once uploaded, use the returned voice name in your session configuration:

{
  "type": "session.update",
  "session": {
    "voice": "cloned_voice_ABC123",
    "allow_recording": false
  }
}

JavaScript Example

// Upload audio file for voice cloning
const formData = new FormData();
formData.append('file', audioFile);

const response = await fetch('http://localhost:8000/v1/voices', {
  method: 'POST',
  body: formData
});

const { name } = await response.json();

// Use the cloned voice
ws.send(JSON.stringify({
  type: 'session.update',
  session: {
    voice: name,
    allow_recording: false
  }
}));

Recording Management

The allow_recording parameter controls whether conversations are recorded for later analysis.

Disabling Recording

{
  "type": "session.update",
  "session": {
    "voice": "default",
    "allow_recording": false
  }
}

Enabling Recording

{
  "type": "session.update",
  "session": {
    "voice": "default",
    "allow_recording": true
  }
}

Note: When recording is enabled, the server stores:

Client events (with audio anonymized as sample counts)
Server events (responses, transcriptions, etc.)
Conversation metadata

Complete Session Setup Example

// 1. Check server health
const health = await fetch('http://localhost:8000/v1/health').then(r => r.json());
if (!health.ok) {
  throw new Error('Server is not healthy');
}

// 2. Get available voices
const voices = await fetch('http://localhost:8000/v1/voices').then(r => r.json());

// 3. Connect to WebSocket
const ws = new WebSocket('ws://localhost:8000/v1/realtime', 'realtime');

ws.onopen = () => {
  // 4. Configure session
  ws.send(JSON.stringify({
    type: 'session.update',
    session: {
      instructions: {
        character: 'You are a helpful AI assistant.',
        scenario: 'General conversation'
      },
      voice: voices[0].name,
      allow_recording: false
    }
  }));
};

ws.onmessage = (event) => {
  const message = JSON.parse(event.data);
  
  if (message.type === 'session.updated') {
    console.log('Session configured successfully');
    // 5. Start sending audio
  }
};

Error Handling

If session configuration fails, the server sends an error event:

{
  "type": "error",
  "error": {
    "type": "invalid_request_error",
    "message": "Invalid message",
    "details": [
      {
        "type": "missing",
        "loc": ["session"],
        "msg": "Field required"
      }
    ]
  }
}

Next Steps

Client Events

Send audio and commands to the server

Server Events

Receive responses and updates from the server

WebSocket Overview

Learn about the WebSocket protocol

WebSocket API

Python API

REST API

Session Management

Overview

Session Configuration

Configuration Object

Getting Available Voices

HTTP Endpoint

Response

JavaScript Example

Configuring a Session

Example: Basic Configuration

Example: With Instructions

Server Confirmation

Instructions Object

Character

Scenario

Updating Session Mid-Conversation

Voice Cloning

Upload Voice Sample

Using Cloned Voice

JavaScript Example

Recording Management

Disabling Recording

Enabling Recording

Complete Session Setup Example

Error Handling

Next Steps

Client Events

Server Events

WebSocket Overview

Build docs developers (and LLMs) love

WebSocket API

Python API

REST API

​Overview

​Session Configuration

​Configuration Object

​Getting Available Voices

​HTTP Endpoint

​Response

​JavaScript Example

​Configuring a Session

​Example: Basic Configuration

​Example: With Instructions

​Server Confirmation

​Instructions Object

​Character

​Scenario

​Updating Session Mid-Conversation

​Voice Cloning

​Upload Voice Sample

​Using Cloned Voice

​JavaScript Example

​Recording Management

​Disabling Recording

​Enabling Recording

​Complete Session Setup Example

​Error Handling

​Next Steps

Client Events

Server Events

WebSocket Overview

Build docs developers (and LLMs) love

Overview

Session Configuration

Configuration Object

Getting Available Voices

HTTP Endpoint

Response

JavaScript Example

Configuring a Session

Example: Basic Configuration

Example: With Instructions

Server Confirmation

Instructions Object

Character

Scenario

Updating Session Mid-Conversation

Voice Cloning

Upload Voice Sample

Using Cloned Voice

JavaScript Example

Recording Management

Disabling Recording

Enabling Recording

Complete Session Setup Example

Error Handling

Next Steps