Skip to main content
The LiveKit Swift SDK includes a complete Agent API for building AI-powered conversational experiences. The Agent framework handles connection lifecycle, message routing, and state management for AI agents in your LiveKit rooms.

Overview

The Agent API consists of three main components:
  • Session: Main entry point for connecting to a LiveKit room with an agent
  • Agent: State container tracking agent connection and conversational state
  • Message System: Handles communication between user and agent

Creating a Session

A Session represents a connection to a LiveKit Room that contains an AI agent.
import LiveKit

// Create a token source for authentication
let tokenSource = YourTokenSource()

// Create session with agent name
let session = Session.withAgent(
    "my-agent",
    tokenSource: tokenSource,
    options: SessionOptions(
        preConnectAudio: true,
        agentConnectTimeout: 20
    )
)

Session Options

Configure session behavior with SessionOptions:
let options = SessionOptions(
    room: Room(),                    // Underlying Room object
    preConnectAudio: true,           // Enable audio before connection
    agentConnectTimeout: 20          // Timeout in seconds
)
When preConnectAudio is enabled, the microphone will be activated before connecting to the room. Use LocalMedia or AudioManager.setRecordingAlwaysPreparedMode() to request microphone permissions early in your app lifecycle.

Starting and Ending Sessions

// Start the session
await session.start()

// Check connection state
if session.isConnected {
    print("Connected to agent")
}

// End the session when done
await session.end()

Agent State

The Agent struct provides real-time information about the agent’s state:
// Access agent state
let agent = session.agent

// Check connection status
if agent.isConnected {
    print("Agent is connected")
}

// Monitor conversational state
switch agent.agentState {
case .listening:
    print("Agent is listening")
case .thinking:
    print("Agent is processing")
case .speaking:
    print("Agent is speaking")
case .initializing:
    print("Agent is initializing")
case .idle:
    print("Agent is idle")
default:
    break
}

// Access media tracks
if let audioTrack = agent.audioTrack {
    // Play agent audio
}

if let videoTrack = agent.avatarVideoTrack {
    // Display agent avatar
}

Agent State Properties

  • isConnected: Agent is actively connected and conversing
  • canListen: Agent can receive user input (includes pre-connect buffering)
  • isPending: Agent is connecting or initializing
  • isFinished: Session has ended
  • agentState: Current conversational state (listening, thinking, speaking, etc.)
  • audioTrack: Agent’s audio track
  • avatarVideoTrack: Agent’s video track (if available)
  • error: Last error that occurred

Sending Messages

Send text messages to the agent:
// Send a message
if let sentMessage = await session.send(text: "Hello, agent!") {
    print("Message sent: \(sentMessage.id)")
}

Message History

Access and manage conversation history:
// Get message history
let messages = session.getMessageHistory()

// Restore previous conversation
let savedMessages: [ReceivedMessage] = loadFromStorage()
session.restoreMessageHistory(savedMessages)

SwiftUI Integration

The Session class conforms to ObservableObject and is marked @MainActor, making it ideal for SwiftUI:
import SwiftUI
import LiveKit

struct AgentView: View {
    @StateObject private var session: Session
    
    init(tokenSource: any TokenSourceConfigurable) {
        _session = StateObject(wrappedValue: Session.withAgent(
            "my-agent",
            tokenSource: tokenSource
        ))
    }
    
    var body: some View {
        VStack {
            // Display agent state
            Text("State: \(session.agent.agentState?.rawValue ?? "Unknown")")
            
            // Show messages
            ForEach(session.messages) { message in
                MessageRow(message: message)
            }
            
            // Input field
            Button("Send Message") {
                Task {
                    await session.send(text: "Hello")
                }
            }
        }
        .task {
            await session.start()
        }
    }
}

Custom Message Senders and Receivers

Extend the messaging system with custom implementations:
// Create custom message sender
class CustomMessageSender: MessageSender {
    func send(_ message: SentMessage) async throws {
        // Custom sending logic
    }
}

// Create custom message receiver
class CustomMessageReceiver: MessageReceiver {
    func messages() async throws -> AsyncStream<ReceivedMessage> {
        // Custom receiving logic
    }
}

// Use custom senders/receivers
let session = Session(
    tokenSource: tokenSource,
    senders: [CustomMessageSender()],
    receivers: [CustomMessageReceiver()]
)

Error Handling

// Monitor for errors
if let error = session.error {
    switch error {
    case .connection(let err):
        print("Connection failed: \(err)")
    case .sender(let err):
        print("Message sender failed: \(err)")
    case .receiver(let err):
        print("Message receiver failed: \(err)")
    }
    
    // Dismiss error after handling
    session.dismissError()
}

// Check agent errors
if let agentError = session.agent.error {
    switch agentError {
    case .timeout:
        print("Agent did not connect to the room")
    case .left:
        print("Agent left the room unexpectedly")
    }
}

Pre-Connect Audio Buffer

When preConnectAudio is enabled, the SDK buffers audio input before the agent connects, ensuring no user speech is lost:
let session = Session.withAgent(
    "my-agent",
    tokenSource: tokenSource,
    options: SessionOptions(preConnectAudio: true)
)

// Audio is being captured and buffered
await session.start()

// Buffered audio is sent when agent connects
if session.agent.canListen {
    print("Agent can hear buffered audio")
}

Resources

Build docs developers (and LLMs) love