The LiveKit Swift SDK includes a complete Agent API for building AI-powered conversational experiences. The Agent framework handles connection lifecycle, message routing, and state management for AI agents in your LiveKit rooms.
Overview
The Agent API consists of three main components:
- Session: Main entry point for connecting to a LiveKit room with an agent
- Agent: State container tracking agent connection and conversational state
- Message System: Handles communication between user and agent
Creating a Session
A Session represents a connection to a LiveKit Room that contains an AI agent.
import LiveKit
// Create a token source for authentication
let tokenSource = YourTokenSource()
// Create session with agent name
let session = Session.withAgent(
"my-agent",
tokenSource: tokenSource,
options: SessionOptions(
preConnectAudio: true,
agentConnectTimeout: 20
)
)
Session Options
Configure session behavior with SessionOptions:
let options = SessionOptions(
room: Room(), // Underlying Room object
preConnectAudio: true, // Enable audio before connection
agentConnectTimeout: 20 // Timeout in seconds
)
When preConnectAudio is enabled, the microphone will be activated before connecting to the room. Use LocalMedia or AudioManager.setRecordingAlwaysPreparedMode() to request microphone permissions early in your app lifecycle.
Starting and Ending Sessions
// Start the session
await session.start()
// Check connection state
if session.isConnected {
print("Connected to agent")
}
// End the session when done
await session.end()
Agent State
The Agent struct provides real-time information about the agent’s state:
// Access agent state
let agent = session.agent
// Check connection status
if agent.isConnected {
print("Agent is connected")
}
// Monitor conversational state
switch agent.agentState {
case .listening:
print("Agent is listening")
case .thinking:
print("Agent is processing")
case .speaking:
print("Agent is speaking")
case .initializing:
print("Agent is initializing")
case .idle:
print("Agent is idle")
default:
break
}
// Access media tracks
if let audioTrack = agent.audioTrack {
// Play agent audio
}
if let videoTrack = agent.avatarVideoTrack {
// Display agent avatar
}
Agent State Properties
isConnected: Agent is actively connected and conversing
canListen: Agent can receive user input (includes pre-connect buffering)
isPending: Agent is connecting or initializing
isFinished: Session has ended
agentState: Current conversational state (listening, thinking, speaking, etc.)
audioTrack: Agent’s audio track
avatarVideoTrack: Agent’s video track (if available)
error: Last error that occurred
Sending Messages
Send text messages to the agent:
// Send a message
if let sentMessage = await session.send(text: "Hello, agent!") {
print("Message sent: \(sentMessage.id)")
}
Message History
Access and manage conversation history:
// Get message history
let messages = session.getMessageHistory()
// Restore previous conversation
let savedMessages: [ReceivedMessage] = loadFromStorage()
session.restoreMessageHistory(savedMessages)
SwiftUI Integration
The Session class conforms to ObservableObject and is marked @MainActor, making it ideal for SwiftUI:
import SwiftUI
import LiveKit
struct AgentView: View {
@StateObject private var session: Session
init(tokenSource: any TokenSourceConfigurable) {
_session = StateObject(wrappedValue: Session.withAgent(
"my-agent",
tokenSource: tokenSource
))
}
var body: some View {
VStack {
// Display agent state
Text("State: \(session.agent.agentState?.rawValue ?? "Unknown")")
// Show messages
ForEach(session.messages) { message in
MessageRow(message: message)
}
// Input field
Button("Send Message") {
Task {
await session.send(text: "Hello")
}
}
}
.task {
await session.start()
}
}
}
Custom Message Senders and Receivers
Extend the messaging system with custom implementations:
// Create custom message sender
class CustomMessageSender: MessageSender {
func send(_ message: SentMessage) async throws {
// Custom sending logic
}
}
// Create custom message receiver
class CustomMessageReceiver: MessageReceiver {
func messages() async throws -> AsyncStream<ReceivedMessage> {
// Custom receiving logic
}
}
// Use custom senders/receivers
let session = Session(
tokenSource: tokenSource,
senders: [CustomMessageSender()],
receivers: [CustomMessageReceiver()]
)
Error Handling
// Monitor for errors
if let error = session.error {
switch error {
case .connection(let err):
print("Connection failed: \(err)")
case .sender(let err):
print("Message sender failed: \(err)")
case .receiver(let err):
print("Message receiver failed: \(err)")
}
// Dismiss error after handling
session.dismissError()
}
// Check agent errors
if let agentError = session.agent.error {
switch agentError {
case .timeout:
print("Agent did not connect to the room")
case .left:
print("Agent left the room unexpectedly")
}
}
Pre-Connect Audio Buffer
When preConnectAudio is enabled, the SDK buffers audio input before the agent connects, ensuring no user speech is lost:
let session = Session.withAgent(
"my-agent",
tokenSource: tokenSource,
options: SessionOptions(preConnectAudio: true)
)
// Audio is being captured and buffered
await session.start()
// Buffered audio is sent when agent connects
if session.agent.canListen {
print("Agent can hear buffered audio")
}
Resources