VoiceAgent and VideoAgent extend Node.js EventEmitter and emit events throughout the lifecycle of conversation processing. This page documents all available events, their payloads, and when theyβre triggered.
Event Categories
Text Events
User input and LLM text streaming
Speech Events
TTS generation and audio output
Tool Events
Tool invocations and results
Connection Events
WebSocket lifecycle
Video Events
Frame capture and processing (VideoAgent only)
Error Events
Errors and warnings
Text Events
Events related to text input and LLM streaming output.text
Emitted when user input is received or when the full assistant response is ready. Payload:role: "user"- After user sends text input or audio is transcribedrole: "assistant"- After LLM completes full response
chunk:text_delta
Emitted for each streaming text token from the LLM. Payload:chunk:reasoning_delta
Emitted for each reasoning token (for models that support reasoning). Payload:Speech Events
Events related to text-to-speech generation and audio output.speech_start
Emitted when TTS generation begins. Payload:speech_complete
Emitted when all TTS chunks have been sent. Payload:speech_interrupted
Emitted when speech generation is cancelled. Payload:"interrupted"- Manual interruption viainterruptSpeech()"user_speaking"- User started speaking (barge-in)"client_request"- Client sent interrupt message"disconnected"- WebSocket disconnected
speech_chunk_queued
Emitted when a text chunk enters the TTS queue. Payload:audio_chunk
Emitted when a single TTS chunk is ready and sent. Payload:audio
Emitted for full non-streaming TTS audio. Payload:generateAndSendSpeechFull() instead of streaming.
Example:
Tool Events
Events related to AI SDK tool invocations.chunk:tool_call
Emitted when a tool invocation is detected during streaming. Payload:tool_result
Emitted when a tool execution completes. Payload:execute function finishes.
Example:
Transcription Events
Events related to audio transcription.transcription
Emitted when audio is successfully transcribed to text. Payload:transcribeAudio() or audio WebSocket message is processed.
Example:
audio_received
Emitted when raw audio input is received before transcription. Payload:History Events
Events related to conversation memory management.history_cleared
Emitted when conversation history is manually cleared. Payload: None When: AfterclearHistory() is called.
Example:
history_trimmed
Emitted when old messages are automatically removed from history. Payload:maxMessages or maxTotalChars limits.
Example:
Connection Events
WebSocket lifecycle events.connected
Emitted when WebSocket connection is established. Payload: None When: Afterconnect() succeeds or handleSocket() is called.
Example:
disconnected
Emitted when WebSocket connection closes. Payload: None When: When socket closes (client disconnect, network error,disconnect() called).
Example:
Video Events (VideoAgent Only)
Events specific toVideoAgent for video frame processing.
frame_received
Emitted when a video frame is received and processed. Payload:frame_requested
Emitted when the agent requests the client to capture a frame. Payload:requestFrameCapture() is called.
Example:
client_ready
Emitted when client connects and reports capabilities. Payload:client_ready WebSocket message.
Example:
config_changed
Emitted when video agent configuration is updated. Payload:updateConfig() is called.
Example:
Error Events
Error and warning events.error
Emitted when an error occurs in any subsystem. Payload:- LLM stream errors
- TTS generation failures
- Transcription errors
- WebSocket errors
- Invalid input (oversized audio/frames)
warning
Emitted for non-fatal issues that donβt stop processing. Payload:- Empty transcript message
- Invalid audio message
- Empty video frame
Listening to Events
Basic Event Handling
WebSocket Integration
VideoAgent Events
Event Timing Diagram
Typical event flow for a user query:Related
Types & Interfaces
Type definitions for event payloads
VoiceAgent
Voice agent class reference
VideoAgent
Video agent class reference
WebSocket Protocol
Complete WebSocket message protocol