WebSocket Connection Issues
Connection refused or fails to establish
Connection refused or fails to establish
Symptoms:
Error: connect ECONNREFUSED- WebSocket never emits
connectedevent
- Verify your WebSocket server is running:
- Check the WebSocket URL is correct:
- Ensure no firewall is blocking the port
- Check server logs for startup errors
Socket closes immediately after connecting
Socket closes immediately after connecting
Symptoms:
- Connection establishes but
disconnectedevent fires immediately socket.readyStateshows closed state
- Check server-side error handling:
- Verify the WebSocket server accepts connections
- Check for authentication/authorization issues if implemented
Messages not being received
Messages not being received
Symptoms:
- WebSocket connected but messages don’t trigger events
- Silent failures
- Verify message format is valid JSON:
- Check for parsing errors in server logs
- Ensure event listeners are attached before connecting:
Socket state errors when sending
Socket state errors when sending
Symptoms:
Cannot send message, socket state: 0(CONNECTING)Cannot send message, socket state: 2(CLOSING)Cannot send message, socket state: 3(CLOSED)
- Wait for the
connectedevent before sending:
- Check
agent.connectedbefore operations:
Audio & Transcription Issues
Transcription returns empty text
Transcription returns empty text
Symptoms:
transcription_error: Whisper returned empty text- Warning:
Transcription returned empty text
- Verify audio format is supported:
- Check audio quality:
- Audio should contain clear speech
- Minimum duration ~0.5 seconds
- Adequate volume level
- Verify base64 encoding is correct:
- Test with a known-good audio file
Audio input too large error
Audio input too large error
Symptoms:
Audio input too large (X MB). Maximum allowed: Y MB
- Increase the limit if needed:
- Or compress audio before sending:
- Use lower bitrate encoding
- Reduce sample rate (e.g., 16kHz for speech)
- Use more efficient codec (e.g., opus)
- Split long audio into chunks if possible
Audio playback is choppy or delayed
Audio playback is choppy or delayed
Symptoms:
- Audio chunks arrive out of order
- Gaps between chunks
- High latency
- Adjust streaming speech configuration:
- Ensure client plays chunks in order:
- Check network latency and bandwidth
Transcription model not configured
Transcription model not configured
Symptoms:
Error: Transcription model not configured- Audio input fails silently
TTS Generation Issues
No speech output generated
No speech output generated
Symptoms:
- Text responses work but no audio
speech_startevent never fires
- Verify speech model is configured:
- Check that you’re listening for the right events:
Speech generation is slow
Speech generation is slow
Symptoms:
- Long delay before first audio chunk
- Slow overall response time
- Enable parallel generation:
- Reduce chunk size for faster time-to-first-audio:
- Use faster TTS model:
Speech interrupted unexpectedly
Speech interrupted unexpectedly
Symptoms:
speech_interruptedevent fires without user action- Audio stops mid-sentence
- Barge-in triggered by new input:
- WebSocket disconnection:
- Check for
disconnectedevent - Implement reconnection logic
- Error in speech generation:
- Listen for
errorevent - Check API quota/rate limits
Memory & Performance
Memory usage grows over time
Memory usage grows over time
Symptoms:
- Increasing memory footprint in long sessions
- Slow response times
- Configure conversation history limits:
- Monitor
history_trimmedevents:
- Clear history periodically if needed:
- Destroy agent instances when done:
High CPU usage
High CPU usage
Symptoms:
- CPU spikes during operation
- Server becomes unresponsive
- v0.1.0+: Speech queue uses promises instead of polling (fixed)
- Limit concurrent parallel TTS requests:
- Monitor active agent instances (one per user):
Race conditions or corrupted history
Race conditions or corrupted history
Symptoms:The queue ensures:
- Interleaved messages
- Duplicate responses
- History contains unexpected messages
sendText()calls are processed one at a time- WebSocket
transcriptmessages are serialized - No concurrent modifications to
conversationHistory
Error Handling Patterns
Handling errors gracefully
Handling errors gracefully
Best practices:
Recovering from API failures
Recovering from API failures
OpenAI API errors:Network errors:
Preventing destroyed agent usage
Preventing destroyed agent usage
Symptoms:
Error: VoiceAgent has been destroyed and cannot be used
destroyed state before operations:Environment & Configuration
Environment variables not loading
Environment variables not loading
Symptoms:
OPENAI_API_KEYundefined- Connection to wrong endpoint
- Ensure
.envfile exists in project root:
- Load dotenv at the top of your entry file:
- Verify
.envis not gitignored when needed
TypeScript errors
TypeScript errors
Common issues:Install matching version:
- Missing types:
- AI SDK version mismatch:
- Module resolution:
Getting Help
If you’re still experiencing issues:- Check the changelog for recent fixes and breaking changes
- Review example code in the repository:
example/demo.ts— text-only usageexample/ws-server.ts— WebSocket serverexample/voice-client.html— browser client
- Enable debug logging to see what’s happening:
- Report issues on GitHub with:
- Voice Agent AI SDK version
- Node.js version
- Minimal reproduction code
- Error messages and logs