Skip to main content

Overview

LibreChat supports bidirectional voice interactions through Text-to-Speech (TTS) for reading AI responses aloud and Speech-to-Text (STT) for voice input. This enables hands-free conversations and accessibility features.

Text-to-Speech (TTS)

Have AI responses read aloud with natural-sounding voices.

Supported TTS Providers

Uses the browser’s built-in speech synthesis:
  • No configuration required
  • Works offline
  • Voice quality depends on browser/OS
  • No API costs
Automatically available in all conversations.

Using TTS

1

Enable TTS

Look for the speaker icon in the message header or message footer.
2

Click Read Aloud

Click the speaker icon to have the message read aloud.
3

Control Playback

  • Pause/Resume: Click the icon again
  • Stop: Click the mute icon
  • Adjust speed: Use the playback rate control in settings

Playback Speed

Control audio playback rate:
// Recoil state for playback speed
playbackRate: 1.0  // Range: 0.5 to 2.0
Adjust in user settings:
  • 0.5x: Slower (better comprehension)
  • 1.0x: Normal speed
  • 1.5x: Faster
  • 2.0x: Maximum speed

Auto-Play

Configure automatic TTS for new messages:
# Feature currently user-controlled via settings
# Auto-play last message when enabled

Speech-to-Text (STT)

Use your voice as input instead of typing.

Supported STT Providers

Uses browser’s built-in speech recognition:
  • No configuration required
  • Works with Chrome, Edge, Safari
  • Limited browser support
  • Requires microphone permission

Using STT

1

Enable Microphone

Grant microphone permission when prompted by your browser.
2

Click Microphone Icon

Find the microphone button in the message input area.
3

Speak Your Message

Speak clearly into your microphone. The transcription appears in real-time.
4

Send or Edit

  • Click Send to submit the transcription
  • Edit the text before sending if needed
For best results with STT:
  • Use a quality microphone
  • Minimize background noise
  • Speak at a normal pace
  • Enunciate clearly

Configuration

Environment Variables

# .env
STT_API_KEY=your-openai-key-for-stt
TTS_API_KEY=your-openai-key-for-tts

Complete Speech Configuration

# librechat.yaml
speech:
  # Text-to-Speech
  tts:
    openai:
      url: 'https://api.openai.com/v1'  # Optional
      apiKey: '${TTS_API_KEY}'
      model: 'tts-1-hd'  # or 'tts-1' for faster/cheaper
      voices:
        - 'alloy'
        - 'echo'
        - 'fable'
        - 'onyx'
        - 'nova'
        - 'shimmer'
  
  # Speech-to-Text
  stt:
    openai:
      url: 'https://api.openai.com/v1'  # Optional
      apiKey: '${STT_API_KEY}'
      model: 'whisper-1'

Audio Features

Audio Element

TTS uses HTML5 audio elements:
// Audio playback with controls
<audio
  id={`audio-${messageId}`}
  ref={audioRef}
  hidden
  preload="none"
>
  <source src={audioUrl} type="audio/mpeg" />
</audio>

Voice Selection

Choose from available TTS voices:
// Voice selector component
<Voices
  voices={['alloy', 'echo', 'fable', 'onyx', 'nova', 'shimmer']}
  selectedVoice={userPreference}
  onChange={handleVoiceChange}
/>

Rate Limiting

Control speech API usage:
# .env
TTS_VIOLATION_SCORE=0  # No rate limiting by default
STT_VIOLATION_SCORE=0  # Adjust as needed
TTS and STT API calls can add up quickly. Monitor usage and set appropriate rate limits.

Accessibility

Speech features enhance accessibility:
  • Screen reader friendly: ARIA labels on all controls
  • Keyboard navigation: Full keyboard support
  • Visual feedback: Clear indication of recording/playback state
  • Captions: Transcriptions appear as text
// Accessibility attributes
aria-label="Read aloud"
aria-haspopup="false"
title="Click to read this message"

Browser Compatibility

  • Chrome: Full support (built-in + API)
  • Firefox: Built-in only
  • Safari: Built-in only
  • Edge: Full support
  • Mobile: Limited (iOS Safari, Chrome Android)

Performance Optimization

Audio Caching

TTS audio can be cached to reduce API calls:
// Audio source caching
const audioCache = new Map<string, string>();

Lazy Loading

Audio elements load only when needed:
<audio preload="none" />

Throttling

Prevent spam by throttling requests:
// Rate limiting for TTS/STT
limit: 40,
window: 60000  // 1 minute

Use Cases

  • Screen reader users
  • Visual impairments
  • Reading difficulties
  • Language learning
  • Driving
  • Cooking
  • Multitasking
  • Mobile usage
  • Long-form content
  • Educational material
  • News summaries
  • Podcast-style listening
  • Faster than typing
  • Mobile convenience
  • Accessibility
  • Multilingual input

Troubleshooting

  • Check API key configuration
  • Verify browser supports audio playback
  • Check volume/mute settings
  • Look for errors in browser console
  • Check device volume
  • Verify audio output device
  • Test browser audio (e.g., YouTube)
  • Check for browser audio permission
  • Grant microphone permission
  • Check microphone is working (test in another app)
  • Reduce background noise
  • Speak clearly and at moderate speed
  • Try refreshing the page
  • Use tts-1-hd model for better quality
  • Check network connection
  • Try different voice options
  • For browser TTS, quality depends on OS
  • Use browser TTS instead of API
  • Limit TTS to important messages
  • Set rate limits
  • Monitor usage in OpenAI dashboard

Best Practices

  • Default to browser TTS: Lower costs, works offline
  • Use API TTS for quality: When professional voice matters
  • Enable selectively: Don’t auto-play all messages
  • Optimize voice choice: Match voice to use case
  • Monitor costs: TTS/STT can be expensive at scale
  • Provide text fallback: Always show text alongside audio

Configuration Reference

# librechat.yaml
speech:
  tts:
    openai:
      url: '${TTS_BASE_URL}'
      apiKey: '${TTS_API_KEY}'
      model: 'tts-1-hd'
      voices: ['alloy', 'echo', 'fable', 'onyx', 'nova', 'shimmer']
  
  stt:
    openai:
      url: '${STT_BASE_URL}'
      apiKey: '${STT_API_KEY}'
      model: 'whisper-1'
# .env
TTS_API_KEY=your-tts-key
STT_API_KEY=your-stt-key

# Rate limiting
TTS_VIOLATION_SCORE=0
STT_VIOLATION_SCORE=0

Build docs developers (and LLMs) love