Skip to main content
The SpeechRecognition library provides a unified interface for multiple speech recognition engines and APIs. This guide covers installation for different use cases and platforms.

Basic Installation

1

Install with pip

The easiest way to install SpeechRecognition is using pip:
pip install SpeechRecognition
This installs the core library, which works with online APIs like Google Speech Recognition without any additional dependencies.
2

Verify installation

Test your installation by running the built-in demo:
python -m speech_recognition

System Requirements

  • Python 3.9+ is required
  • FLAC encoder is bundled for Windows (x86/x86-64), macOS (Intel), and Linux (x86/x86-64)
  • For other platforms, install FLAC manually (see Platform-Specific Setup)

Optional Dependencies

The library supports multiple recognition engines, each with its own dependencies. Install only what you need:

Microphone Support

Required for capturing audio from your microphone:
pip install SpeechRecognition[audio]
PyAudio 0.2.11 or later is required. Earlier versions have known memory management issues when recording from microphones.

Offline Recognition Engines

PocketSphinx (CMU Sphinx)

For offline speech recognition:
pip install SpeechRecognition[pocketsphinx]
Package repository versions are often outdated. Using pip is recommended for compatibility with bundled language data.
Additional language packs are available for languages like International French and Mandarin Chinese. See the PocketSphinx reference for details.

Vosk API

For offline recognition with Vosk:
pip install SpeechRecognition[vosk]
1

Download language models

Vosk requires language models. Download from alphacephei.com/vosk/models and place in your project’s model directory.
2

Or use the CLI helper

sprc download vosk

Whisper (Local)

For offline recognition with OpenAI’s Whisper:
pip install SpeechRecognition[whisper-local]
This installs openai-whisper and soundfile for processing audio locally.

Faster Whisper

For optimized Whisper performance:
pip install SpeechRecognition[faster-whisper]

Cloud APIs

Google Cloud Speech-to-Text

pip install SpeechRecognition[google-cloud]
Prerequisites: Set up local authentication credentials for your Google account. Follow the Google Cloud Speech-to-Text setup guide.
Currently supports V1 API only (V2 is not yet supported).

OpenAI Whisper API

pip install SpeechRecognition[openai]
Set your API key as an environment variable:
export OPENAI_API_KEY="your-api-key-here"
This also works with OpenAI-compatible self-hosted endpoints (vLLM, Ollama, etc.). Set OPENAI_BASE_URL to your custom endpoint with a dummy API key.

Groq Whisper API

pip install SpeechRecognition[groq]
Set your API key:
export GROQ_API_KEY="your-api-key-here"

Platform-Specific Setup

macOS

FLAC encoder is bundled for Intel Macs running OS X 10.6+. For Apple Silicon or if you need a custom build:
brew install flac

Linux

FLAC is bundled for x86 and x86-64 architectures. For other architectures:
sudo apt-get install flac

Raspberry Pi

The Raspberry Pi doesn’t have built-in audio input. You’ll need:
  1. A USB sound card or USB microphone
  2. To specify the device index when creating a Microphone instance
See Troubleshooting for how to find your device index.

Installation from Source

To install from the source distribution:
1

Download source

Download the source distribution from PyPI.
2

Extract and install

tar -xzf SpeechRecognition-*.tar.gz
cd SpeechRecognition-*
python setup.py install
For development:
git clone https://github.com/Uberi/speech_recognition.git
cd speech_recognition
python -m pip install -e .[dev]

Troubleshooting

You need to install PyAudio for microphone support:
pip install SpeechRecognition[audio]
If that fails on Linux, install system dependencies first:
sudo apt-get install portaudio19-dev python3-all-dev
pip install SpeechRecognition[audio]
Your system doesn’t have a default microphone configured. Either:
  1. Set a default microphone in your OS settings, or
  2. Specify a device index explicitly: Microphone(device_index=0)
See Selecting a Specific Microphone below.
List available microphones with this code:
import speech_recognition as sr

for index, name in enumerate(sr.Microphone.list_microphone_names()):
    print(f"Microphone {index}: {name}")
Then use the index with Microphone:
# Use microphone at index 3
with sr.Microphone(device_index=3) as source:
    # ...
Installing FLAC directly from source code doesn’t update the search path correctly. Use Homebrew instead:
brew install flac
Warnings like “bt_audio_service_open: Connection refused” or “Unknown PCM” are common on Linux and can usually be safely ignored. They occur when:
  • Bluetooth audio device configured but not connected (safe to ignore if not using Bluetooth)
  • ALSA trying to connect to JACK server (safe to ignore)
To suppress specific PCM warnings, comment out the corresponding lines in /usr/share/alsa/alsa.conf.
SpeechRecognition is supported out of the box in PyInstaller 3.0+. If you encounter issues:
pip install --upgrade pyinstaller

Next Steps

Quickstart Guide

Get started with your first speech recognition application

API Reference

Explore the full API documentation