Let’s start with the simplest possible example - recognizing speech from your microphone using Google’s free speech recognition service.
1
Create a new Python file
Create a file called speech_demo.py and import the library:
speech_demo.py
import speech_recognition as sr# Create a recognizer instancer = sr.Recognizer()
2
Capture audio from microphone
Use the Microphone context manager to capture audio:
speech_demo.py
# Use the default microphone as audio sourcewith sr.Microphone() as source: print("Say something!") audio = r.listen(source)
3
Recognize the speech
Send the audio to Google Speech Recognition:
speech_demo.py
# Recognize speech using Google Speech Recognitiontry: text = r.recognize_google(audio) print(f"You said: {text}")except sr.UnknownValueError: print("Could not understand audio")except sr.RequestError as e: print(f"Error: {e}")
4
Run your program
Execute your script:
python speech_demo.py
Speak into your microphone when prompted, and you should see your speech transcribed!
import speech_recognition as sr# Create recognizer instancer = sr.Recognizer()# Use microphone as audio sourcewith sr.Microphone() as source: print("Say something!") audio = r.listen(source)# Recognize speech using Google Speech Recognitiontry: text = r.recognize_google(audio) print(f"You said: {text}")except sr.UnknownValueError: print("Could not understand audio")except sr.RequestError as e: print(f"Error: {e}")
For better accuracy, calibrate the recognizer to ambient noise levels before listening:
import speech_recognition as srr = sr.Recognizer()with sr.Microphone() as source: # Adjust for ambient noise - listens for 1 second print("Adjusting for ambient noise... Please wait") r.adjust_for_ambient_noise(source, duration=1) print("Say something!") audio = r.listen(source)try: text = r.recognize_google(audio) print(f"You said: {text}")except sr.UnknownValueError: print("Could not understand audio")except sr.RequestError as e: print(f"Error: {e}")
Always call adjust_for_ambient_noise() in a quiet environment before the user starts speaking. This sets the energy threshold appropriately for the current noise level.
The library supports multiple recognition engines. Here are examples using different services:
# No API key required!text = r.recognize_google(audio)
Google Speech Recognition is free and doesn’t require an API key, making it perfect for getting started. For production applications, consider using other engines with proper API keys.
For continuous speech recognition, use listen_in_background():
import speech_recognition as srr = sr.Recognizer()m = sr.Microphone()def callback(recognizer, audio): """This is called from a background thread""" try: text = recognizer.recognize_google(audio) print(f"You said: {text}") except sr.UnknownValueError: print("Could not understand audio") except sr.RequestError as e: print(f"Error: {e}")# Start listening in the backgroundstop_listening = r.listen_in_background(m, callback)# Keep the program runningprint("Listening... Press Ctrl+C to stop")try: while True: passexcept KeyboardInterrupt: stop_listening(wait_for_stop=False) print("Stopped listening")
The callback function runs in a background thread. Make sure your callback is thread-safe if it modifies shared data.
Always handle both exception types when performing recognition:
import speech_recognition as srr = sr.Recognizer()with sr.Microphone() as source: audio = r.listen(source)try: text = r.recognize_google(audio) print(f"Result: {text}")except sr.UnknownValueError: # Speech was unintelligible print("Sorry, I couldn't understand that")except sr.RequestError as e: # API request failed print(f"Could not connect to the service: {e}")
UnknownValueError: Raised when the recognizer can’t understand the speech
RequestError: Raised when there’s a network error or API issue