Skip to main content
The energy threshold is a critical parameter that determines when the library considers audio to be speech versus background noise. Proper calibration ensures reliable speech detection in different environments.

Prerequisites

This example requires PyAudio to access your microphone. Install it with:
pip install pyaudio

Understanding Energy Threshold

The energy_threshold property controls how sensitive the recognizer is to sound:
  • Too Low: Background noise triggers false detections
  • Too High: Quiet speech is missed
  • Just Right: Only actual speech is detected
The library uses this threshold to determine:
  1. When speech starts (audio rises above threshold)
  2. When speech ends (audio falls below threshold)
  3. What audio to send to the recognition engine

Automatic Calibration

The easiest way to set the energy threshold is to use automatic calibration:
1

Create Recognizer and Microphone

import speech_recognition as sr

r = sr.Recognizer()
2

Calibrate for Ambient Noise

with sr.Microphone() as source:
    r.adjust_for_ambient_noise(source)
This method:
  • Listens to ambient noise for 1 second (default)
  • Calculates an appropriate energy threshold
  • Sets r.energy_threshold automatically
3

Listen for Speech

After calibration, use the recognizer normally:
with sr.Microphone() as source:
    print("Say something!")
    audio = r.listen(source)

Complete Working Example

Here’s the complete calibration example from the source code:
calibrate_energy_threshold.py
import speech_recognition as sr

# Obtain audio from the microphone
r = sr.Recognizer()
with sr.Microphone() as source:
    # Listen for 1 second to calibrate the energy threshold for ambient noise levels
    r.adjust_for_ambient_noise(source)
    print("Say something!")
    audio = r.listen(source)

# Recognize speech using Google Speech Recognition
try:
    text = r.recognize_google(audio)
    print("Google Speech Recognition thinks you said " + text)
except sr.UnknownValueError:
    print("Google Speech Recognition could not understand audio")
except sr.RequestError as e:
    print("Could not request results from Google Speech Recognition service; {0}".format(e))

Manual Calibration

For more control, you can manually set the energy threshold:

Check the Default Value

import speech_recognition as sr

r = sr.Recognizer()
print(f"Default energy threshold: {r.energy_threshold}")
# Output: Default energy threshold: 300

Set a Custom Value

r = sr.Recognizer()
r.energy_threshold = 4000  # Higher = less sensitive

with sr.Microphone() as source:
    audio = r.listen(source)

Find the Right Value

Experiment to find the optimal threshold for your environment:
import speech_recognition as sr

r = sr.Recognizer()

# Try different values
for threshold in [1000, 2000, 3000, 4000, 5000]:
    r.energy_threshold = threshold
    print(f"\nTesting with threshold: {threshold}")
    
    with sr.Microphone() as source:
        print("Say something...")
        try:
            audio = r.listen(source, timeout=3)
            text = r.recognize_google(audio)
            print(f"Success! Recognized: {text}")
        except sr.WaitTimeoutError:
            print("No speech detected (threshold too high?)")
        except sr.UnknownValueError:
            print("Speech detected but not understood")

Calibration Duration

You can specify how long to listen when calibrating:
with sr.Microphone() as source:
    # Default: 1 second
    r.adjust_for_ambient_noise(source)
    
    # Custom duration: 2 seconds
    r.adjust_for_ambient_noise(source, duration=2)
    
    # Quick calibration: 0.5 seconds
    r.adjust_for_ambient_noise(source, duration=0.5)
Use longer durations (2-3 seconds) in noisy environments for better calibration.

When to Calibrate

One-Time Calibration

For stable environments, calibrate once at startup:
import speech_recognition as sr

r = sr.Recognizer()
m = sr.Microphone()

# Calibrate once
with m as source:
    print("Calibrating... Please be quiet.")
    r.adjust_for_ambient_noise(source, duration=2)
    print(f"Energy threshold set to: {r.energy_threshold}")

# Now use the recognizer multiple times
for i in range(5):
    with m as source:
        print(f"\nListening (attempt {i+1})...")
        audio = r.listen(source)
        text = r.recognize_google(audio)
        print(f"You said: {text}")

Per-Session Calibration

Recalibrate before each listening session if the environment changes:
with sr.Microphone() as source:
    # Calibrate before each listen
    r.adjust_for_ambient_noise(source)
    audio = r.listen(source)

Background Listening Calibration

For background listening, calibrate before starting:
import speech_recognition as sr
import time

def callback(recognizer, audio):
    try:
        text = recognizer.recognize_google(audio)
        print(f"Heard: {text}")
    except sr.UnknownValueError:
        pass

r = sr.Recognizer()
m = sr.Microphone()

# Calibrate once before background listening
with m as source:
    r.adjust_for_ambient_noise(source)

# Start background listening
stop_listening = r.listen_in_background(m, callback)

time.sleep(60)  # Listen for 60 seconds
stop_listening(wait_for_stop=False)

Environment-Specific Settings

Quiet Room (Office, Home)

# Lower threshold for quiet environments
r.energy_threshold = 300  # or use automatic calibration

Moderate Noise (Coffee Shop)

# Medium threshold
r.energy_threshold = 1000

# Or calibrate with longer duration
with sr.Microphone() as source:
    r.adjust_for_ambient_noise(source, duration=2)

Noisy Environment (Street, Restaurant)

# Higher threshold
r.energy_threshold = 4000

# Or calibrate with even longer duration
with sr.Microphone() as source:
    r.adjust_for_ambient_noise(source, duration=3)

Very Noisy (Construction, Party)

# Very high threshold
r.energy_threshold = 6000

# Consider using a directional microphone
# or external noise cancellation

Dynamic Adjustment

The dynamic_energy_threshold property allows the threshold to adapt over time:
r = sr.Recognizer()

# Enable dynamic adjustment (default: True)
r.dynamic_energy_threshold = True

# The threshold will automatically adjust based on ambient noise
# during listening
This is useful when:
  • The environment noise level changes during use
  • You don’t know the environment in advance
  • You want “set it and forget it” behavior
Disable it for consistent behavior:
r.dynamic_energy_threshold = False
r.energy_threshold = 3000  # Fixed threshold

Monitoring Energy Levels

To see the actual energy levels in your environment:
import speech_recognition as sr

r = sr.Recognizer()

with sr.Microphone() as source:
    print("Monitoring energy levels...")
    print("Current threshold:", r.energy_threshold)
    
    # Sample ambient noise
    r.adjust_for_ambient_noise(source, duration=2)
    
    print("Adjusted threshold:", r.energy_threshold)
    print("\nSpeak now to see if speech is detected...")
    
    try:
        audio = r.listen(source, timeout=5)
        print("Speech detected!")
    except sr.WaitTimeoutError:
        print("No speech detected within timeout")

Troubleshooting

Problem: Recognizer Never Detects Speech

Symptoms: The listen() method never returns, or always times out. Solution: The energy threshold is too high.
# Lower the threshold
r.energy_threshold = 300

# Or recalibrate in a quiet moment
with sr.Microphone() as source:
    print("Please be quiet during calibration...")
    r.adjust_for_ambient_noise(source, duration=2)

Problem: Background Noise Triggers Detection

Symptoms: The callback fires constantly, even when no one is speaking. Solution: The energy threshold is too low.
# Raise the threshold
r.energy_threshold = 4000

# Or recalibrate while the background noise is present
with sr.Microphone() as source:
    print("Calibrating to current noise level...")
    r.adjust_for_ambient_noise(source, duration=3)

Problem: Inconsistent Detection

Symptoms: Sometimes speech is detected, sometimes it isn’t. Solution: Use dynamic thresholding or recalibrate more frequently.
# Enable dynamic thresholding
r.dynamic_energy_threshold = True

# Or recalibrate before each session
with sr.Microphone() as source:
    r.adjust_for_ambient_noise(source)
    audio = r.listen(source)

Problem: Quiet Speech Not Detected

Symptoms: Loud speech works, but quiet speech is missed. Solutions:
  1. Lower the energy threshold:
r.energy_threshold = 200
  1. Ask users to speak louder
  2. Use a better microphone closer to the speaker
  3. Reduce background noise in the environment

Best Practices

Always calibrate in conditions similar to when you’ll be listening. Don’t calibrate in silence if you’ll be used in a noisy environment.
# Good: Calibrate while background noise is present
with sr.Microphone() as source:
    print("Calibrating... (normal noise is OK)")
    r.adjust_for_ambient_noise(source, duration=2)
Let users know when calibration is happening:
with sr.Microphone() as source:
    print("Adjusting for ambient noise... Please wait.")
    r.adjust_for_ambient_noise(source, duration=1)
    print("Calibration complete. You can speak now.")
Unless you have a specific reason to disable it, keep dynamic thresholding enabled:
r = sr.Recognizer()
r.dynamic_energy_threshold = True  # This is the default
Always test your application in the actual environment where it will be used:
# Save threshold for debugging
with sr.Microphone() as source:
    r.adjust_for_ambient_noise(source)
    print(f"Threshold set to: {r.energy_threshold}")
    print(f"Dynamic adjustment: {r.dynamic_energy_threshold}")
For production applications, consider letting users manually adjust the threshold:
import speech_recognition as sr

# Load from config or user preferences
user_threshold = load_user_preference("energy_threshold", default=300)

r = sr.Recognizer()
r.energy_threshold = user_threshold
The energy threshold works in conjunction with other parameters:

Pause Threshold

How much silence before considering speech ended:
r.pause_threshold = 0.8  # seconds (default)
r.pause_threshold = 1.2  # longer pauses = more patient listening

Phrase Time Limit

Maximum duration of a phrase:
with sr.Microphone() as source:
    # Limit phrases to 5 seconds
    audio = r.listen(source, phrase_time_limit=5)

Non-Speaking Duration

Minimum silence before speech starts:
r.non_speaking_duration = 0.5  # seconds (default)

Next Steps

Microphone Recognition

Apply calibration to basic recognition

Background Listening

Use calibration with continuous listening

Recognizer API

Explore all recognizer properties and methods