Recognizer class represents a collection of speech recognition settings and functionality. It provides methods for capturing audio, adjusting for ambient noise, and performing speech recognition using various engines.
Constructor
Recognizer instance with default settings.
Properties
energy_threshold
dynamic_energy_threshold). The actual energy threshold you will need depends on your microphone sensitivity or audio data. Typical values for a silent room are 0 to 100, and typical values for speaking are between 150 and 3500.
Example:
dynamic_energy_threshold
False.
dynamic_energy_adjustment_damping
dynamic_energy_ratio
pause_threshold
phrase_threshold
non_speaking_duration
operation_timeout
Methods
record()
duration seconds of audio from source (an AudioSource instance) starting at offset (or at the beginning if not specified) into an AudioData instance, which it returns.
If duration is not specified, then it will record until there is no more audio input.
An audio source instance (e.g.,
Microphone or AudioFile). Must be entered before recording (used within a with statement).Maximum number of seconds to record. If
None, records until stream ends.Number of seconds into the audio to start recording from.
adjust_for_ambient_noise()
source to account for ambient noise.
Intended to calibrate the energy threshold with the ambient energy level. Should be used on periods of audio without speech - will stop early if any speech is detected.
The duration parameter is the maximum number of seconds that it will dynamically adjust the threshold for before returning. This value should be at least 0.5 in order to get a representative sample of the ambient noise.
An audio source instance. Must be entered before adjusting.
Maximum number of seconds to adjust for. Should be at least 0.5.
listen()
source into an AudioData instance, which it returns.
This is done by waiting until the audio has an energy above energy_threshold (the user has started speaking), and then recording until it encounters pause_threshold seconds of non-speaking or there is no more audio input. The ending silence is not included.
An audio source instance. Must be entered before listening.
Maximum number of seconds to wait for a phrase to start before giving up and throwing a
WaitTimeoutError exception. If None, there will be no wait timeout.Maximum number of seconds that a phrase can continue before stopping and returning the part of the phrase processed before the time limit was reached. If
None, there will be no phrase time limit.Allows integration with Snowboy, an offline hotword recognition engine. Should be a tuple of
(SNOWBOY_LOCATION, LIST_OF_HOT_WORD_FILES) or None to turn off Snowboy support.If
True, yields AudioData instances representing chunks of audio data as they are detected. If False, returns a single AudioData instance representing the entire phrase.listen_in_background()
source into an AudioData instance and call callback with that AudioData instance as soon as each phrase is detected.
Returns a function object that, when called, requests that the background listener thread stop. The background thread is a daemon and will not stop the program from exiting if there are no other non-daemon threads.
An audio source instance.
A function that accepts two parameters - the
Recognizer instance and an AudioData instance representing the captured audio. Note that callback will be called from a non-main thread.Maximum number of seconds that a phrase can continue. Works the same as in
listen().wait_for_stop: if truthy, the function will wait for the background listener to stop before returning.
Example: