Overview
Performs speech recognition using the Wit.ai API. Wit.ai is a natural language processing platform that provides free speech recognition with support for custom intents and entities.Method Signature
Parameters
The audio data to recognize. Must be an
AudioData instance.Wit.ai API key (32-character uppercase alphanumeric string).See setup instructions below for how to obtain an API key.
If
True, returns the raw API response as a JSON dictionary. If False, returns only the transcription text.Returns
The recognized text when
show_all=FalseWhen
show_all=True, returns the raw API response containing:_text: The transcribed textintents: Detected intents with confidence scoresentities: Extracted entities from the texttraits: Detected traits
Exceptions
Raised when the speech is unintelligible or the API returns no transcription
Raised when:
- The API request fails
- The API key is invalid
- There is no internet connection
Example Usage
Basic Recognition
Getting Full Response with Intents
From Audio File
Using Environment Variables
Command Recognition
Setup Instructions
1. Create Wit.ai Account
- Go to Wit.ai
- Click Sign Up or log in with GitHub/Facebook
- Complete the registration
2. Create an App
- After logging in, click New App
- Enter an app name
- Set the language (e.g., English)
- Choose visibility (private recommended)
- Click Create
3. Add at Least One Intent
Important: You must add at least one intent before you can access the API key.- In your app, go to Utterances
- Type a sample phrase (e.g., “hello”)
- Create a new intent (e.g., “greeting”)
- Click Train and Validate
4. Get API Key
- Go to Settings (gear icon)
- Under API Details, find the Server Access Token
- Copy the token (32-character uppercase alphanumeric string)
5. Use in Code
Language Support
The recognition language is configured in your Wit.ai app settings, not in the API call. Supported languages include:- English
- Spanish
- French
- German
- Italian
- Portuguese
- Russian
- Turkish
- Dutch
- Polish
- And more…
- Go to your app’s Settings
- Under App Details, change the Language
- Save changes
Understanding Intents and Entities
Intents
Intents represent what the user wants to do:turn_on_lightset_timerplay_musiccheck_weather
Entities
Entities are data extracted from the text:- Built-in entities:
wit$datetime,wit$duration,wit$location, etc. - Custom entities: You define these for your specific use case
Example Response
Best Practices
- Train Your App: Add diverse training examples for better accuracy
- Use Intents: Design intents for your specific use case
- Check Confidence: Filter results by confidence threshold (e.g., > 0.8)
- Handle Errors: Always catch
UnknownValueErrorandRequestError - Rate Limits: Free tier has rate limits; monitor your usage
Notes
- Free to use with reasonable rate limits
- Recognition language is set in app settings, not per-request
- Audio must be at least 8 kHz sample rate
- Audio is automatically converted to 16-bit samples
- Supports custom natural language understanding
- Best for voice commands and conversational interfaces