Enabling voice mode
To run vimGPT with voice input:--voice flag enables voice input mode (main.py:44-46).
How it works
When voice mode is enabled:- Initialization: vimGPT starts the browser and navigates to Google
- Voice capture: The system listens for your voice command (main.py:17-25)
- Transcription: Whisper converts your speech to text
- Execution: vimGPT uses the transcribed text as the objective
- Autonomous browsing: The agent performs actions until the task is complete
Voice input workflow
Here’s what you’ll see when running in voice mode:Implementation details
Voice mode uses thewhisper-mic library for audio capture and transcription:
Requirements
Voice mode requires thewhisper-mic package, which is included in the requirements (requirements.txt:20):
The
whisper-mic library handles microphone access and Whisper model integration automatically. You don’t need to configure the Whisper model separately.Error handling
If voice capture fails, vimGPT will display an error message and exit:Tips for voice mode
Comparison with text mode
| Feature | Text mode | Voice mode |
|---|---|---|
| Objective input | Type in terminal | Speak into microphone |
| Command | python main.py | python main.py --voice |
| Best for | Quick testing, scripting | Hands-free operation, accessibility |
| Requirements | None (default) | whisper-mic package |
Example voice commands
Here are some example voice commands to try:- “Search YouTube for GPT-4 tutorials”
- “Find news articles about climate change”
- “Navigate to GitHub and search for AI projects”
- “Look up Italian restaurants near me”
- “Find the Python documentation”
Accessibility benefits
Voice mode makes vimGPT more accessible by:- Enabling hands-free browsing: Control the browser without typing
- Supporting voice-first workflows: Integrate with voice-based systems
- Reducing physical interaction: Helpful for users with mobility limitations
Switching between modes
You can easily switch between text and voice modes:--voice flag.
Troubleshooting
Microphone not detected
If your microphone isn’t detected:- Check system audio settings
- Verify microphone permissions for Python
- Test the microphone with other applications
- Ensure
whisper-micis properly installed
Poor transcription accuracy
- Speak more slowly and clearly
- Reduce background noise
- Use a better quality microphone
- Check microphone positioning and distance
Voice capture timeout
If the system doesn’t capture your voice:- Check if you need to start speaking immediately
- Verify the microphone is active and unmuted
- Look at
whisper-micdocumentation for timeout settings