Overview
The voice assistant lets you:Talk to Your Agent
Ask questions, give instructions, and request code changes hands-free
Approve by Voice
Say “yes” or “no” to approve or deny permission requests
Monitor Progress
Receive spoken updates when tasks complete or errors occur
Prerequisites
You need an ElevenLabs account with API access.ElevenLabs offers a free tier with limited usage. Paid plans provide more minutes and better voice quality.
Setup
1. Get an API Key
Sign Up
Sign up or log in at elevenlabs.io
API Keys
Go to API Keys in your account settings
2. Configure the Hub
Set the environment variable before starting the hub:3. (Optional) Use Custom Agent
If you want to use your own ElevenLabs agent instead of the auto-created one:Configuration
| Variable | Required | Description |
|---|---|---|
ELEVENLABS_API_KEY | Yes | ElevenLabs API key for voice assistant |
ELEVENLABS_AGENT_ID | No | Custom agent ID (auto-created if not set) |
Usage
Starting a Voice Session
Voice assistant is only available in the web app, not in Telegram Mini App or terminal.
Voice Commands
The assistant understands natural language. Here are common patterns:| Say This | What Happens |
|---|---|
| ”Ask Claude to…” / “Have it…” | Sends your request to the coding agent |
| ”Refactor the auth module” | Coding requests are forwarded automatically |
| ”Yes” / “Allow” / “Go ahead” | Approves pending permission requests |
| ”No” / “Deny” / “Cancel” | Denies pending permission requests |
| Direct questions | The voice assistant answers itself if it can |
How It Works
Context Synchronization
The voice assistant automatically receives updates when:- You focus on a session (full history is loaded)
- The agent sends messages or uses tools
- Permission requests arrive
- Tasks complete
Tools
The voice assistant has two tools to interact with your coding agent:messageCodingAgent
Forwards your requests to the active agent
processPermissionRequest
Handles permission approvals and denials
Architecture
- Browser captures audio via WebRTC
- ElevenLabs handles speech-to-text and text-to-speech
- Voice Assistant interprets intent and calls tools
- HAPI Hub routes tool calls to the coding agent
- Coding Agent executes the request
The voice connection uses WebRTC for low-latency audio streaming. The HAPI hub provides conversation tokens and handles authentication.
Use Cases
Hands-Free Coding
Code while your hands are busy:Quick Permission Approvals
Approve permissions while away from keyboard:Progress Monitoring
Get spoken updates without looking at screen:Tips for Best Results
Be Specific
Be Specific
Clear, complete requests get better results:
- ✅ “Refactor the user authentication module to use JWT tokens”
- ❌ “Fix that thing”
Wait for Completion
Wait for Completion
The assistant stays silent while the agent works, then summarizes results. Don’t interrupt while processing.
Use Natural Language
Use Natural Language
No special command syntax needed:
- ✅ “Can you have Claude add error handling to the API?”
- ✅ “Tell it to fix the bug in utils.ts”
- ✅ “Yes, go ahead”
Keep Sessions Focused
Keep Sessions Focused
Use one active session at a time for clearest context. The assistant tracks the currently focused session.
Audio Quality
For best audio experience:- Use a headset to avoid echo
- Reduce background noise for better recognition
- Stable internet for low-latency streaming
- Chrome/Edge recommended (best WebRTC support)
Troubleshooting
"ElevenLabs API key not configured"
"ElevenLabs API key not configured"
Solution: Set
ELEVENLABS_API_KEY in your environment and restart the hub:"Failed to get microphone permission"
"Failed to get microphone permission"
Possible causes:
- Check browser permissions for microphone access
- Ensure no other app is using the microphone
- Try refreshing the page
- HTTPS required (some browsers block mic on HTTP)
Voice Not Responding
Voice Not Responding
Check these:
- Verify the session is connected (green dot in status bar)
- Check that voice status shows “connecting” or connected state
- Ensure you have a stable internet connection
- Look for errors in browser console (F12)
"Failed to create ElevenLabs agent automatically"
"Failed to create ElevenLabs agent automatically"
Solutions:
- Verify your API key is valid
- Check your ElevenLabs account has available quota
- Try setting a custom
ELEVENLABS_AGENT_IDfrom your dashboard
Poor Audio Quality
Poor Audio Quality
Improvements:
- Use a headset to avoid echo
- Reduce background noise
- Check your internet connection stability
- Upgrade to a paid ElevenLabs plan for better voice quality
Assistant Misunderstands Commands
Assistant Misunderstands Commands
Tips:
- Speak clearly and at moderate pace
- Use complete sentences
- Be explicit: “Ask Claude to…” vs just describing the task
- Check that session has an active agent
Limitations
Browser Support
| Browser | Platform | Support |
|---|---|---|
| Chrome | Desktop/Android | ✅ Full support |
| Edge | Desktop/Android | ✅ Full support |
| Safari | macOS/iOS | ⚠️ Limited WebRTC support |
| Firefox | All | ⚠️ Partial support |
Privacy & Security
Audio Processing
- Audio is streamed to ElevenLabs via WebRTC
- ElevenLabs processes speech-to-text
- Transcripts are sent to the voice assistant
- Voice responses are generated by ElevenLabs
Data Handling
- Audio is not stored by HAPI
- Transcripts are logged for debugging (can be disabled)
- ElevenLabs has its own data retention policies
- Check ElevenLabs Privacy Policy for details
Related Features
Remote Control
Control sessions from anywhere
Permissions
Approve agent actions
PWA
Install HAPI on your phone