What It Does
The Add Voice Transcription skill:- Detects WhatsApp voice messages
- Downloads audio files automatically
- Transcribes using OpenAI Whisper API
- Delivers transcript to agent as
[Voice: <transcript>] - Falls back gracefully if API key missing or transcription fails
Prerequisites
- NanoClaw with WhatsApp channel installed
- OpenAI API key with Whisper access
- Funded OpenAI account (Whisper requires credits)
How to Apply
Get OpenAI API key
If you don’t have one:
- Go to https://platform.openai.com/api-keys
- Click “Create new secret key”
- Name it (e.g., “NanoClaw Transcription”)
- Copy the key (starts with
sk-)
Apply code changes
The skill runs
npx tsx scripts/apply-skill.ts .claude/skills/add-voice-transcription which:- Adds
src/transcription.tsmodule - Merges voice handling into WhatsApp channel
- Adds transcription tests
- Installs
openaidependency
What Changes
Files Created
src/transcription.ts- Voice transcription module using OpenAI Whisper
Files Modified
src/channels/whatsapp.ts- Adds voice message detection and transcriptionsrc/channels/whatsapp.test.ts- Adds 3 transcription test casespackage.json- Addsopenaidependency.env- AddsOPENAI_API_KEY.env.example- DocumentsOPENAI_API_KEYdata/env/env- Synced environment for container.nanoclaw/state.yaml- Records skill application
Dependencies Added
openai- OpenAI API client for Whisper transcription
Usage
Send Voice Note
Simply send a voice message in any registered WhatsApp chat:Transcription Format
The agent receives:Whisper supports 50+ languages and automatically detects the language. No configuration needed for multilingual transcription.
Troubleshooting
Voice Notes Show “[Voice Message - transcription unavailable]”
- Check
OPENAI_API_KEYis set in.envAND synced todata/env/env - Verify key works:
- Check OpenAI billing - Whisper requires funded account
Voice Notes Show “[Voice Message - transcription failed]”
Check logs for specific error:- Network timeout (transient, will work on next message)
- Invalid API key (regenerate at platform.openai.com/api-keys)
- Rate limiting (wait and retry)
- Insufficient credits (add funds to OpenAI account)
Agent Doesn’t Respond to Voice Notes
Verify:- Chat is registered in database
- Agent is running
- WhatsApp channel is connected
- Transcription succeeded (check logs for “Transcribed voice message”)
High Costs
Whisper pricing is $0.006 per minute. To reduce costs:- Use only for registered chats (automatically limited)
- Consider local Whisper via
/use-local-whisperskill (free but slower) - Monitor usage at platform.openai.com/usage