Overview
The Unmute API provides endpoints for voice cloning and voice donation. Voice cloning allows you to create custom voices from audio samples, while voice donation enables users to contribute their voices to improve the service.Get Available Voices
Retrieve a list of pre-configured voices available for use.Response
Returns an array of voice objects. Each voice object contains:The unique identifier for the voice
Information about the voice source, including path and description
Optional instructions for using this voice
Response Example
Clone a Voice
Upload an audio file to create a custom voice clone.Request
Audio file containing the voice sample to clone. Supported formats include WAV, MP3, and other common audio formats.
File Size Limits
- Maximum file size is configurable (default: defined in
MAX_VOICE_FILE_SIZE_MB) - Files exceeding the limit will return a
413 Request Entity Too Largeerror - Minimum recommended duration: 10-30 seconds of clear speech
Response
The unique identifier for the cloned voice, in the format
custom:<uuid>. Use this name when making TTS requests.Response Example
Status Codes
Voice successfully cloned
Content-Length header is required
Request entity too large - file exceeds maximum size limit
Voice Donation
Contribute your voice to help improve the Unmute service.Step 1: Request Verification
Initiate a voice donation by requesting a verification text to read.Response
Unique identifier for this verification request. Must be included when submitting the voice donation.
The verification text you must read aloud. Always begins with: “I consent to my voice being used for voice cloning.” followed by two randomly selected sentences.
Unix timestamp (seconds since epoch) when the verification was created. Verifications expire after 5 minutes.
Response Example
Step 2: Submit Voice Donation
Submit your voice recording along with the required metadata.Request
Audio file containing your recording of the verification text. Must be at least 0.1 MB and no more than the configured maximum size.
JSON string containing the submission metadata with the following fields:
Metadata Fields
Your email address. Kept private and only used if you want to withdraw your donation. Not published.
Public nickname to associate with your voice donation. Maximum 30 characters.
The verification ID received from the GET /v1/voice-donation endpoint.
License for the voice donation. Currently only “CC0” (Creative Commons Zero) is accepted.
Metadata format version. Defaults to “1.1”.
Optional transcription of what you recorded. Note: this is sent by the client and could be manipulated.
Response
Status Codes
Voice donation successfully submitted
Invalid request. Possible reasons:
- Invalid JSON in metadata field
- Invalid submission data (validation error)
- Audio file too small (< 0.1 MB)
- Audio file too large (> maximum size)
- Nickname too long (> 30 characters)
- Verification ID not found or expired (> 5 minutes old)
Content-Length header is required
Request entity too large - file exceeds maximum size limit
Error Response Example
CORS
All endpoints support CORS for the following origins:http://localhosthttp://localhost:3000
Notes
- Voice cloning embeddings are cached for 1 hour
- Verification requests are cached for 1 hour but expire after 5 minutes for submission
- All audio processing is handled asynchronously to prevent blocking
- Voice donations are saved with both audio and metadata files for future processing