Overview
The Generations API allows you to create TTS audio from text and retrieve generation history. All operations require an active subscription.Create Generation
Generate speech audio from text using a voice clone.Input Schema
Text to convert to speech. Must be between 1 and 5,000 characters.
ID of the voice to use (can be SYSTEM or CUSTOM variant)
Controls randomness in generation. Range: 0.0 to 2.0
- Lower values (0.0-0.5): More consistent, predictable output
- Default (0.8): Balanced naturalness
- Higher values (1.0-2.0): More varied, expressive output
Nucleus sampling threshold. Range: 0.0 to 1.0Controls diversity by considering only the top probability mass.
Number of top tokens to consider. Range: 1 to 10,000Limits the sampling pool to the K most likely tokens.
Penalty for repeating tokens. Range: 1.0 to 2.0
- 1.0: No penalty (may lead to repetition)
- 1.2: Default (balanced)
- 2.0: Strong penalty (avoids repetition)
Response
Unique identifier for the created generation. Use this to retrieve the audio via
/api/audio/{id}Implementation
Get All Generations
Retrieve all generations for the current organization.Response
Array of generation objects ordered by creation date (newest first)
Implementation
Get Generation by ID
Retrieve a specific generation with audio URL.Input Schema
Generation ID to retrieve
Response
Generation object with all fields plus audioUrl
Implementation
Error Codes
| Code | Description | When It Occurs |
|---|---|---|
UNAUTHORIZED | User not authenticated | Missing or invalid session |
FORBIDDEN | Subscription required | No active subscription (message: SUBSCRIPTION_REQUIRED) |
NOT_FOUND | Resource not found | Voice or generation doesn’t exist or not accessible |
PRECONDITION_FAILED | Voice audio unavailable | Voice exists but has no audio file |
INTERNAL_SERVER_ERROR | Generation failed | Chatterbox API error or storage failure |
Usage Tracking
Each successful generation triggers a usage event sent to Polar for billing:- Event name:
tts_generation - Metered by character count
- Fire-and-forget (doesn’t block response)
- Silent failure (doesn’t affect user experience)
See Generation Parameters for detailed guidance on tuning temperature, top-p, top-k, and repetition penalty.