Text-to-Speech

Endpoint

POST https://api.cencori.com/api/ai/audio/speech

Convert text to natural-sounding speech using OpenAI TTS models. Returns audio as a binary stream.

Authentication

Requires API key in Authorization header or CENCORI_API_KEY header.

Request Body

input

string

required

Text to convert to speech.Maximum length: 4096 characters

model

string

TTS model to use.Options:

tts-1 - Fast, lower quality (default)
tts-1-hd - Higher quality, slower

voice

string

Voice to use for speech generation.Options:

alloy - Neutral and balanced (default)
echo - Warm and conversational
fable - British accent, expressive
onyx - Deep and authoritative
nova - Friendly and upbeat
shimmer - Clear and professional

response_format

string

Audio format.Options:

mp3 (default) - MPEG audio
opus - Opus format
aac - AAC format
flac - FLAC format
wav - WAV format
pcm - Raw PCM

speed

number

Playback speed (0.25 to 4.0).Default: 1.0

0.5 = Half speed
1.0 = Normal speed
2.0 = Double speed

Response

Returns audio file as binary data. Content-Type: Based on response_format

audio/mpeg for mp3
audio/opus for opus
audio/aac for aac
audio/flac for flac
audio/wav for wav
audio/pcm for pcm

Headers:

Content-Type: Audio MIME type
Content-Length: File size in bytes
X-Request-Id: Request tracking ID

Examples

Basic Text-to-Speech

curl https://api.cencori.com/api/ai/audio/speech \
  -H "Authorization: Bearer $CENCORI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tts-1",
    "input": "Hello! Welcome to Cencori.",
    "voice": "alloy"
  }' \
  --output speech.mp3

High Quality with Custom Voice

curl https://api.cencori.com/api/ai/audio/speech \
  -H "Authorization: Bearer $CENCORI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tts-1-hd",
    "input": "This is a professional announcement.",
    "voice": "onyx",
    "response_format": "opus"
  }' \
  --output speech.opus

Faster Playback Speed

curl https://api.cencori.com/api/ai/audio/speech \
  -H "Authorization: Bearer $CENCORI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tts-1",
    "input": "Speeding up this narration.",
    "voice": "nova",
    "speed": 1.5
  }' \
  --output speech.mp3

Using OpenAI SDK

import OpenAI from 'openai';
import fs from 'fs';

const client = new OpenAI({
  apiKey: process.env.CENCORI_API_KEY,
  baseURL: 'https://api.cencori.com/v1'
});

const mp3 = await client.audio.speech.create({
  model: 'tts-1',
  voice: 'alloy',
  input: 'Today is a wonderful day!'
});

const buffer = Buffer.from(await mp3.arrayBuffer());
fs.writeFileSync('speech.mp3', buffer);

Stream Audio to Browser

import { NextResponse } from 'next/server';

export async function POST(req: Request) {
  const { text } = await req.json();
  
  const response = await fetch('https://api.cencori.com/api/ai/audio/speech', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${process.env.CENCORI_API_KEY}`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      model: 'tts-1',
      voice: 'alloy',
      input: text
    })
  });
  
  const audioBuffer = await response.arrayBuffer();
  
  return new NextResponse(audioBuffer, {
    headers: {
      'Content-Type': 'audio/mpeg',
      'Content-Disposition': 'inline; filename="speech.mp3"'
    }
  });
}

Long Text with Chunks

// Split long text into chunks
function chunkText(text: string, maxLength: number = 4000): string[] {
  const sentences = text.match(/[^.!?]+[.!?]+/g) || [text];
  const chunks: string[] = [];
  let currentChunk = '';
  
  for (const sentence of sentences) {
    if ((currentChunk + sentence).length > maxLength) {
      chunks.push(currentChunk.trim());
      currentChunk = sentence;
    } else {
      currentChunk += sentence;
    }
  }
  
  if (currentChunk) chunks.push(currentChunk.trim());
  return chunks;
}

// Generate speech for each chunk
const longText = "Very long text...";
const chunks = chunkText(longText);

for (let i = 0; i < chunks.length; i++) {
  const mp3 = await client.audio.speech.create({
    model: 'tts-1',
    voice: 'alloy',
    input: chunks[i]
  });
  
  const buffer = Buffer.from(await mp3.arrayBuffer());
  fs.writeFileSync(`speech_part${i + 1}.mp3`, buffer);
}

Compare All Voices

const voices = ['alloy', 'echo', 'fable', 'onyx', 'nova', 'shimmer'];
const text = 'Hello, this is a voice comparison test.';

for (const voice of voices) {
  const mp3 = await client.audio.speech.create({
    model: 'tts-1',
    voice,
    input: text
  });
  
  const buffer = Buffer.from(await mp3.arrayBuffer());
  fs.writeFileSync(`voice_${voice}.mp3`, buffer);
}

Voice Characteristics

Voice	Gender	Accent	Tone	Best For
alloy	Neutral	American	Balanced	General purpose
echo	Male	American	Warm	Storytelling
fable	Female	British	Expressive	Narration
onyx	Male	American	Deep	Announcements
nova	Female	American	Upbeat	Marketing
shimmer	Female	American	Professional	Business

Use Cases

Audiobooks

const chapters = ['Chapter 1 text...', 'Chapter 2 text...'];

for (let i = 0; i < chapters.length; i++) {
  const mp3 = await client.audio.speech.create({
    model: 'tts-1-hd',
    voice: 'fable',
    input: chapters[i],
    speed: 0.9  // Slightly slower for audiobooks
  });
  
  const buffer = Buffer.from(await mp3.arrayBuffer());
  fs.writeFileSync(`chapter_${i + 1}.mp3`, buffer);
}

Podcast Intros

const intro = await client.audio.speech.create({
  model: 'tts-1-hd',
  voice: 'echo',
  input: 'Welcome to the Tech Insights Podcast. Your weekly dose of innovation and discovery.'
});

Accessibility Features

// Read webpage content aloud
async function textToSpeech(element: HTMLElement) {
  const text = element.innerText;
  
  const response = await fetch('/api/tts', {
    method: 'POST',
    body: JSON.stringify({ text })
  });
  
  const audioBlob = await response.blob();
  const audioUrl = URL.createObjectURL(audioBlob);
  
  const audio = new Audio(audioUrl);
  audio.play();
}

Voice Notifications

const notification = await client.audio.speech.create({
  model: 'tts-1',
  voice: 'nova',
  input: 'You have 3 new messages waiting.'
});

Learning Apps

// Language learning pronunciation
const phrases = [
  { text: 'Hello', lang: 'en' },
  { text: 'Bonjour', lang: 'fr' },
  { text: 'Hola', lang: 'es' }
];

for (const phrase of phrases) {
  const mp3 = await client.audio.speech.create({
    model: 'tts-1-hd',
    voice: 'alloy',
    input: phrase.text
  });
  
  const buffer = Buffer.from(await mp3.arrayBuffer());
  fs.writeFileSync(`${phrase.lang}_${phrase.text}.mp3`, buffer);
}

Best Practices

Choose the right model
- Use tts-1 for most use cases (faster, cheaper)
- Use tts-1-hd for high-quality production audio
Select appropriate voice
- Test different voices for your use case
- Match voice gender/tone to content
- Consider audience preferences
Optimize text input
- Use proper punctuation for natural pauses
- Break long texts into sentences
- Add commas for breathing room
Handle long texts
- Split texts over 4096 characters
- Split at sentence boundaries
- Concatenate audio files if needed
Cache generated audio
- Store frequently used phrases
- Reduce API calls and costs
- Improve response times
Format selection
- Use mp3 for web and mobile (best compatibility)
- Use opus for streaming (better compression)
- Use wav for editing (uncompressed)

Limitations

Maximum input: 4096 characters per request
English language only (best quality)
No custom voice training
No emotion/tone control beyond voice selection
Generated audio cannot be used for prohibited content

Error Responses

Input Too Long

{
  "error": "bad_request",
  "message": "Input text exceeds maximum length of 4096 characters"
}

HTTP Status: 400 Solution: Split text into smaller chunks.

Missing Input

{
  "error": "bad_request",
  "message": "Input text is required"
}

HTTP Status: 400 Solution: Provide text in the input field.

Provider Not Configured

{
  "error": "provider_not_configured",
  "message": "No OpenAI API key configured"
}

HTTP Status: 400 Solution: Add OpenAI API key in project settings.

Rate Limits

TTS requests count toward monthly quota:

Free: 1,000 requests/month
Pro: 50,000 requests/month
Enterprise: Custom limits

Pricing

Pricing based on character count:

Model	Provider Cost	Cencori Charge
tts-1	$15/1M characters	$18/1M characters
tts-1-hd	$30/1M characters	$36/1M characters

Examples:

100 characters: $0.0018 (tts-1)
1000 characters: $0.018 (tts-1)
100 characters: $0.0036 (tts-1-hd)

List Available Options

curl https://api.cencori.com/api/ai/audio/speech \
  -H "Authorization: Bearer $CENCORI_API_KEY"

{
  "models": ["tts-1", "tts-1-hd"],
  "voices": [
    { "id": "alloy", "description": "Neutral and balanced" },
    { "id": "echo", "description": "Warm and conversational" },
    { "id": "fable", "description": "British accent, expressive" },
    { "id": "onyx", "description": "Deep and authoritative" },
    { "id": "nova", "description": "Friendly and upbeat" },
    { "id": "shimmer", "description": "Clear and professional" }
  ],
  "formats": ["mp3", "opus", "aac", "flac", "wav", "pcm"]
}

Overview

AI Gateway

Memory & Storage

Management

Endpoint

Authentication

Request Body

Response

Examples

Basic Text-to-Speech

High Quality with Custom Voice

Faster Playback Speed

Using OpenAI SDK

Stream Audio to Browser

Long Text with Chunks

Compare All Voices

Voice Characteristics

Use Cases

Audiobooks

Podcast Intros

Accessibility Features

Voice Notifications

Learning Apps

Best Practices

Limitations

Error Responses

Input Too Long

Missing Input

Provider Not Configured

Rate Limits

Pricing

List Available Options

Build docs developers (and LLMs) love

Overview

AI Gateway

Memory & Storage

Management

​Endpoint

​Authentication

​Request Body

​Response

​Examples

​Basic Text-to-Speech

​High Quality with Custom Voice

​Faster Playback Speed

​Using OpenAI SDK

​Stream Audio to Browser

​Long Text with Chunks

​Compare All Voices

​Voice Characteristics

​Use Cases

​Audiobooks

​Podcast Intros

​Accessibility Features

​Voice Notifications

​Learning Apps

​Best Practices

​Limitations

​Error Responses

​Input Too Long

​Missing Input

​Provider Not Configured

​Rate Limits

​Pricing

​List Available Options

Build docs developers (and LLMs) love

Endpoint

Authentication

Request Body

Response

Examples

Basic Text-to-Speech

High Quality with Custom Voice

Faster Playback Speed

Using OpenAI SDK

Stream Audio to Browser

Long Text with Chunks

Compare All Voices

Voice Characteristics

Use Cases

Audiobooks

Podcast Intros

Accessibility Features

Voice Notifications

Learning Apps

Best Practices

Limitations

Error Responses

Input Too Long

Missing Input

Provider Not Configured

Rate Limits

Pricing

List Available Options