Generations API

Overview

The Generations API allows you to create TTS audio from text and retrieve generation history. All operations require an active subscription.

Create Generation

Generate speech audio from text using a voice clone.

trpc.generations.create.useMutation();

Requires an active subscription. Returns FORBIDDEN with message SUBSCRIPTION_REQUIRED if no active subscription exists.

Input Schema

text

string

required

Text to convert to speech. Must be between 1 and 5,000 characters.

voiceId

string

required

ID of the voice to use (can be SYSTEM or CUSTOM variant)

temperature

number

default:"0.8"

Controls randomness in generation. Range: 0.0 to 2.0

Lower values (0.0-0.5): More consistent, predictable output
Default (0.8): Balanced naturalness
Higher values (1.0-2.0): More varied, expressive output

topP

number

default:"0.95"

Nucleus sampling threshold. Range: 0.0 to 1.0Controls diversity by considering only the top probability mass.

topK

number

default:"1000"

Number of top tokens to consider. Range: 1 to 10,000Limits the sampling pool to the K most likely tokens.

repetitionPenalty

number

default:"1.2"

Penalty for repeating tokens. Range: 1.0 to 2.0

1.0: No penalty (may lead to repetition)
1.2: Default (balanced)
2.0: Strong penalty (avoids repetition)

Response

string

required

Unique identifier for the created generation. Use this to retrieve the audio via /api/audio/{id}

Implementation

create: orgProcedure
  .input(
    z.object({
      text: z.string().min(1).max(TEXT_MAX_LENGTH),
      voiceId: z.string().min(1),
      temperature: z.number().min(0).max(2).default(0.8),
      topP: z.number().min(0).max(1).default(0.95),
      topK: z.number().min(1).max(10000).default(1000),
      repetitionPenalty: z.number().min(1).max(2).default(1.2),
    })
  )
  .mutation(async ({ input, ctx }) => {
    // Check for active subscription
    const customerState = await polar.customers.getStateExternal({
      externalId: ctx.orgId,
    });
    const hasActiveSubscription =
      (customerState.activeSubscriptions ?? []).length > 0;
    if (!hasActiveSubscription) {
      throw new TRPCError({
        code: "FORBIDDEN",
        message: "SUBSCRIPTION_REQUIRED",
      });
    }

    // Validate voice access
    const voice = await prisma.voice.findUnique({
      where: {
        id: input.voiceId,
        OR: [
          { variant: "SYSTEM" },
          { variant: "CUSTOM", orgId: ctx.orgId }
        ],
      },
    });

    if (!voice?.r2ObjectKey) {
      throw new TRPCError({
        code: "NOT_FOUND",
        message: "Voice not found",
      });
    }

    // Generate audio via Chatterbox API
    const { data, error } = await chatterbox.POST("/generate", {
      body: {
        prompt: input.text,
        voice_key: voice.r2ObjectKey,
        temperature: input.temperature,
        top_p: input.topP,
        top_k: input.topK,
        repetition_penalty: input.repetitionPenalty,
        norm_loudness: true,
      },
      parseAs: "arrayBuffer",
    });

    // Store in database and R2
    const generation = await prisma.generation.create({
      data: {
        orgId: ctx.orgId,
        text: input.text,
        voiceName: voice.name,
        voiceId: voice.id,
        temperature: input.temperature,
        topP: input.topP,
        topK: input.topK,
        repetitionPenalty: input.repetitionPenalty,
      },
    });

    await uploadAudio({ 
      buffer: Buffer.from(data), 
      key: `generations/orgs/${ctx.orgId}/${generation.id}` 
    });

    // Track usage in Polar (fire-and-forget)
    polar.events.ingest({
      events: [{
        name: "tts_generation",
        externalCustomerId: ctx.orgId,
        metadata: { characters: input.text.length },
        timestamp: new Date(),
      }],
    }).catch(() => {});

    return { id: generation.id };
  }),

Get All Generations

Retrieve all generations for the current organization.

trpc.generations.getAll.useQuery();

Response

generations

Generation[]

Array of generation objects ordered by creation date (newest first)

Show Generation object

string

required

Unique generation identifier

text

string

required

Original text that was converted to speech

voiceId

string

required

ID of the voice used

voiceName

string

required

Display name of the voice

temperature

number

required

Temperature value used (0.0-2.0)

topP

number

required

Top-p value used (0.0-1.0)

topK

number

required

Top-k value used (1-10000)

repetitionPenalty

number

required

Repetition penalty used (1.0-2.0)

createdAt

Date

required

Timestamp when the generation was created

Implementation

getAll: orgProcedure.query(async ({ ctx }) => {
  const generations = await prisma.generation.findMany({
    where: { orgId: ctx.orgId },
    orderBy: { createdAt: "desc" },
    omit: {
      orgId: true,
      r2ObjectKey: true,
    },
  });

  return generations;
}),

Get Generation by ID

Retrieve a specific generation with audio URL.

trpc.generations.getById.useQuery({ id: "gen-123" });

Input Schema

string

required

Generation ID to retrieve

Response

generation

Generation

Generation object with all fields plus audioUrl

Show Extended Generation object

Contains all standard Generation fields plus:

audioUrl

string

required

URL path to access the audio file: /api/audio/{id}

Implementation

getById: orgProcedure
  .input(z.object({ id: z.string() }))
  .query(async ({ input, ctx }) => {
    const generation = await prisma.generation.findUnique({
      where: { id: input.id, orgId: ctx.orgId },
      omit: {
        orgId: true,
        r2ObjectKey: true,
      },
    });

    if (!generation) {
      throw new TRPCError({ code: "NOT_FOUND" });
    }

    return {
      ...generation,
      audioUrl: `/api/audio/${generation.id}`,
    };
  }),

Error Codes

Code	Description	When It Occurs
`UNAUTHORIZED`	User not authenticated	Missing or invalid session
`FORBIDDEN`	Subscription required	No active subscription (message: `SUBSCRIPTION_REQUIRED`)
`NOT_FOUND`	Resource not found	Voice or generation doesn’t exist or not accessible
`PRECONDITION_FAILED`	Voice audio unavailable	Voice exists but has no audio file
`INTERNAL_SERVER_ERROR`	Generation failed	Chatterbox API error or storage failure

Usage Tracking

Each successful generation triggers a usage event sent to Polar for billing:

polar.events.ingest({
  events: [{
    name: "tts_generation",
    externalCustomerId: ctx.orgId,
    metadata: { characters: input.text.length },
    timestamp: new Date(),
  }],
});

Event name: tts_generation
Metered by character count
Fire-and-forget (doesn’t block response)
Silent failure (doesn’t affect user experience)

See Generation Parameters for detailed guidance on tuning temperature, top-p, top-k, and repetition penalty.

Overview

tRPC Routers

Chatterbox TTS

Overview

Create Generation

Input Schema

Response

Implementation

Get All Generations

Response

Implementation

Get Generation by ID

Input Schema

Response

Implementation

Error Codes

Usage Tracking

Build docs developers (and LLMs) love

Overview

tRPC Routers

Chatterbox TTS

​Overview

​Create Generation

​Input Schema

​Response

​Implementation

​Get All Generations

​Response

​Implementation

​Get Generation by ID

​Input Schema

​Response

​Implementation

​Error Codes

​Usage Tracking

Build docs developers (and LLMs) love

Overview

Create Generation

Input Schema

Response

Implementation

Get All Generations

Response

Implementation

Get Generation by ID

Input Schema

Response

Implementation

Error Codes

Usage Tracking