Skip to main content

Overview

The Generations API allows you to create TTS audio from text and retrieve generation history. All operations require an active subscription.

Create Generation

Generate speech audio from text using a voice clone.
trpc.generations.create.useMutation();
Requires an active subscription. Returns FORBIDDEN with message SUBSCRIPTION_REQUIRED if no active subscription exists.

Input Schema

text
string
required
Text to convert to speech. Must be between 1 and 5,000 characters.
voiceId
string
required
ID of the voice to use (can be SYSTEM or CUSTOM variant)
temperature
number
default:"0.8"
Controls randomness in generation. Range: 0.0 to 2.0
  • Lower values (0.0-0.5): More consistent, predictable output
  • Default (0.8): Balanced naturalness
  • Higher values (1.0-2.0): More varied, expressive output
topP
number
default:"0.95"
Nucleus sampling threshold. Range: 0.0 to 1.0Controls diversity by considering only the top probability mass.
topK
number
default:"1000"
Number of top tokens to consider. Range: 1 to 10,000Limits the sampling pool to the K most likely tokens.
repetitionPenalty
number
default:"1.2"
Penalty for repeating tokens. Range: 1.0 to 2.0
  • 1.0: No penalty (may lead to repetition)
  • 1.2: Default (balanced)
  • 2.0: Strong penalty (avoids repetition)

Response

id
string
required
Unique identifier for the created generation. Use this to retrieve the audio via /api/audio/{id}

Implementation

create: orgProcedure
  .input(
    z.object({
      text: z.string().min(1).max(TEXT_MAX_LENGTH),
      voiceId: z.string().min(1),
      temperature: z.number().min(0).max(2).default(0.8),
      topP: z.number().min(0).max(1).default(0.95),
      topK: z.number().min(1).max(10000).default(1000),
      repetitionPenalty: z.number().min(1).max(2).default(1.2),
    })
  )
  .mutation(async ({ input, ctx }) => {
    // Check for active subscription
    const customerState = await polar.customers.getStateExternal({
      externalId: ctx.orgId,
    });
    const hasActiveSubscription =
      (customerState.activeSubscriptions ?? []).length > 0;
    if (!hasActiveSubscription) {
      throw new TRPCError({
        code: "FORBIDDEN",
        message: "SUBSCRIPTION_REQUIRED",
      });
    }

    // Validate voice access
    const voice = await prisma.voice.findUnique({
      where: {
        id: input.voiceId,
        OR: [
          { variant: "SYSTEM" },
          { variant: "CUSTOM", orgId: ctx.orgId }
        ],
      },
    });

    if (!voice?.r2ObjectKey) {
      throw new TRPCError({
        code: "NOT_FOUND",
        message: "Voice not found",
      });
    }

    // Generate audio via Chatterbox API
    const { data, error } = await chatterbox.POST("/generate", {
      body: {
        prompt: input.text,
        voice_key: voice.r2ObjectKey,
        temperature: input.temperature,
        top_p: input.topP,
        top_k: input.topK,
        repetition_penalty: input.repetitionPenalty,
        norm_loudness: true,
      },
      parseAs: "arrayBuffer",
    });

    // Store in database and R2
    const generation = await prisma.generation.create({
      data: {
        orgId: ctx.orgId,
        text: input.text,
        voiceName: voice.name,
        voiceId: voice.id,
        temperature: input.temperature,
        topP: input.topP,
        topK: input.topK,
        repetitionPenalty: input.repetitionPenalty,
      },
    });

    await uploadAudio({ 
      buffer: Buffer.from(data), 
      key: `generations/orgs/${ctx.orgId}/${generation.id}` 
    });

    // Track usage in Polar (fire-and-forget)
    polar.events.ingest({
      events: [{
        name: "tts_generation",
        externalCustomerId: ctx.orgId,
        metadata: { characters: input.text.length },
        timestamp: new Date(),
      }],
    }).catch(() => {});

    return { id: generation.id };
  }),

Get All Generations

Retrieve all generations for the current organization.
trpc.generations.getAll.useQuery();

Response

generations
Generation[]
Array of generation objects ordered by creation date (newest first)

Implementation

getAll: orgProcedure.query(async ({ ctx }) => {
  const generations = await prisma.generation.findMany({
    where: { orgId: ctx.orgId },
    orderBy: { createdAt: "desc" },
    omit: {
      orgId: true,
      r2ObjectKey: true,
    },
  });

  return generations;
}),

Get Generation by ID

Retrieve a specific generation with audio URL.
trpc.generations.getById.useQuery({ id: "gen-123" });

Input Schema

id
string
required
Generation ID to retrieve

Response

generation
Generation
Generation object with all fields plus audioUrl

Implementation

getById: orgProcedure
  .input(z.object({ id: z.string() }))
  .query(async ({ input, ctx }) => {
    const generation = await prisma.generation.findUnique({
      where: { id: input.id, orgId: ctx.orgId },
      omit: {
        orgId: true,
        r2ObjectKey: true,
      },
    });

    if (!generation) {
      throw new TRPCError({ code: "NOT_FOUND" });
    }

    return {
      ...generation,
      audioUrl: `/api/audio/${generation.id}`,
    };
  }),

Error Codes

CodeDescriptionWhen It Occurs
UNAUTHORIZEDUser not authenticatedMissing or invalid session
FORBIDDENSubscription requiredNo active subscription (message: SUBSCRIPTION_REQUIRED)
NOT_FOUNDResource not foundVoice or generation doesn’t exist or not accessible
PRECONDITION_FAILEDVoice audio unavailableVoice exists but has no audio file
INTERNAL_SERVER_ERRORGeneration failedChatterbox API error or storage failure

Usage Tracking

Each successful generation triggers a usage event sent to Polar for billing:
polar.events.ingest({
  events: [{
    name: "tts_generation",
    externalCustomerId: ctx.orgId,
    metadata: { characters: input.text.length },
    timestamp: new Date(),
  }],
});
  • Event name: tts_generation
  • Metered by character count
  • Fire-and-forget (doesn’t block response)
  • Silent failure (doesn’t affect user experience)
See Generation Parameters for detailed guidance on tuning temperature, top-p, top-k, and repetition penalty.

Build docs developers (and LLMs) love