Fish Audio Setup

speak-mintlify uses Fish Audio to generate high-quality, natural-sounding TTS audio for your documentation.

Why Fish Audio?

Fish Audio provides:

Affordable pricing - Cost-effective compared to other TTS providers
Natural voices - High-quality, human-like speech
Multiple voices - Wide selection of voices in different languages
Developer-friendly - Simple API with good documentation
Fast generation - Quick audio generation with built-in retry handling

Getting Your API Key

Create an account

Visit fish.audio and sign up for an account.

Navigate to API settings

Once logged in, go to your account settings or dashboard to find the API section.

Generate API key

Create a new API key for your documentation project. Give it a descriptive name like “docs-tts” to identify it later.

Save your key securely

Copy the API key and store it securely. You’ll need it for your .env file or CI/CD secrets.

.env

FISH_API_KEY=your_api_key_here

Never commit your API key to version control. Always use environment variables or secrets management.

Choosing Voices

Fish Audio offers a variety of voices. Browse available voices on the Fish Audio platform.

Finding Voice IDs

Browse voice library

Explore the Fish Audio voice library to find voices that match your brand and audience.

Preview voices

Listen to voice samples to find the right tone and style for your documentation.

Copy voice IDs

Each voice has a unique ID (a long hexadecimal string). Copy the IDs of voices you want to use.Example voice IDs:

8ef4a238714b45718ce04243307c57a7
bf322df2096a46f18c579d0baa36f41d
933563129e564b19a115bedd57b7406a

Configuring Voices

Add voices to your speaker-config.yaml:

speaker-config.yaml

voices:
  8ef4a238714b45718ce04243307c57a7: Professional Female
  bf322df2096a46f18c579d0baa36f41d: Casual Male
  933563129e564b19a115bedd57b7406a: Energetic Narrator
  b347db033a6549378b48d00acb0d06cd: Technical Guide

Voice names (right side) are what users see in the audio player. Choose descriptive names that help users select the right voice for them.

Voice Configuration Examples

Single Voice
Multiple Voices
Multilingual
Branded

For a consistent experience, use a single voice:

speaker-config.yaml

voices:
  8ef4a238714b45718ce04243307c57a7: Documentation Voice

Users won’t see a voice selector - the audio player will use this voice automatically.

Offer variety with multiple voices:

speaker-config.yaml

voices:
  8ef4a238714b45718ce04243307c57a7: Sarah
  bf322df2096a46f18c579d0baa36f41d: Adrian
  933563129e564b19a115bedd57b7406a: Alex

The audio player will show a dropdown for voice selection.

Support multiple languages:

speaker-config.yaml

voices:
  8ef4a238714b45718ce04243307c57a7: English (US)
  a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6: Spanish (ES)
  b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7: French (FR)
  c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8: German (DE)

Make sure your MDX content is in the appropriate language for each voice.

Use custom names that match your brand:

speaker-config.yaml

voices:
  8ef4a238714b45718ce04243307c57a7: Tech Guide
  bf322df2096a46f18c579d0baa36f41d: Developer Buddy
  933563129e564b19a115bedd57b7406a: Quick Start Voice

Testing Voice Configuration

Before committing to production, test your voice configuration:

# Preview what will be generated without making changes
npx speak-mintlify generate . --dry-run --verbose

API Usage and Rate Limits

Monitor your Fish Audio API usage to avoid unexpected costs or rate limiting.

Estimating Costs

Fish Audio charges based on characters processed. Estimate your costs:

Count documentation characters: Run with --verbose to see extracted text
Multiply by voices: Each voice generates separate audio
Check pricing: Visit Fish Audio pricing page for current rates

# See how much text will be processed
npx speak-mintlify generate . --dry-run --verbose

Optimizing Usage

Smart Regeneration

speak-mintlify only regenerates audio when content changes, saving API calls and costs.

Limit Voices

Start with 1-2 voices. Add more only if users request variety.

Exclude Pages

Use .speakignore to skip API reference, snippets, or other pages that don’t benefit from audio.

File Patterns

Use --pattern to generate audio only for specific sections.

Using .speakignore

Exclude files and directories from TTS generation:

.speakignore

# Exclude API reference (better suited for interactive docs)
api-reference/**

# Exclude snippets (reusable components)
snippets/**

# Exclude changelog
CHANGELOG.md

# Exclude drafts
drafts/**
todo/**

# Exclude specific pages
terms-of-service.mdx
privacy-policy.mdx

Troubleshooting

API key not working

Common issues:

Key not set in environment: Check .env file or CI/CD secrets
Typo in key: Copy-paste carefully, avoid trailing spaces
Key expired or revoked: Generate a new key
Account not activated: Complete Fish Audio account setup

Test your key:

# Should show error if key is invalid
npx speak-mintlify generate . --pattern "test.mdx"

Voice IDs not found

If you get voice not found errors:

Verify voice IDs are copied correctly (case-sensitive)
Check that voices haven’t been removed from Fish Audio
Ensure voice IDs match your account/plan
Try generating with a single known-good voice first

TTS generation fails intermittently

speak-mintlify has built-in retry logic (3 attempts), but if issues persist:

Check Fish Audio status page for outages
Verify your network connection
Check API rate limits
Try again during off-peak hours

The tool will show retry attempts:

TTS attempt 1 failed. 2 retries left.

Audio quality issues

If generated audio sounds unnatural:

Try different voices - quality varies
Check source text for markdown artifacts
Ensure text is properly extracted (use --verbose)
Report persistent issues to Fish Audio support

Missing voices in audio player

If the audio player doesn’t show all voices:

Check speaker-config.yaml syntax (proper YAML format)
Verify voice IDs are valid hexadecimal strings
Ensure voice names don’t contain special characters
Check browser console for errors

Advanced Configuration

CLI Override

Override speaker-config.yaml from the command line:

npx speak-mintlify generate . \
  --voices "8ef4a238714b45718ce04243307c57a7,bf322df2096a46f18c579d0baa36f41d" \
  --voice-names "Production Voice,Testing Voice" \
  --api-key "your_api_key"

CLI flags take priority over config files, useful for testing or one-off generation.

Environment Variables

Set Fish API key via environment variable:

# Local development
export FISH_API_KEY=your_key
npx speak-mintlify generate .

# One-time use
FISH_API_KEY=your_key npx speak-mintlify generate .

Best Practices

Start simple

Begin with 1-2 voices. Add more based on user feedback.

Use descriptive names

Help users choose voices with clear, descriptive names (“Professional Female” vs “Voice 1”).

Test before deploying

Always use --dry-run to preview changes before generating audio.

Monitor usage

Keep track of API usage and costs, especially with multiple voices.

Exclude unnecessary pages

Use .speakignore to skip pages that don’t need audio.

Next Steps

S3 Setup

Configure storage for your generated audio files

Customization

Customize audio player appearance and behavior

Get Started

Configuration

Commands

Guides

Reference

Why Fish Audio?

Getting Your API Key

Choosing Voices

Finding Voice IDs

Configuring Voices

Voice Configuration Examples

Testing Voice Configuration

API Usage and Rate Limits

Estimating Costs

Optimizing Usage

Smart Regeneration

Limit Voices

Exclude Pages

File Patterns

Using .speakignore

Troubleshooting

Advanced Configuration

CLI Override

Environment Variables

Best Practices

Next Steps

S3 Setup

Customization

Build docs developers (and LLMs) love

Get Started

Configuration

Commands

Guides

Reference

​Why Fish Audio?

​Getting Your API Key

​Choosing Voices

​Finding Voice IDs

​Configuring Voices

​Voice Configuration Examples

​Testing Voice Configuration

​API Usage and Rate Limits

​Estimating Costs

​Optimizing Usage

Smart Regeneration

Limit Voices

Exclude Pages

File Patterns

​Using .speakignore

​Troubleshooting

​Advanced Configuration

​CLI Override

​Environment Variables

​Best Practices

​Next Steps

S3 Setup

Customization

Build docs developers (and LLMs) love

Why Fish Audio?

Getting Your API Key

Choosing Voices

Finding Voice IDs

Configuring Voices

Voice Configuration Examples

Testing Voice Configuration

API Usage and Rate Limits

Estimating Costs

Optimizing Usage

Using .speakignore

Troubleshooting

Advanced Configuration

CLI Override

Environment Variables

Best Practices

Next Steps