Skip to main content
speak-mintlify uses Fish Audio to generate high-quality, natural-sounding TTS audio for your documentation.

Why Fish Audio?

Fish Audio provides:
  • Affordable pricing - Cost-effective compared to other TTS providers
  • Natural voices - High-quality, human-like speech
  • Multiple voices - Wide selection of voices in different languages
  • Developer-friendly - Simple API with good documentation
  • Fast generation - Quick audio generation with built-in retry handling

Getting Your API Key

1

Create an account

Visit fish.audio and sign up for an account.
2

Navigate to API settings

Once logged in, go to your account settings or dashboard to find the API section.
3

Generate API key

Create a new API key for your documentation project. Give it a descriptive name like “docs-tts” to identify it later.
4

Save your key securely

Copy the API key and store it securely. You’ll need it for your .env file or CI/CD secrets.
.env
FISH_API_KEY=your_api_key_here
Never commit your API key to version control. Always use environment variables or secrets management.

Choosing Voices

Fish Audio offers a variety of voices. Browse available voices on the Fish Audio platform.

Finding Voice IDs

1

Browse voice library

Explore the Fish Audio voice library to find voices that match your brand and audience.
2

Preview voices

Listen to voice samples to find the right tone and style for your documentation.
3

Copy voice IDs

Each voice has a unique ID (a long hexadecimal string). Copy the IDs of voices you want to use.Example voice IDs:
  • 8ef4a238714b45718ce04243307c57a7
  • bf322df2096a46f18c579d0baa36f41d
  • 933563129e564b19a115bedd57b7406a

Configuring Voices

Add voices to your speaker-config.yaml:
speaker-config.yaml
voices:
  8ef4a238714b45718ce04243307c57a7: Professional Female
  bf322df2096a46f18c579d0baa36f41d: Casual Male
  933563129e564b19a115bedd57b7406a: Energetic Narrator
  b347db033a6549378b48d00acb0d06cd: Technical Guide
Voice names (right side) are what users see in the audio player. Choose descriptive names that help users select the right voice for them.

Voice Configuration Examples

For a consistent experience, use a single voice:
speaker-config.yaml
voices:
  8ef4a238714b45718ce04243307c57a7: Documentation Voice
Users won’t see a voice selector - the audio player will use this voice automatically.

Testing Voice Configuration

Before committing to production, test your voice configuration:
# Preview what will be generated without making changes
npx speak-mintlify generate . --dry-run --verbose

API Usage and Rate Limits

Monitor your Fish Audio API usage to avoid unexpected costs or rate limiting.

Estimating Costs

Fish Audio charges based on characters processed. Estimate your costs:
  1. Count documentation characters: Run with --verbose to see extracted text
  2. Multiply by voices: Each voice generates separate audio
  3. Check pricing: Visit Fish Audio pricing page for current rates
# See how much text will be processed
npx speak-mintlify generate . --dry-run --verbose

Optimizing Usage

Smart Regeneration

speak-mintlify only regenerates audio when content changes, saving API calls and costs.

Limit Voices

Start with 1-2 voices. Add more only if users request variety.

Exclude Pages

Use .speakignore to skip API reference, snippets, or other pages that don’t benefit from audio.

File Patterns

Use --pattern to generate audio only for specific sections.

Using .speakignore

Exclude files and directories from TTS generation:
.speakignore
# Exclude API reference (better suited for interactive docs)
api-reference/**

# Exclude snippets (reusable components)
snippets/**

# Exclude changelog
CHANGELOG.md

# Exclude drafts
drafts/**
todo/**

# Exclude specific pages
terms-of-service.mdx
privacy-policy.mdx

Troubleshooting

Common issues:
  • Key not set in environment: Check .env file or CI/CD secrets
  • Typo in key: Copy-paste carefully, avoid trailing spaces
  • Key expired or revoked: Generate a new key
  • Account not activated: Complete Fish Audio account setup
Test your key:
# Should show error if key is invalid
npx speak-mintlify generate . --pattern "test.mdx"
If you get voice not found errors:
  • Verify voice IDs are copied correctly (case-sensitive)
  • Check that voices haven’t been removed from Fish Audio
  • Ensure voice IDs match your account/plan
  • Try generating with a single known-good voice first
speak-mintlify has built-in retry logic (3 attempts), but if issues persist:
  • Check Fish Audio status page for outages
  • Verify your network connection
  • Check API rate limits
  • Try again during off-peak hours
The tool will show retry attempts:
TTS attempt 1 failed. 2 retries left.
If generated audio sounds unnatural:
  • Try different voices - quality varies
  • Check source text for markdown artifacts
  • Ensure text is properly extracted (use --verbose)
  • Report persistent issues to Fish Audio support
If the audio player doesn’t show all voices:
  • Check speaker-config.yaml syntax (proper YAML format)
  • Verify voice IDs are valid hexadecimal strings
  • Ensure voice names don’t contain special characters
  • Check browser console for errors

Advanced Configuration

CLI Override

Override speaker-config.yaml from the command line:
npx speak-mintlify generate . \
  --voices "8ef4a238714b45718ce04243307c57a7,bf322df2096a46f18c579d0baa36f41d" \
  --voice-names "Production Voice,Testing Voice" \
  --api-key "your_api_key"
CLI flags take priority over config files, useful for testing or one-off generation.

Environment Variables

Set Fish API key via environment variable:
# Local development
export FISH_API_KEY=your_key
npx speak-mintlify generate .

# One-time use
FISH_API_KEY=your_key npx speak-mintlify generate .

Best Practices

1

Start simple

Begin with 1-2 voices. Add more based on user feedback.
2

Use descriptive names

Help users choose voices with clear, descriptive names (“Professional Female” vs “Voice 1”).
3

Test before deploying

Always use --dry-run to preview changes before generating audio.
4

Monitor usage

Keep track of API usage and costs, especially with multiple voices.
5

Exclude unnecessary pages

Use .speakignore to skip pages that don’t need audio.

Next Steps

S3 Setup

Configure storage for your generated audio files

Customization

Customize audio player appearance and behavior

Build docs developers (and LLMs) love