voices.yaml configuration file, which combines voice samples with conversation instructions.
Understanding Character Configuration
Each character in Unmute consists of two main components:- Voice Sample: An audio file that defines how the character sounds
- Instructions: System prompts that define the character’s personality and behavior
The voices.yaml File
Thevoices.yaml file is located in the root of the Unmute repository. Each character is defined with the following structure:
Configuration Fields
Display name for the character that appears in the UI
Quality flag - set to
true to mark as production-ready, false for testing, or null for undecidedInternal notes about the character (e.g., “man, UK, skeptical”)
Defines the character’s conversation style and personality. See Instruction Types below.
Specifies the voice audio source. Can be either
file or freesound type.Instruction Types
Unmute provides several built-in instruction types that define character behavior:Smalltalk
For general conversation with context-aware greetings:Constant
For custom character personalities with specific instructions:Quiz Show
For a game show host personality:News
For tech news discussions using live headlines from The Verge:NEWSAPI_API_KEY environment variable to be set.
Guess Animal
For playing a guessing game:Unmute Explanation
For answering questions about Unmute itself:Voice Sources
Unmute supports multiple ways to source voice samples:File-based Voices
Reference voice files stored on the server:Freesound Voices
Automatically download voices from Freesound.org:Available Voices
Browse the Kyutai TTS voice repository for hundreds of available voices, including:- Voices from open datasets (VCTK, Expresso, etc.)
- Community-donated voices from the Voice Donation Project
- Freesound.org voice samples
path_on_server to the relative path:
Adding a New Character
Choose or Create a Voice Sample
Select a voice from the voice repository or prepare your own 10-second audio sample in WAV or MP3 format.
Edit voices.yaml
Open
voices.yaml in the root of the Unmute repository and add your character definition:Language Support
The TTS model supports English and French. Configure language support using thelanguage field:
en: Primarily English (default)fr: Primarily Frenchen/fr: Fluent in both, English preferencefr/en: Fluent in both, French preference
Advanced Customization
For more complex character behaviors, you can create custom instruction types by modifyingunmute/llm/system_prompt.py. See the existing instruction classes for examples:
SmalltalkInstructions: Adds dynamic date/time contextQuizShowInstructions: Randomly selects questionsNewsInstructions: Fetches live news data
make_system_prompt() method that returns the final system prompt string.