Skip to main content

Voice Options

VozCraft provides extensive voice options with support for 22+ languages across multiple regional dialects. This guide covers all available languages, voice types, and how to select the perfect voice for your content.

Overview

VozCraft leverages the Web Speech API to provide access to your system’s text-to-speech voices. The quality and characteristics of each voice depend on your operating system and browser, but VozCraft intelligently selects the best available voice based on your chosen language and voice type.
Voice Availability: The voices available to you depend on your operating system. Windows, macOS, iOS, and Android each provide different voice sets. VozCraft will automatically use the best available voice for your selected language.

Supported Languages

VozCraft supports 22 languages across 28 regional variants:

Spanish (🇪🇸 Español) - 6 Variants

Spanish is VozCraft’s most comprehensive language offering with six regional dialects:

🇲🇽 Español (México)

Mexican Spanish - es-MXThe default Spanish option, featuring:
  • Neutral Latin American pronunciation
  • Clear articulation ideal for learning
  • Widely understood across Latin America
Best for: Educational content, business applications, general audiences

🇪🇸 Español (España)

Castilian Spanish - es-ESEuropean Spanish featuring:
  • Distinctive “th” sound for C and Z
  • Formal European pronunciation
  • Traditional Spanish accent
Best for: European audiences, classical content, formal presentations

🇦🇷 Español (Argentina)

Argentine Spanish - es-ARDistinctive River Plate accent with:
  • “Sh” sound for LL and Y
  • Unique intonation patterns
  • Recognizable porteqo characteristics
Best for: Argentine content, tango culture, regional marketing

🇨🇴 Español (Colombia)

Colombian Spanish - es-COClear, neutral accent known for:
  • Excellent clarity and enunciation
  • Widely considered “neutral” Spanish
  • Easy to understand for all speakers
Best for: International content, training materials, dubbing

🇨🇱 Español (Chile)

Chilean Spanish - es-CLDistinctive Chilean variant with:
  • Unique pronunciation patterns
  • Regional expressions and rhythms
  • Characteristic vowel reduction
Best for: Chilean audiences, regional content, local marketing

🇻🇪 Español (Venezuela)

Venezuelan Spanish - es-VECaribbean Spanish featuring:
  • Warm, expressive intonation
  • Caribbean pronunciation traits
  • Distinctive Venezuelan characteristics
Best for: Venezuelan content, Caribbean region, regional media

English - 4 Variants

Comprehensive English coverage across major English-speaking regions:

🇺🇸 English (US)

American English - en-USStandard American pronunciation:
  • General American accent
  • Rhotic (pronounced R’s)
  • Most widely recognized globally
Best for: International content, business, technology, media

🇬🇧 English (UK)

British English - en-GBReceived Pronunciation (RP):
  • Classic British accent
  • Non-rhotic pronunciation
  • Formal, authoritative tone
Best for: UK audiences, formal content, classic literature

🇦🇺 English (AU)

Australian English - en-AUAustralian accent featuring:
  • Distinctive vowel sounds
  • Unique intonation patterns
  • Friendly, approachable tone
Best for: Australian content, regional marketing, casual content

🇮🇳 English (IN)

Indian English - en-INIndian English accent:
  • Clear articulation
  • Distinctive pronunciation patterns
  • Growing in global recognition
Best for: Indian audiences, technical content, customer service

Portuguese (🇵🇹 Português) - 2 Variants

Português (Brasil) - pt-BRBrazilian Portuguese features:
  • Open vowel sounds
  • Distinctive nasal pronunciation
  • More widely spoken variant (215+ million speakers)
  • Warmer, more melodic intonation
Key Differences from European Portuguese:
  • Different vowel pronunciation (especially unstressed vowels)
  • Distinct consonant sounds
  • Different vocabulary and expressions
  • More informal, friendly tone
Best for: Brazilian content, South American audiences, popular media

French (🇫🇷 Français) - 2 Variants

🇫🇷 Français (France)

French - fr-FRStandard French (Parisian):
  • Precise articulation
  • Classic French pronunciation
  • International standard
  • Elegant, formal tone
Best for: European audiences, formal content, education, literature

🇨🇦 Français (Canada)

French (Canada) - fr-CAQuébécois French:
  • Distinctive pronunciation
  • Unique vocabulary choices
  • North American context
  • Different intonation patterns
Best for: Canadian content, North American French markets, regional media

German (🇩🇪 Deutsch)

Deutsch - de-DE Standard High German:
  • Clear, precise pronunciation
  • Formal register
  • International German standard
  • Strong consonant articulation
Best for: German-speaking audiences (Germany, Austria, Switzerland), technical documentation, business content Note: While this voice is labeled de-DE (Germany), it’s generally understood across all German-speaking regions.

Italian (🇮🇹 Italiano)

Italiano - it-IT Standard Italian:
  • Melodic, musical intonation
  • Clear vowel pronunciation
  • Smooth consonant delivery
  • Expressive rhythm
Best for: Italian content, culinary content, art and culture, Mediterranean audiences Characteristics: Italian TTS voices typically have excellent prosody due to the language’s inherent musicality.

Japanese (🇯🇵 日本語)

日本語 - ja-JP Japanese synthesis:
  • Pitch accent rendering
  • Mora-timed rhythm
  • Formal/informal register (system-dependent)
  • Clear consonant-vowel pairs
Best for: Japanese content, anime/manga, business, technology Note: Japanese TTS quality varies significantly by platform. iOS/macOS typically provide high-quality Japanese voices.

Chinese (🇨🇳 中文)

中文 (普通话) - Mandarin Chinese - zh-CN Mandarin synthesis:
  • Tone reproduction (4 tones + neutral)
  • Pinyin-based pronunciation
  • Simplified character support
  • Standard Mandarin accent
Best for: Chinese content, education, business, technology Important: Tone accuracy varies by TTS engine. Modern systems (iOS 14+, Windows 10+) provide good tone rendering.

Russian (🇷🇺 Русский)

Русский - ru-RU Russian synthesis:
  • Cyrillic alphabet support
  • Stress-based pronunciation
  • Hard/soft consonant distinction
  • Characteristic Russian rhythm
Best for: Russian content, Eastern European audiences, technical content Characteristics: Russian TTS handles stress patterns and consonant palatalization with varying accuracy depending on the system.

Arabic (🇸🇦 العربية)

العربية - ar-SA Arabic synthesis (Saudi/Modern Standard Arabic):
  • Right-to-left text rendering
  • Emphatic consonant distinction
  • Classical Arabic pronunciation
  • Formal register
Best for: Arabic content, Middle Eastern audiences, formal documents, news Note: This voice uses Modern Standard Arabic, understood across all Arabic-speaking regions.

Hindi (🇮🇳 हिन्दी)

हिन्दी - hi-IN Hindi synthesis:
  • Devanagari script support
  • Aspirated consonant distinction
  • Retroflex sound rendering
  • Standard Hindi pronunciation
Best for: Indian content, South Asian audiences, Bollywood-related content Characteristics: Hindi TTS quality has improved significantly with modern systems supporting proper consonant and vowel distinctions.

Turkish (🇹🇷 Türkçe)

Türkçe - tr-TR Turkish synthesis:
  • Vowel harmony rendering
  • Clear agglutination handling
  • Distinctive Turkish phonemes
  • Modern Turkish pronunciation
Best for: Turkish content, regional marketing, education Characteristics: Turkish has regular pronunciation rules, making TTS synthesis generally accurate.

Voice Types

In addition to language selection, VozCraft offers two voice type options that modify the pitch and rate characteristics:

🔉 Normal Voice (Voz Normal)

Technical Parameters:
{
  pitch: 0.75,
  rateAdd: -0.05,
  description: "Normal Voice"
}
Characteristics:
  • Lower pitch: 0.75 (25% below baseline)
  • Slightly slower rate: Base rate - 0.05
  • Balanced tone: Professional, neutral sound
  • Best for: Most content types, professional use, learning
Sound Profile:
  • Deeper, more authoritative voice
  • Clear articulation
  • Natural, comfortable listening experience
  • Suitable for extended listening sessions
Use Cases:
  • Business presentations
  • Educational content
  • Audiobooks and long-form content
  • News and informational content
  • Professional communications
  • Documentation and technical content

🔊 High-pitched Voice (Voz Aguda)

Technical Parameters:
{
  pitch: 1.30,
  rateAdd: 0.05,
  description: "High-pitched Voice"
}
Characteristics:
  • Higher pitch: 1.30 (30% above baseline)
  • Slightly faster rate: Base rate + 0.05
  • Energetic tone: Bright, lively sound
  • Best for: Upbeat content, children’s content, marketing
Sound Profile:
  • Lighter, more expressive voice
  • Energetic delivery
  • Attention-grabbing quality
  • Youthful, dynamic character
Use Cases:
  • Children’s content and stories
  • Marketing and promotional material
  • Upbeat announcements
  • Entertainment content
  • Character voices
  • Energetic presentations

Voice Type and Gender Selection

VozCraft intelligently attempts to match system voices to your selected voice type:

Gender Matching Algorithm

// VozCraft searches for appropriate voice characteristics
const wantFemale = generoLabel === 'Voz Aguda';

// For High-pitched Voice, searches for:
- "female", "woman", "girl", "femenin"
- Specific names: "paulina", "mónica", "lucia", 
  "valentina", "rosa", "samantha", "karen", 
  "alice", "milena"

// For Normal Voice, searches for:
- "male", "man", "guy", "masculin"
- Specific names: "jorge", "carlos", "diego", 
  "miguel", "alex", "daniel", "thomas", 
  "james", "mark"
Automatic Selection: VozCraft automatically selects the best matching voice from your system’s available voices. If no gender-matched voice is found, it falls back to the default voice for the selected language.

Language Groups

VozCraft internally organizes voices into language groups for better UI organization:
GroupLanguagesPurpose
esAll Spanish variantsSpanish language family
enAll English variantsEnglish language family
ptPortuguese variantsPortuguese language family
frFrench variantsFrench language family
deGermanGermanic languages
itItalianRomance languages
defaultAll other languagesStandalone languages
This grouping affects the UI display but doesn’t impact audio generation.

Voice Selection Best Practices

Choosing the Right Language

1

Match Your Audience

Select the language and regional variant that matches your target audience:
  • Use es-MX for general Latin American content
  • Use es-ES for Spain and European Spanish markets
  • Use en-US for international English content
  • Use regional variants when targeting specific countries
2

Consider Voice Availability

Some languages have better voice support on certain platforms:
  • Best support: English (US/UK), Spanish (Mexico/Spain)
  • Good support: French, German, Italian, Portuguese, Japanese
  • Variable support: Arabic, Hindi, Turkish, Russian
Test your selected language to ensure quality meets your needs.
3

Test Regional Differences

Regional accents can significantly impact comprehension:
  • Argentine Spanish sounds very different from Mexican Spanish
  • British English may be harder for non-native speakers than US English
  • Brazilian and European Portuguese are quite distinct

Choosing Voice Type

  • Professional content: Business reports, formal presentations
  • Long-form content: Audiobooks, articles, documentation
  • Educational material: Lectures, tutorials, courses
  • News and information: Journalism, updates, bulletins
  • Technical content: Instructions, specifications, guides
  • Serious topics: Legal, medical, academic content
Why: The lower pitch (0.75) provides an authoritative, trustworthy tone that’s comfortable for extended listening.

Platform-Specific Voice Information

Windows

Windows 10/11 provides comprehensive voice support:
Pre-installed:
  • English (US): David, Zira
  • English (UK): Hazel, George
  • Spanish (Spain): Helena, Pablo
  • French: Hortense, Paul
  • German: Hedda, Stefan
Downloadable (Settings > Time & Language > Speech):
  • Additional language packs available
  • Higher quality voices for major languages
  • Neural voices on Windows 11
Quality: Good to excellent, especially on Windows 11 with neural voices

macOS / iOS

Apple platforms offer high-quality voices:
Pre-installed:
  • English: Alex (macOS), Samantha (iOS)
  • Spanish: Paulina (Latin America), Monica (Spain)
  • French: Thomas
  • German: Anna
  • Italian: Alice
  • Japanese: Kyoko
  • Many others
Enhanced Voices (downloadable):
  • Higher quality neural voices
  • Better prosody and naturalness
  • Larger file sizes (200-800MB)
Quality: Excellent, among the best TTS voices availableAccess: System Preferences > Accessibility > Spoken Content > System Voice

Android

Android uses Google TTS:
Google TTS Engine:
  • Supports 40+ languages
  • Multiple voice options per language
  • Regular quality improvements
  • Cloud-connected for best quality
Voice Installation:
  • Settings > System > Languages & input > Text-to-speech
  • Download language data as needed
  • Some voices require internet connection
Quality: Very good, especially for major languages

Linux

Linux voice support varies:
Common Engines:
  • eSpeak: Lightweight, all platforms (lower quality)
  • Festival: Open source TTS (moderate quality)
  • Pico TTS: SVOX engine (good quality)
  • Google TTS: Via Chrome browser (excellent quality)
Browser Support:
  • Chrome/Chromium: Full support with Google voices
  • Firefox: Limited support, uses system voices
Quality: Variable, Chrome provides best experience

Voice Quality Optimization

Getting the Best Results

Use Major Languages

Languages with larger speaker populations (English, Spanish, Mandarin) typically have:
  • More investment in TTS development
  • Better voice quality and naturalness
  • More voice options to choose from
  • Better accent and prosody handling

Update Your System

Newer operating system versions include:
  • Improved TTS engines
  • Neural network-based voices
  • Better prosody and intonation
  • More natural breathing and pauses
Consider updating to Windows 11, macOS 13+, or iOS 15+ for best quality.

Download Enhanced Voices

Most systems offer downloadable high-quality voices:
  • Larger file size (100-800MB)
  • Significantly better naturalness
  • Improved emotional expression
  • Worth the download for frequent use

Use Chrome or Edge

Chromium-based browsers typically provide:
  • Best Web Speech API support
  • Access to cloud-enhanced voices
  • More consistent voice quality
  • Better cross-platform compatibility

Troubleshooting Voice Issues

Common Issue: “The voice sounds wrong for my selected language”Cause: Your system doesn’t have a voice installed for that language.Solution: Install the language pack for your operating system or try a different language that’s better supported on your device.

Voice Not Available

1

Check System Voices

Verify what voices are installed on your system:
  • Windows: Settings > Time & Language > Speech
  • macOS: System Preferences > Accessibility > Spoken Content
  • Android: Settings > Languages & input > Text-to-speech
2

Install Missing Languages

Download the language pack for your desired language:
  • Follow your system’s language installation procedure
  • Some languages require large downloads (500MB+)
  • Restart your browser after installation
3

Try Alternative Variant

If one regional variant doesn’t work, try another:
  • Try en-GB if en-US fails
  • Try es-ES if es-MX fails
  • Try pt-PT if pt-BR fails

Voice Quality Issues

  • Try using Normal Voice type instead of High-pitched
  • Select Neutral mood for most natural results
  • Use Normal speed (1.00x)
  • Download enhanced/neural voices for your system
  • Try a different language that has better voice support on your platform
  • Verify you selected the correct language variant
  • Check if your system has multiple voices for that language (it may be using a different one)
  • Try generating audio at different times (browser sometimes switches voices)
  • Consider using a different browser (Chrome usually has best support)
  • Volume is primarily controlled by the Mood setting
  • Melancholic has lowest volume (88%)
  • Neutral/Happy/Enthusiastic/Energetic have full volume (100%)
  • Adjust your system volume if needed
  • Check browser audio permissions

Next Steps

Now that you understand VozCraft’s voice options, explore how to customize your audio further:

Customization

Learn about speed, pitch, and mood controls

Audio Export

Export your audio in MP3 or WAV format

Voice Settings Guide

Detailed guide for optimal voice configuration

Build docs developers (and LLMs) love