Skip to main content
NVDA provides comprehensive speech synthesis support through multiple speech drivers, allowing users to choose from various synthesizers based on their preferences and system configuration.

Overview

Speech synthesis in NVDA converts text and interface information into spoken audio output. NVDA supports several speech synthesizer technologies, from built-in Windows voices to open-source solutions.

Supported Speech Synthesizers

eSpeak NG

Open-source multilingual synthesizer with support for over 100 languages. Default synthesizer for NVDA.

Microsoft Speech API (SAPI 5)

Native Windows speech synthesizer supporting third-party SAPI 5 voices and system-installed voices.

Windows OneCore

Modern Windows 10+ synthesizer using OneCore voices with natural-sounding speech.

Microsoft Speech API 4 (SAPI 4)

Legacy 32-bit synthesizer for compatibility with older SAPI 4 voices.

eSpeak NG

eSpeak NG is NVDA’s default speech synthesizer, providing multilingual support out of the box.

Features

  • Open Source: Free and actively maintained
  • Multilingual: Supports 100+ languages including:
    • English (en-gb, en-us, en-029)
    • Spanish (es), French (fr-fr), German (de)
    • Russian (ru), Chinese, Japanese, Korean
    • And many more regional variants
  • Voice Variants: Multiple voice personalities
  • Fast Response: Low latency for quick feedback
  • Customizable: Extensive control over speech parameters

Configurable Settings

  • Voice: Select from available language voices
  • Variant: Choose voice personality (male, female, whisper, etc.)
  • Rate: Speech speed (adjustable from very slow to very fast)
  • Rate Boost: Enable even faster speech rates
  • Pitch: Voice pitch level
  • Inflection: Amount of pitch variation
  • Volume: Output volume level

Language Support

eSpeak NG automatically switches languages based on text content. Supported languages include:
af (Afrikaans), an (Aragonese), cy (Welsh), de (German),
en-gb (English UK), en-us (English US), es (Spanish),
eu (Basque), fi (Finnish), fr-fr (French), grc (Greek),
gu (Gujarati), hu (Hungarian), ia (Interlingua),
it (Italian), ja (Japanese), kk (Kazakh), ko (Korean),
ms (Malay), nci (Nahuatl), nog (Nogai), om (Oromo),
pt (Portuguese), ru (Russian), uk (Ukrainian),
vi-vn (Vietnamese), zh (Chinese), and many more...
eSpeak NG is included with NVDA and requires no additional installation. It’s the recommended synthesizer for new users.

SAPI 5 (Speech API Version 5)

SAPI 5 is Microsoft’s speech API that allows NVDA to use third-party and system-installed voices.

Features

  • Third-Party Voice Support: Compatible with commercial voices (Eloquence, Vocalizer, etc.)
  • System Integration: Uses voices installed on Windows
  • High Quality: Often more natural-sounding than eSpeak
  • Commercial Options: Supports premium voice packages

Supported Commands

The SAPI 5 driver supports advanced speech commands:
  • Index commands for synchronization
  • Character mode for spelling
  • Language switching
  • Break/pause commands
  • Real-time pitch, rate, and volume adjustments
  • Phoneme commands for pronunciation

Configuration

The SAPI 5 driver provides full control over voice parameters through NVDA’s speech settings dialog:
  1. Open NVDA menu → Preferences → Settings → Speech
  2. Select “Microsoft Speech API version 5” as synthesizer
  3. Choose your installed SAPI 5 voice
  4. Adjust rate, pitch, volume, and inflection
SAPI 5 requires compatible voices to be installed on your system. Windows comes with some voices, but additional voices must be purchased or downloaded separately.

Windows OneCore Voices

Windows OneCore voices are modern, natural-sounding synthesizers available in Windows 10 and later.

Features

  • Natural Speech: Neural network-based voices with human-like quality
  • Native Integration: Built into Windows 10/11
  • Multiple Languages: Various language and regional voice options
  • No Installation: Available immediately on compatible systems
  • Prosody Support: Advanced intonation and rhythm control

Technical Details

The OneCore driver uses Windows Speech Platform (oneCore API version 5+) for:
  • SSML (Speech Synthesis Markup Language) conversion
  • Real-time prosody adjustments (rate, pitch, volume)
  • Language detection and automatic switching
  • Speech event synchronization

Available Voices

OneCore voices are installed with Windows language packs. Examples include:
  • English: Microsoft David, Microsoft Zira, Microsoft Mark
  • Spanish: Microsoft Helena, Microsoft Laura
  • French: Microsoft Hortense, Microsoft Paul
  • German: Microsoft Hedda, Microsoft Katja
  • And many more regional variants
To add more OneCore voices:
  1. Open Windows Settings → Time & Language → Language
  2. Add a language or click on an installed language
  3. Click Options → Speech
  4. Download available speech voices
  5. Restart NVDA to detect new voices

SAPI 4 (Speech API Version 4)

A legacy 32-bit speech synthesizer driver for compatibility with older SAPI 4 voices.

Features

  • Legacy Support: Maintains compatibility with older voice software
  • 32-bit Bridge: Runs through a 32-bit proxy process
  • Historical Compatibility: Supports vintage assistive technology voices
SAPI 4 is a legacy technology. It requires 32-bit SAPI 4 voices to be installed and registered in the Windows registry. Only use this if you have specific older voices that require SAPI 4.

Silence (No Speech)

NVDA includes a “silence” driver that disables all speech output. This is useful for:
  • Braille-only users
  • Testing and debugging
  • Demonstrations without audio
  • Silent operation environments

Changing Speech Synthesizers

To change your speech synthesizer:
  1. Press NVDA+Control+S (or open NVDA menu → Preferences → Settings)
  2. Select “Speech” category
  3. Choose your synthesizer from the dropdown
  4. Select voice and adjust settings
  5. Click OK to apply
You can also cycle through synthesizers using the Synth Settings Ring (NVDA+Control+Arrow Keys).

Speech Parameters

All NVDA synthesizers support these core parameters:

Rate

Control how fast speech is delivered (words per minute)

Pitch

Adjust voice pitch frequency (higher or lower)

Volume

Set speech output volume level

Inflection

Control pitch variation and intonation

Advanced Features

Automatic Language Switching

When enabled, NVDA automatically switches synthesizer voice when encountering text in different languages. This requires:
  • Multi-lingual synthesizer (eSpeak NG or OneCore)
  • Language information in document markup
  • Compatible voice for target language

Character Mode

Synthesizers can switch to character mode for spelling words letter-by-letter with appropriate pronunciation.

Audio Ducking

NVDA can reduce the volume of other applications (audio ducking) when speaking, ensuring speech is always audible.

Developer Information

NVDA’s speech synthesizer drivers are located in:
source/synthDrivers/
├── espeak.py          # eSpeak NG driver
├── sapi5.py           # SAPI 5 driver  
├── oneCore.py         # Windows OneCore driver
├── sapi4_32.py        # SAPI 4 32-bit driver
└── silence.py         # No speech driver
Each driver implements the SynthDriver interface defined in synthDriverHandler.py.

Build docs developers (and LLMs) love