Speech Synthesis

NVDA provides comprehensive speech synthesis support through multiple speech drivers, allowing users to choose from various synthesizers based on their preferences and system configuration.

Overview

Speech synthesis in NVDA converts text and interface information into spoken audio output. NVDA supports several speech synthesizer technologies, from built-in Windows voices to open-source solutions.

Supported Speech Synthesizers

eSpeak NG

Open-source multilingual synthesizer with support for over 100 languages. Default synthesizer for NVDA.

Microsoft Speech API (SAPI 5)

Native Windows speech synthesizer supporting third-party SAPI 5 voices and system-installed voices.

Windows OneCore

Modern Windows 10+ synthesizer using OneCore voices with natural-sounding speech.

Microsoft Speech API 4 (SAPI 4)

Legacy 32-bit synthesizer for compatibility with older SAPI 4 voices.

eSpeak NG

eSpeak NG is NVDA’s default speech synthesizer, providing multilingual support out of the box.

Features

Open Source: Free and actively maintained
Multilingual: Supports 100+ languages including:
- English (en-gb, en-us, en-029)
- Spanish (es), French (fr-fr), German (de)
- Russian (ru), Chinese, Japanese, Korean
- And many more regional variants
Voice Variants: Multiple voice personalities
Fast Response: Low latency for quick feedback
Customizable: Extensive control over speech parameters

Configurable Settings

Voice Settings

Voice: Select from available language voices
Variant: Choose voice personality (male, female, whisper, etc.)
Rate: Speech speed (adjustable from very slow to very fast)
Rate Boost: Enable even faster speech rates
Pitch: Voice pitch level
Inflection: Amount of pitch variation
Volume: Output volume level

Language Support

eSpeak NG automatically switches languages based on text content. Supported languages include:

af (Afrikaans), an (Aragonese), cy (Welsh), de (German),
en-gb (English UK), en-us (English US), es (Spanish),
eu (Basque), fi (Finnish), fr-fr (French), grc (Greek),
gu (Gujarati), hu (Hungarian), ia (Interlingua),
it (Italian), ja (Japanese), kk (Kazakh), ko (Korean),
ms (Malay), nci (Nahuatl), nog (Nogai), om (Oromo),
pt (Portuguese), ru (Russian), uk (Ukrainian),
vi-vn (Vietnamese), zh (Chinese), and many more...

eSpeak NG is included with NVDA and requires no additional installation. It’s the recommended synthesizer for new users.

SAPI 5 (Speech API Version 5)

SAPI 5 is Microsoft’s speech API that allows NVDA to use third-party and system-installed voices.

Features

Third-Party Voice Support: Compatible with commercial voices (Eloquence, Vocalizer, etc.)
System Integration: Uses voices installed on Windows
High Quality: Often more natural-sounding than eSpeak
Commercial Options: Supports premium voice packages

Supported Commands

The SAPI 5 driver supports advanced speech commands:

Index commands for synchronization
Character mode for spelling
Language switching
Break/pause commands
Real-time pitch, rate, and volume adjustments
Phoneme commands for pronunciation

Configuration

SAPI 5 Settings

The SAPI 5 driver provides full control over voice parameters through NVDA’s speech settings dialog:

Open NVDA menu → Preferences → Settings → Speech
Select “Microsoft Speech API version 5” as synthesizer
Choose your installed SAPI 5 voice
Adjust rate, pitch, volume, and inflection

SAPI 5 requires compatible voices to be installed on your system. Windows comes with some voices, but additional voices must be purchased or downloaded separately.

Windows OneCore Voices

Windows OneCore voices are modern, natural-sounding synthesizers available in Windows 10 and later.

Features

Natural Speech: Neural network-based voices with human-like quality
Native Integration: Built into Windows 10/11
Multiple Languages: Various language and regional voice options
No Installation: Available immediately on compatible systems
Prosody Support: Advanced intonation and rhythm control

Technical Details

The OneCore driver uses Windows Speech Platform (oneCore API version 5+) for:

SSML (Speech Synthesis Markup Language) conversion
Real-time prosody adjustments (rate, pitch, volume)
Language detection and automatic switching
Speech event synchronization

Available Voices

OneCore voices are installed with Windows language packs. Examples include:

English: Microsoft David, Microsoft Zira, Microsoft Mark
Spanish: Microsoft Helena, Microsoft Laura
French: Microsoft Hortense, Microsoft Paul
German: Microsoft Hedda, Microsoft Katja
And many more regional variants

Adding OneCore Voices

To add more OneCore voices:

Open Windows Settings → Time & Language → Language
Add a language or click on an installed language
Click Options → Speech
Download available speech voices
Restart NVDA to detect new voices

SAPI 4 (Speech API Version 4)

A legacy 32-bit speech synthesizer driver for compatibility with older SAPI 4 voices.

Features

Legacy Support: Maintains compatibility with older voice software
32-bit Bridge: Runs through a 32-bit proxy process
Historical Compatibility: Supports vintage assistive technology voices

SAPI 4 is a legacy technology. It requires 32-bit SAPI 4 voices to be installed and registered in the Windows registry. Only use this if you have specific older voices that require SAPI 4.

Silence (No Speech)

NVDA includes a “silence” driver that disables all speech output. This is useful for:

Braille-only users
Testing and debugging
Demonstrations without audio
Silent operation environments

Changing Speech Synthesizers

To change your speech synthesizer:

Press NVDA+Control+S (or open NVDA menu → Preferences → Settings)
Select “Speech” category
Choose your synthesizer from the dropdown
Select voice and adjust settings
Click OK to apply

You can also cycle through synthesizers using the Synth Settings Ring (NVDA+Control+Arrow Keys).

Speech Parameters

All NVDA synthesizers support these core parameters:

Rate

Control how fast speech is delivered (words per minute)

Pitch

Adjust voice pitch frequency (higher or lower)

Volume

Set speech output volume level

Inflection

Control pitch variation and intonation

Advanced Features

Automatic Language Switching

When enabled, NVDA automatically switches synthesizer voice when encountering text in different languages. This requires:

Multi-lingual synthesizer (eSpeak NG or OneCore)
Language information in document markup
Compatible voice for target language

Character Mode

Synthesizers can switch to character mode for spelling words letter-by-letter with appropriate pronunciation.

Audio Ducking

NVDA can reduce the volume of other applications (audio ducking) when speaking, ensuring speech is always audible.

Developer Information

NVDA’s speech synthesizer drivers are located in:

source/synthDrivers/
├── espeak.py          # eSpeak NG driver
├── sapi5.py           # SAPI 5 driver  
├── oneCore.py         # Windows OneCore driver
├── sapi4_32.py        # SAPI 4 32-bit driver
└── silence.py         # No speech driver

Each driver implements the SynthDriver interface defined in synthDriverHandler.py.

Browse Mode - Navigation modes for documents
Braille Displays - Tactile output devices
Vision Enhancements - Visual highlighting features

Get Started

User Guide

Features

Configuration

Speech Synthesis

Overview

Supported Speech Synthesizers

eSpeak NG

Microsoft Speech API (SAPI 5)

Windows OneCore

Microsoft Speech API 4 (SAPI 4)

eSpeak NG

Features

Configurable Settings

Language Support

SAPI 5 (Speech API Version 5)

Features

Supported Commands

Configuration

Windows OneCore Voices

Features

Technical Details

Available Voices

SAPI 4 (Speech API Version 4)

Features

Silence (No Speech)

Changing Speech Synthesizers

Speech Parameters

Rate

Pitch

Volume

Inflection

Advanced Features

Automatic Language Switching

Character Mode

Audio Ducking

Developer Information

Build docs developers (and LLMs) love

Get Started

User Guide

Features

Configuration

​Overview

​Supported Speech Synthesizers

eSpeak NG

Microsoft Speech API (SAPI 5)

Windows OneCore

Microsoft Speech API 4 (SAPI 4)

​eSpeak NG

​Features

​Configurable Settings

​Language Support

​SAPI 5 (Speech API Version 5)

​Features

​Supported Commands

​Configuration

​Windows OneCore Voices

​Features

​Technical Details

​Available Voices

​SAPI 4 (Speech API Version 4)

​Features

​Silence (No Speech)

​Changing Speech Synthesizers

​Speech Parameters

Rate

Pitch

Volume

Inflection

​Advanced Features

​Automatic Language Switching

​Character Mode

​Audio Ducking

​Developer Information

​Related Topics

Build docs developers (and LLMs) love

Overview

Supported Speech Synthesizers

eSpeak NG

Features

Configurable Settings

Language Support

SAPI 5 (Speech API Version 5)

Features

Supported Commands

Configuration

Windows OneCore Voices

Features

Technical Details

Available Voices

SAPI 4 (Speech API Version 4)

Features

Silence (No Speech)

Changing Speech Synthesizers

Speech Parameters

Advanced Features

Automatic Language Switching

Character Mode

Audio Ducking

Developer Information

Related Topics