What is Whisper?
Whisper is a general-purpose speech recognition model developed by OpenAI. It is trained on a large dataset of diverse audio and is a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. A Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented as a sequence of tokens to be predicted by the decoder, allowing a single model to replace many stages of a traditional speech-processing pipeline.Key Features
Multilingual Recognition
Transcribe speech in 99+ languages with high accuracy across diverse accents and dialects
Speech Translation
Translate speech from any supported language directly into English text
Multiple Model Sizes
Choose from 6 model sizes (tiny to turbo) to balance speed and accuracy for your use case
Simple API
Easy-to-use Python API and command-line interface for quick integration
Available Models
Whisper offers six model sizes with varying speed and accuracy tradeoffs:| Size | Parameters | English-only | Multilingual | Required VRAM | Relative Speed |
|---|---|---|---|---|---|
| tiny | 39 M | tiny.en | tiny | ~1 GB | ~10x |
| base | 74 M | base.en | base | ~1 GB | ~7x |
| small | 244 M | small.en | small | ~2 GB | ~4x |
| medium | 769 M | medium.en | medium | ~5 GB | ~2x |
| large | 1550 M | N/A | large | ~10 GB | 1x |
| turbo | 809 M | N/A | turbo | ~6 GB | ~8x |
The
.en models for English-only applications tend to perform better, especially for the tiny.en and base.en models. The turbo model is an optimized version of large-v3 that offers faster transcription speed with minimal degradation in accuracy.Getting Started
Installation
Install Whisper and its dependencies on your system
Quickstart
Start transcribing audio in minutes with CLI and Python examples
Resources
Research Paper
Read the technical paper on arXiv
Blog Post
Learn more from the official OpenAI blog
Model Card
View detailed model specifications