v20250625
Security & Bug Fixes
- Update
torch.loadto useweights_only=Trueto prevent security vulnerabilities (#2451) - Ensure DTW cost tensor is on the same device as input tensor (#2561)
- Fixed triton kernel update to support latest triton versions (#2588)
- Fix GitHub display errors for Jupyter notebooks (#2589)
- Bugfix: Illogical “Avoid computing higher temperatures on no_speech” (#1903)
- Fix typo in data/README.md (#2433)
Documentation
- Updated README to specify translation model limitation (#2547)
- Updating README and doc strings to reflect that n_mels can now be 128 (#2049)
- Update README.md (#2379)
Features
- Add option to carry initial_prompt with the sliding window (#2343)
Development & CI
- Bump the github-actions group with 3 updates (#2592)
- Keep GitHub Actions up to date with GitHub’s Dependabot (#2486)
- pre-commit: Upgrade black v25.1.0 and isort v6.0.0 (#2514)
- GitHub Actions: Add Python 3.13 to the testing (#2487)
- PEP 621: Migrate from setup.py to pyproject.toml (#2435)
- pre-commit autoupdate && pre-commit run —all-files (#2484)
- Upgrade GitHub Actions (#2430)
- More pytorch versions in tests (#2408)
v20240930
Features
- large-v3-turbo model - New optimized model for faster inference (#2361)
- Using SDPA (Scaled Dot Product Attention) if available (#2359)
Development
v20240927
Bug Fixes
Development
- Pinning numpy<2 in tests (#2332)
- Relax triton requirements for compatibility with PyTorch 2.4 and newer (#2307)
v20231117
Compatibility
- Relax triton requirements for compatibility with PyTorch 2.1 and newer (#1802)
v20231106
Features
- large-v3 - New improved large model (#1761)
v20231105
Features
- Add new option to generate subtitles by a specific number of words (#1729)
Bug Fixes
- Fix exception when an audio file with no speech is provided (#1396)
- Handling transcribe exceptions (#1682)
Security
- allow_pickle=False while loading mel matrix in audio.py (#1511)
Development
- Remove tiktoken pin (#1759)
Documentation
- Disambiguation of the term “relative speed” in the README (#1751)
v20230918
Features
- Word timing tweaks (#1559)
- Implement max line width and max line count, and make word highlighting optional (#1184)
Bug Fixes
- Fix condition_on_previous_text (#1224)
- Fix numba deprecation notice (#1233)
- Drop ffmpeg-python dependency and call ffmpeg directly (#1242)
Performance
- Avoid rearranging all caches (#1483)
- Avoid computing higher temperatures on no_speech segments (#1279)
Improvements
Development
- Add .pre-commit-config.yaml (#1528)
- Python 3.11 support (#1171)
- Dropped unused execute bit from mel_filters.npz (#1254)
Documentation
- Fix doc of TextDecoder (#1526)
- Update model-card.md (#1643)
- Updated README.md to provide more insight on BLEU and specific appendices (#1236)
- Update README.md to reference tiktoken (#1105)
- Update decoding.py (#1219, #1155)
- Update tokenizer.py (#1163)
v20230314
Features
- Use tiktoken for fast tokenization (#1044)
Bug Fixes
- Abort find_alignment on empty input (#1090)
- Fix truncated words list when the replacement character is decoded (#1089)
- Fix alignment between the segments and the list of words (#1087)
- Fix github language stats getting dominated by jupyter notebook (#1076)
v20230308
Features
- kwargs in decode() for convenience (#1061)
Bug Fixes
- Fix all_tokens handling that caused more repetitions and discrepancy in JSON (#1060)
- Fix typo in CHANGELOG.md
v20230307
Bug Fixes
- Fix the repetition/hallucination issue identified in #1046 (#1052)
Development
- Use triton==2.0.0 (#1053)
- Install triton in x86_64 linux only (#1051)
- Update setup.py to specify python >= 3.8 requirement
v20230306
Features
Bug Fixes
- Fix infinite loop caused by incorrect timestamp tokens prediction (#914)
Development
- Remove auxiliary audio extension (#1021)
- Apply formatting with
black,isort, andflake8(#1038) - Drop Python 3.7 support (#889)
Documentation
- Update README.md (#894)
v20230124
Features
- Add TSV formatted output in transcript, using integer start/end time in milliseconds (#228)
- Added
--output_formatoption (#333)
Bug Fixes
- Fix bug where mm is mistakenly replaced with hmm in e.g. 20mm (#659)
- Handle
XDG_CACHE_HOMEproperly fordownload_root(#864)
Improvements
- Handle printing even if sys.stdout.buffer is not available (#887)
- Use stdout for printing transcription progress (#867)
- Print ’?’ if a letter can’t be encoded using the system default encoding (#859)