Translations

Create translation

Translates audio into English.

from openai import OpenAI

client = OpenAI()

audio_file = open("german_audio.mp3", "rb")
translation = client.audio.translations.create(
    model="whisper-1",
    file=audio_file
)

print(translation.text)

Parameters

file

required

The audio file object (not file name) to translate, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.

model

string

required

ID of the model to use. Only whisper-1 (which is powered by OpenAI’s open source Whisper V2 model) is currently available.

prompt

string

An optional text to guide the model’s style or continue a previous audio segment. The prompt should be in English.

response_format

string

default:"json"

The format of the output, in one of these options: json, text, srt, verbose_json, or vtt.

temperature

float

default:"0"

The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use log probability to automatically increase the temperature until certain thresholds are hit.

Response

text

string

The translated text in English.

Examples

Basic translation

from openai import OpenAI

client = OpenAI()

# Translate Spanish audio to English
audio_file = open("spanish_audio.mp3", "rb")
translation = client.audio.translations.create(
    model="whisper-1",
    file=audio_file
)

print(translation.text)

Get translation as SRT subtitles

from openai import OpenAI

client = OpenAI()

audio_file = open("french_video.mp3", "rb")
translation = client.audio.translations.create(
    model="whisper-1",
    file=audio_file,
    response_format="srt"
)

# Save English SRT file
with open("english_subtitles.srt", "w") as f:
    f.write(translation)

Get translation as VTT

from openai import OpenAI

client = OpenAI()

audio_file = open("mandarin_audio.mp3", "rb")
translation = client.audio.translations.create(
    model="whisper-1",
    file=audio_file,
    response_format="vtt"
)

# Save as WebVTT file
with open("subtitles.vtt", "w") as f:
    f.write(translation)

Translation with verbose JSON

from openai import OpenAI

client = OpenAI()

audio_file = open("japanese_audio.mp3", "rb")
translation = client.audio.translations.create(
    model="whisper-1",
    file=audio_file,
    response_format="verbose_json"
)

# Access detailed information
print(f"Language: {translation.language}")
print(f"Duration: {translation.duration}")
print(f"Text: {translation.text}")

Translation with prompt

from openai import OpenAI

client = OpenAI()

audio_file = open("german_presentation.mp3", "rb")
translation = client.audio.translations.create(
    model="whisper-1",
    file=audio_file,
    prompt="This is a technical presentation about machine learning algorithms."
)

print(translation.text)

Batch translate multiple files

from openai import OpenAI
from pathlib import Path

client = OpenAI()

audio_files = Path("audio_files").glob("*.mp3")

for audio_path in audio_files:
    with audio_path.open("rb") as audio_file:
        translation = client.audio.translations.create(
            model="whisper-1",
            file=audio_file
        )
        
        # Save translation
        output_path = audio_path.with_suffix(".txt")
        output_path.write_text(translation.text)
        print(f"Translated {audio_path.name}")

Async usage

import asyncio
from openai import AsyncOpenAI

async def translate_audio():
    client = AsyncOpenAI()
    
    audio_file = open("italian_audio.mp3", "rb")
    translation = await client.audio.translations.create(
        model="whisper-1",
        file=audio_file
    )
    
    print(translation.text)

asyncio.run(translate_audio())

Supported audio formats

The translation endpoint supports the following audio formats:

flac - Free Lossless Audio Codec
mp3 - MPEG audio format
mp4 - MPEG-4 Part 14
mpeg - MPEG audio
mpga - MPEG audio
m4a - MPEG-4 audio
ogg - Ogg Vorbis
wav - Waveform audio
webm - WebM audio

File uploads

Files are uploaded using multipart/form-data. The file object should be opened in binary mode:

# Correct way to open file
audio_file = open("path/to/file.mp3", "rb")

# Or using pathlib
from pathlib import Path
audio_file = Path("path/to/file.mp3").open("rb")

Translation vs Transcription

The key difference between the /audio/translations and /audio/transcriptions endpoints:

Translations - Always outputs English text, regardless of input language
Transcriptions - Outputs text in the same language as the audio input

When to use translations

Use the translations endpoint when you need to:

Convert non-English audio into English text
Create English subtitles for foreign language videos
Translate podcasts or audio content into English
Build multilingual applications that standardize on English output

When to use transcriptions

Use the transcriptions endpoint when you need to:

Convert audio to text in the same language
Create subtitles in the original language
Preserve the original language of the content

Example: Multi-language processing

from openai import OpenAI

client = OpenAI()

# First transcribe in original language
audio_file = open("spanish_audio.mp3", "rb")
transcription = client.audio.transcriptions.create(
    model="whisper-1",
    file=audio_file,
    language="es"
)
print(f"Spanish: {transcription.text}")

# Then translate to English
audio_file = open("spanish_audio.mp3", "rb")
translation = client.audio.translations.create(
    model="whisper-1",
    file=audio_file
)
print(f"English: {translation.text}")

For more information on audio processing and best practices, see the Speech to text guide.

Client

Responses

Chat

Audio

Images

Videos

Embeddings

Files

Fine-tuning

Batches

Assistants (Beta)

Vector Stores

Moderations

Models

Realtime

Create translation

Parameters

Response

Examples

Basic translation

Get translation as SRT subtitles

Get translation as VTT

Translation with verbose JSON

Translation with prompt

Batch translate multiple files

Async usage

Supported audio formats

File uploads

Translation vs Transcription

When to use translations

When to use transcriptions

Example: Multi-language processing

Build docs developers (and LLMs) love

Client

Responses

Chat

Audio

Images

Videos

Embeddings

Files

Fine-tuning

Batches

Assistants (Beta)

Vector Stores

Moderations

Models

Realtime

​Create translation

​Parameters

​Response

​Examples

​Basic translation

​Get translation as SRT subtitles

​Get translation as VTT

​Translation with verbose JSON

​Translation with prompt

​Batch translate multiple files

​Async usage

​Supported audio formats

​File uploads

​Translation vs Transcription

​When to use translations

​When to use transcriptions

​Example: Multi-language processing

Build docs developers (and LLMs) love

Create translation

Parameters

Response

Examples

Basic translation

Get translation as SRT subtitles

Get translation as VTT

Translation with verbose JSON

Translation with prompt

Batch translate multiple files

Async usage

Supported audio formats

File uploads

Translation vs Transcription

When to use translations

When to use transcriptions

Example: Multi-language processing