Timeouts

Overview

Timeouts prevent requests from hanging indefinitely. The OpenAI Python SDK provides flexible timeout configuration at both the client and request level. Default timeout: 600 seconds (10 minutes)

Default Behavior

By default, all requests timeout after 10 minutes:

from openai import OpenAI

# Default: 600 second timeout
client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello!"}],
)
# Times out after 10 minutes if no response

Simple Timeout Configuration

Client-Level Timeout

Set a default timeout for all requests:

from openai import OpenAI

# 20 second timeout
client = OpenAI(timeout=20.0)

# 5 minute timeout
client = OpenAI(timeout=300.0)

# No timeout (wait indefinitely)
client = OpenAI(timeout=None)

Setting timeout=None disables timeouts completely. This can cause requests to hang indefinitely if the server doesn’t respond.

Per-Request Timeout

Override the timeout for specific requests:

from openai import OpenAI

client = OpenAI(timeout=600.0)  # Default 10 minutes

# Short timeout for quick request
response = client.with_options(timeout=5.0).chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Quick question"}],
)

# Longer timeout for complex request
response = client.with_options(timeout=120.0).chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Detailed analysis..."}],
)

Granular Timeout Configuration

For fine-grained control, use httpx.Timeout to configure different timeout phases:

import httpx
from openai import OpenAI

client = OpenAI(
    timeout=httpx.Timeout(
        timeout=60.0,    # Total timeout
        connect=5.0,     # Connection timeout
        read=30.0,       # Read timeout
        write=10.0,      # Write timeout
        pool=5.0,        # Connection pool timeout
    ),
)

Timeout Phases

timeout

float

default:"60.0"

Total timeout for the entire request/response cycle. This is the maximum time from starting the request to receiving the complete response.

connect

float

default:"5.0"

Maximum time to establish a connection to the server.

read

float

default:"30.0"

Maximum time to wait for data to be received from the server. This applies per read operation, not the entire response.

write

float

default:"10.0"

Maximum time to wait for data to be sent to the server. This applies per write operation.

pool

float

default:"5.0"

Maximum time to wait for a connection from the connection pool.

Example Configurations

Fast Requests

import httpx
from openai import OpenAI

# Short timeouts for quick requests
client = OpenAI(
    timeout=httpx.Timeout(
        timeout=10.0,
        connect=2.0,
        read=5.0,
        write=5.0,
    ),
)

Long-Running Requests

import httpx
from openai import OpenAI

# Longer timeouts for complex operations
client = OpenAI(
    timeout=httpx.Timeout(
        timeout=300.0,    # 5 minutes total
        connect=10.0,
        read=120.0,       # 2 minutes per read
        write=30.0,
    ),
)

Streaming Requests

import httpx
from openai import OpenAI

# Optimized for streaming
client = OpenAI(
    timeout=httpx.Timeout(
        timeout=None,     # No total timeout for streams
        connect=5.0,
        read=60.0,        # Per-chunk timeout
        write=10.0,
    ),
)

stream = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Write a story"}],
    stream=True,
)

for chunk in stream:
    print(chunk.choices[0].delta.content, end="")
# Each chunk must arrive within 60 seconds

For streaming requests, set timeout=None and rely on read timeout for per-chunk timing.

Timeout Errors

When a timeout occurs, the SDK raises APITimeoutError:

import openai
from openai import OpenAI

client = OpenAI(timeout=5.0)

try:
    response = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": "Hello!"}],
    )
except openai.APITimeoutError as e:
    print(f"Request timed out: {e.message}")
    print(f"Request: {e.request.url}")

Timeout errors are automatically retried by default. See Retries for configuration.

Async Timeouts

Timeouts work identically with AsyncOpenAI:

import asyncio
import httpx
from openai import AsyncOpenAI

client = AsyncOpenAI(
    timeout=httpx.Timeout(
        timeout=30.0,
        connect=5.0,
        read=15.0,
        write=10.0,
    ),
)

async def main():
    response = await client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": "Hello!"}],
    )
    print(response.choices[0].message.content)

asyncio.run(main())

Custom HTTP Client Timeout

When providing a custom HTTP client, the SDK uses the client’s timeout if not explicitly set:

import httpx
from openai import OpenAI, DefaultHttpxClient

# Timeout from custom client
custom_client = httpx.Client(timeout=30.0)
client = OpenAI(http_client=custom_client)
# Uses 30 second timeout from custom_client

# Override with explicit timeout
client = OpenAI(
    http_client=custom_client,
    timeout=60.0,  # Overrides custom_client timeout
)

When using a custom http_client, prefer DefaultHttpxClient to preserve SDK defaults.

Timeout Headers

The SDK includes the read timeout in request headers:

from openai import OpenAI

client = OpenAI(timeout=30.0)

# Automatically includes header:
# x-stainless-read-timeout: 30.0
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello!"}],
)

This helps with server-side debugging and timeout analysis.

Best Practices

Set reasonable timeouts - Balance between too short (frequent failures) and too long (poor UX)
Use per-request timeouts for requests with different expected durations
Configure streaming timeouts with timeout=None and appropriate read timeout
Monitor timeout errors - Frequent timeouts may indicate performance issues
Consider retry behavior - Timeouts are retried by default, increasing total wait time

Common Scenarios

Short Requests

import httpx
from openai import OpenAI

client = OpenAI(
    timeout=httpx.Timeout(
        timeout=15.0,
        connect=2.0,
        read=10.0,
    ),
)

response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "What is 2+2?"}],
)

Long Completions

import httpx
from openai import OpenAI

client = OpenAI(
    timeout=httpx.Timeout(
        timeout=180.0,    # 3 minutes
        connect=5.0,
        read=120.0,
    ),
)

response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Write a detailed essay..."}],
)

File Uploads

import httpx
from openai import OpenAI
from pathlib import Path

client = OpenAI(
    timeout=httpx.Timeout(
        timeout=300.0,     # 5 minutes
        connect=5.0,
        write=120.0,       # 2 minutes for upload
        read=60.0,
    ),
)

file = client.files.create(
    file=Path("large_file.jsonl"),
    purpose="fine-tune",
)

Streaming

import httpx
from openai import OpenAI

client = OpenAI(
    timeout=httpx.Timeout(
        timeout=None,      # No total timeout
        connect=5.0,
        read=30.0,         # 30s between chunks
    ),
)

stream = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True,
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Debugging Timeouts

Enable debug logging to see timeout information:

export OPENAI_LOG=debug

from openai import OpenAI

client = OpenAI(timeout=10.0)

# Logs will show:
# - Request headers including x-stainless-read-timeout
# - Timeout exceptions with full details
# - Retry attempts due to timeouts
try:
    response = client.chat.completions.create(
        model="gpt-4",
        messages=[{"role": "user", "content": "Hello!"}],
    )
except Exception as e:
    print(f"Error: {e}")

Retries - Automatic retry configuration (timeouts are retried)
Error Handling - Handling APITimeoutError
Client Initialization - Client configuration options
Streaming - Timeout configuration for streaming

Get Started

Core Concepts

Guides

Overview

Default Behavior

Simple Timeout Configuration

Client-Level Timeout

Per-Request Timeout

Granular Timeout Configuration

Timeout Phases

Example Configurations

Fast Requests

Long-Running Requests

Streaming Requests

Timeout Errors

Async Timeouts

Custom HTTP Client Timeout

Timeout Headers

Best Practices

Common Scenarios

Short Requests

Long Completions

File Uploads

Streaming

Debugging Timeouts

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

​Overview

​Default Behavior

​Simple Timeout Configuration

​Client-Level Timeout

​Per-Request Timeout

​Granular Timeout Configuration

​Timeout Phases

​Example Configurations

​Fast Requests

​Long-Running Requests

​Streaming Requests

​Timeout Errors

​Async Timeouts

​Custom HTTP Client Timeout

​Timeout Headers

​Best Practices

​Common Scenarios

​Short Requests

​Long Completions

​File Uploads

​Streaming

​Debugging Timeouts

​Related

Build docs developers (and LLMs) love

Overview

Default Behavior

Simple Timeout Configuration

Client-Level Timeout

Per-Request Timeout

Granular Timeout Configuration

Timeout Phases

Example Configurations

Fast Requests

Long-Running Requests

Streaming Requests

Timeout Errors

Async Timeouts

Custom HTTP Client Timeout

Timeout Headers

Best Practices

Common Scenarios

Short Requests

Long Completions

File Uploads

Streaming

Debugging Timeouts

Related