Overview
Timeouts prevent requests from hanging indefinitely. The OpenAI Python SDK provides flexible timeout configuration at both the client and request level.
Default timeout: 600 seconds (10 minutes)
Default Behavior
By default, all requests timeout after 10 minutes:
from openai import OpenAI
# Default: 600 second timeout
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello!"}],
)
# Times out after 10 minutes if no response
Simple Timeout Configuration
Client-Level Timeout
Set a default timeout for all requests:
from openai import OpenAI
# 20 second timeout
client = OpenAI(timeout=20.0)
# 5 minute timeout
client = OpenAI(timeout=300.0)
# No timeout (wait indefinitely)
client = OpenAI(timeout=None)
Setting timeout=None disables timeouts completely. This can cause requests to hang indefinitely if the server doesn’t respond.
Per-Request Timeout
Override the timeout for specific requests:
from openai import OpenAI
client = OpenAI(timeout=600.0) # Default 10 minutes
# Short timeout for quick request
response = client.with_options(timeout=5.0).chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Quick question"}],
)
# Longer timeout for complex request
response = client.with_options(timeout=120.0).chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Detailed analysis..."}],
)
Granular Timeout Configuration
For fine-grained control, use httpx.Timeout to configure different timeout phases:
import httpx
from openai import OpenAI
client = OpenAI(
timeout=httpx.Timeout(
timeout=60.0, # Total timeout
connect=5.0, # Connection timeout
read=30.0, # Read timeout
write=10.0, # Write timeout
pool=5.0, # Connection pool timeout
),
)
Timeout Phases
Total timeout for the entire request/response cycle. This is the maximum time from starting the request to receiving the complete response.
Maximum time to establish a connection to the server.
Maximum time to wait for data to be received from the server. This applies per read operation, not the entire response.
Maximum time to wait for data to be sent to the server. This applies per write operation.
Maximum time to wait for a connection from the connection pool.
Example Configurations
Fast Requests
import httpx
from openai import OpenAI
# Short timeouts for quick requests
client = OpenAI(
timeout=httpx.Timeout(
timeout=10.0,
connect=2.0,
read=5.0,
write=5.0,
),
)
Long-Running Requests
import httpx
from openai import OpenAI
# Longer timeouts for complex operations
client = OpenAI(
timeout=httpx.Timeout(
timeout=300.0, # 5 minutes total
connect=10.0,
read=120.0, # 2 minutes per read
write=30.0,
),
)
Streaming Requests
import httpx
from openai import OpenAI
# Optimized for streaming
client = OpenAI(
timeout=httpx.Timeout(
timeout=None, # No total timeout for streams
connect=5.0,
read=60.0, # Per-chunk timeout
write=10.0,
),
)
stream = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Write a story"}],
stream=True,
)
for chunk in stream:
print(chunk.choices[0].delta.content, end="")
# Each chunk must arrive within 60 seconds
For streaming requests, set timeout=None and rely on read timeout for per-chunk timing.
Timeout Errors
When a timeout occurs, the SDK raises APITimeoutError:
import openai
from openai import OpenAI
client = OpenAI(timeout=5.0)
try:
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello!"}],
)
except openai.APITimeoutError as e:
print(f"Request timed out: {e.message}")
print(f"Request: {e.request.url}")
Timeout errors are automatically retried by default. See Retries for configuration.
Async Timeouts
Timeouts work identically with AsyncOpenAI:
import asyncio
import httpx
from openai import AsyncOpenAI
client = AsyncOpenAI(
timeout=httpx.Timeout(
timeout=30.0,
connect=5.0,
read=15.0,
write=10.0,
),
)
async def main():
response = await client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)
asyncio.run(main())
Custom HTTP Client Timeout
When providing a custom HTTP client, the SDK uses the client’s timeout if not explicitly set:
import httpx
from openai import OpenAI, DefaultHttpxClient
# Timeout from custom client
custom_client = httpx.Client(timeout=30.0)
client = OpenAI(http_client=custom_client)
# Uses 30 second timeout from custom_client
# Override with explicit timeout
client = OpenAI(
http_client=custom_client,
timeout=60.0, # Overrides custom_client timeout
)
When using a custom http_client, prefer DefaultHttpxClient to preserve SDK defaults.
The SDK includes the read timeout in request headers:
from openai import OpenAI
client = OpenAI(timeout=30.0)
# Automatically includes header:
# x-stainless-read-timeout: 30.0
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello!"}],
)
This helps with server-side debugging and timeout analysis.
Best Practices
- Set reasonable timeouts - Balance between too short (frequent failures) and too long (poor UX)
- Use per-request timeouts for requests with different expected durations
- Configure streaming timeouts with
timeout=None and appropriate read timeout
- Monitor timeout errors - Frequent timeouts may indicate performance issues
- Consider retry behavior - Timeouts are retried by default, increasing total wait time
Common Scenarios
Short Requests
import httpx
from openai import OpenAI
client = OpenAI(
timeout=httpx.Timeout(
timeout=15.0,
connect=2.0,
read=10.0,
),
)
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "What is 2+2?"}],
)
Long Completions
import httpx
from openai import OpenAI
client = OpenAI(
timeout=httpx.Timeout(
timeout=180.0, # 3 minutes
connect=5.0,
read=120.0,
),
)
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Write a detailed essay..."}],
)
File Uploads
import httpx
from openai import OpenAI
from pathlib import Path
client = OpenAI(
timeout=httpx.Timeout(
timeout=300.0, # 5 minutes
connect=5.0,
write=120.0, # 2 minutes for upload
read=60.0,
),
)
file = client.files.create(
file=Path("large_file.jsonl"),
purpose="fine-tune",
)
Streaming
import httpx
from openai import OpenAI
client = OpenAI(
timeout=httpx.Timeout(
timeout=None, # No total timeout
connect=5.0,
read=30.0, # 30s between chunks
),
)
stream = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Tell me a story"}],
stream=True,
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
Debugging Timeouts
Enable debug logging to see timeout information:
from openai import OpenAI
client = OpenAI(timeout=10.0)
# Logs will show:
# - Request headers including x-stainless-read-timeout
# - Timeout exceptions with full details
# - Retry attempts due to timeouts
try:
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello!"}],
)
except Exception as e:
print(f"Error: {e}")