OpenAI

Overview

OpenAI is one of the leading AI providers, offering powerful language models (GPT-4, GPT-3.5, o1), image generation (DALL-E), speech (Whisper, TTS), and more. Portkey provides full support for all OpenAI capabilities. Base URL: https://api.openai.com/v1

Supported Features

✅ Chat Completions (including streaming)
✅ Completions (legacy)
✅ Embeddings
✅ Image Generation (DALL-E)
✅ Image Editing
✅ Text-to-Speech (TTS)
✅ Speech-to-Text (Whisper transcription)
✅ Audio Translation
✅ Realtime API (WebSocket)
✅ Function Calling & Tools
✅ Vision (GPT-4 Vision)
✅ Batch API
✅ Fine-tuning
✅ File Operations

Quick Start

Chat Completions

from portkey_ai import Portkey

client = Portkey(
    provider="openai",
    Authorization="sk-***"  # Your OpenAI API key
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of France?"}
    ]
)

print(response.choices[0].message.content)

Streaming Responses

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Count from 1 to 5"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Popular Models

Model	Context Window	Description	Best For
`gpt-4o`	128K tokens	Latest GPT-4 Omni model	General purpose, multimodal
`gpt-4o-mini`	128K tokens	Faster, cost-effective GPT-4	High-volume tasks
`gpt-4-turbo`	128K tokens	Enhanced GPT-4	Complex reasoning
`gpt-3.5-turbo`	16K tokens	Fast and efficient	Simple tasks, high throughput
`o1-preview`	128K tokens	Advanced reasoning	Math, science, coding
`o3-mini`	128K tokens	Efficient reasoning	Balanced performance
`text-embedding-3-large`	8K tokens	Latest embeddings	Semantic search, RAG
`dall-e-3`	N/A	Image generation	High-quality images
`whisper-1`	N/A	Speech-to-text	Transcription
`tts-1`	N/A	Text-to-speech	Voice generation

Configuration Options

Headers

client = Portkey(
    provider="openai",
    Authorization="sk-***",
    openai_organization="org-***",      # Optional: Organization ID
    openai_project="proj_***",          # Optional: Project ID
    openai_beta="assistants=v2"         # Optional: Beta features
)

Header	Description	Required
`Authorization`	OpenAI API key (Bearer token)	Yes
`openai_organization`	Organization ID	No
`openai_project`	Project ID	No
`openai_beta`	Beta feature flags	No

Advanced Features

Function Calling

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather in a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "City name"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"]
                    }
                },
                "required": ["location"]
            }
        }
    }
]

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What's the weather in Paris?"}],
    tools=tools,
    tool_choice="auto"
)

Vision (GPT-4 Vision)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What's in this image?"},
            {
                "type": "image_url",
                "image_url": {"url": "https://example.com/image.jpg"}
            }
        ]
    }]
)

Embeddings

response = client.embeddings.create(
    model="text-embedding-3-large",
    input="The quick brown fox jumps over the lazy dog"
)

embedding = response.data[0].embedding
print(f"Embedding dimension: {len(embedding)}")

Image Generation (DALL-E)

response = client.images.generate(
    model="dall-e-3",
    prompt="A futuristic city with flying cars at sunset",
    size="1024x1024",
    quality="hd",
    n=1
)

image_url = response.data[0].url
print(f"Generated image: {image_url}")

Text-to-Speech

response = client.audio.speech.create(
    model="tts-1",
    voice="alloy",
    input="Hello! This is a text-to-speech example."
)

# Save the audio file
with open("output.mp3", "wb") as f:
    f.write(response.content)

Speech-to-Text (Whisper)

with open("audio.mp3", "rb") as audio_file:
    response = client.audio.transcriptions.create(
        model="whisper-1",
        file=audio_file,
        language="en"
    )
    
print(response.text)

Fallback Configuration

Use Anthropic as fallback for OpenAI:

config = {
    "strategy": {"mode": "fallback"},
    "targets": [
        {
            "provider": "openai",
            "api_key": "sk-***",
            "override_params": {"model": "gpt-4o"}
        },
        {
            "provider": "anthropic",
            "api_key": "sk-ant-***",
            "override_params": {"model": "claude-3-5-sonnet-20241022"}
        }
    ]
}

client = Portkey().with_options(config=config)

response = client.chat.completions.create(
    messages=[{"role": "user", "content": "Hello!"}]
)

Load Balancing

Distribute requests between OpenAI and Azure OpenAI:

config = {
    "strategy": {"mode": "loadbalance"},
    "targets": [
        {
            "provider": "openai",
            "api_key": "sk-***",
            "weight": 0.5
        },
        {
            "provider": "azure-openai",
            "api_key": "***",
            "resource_name": "my-resource",
            "deployment_id": "gpt-4",
            "api_version": "2024-02-15-preview",
            "weight": 0.5
        }
    ]
}

client = Portkey().with_options(config=config)

Batch API

# Create a batch job
response = client.batches.create(
    input_file_id="file-abc123",
    endpoint="/v1/chat/completions",
    completion_window="24h"
)

batch_id = response.id

# Retrieve batch status
batch = client.batches.retrieve(batch_id)
print(f"Status: {batch.status}")

Error Handling

from portkey_ai.exceptions import (
    RateLimitError,
    APIError,
    AuthenticationError
)

try:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello"}]
    )
except RateLimitError as e:
    print(f"Rate limit exceeded: {e}")
except AuthenticationError as e:
    print(f"Authentication failed: {e}")
except APIError as e:
    print(f"API error: {e}")

Request & Response Format

Request

{
  "model": "gpt-4o",
  "messages": [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ],
  "temperature": 0.7,
  "max_tokens": 150,
  "top_p": 1,
  "frequency_penalty": 0,
  "presence_penalty": 0
}

Response

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "gpt-4o",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Hello! How can I assist you today?"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 10,
    "total_tokens": 30
  }
}

Best Practices

Use streaming for long responses to improve user experience
Implement retry logic with exponential backoff for rate limits
Cache embeddings to reduce costs and latency
Use gpt-4o-mini for high-volume, simpler tasks
Set max_tokens to control costs and response length
Use system messages to guide model behavior consistently
Implement fallbacks to other providers for reliability

Pricing

For up-to-date OpenAI pricing, visit:

OpenAI Pricing

View detailed pricing for all OpenAI models

Azure OpenAI

Use OpenAI models through Azure

Fallback Routing

Set up fallbacks from OpenAI

Caching

Cache OpenAI responses

Function Calling

Advanced function calling guide

Overview

Major Providers

Specialized Providers

Overview

Supported Features

Quick Start

Chat Completions

Streaming Responses

Popular Models

Configuration Options

Headers

Advanced Features

Function Calling

Vision (GPT-4 Vision)

Embeddings

Image Generation (DALL-E)

Text-to-Speech

Speech-to-Text (Whisper)

Fallback Configuration

Load Balancing

Batch API

Error Handling

Request & Response Format

Request

Response

Best Practices

Pricing

OpenAI Pricing

Azure OpenAI

Fallback Routing

Caching

Function Calling

Build docs developers (and LLMs) love

Overview

Major Providers

Specialized Providers

​Overview

​Supported Features

​Quick Start

​Chat Completions

​Streaming Responses

​Popular Models

​Configuration Options

​Headers

​Advanced Features

​Function Calling

​Vision (GPT-4 Vision)

​Embeddings

​Image Generation (DALL-E)

​Text-to-Speech

​Speech-to-Text (Whisper)

​Fallback Configuration

​Load Balancing

​Batch API

​Error Handling

​Request & Response Format

​Request

​Response

​Best Practices

​Pricing

OpenAI Pricing

​Related Resources

Azure OpenAI

Fallback Routing

Caching

Function Calling

Build docs developers (and LLMs) love

Overview

Supported Features

Quick Start

Chat Completions

Streaming Responses

Popular Models

Configuration Options

Headers

Advanced Features

Function Calling

Vision (GPT-4 Vision)

Embeddings

Image Generation (DALL-E)

Text-to-Speech

Speech-to-Text (Whisper)

Fallback Configuration

Load Balancing

Batch API

Error Handling

Request & Response Format

Request

Response

Best Practices

Pricing

Related Resources