OpenAI Provider

The OpenAI provider gives you direct access to OpenAI’s latest models including GPT-4o, GPT-4, o1, o3, and GPT-3.5-Turbo. Use this provider when you want to work directly with OpenAI’s API rather than through Azure.

Provider Options

OpenAI integration offers multiple approaches:

Chat Client

Direct chat completions with streaming

Responses Client

Structured responses with tools

Assistants API

Persistent agents with code interpreter and file search

Installation

pip install agent-framework --pre
# OpenAI support is included in the core package

Authentication

OpenAI uses API keys for authentication:

Environment Variable (Recommended)

import os
from agent_framework.openai import OpenAIChatClient

# Set OPENAI_API_KEY environment variable
# The client will automatically read from the environment
client = OpenAIChatClient()

.env

OPENAI_API_KEY=sk-proj-...
OPENAI_CHAT_MODEL_ID=gpt-4o

Explicit API Key

from agent_framework.openai import OpenAIChatClient

client = OpenAIChatClient(api_key="sk-proj-...")

Never commit API keys to source control. Always use environment variables or secure secret management.

OpenAI Chat Client

The Chat Client provides direct access to OpenAI chat completions.

Basic Usage

import asyncio
from agent_framework.openai import OpenAIChatClient

async def main():
    # Create agent with OpenAI Chat Client
    agent = OpenAIChatClient().as_agent(
        name="Assistant",
        instructions="You are a helpful assistant.",
    )
    
    # Non-streaming response
    result = await agent.run("What is the capital of France?")
    print(result)

asyncio.run(main())

Configuration

Environment Variables
Explicit Configuration

OPENAI_API_KEY=sk-proj-...
OPENAI_CHAT_MODEL_ID=gpt-4o
OPENAI_ORGANIZATION=org-...

from agent_framework.openai import OpenAIChatClient

client = OpenAIChatClient(
    api_key="sk-proj-...",
    model_id="gpt-4o",
    organization="org-...",
)

Streaming Responses

import asyncio
from agent_framework.openai import OpenAIChatClient

async def main():
    agent = OpenAIChatClient().as_agent(
        instructions="You are a helpful assistant.",
    )
    
    # Streaming response
    query = "Write a short story about AI agents."
    print("Agent: ", end="", flush=True)
    async for chunk in agent.run(query, stream=True):
        if chunk.text:
            print(chunk.text, end="", flush=True)
    print()

asyncio.run(main())

Function Calling

import asyncio
from typing import Annotated
from agent_framework import tool
from agent_framework.openai import OpenAIChatClient

@tool(approval_mode="never_require")  # Use "always_require" in production
def get_weather(location: Annotated[str, "City name"]) -> str:
    """Get the weather for a location."""
    return f"Weather in {location}: Sunny, 72°F"

async def main():
    agent = OpenAIChatClient().as_agent(
        name="WeatherAgent",
        instructions="You are a weather assistant.",
        tools=[get_weather],
    )
    
    result = await agent.run("What's the weather in Seattle?")
    print(result)

asyncio.run(main())

OpenAI Responses Client

The Responses Client provides structured response generation:

import asyncio
from agent_framework.openai import OpenAIResponsesClient

async def main():
    agent = OpenAIResponsesClient().as_agent(
        instructions="You are a helpful assistant.",
    )
    
    result = await agent.run("Explain quantum computing in simple terms.")
    print(result)

asyncio.run(main())

OpenAI Assistants API

The Assistants API provides persistent agents with managed state, code interpreter, and file search capabilities.

Creating an Assistant

import asyncio
import os
from agent_framework.openai import OpenAIAssistantProvider
from openai import AsyncOpenAI

async def main():
    client = AsyncOpenAI()
    provider = OpenAIAssistantProvider(client)
    
    # Create a new assistant
    agent = await provider.create_agent(
        name="MathTutor",
        model=os.environ.get("OPENAI_CHAT_MODEL_ID", "gpt-4o"),
        instructions="You are a helpful math tutor.",
        tools=[...],
    )
    
    try:
        result = await agent.run("What is 25 * 17?")
        print(result)
    finally:
        # Clean up the assistant
        await client.beta.assistants.delete(agent.id)

asyncio.run(main())

Using an Existing Assistant

import asyncio
from agent_framework.openai import OpenAIAssistantProvider
from openai import AsyncOpenAI

async def main():
    client = AsyncOpenAI()
    provider = OpenAIAssistantProvider(client)
    
    # Get existing assistant by ID
    agent = await provider.get_agent(assistant_id="asst_...")
    
    result = await agent.run("Hello!")
    print(result)

asyncio.run(main())

Code Interpreter

Enable code interpreter for Python code execution:

from agent_framework.openai import code_interpreter_tool

agent = await provider.create_agent(
    name="DataAnalyst",
    model="gpt-4o",
    instructions="You are a data analyst.",
    tools=[code_interpreter_tool()],
)

File Search

Enable file search for RAG capabilities:

from agent_framework.openai import file_search_tool

agent = await provider.create_agent(
    name="DocumentAssistant",
    model="gpt-4o",
    instructions="You help with documents.",
    tools=[file_search_tool()],
)

Available Models

OpenAI offers several model families:

Model	Context Window	Best For	Function Calling
gpt-4o	128k	Latest flagship, multimodal, vision	✅
gpt-4o-mini	128k	Fast, cost-effective, intelligent	✅
gpt-4-turbo	128k	High performance, vision	✅
gpt-4	8k	Complex reasoning	✅
gpt-3.5-turbo	16k	Fast, economical	✅
o1	200k	Advanced reasoning	❌
o3-mini	200k	Cost-effective reasoning	❌

Reasoning models (o1, o3) do not support function calling or streaming. They are optimized for complex reasoning tasks.

Reasoning Models

OpenAI’s o1 and o3 models provide advanced reasoning capabilities:

import asyncio
from agent_framework.openai import OpenAIResponsesClient

async def main():
    agent = OpenAIResponsesClient(
        model_id="o3-mini"  # or "o1-preview", "o1-mini"
    ).as_agent(
        instructions="You are a reasoning assistant.",
    )
    
    result = await agent.run(
        "Solve this logic puzzle: Three people are in a room. "
        "Alice says Bob is lying. Bob says Charlie is lying. "
        "Charlie says both Alice and Bob are lying. Who is telling the truth?"
    )
    print(result)

asyncio.run(main())

Reasoning models do not support streaming or function calling. They are optimized for complex reasoning tasks that require extended thinking.

Vision Capabilities

GPT-4o and GPT-4-turbo support image inputs:

import asyncio
from agent_framework import Message
from agent_framework.openai import OpenAIChatClient

async def main():
    agent = OpenAIChatClient(model_id="gpt-4o").as_agent(
        instructions="You analyze images.",
    )
    
    message = Message(
        role="user",
        text="What's in this image?",
        images=["https://example.com/image.jpg"],
    )
    
    result = await agent.run(message)
    print(result)

asyncio.run(main())

Structured Outputs

Force the model to return structured JSON outputs:

import asyncio
from pydantic import BaseModel
from agent_framework.openai import OpenAIChatClient

class WeatherResponse(BaseModel):
    location: str
    temperature: float
    condition: str
    forecast: str

async def main():
    agent = OpenAIChatClient().as_agent(
        instructions="You are a weather assistant.",
        response_format=WeatherResponse,
    )
    
    result = await agent.run("What's the weather in Paris?")
    # result is parsed as WeatherResponse
    print(f"Location: {result.location}")
    print(f"Temperature: {result.temperature}°F")
    print(f"Condition: {result.condition}")

asyncio.run(main())

Best Practices

Secure API Key Management

Never hardcode API keys in your source code:

Use environment variables for API keys
Use .env files locally (add to .gitignore)
Use Azure Key Vault or similar for production
Rotate API keys regularly
Use project-scoped keys when possible

Choose the Right Model

Use gpt-4o-mini for development and testing
Use gpt-4o for production workloads requiring maximum quality
Use gpt-3.5-turbo for simple tasks where cost is critical
Use o1/o3 for complex reasoning tasks that don’t need function calling

Handle Rate Limits

OpenAI has rate limits based on your tier:

Implement exponential backoff retry logic
Monitor usage in the OpenAI dashboard
Consider upgrading your tier for higher limits
Use batch processing for high-volume workloads

Optimize Token Usage

Tokens directly impact cost:

Keep system prompts concise but clear
Use smaller models when appropriate
Implement conversation pruning for long sessions
Monitor token usage in responses
Use structured outputs to reduce verbose responses

Clean Up Resources

When using the Assistants API:

Delete assistants when no longer needed
Clean up uploaded files
Delete old threads to manage costs
Use try-finally blocks to ensure cleanup

Troubleshooting

Invalid API Key

If you see API key errors:

Verify the API key is correct and active
Check if the key has expired
Ensure the key has proper permissions
Verify organization ID if using organization keys

Rate Limit Errors

If you’re hitting rate limits:

Check your usage tier at platform.openai.com
Implement exponential backoff
Reduce request frequency
Consider upgrading to a higher tier
Use batch API for non-urgent requests

Model Not Available

If the model is not available:

Check if you have access to the model
Verify the model name is correct
Some models require waitlist access
Check if your account tier supports the model

Function Calling Not Working

If function calling isn’t working:

Verify the model supports function calling (o1/o3 don’t)
Check function schema is valid JSON Schema
Ensure function descriptions are clear
Verify parameter types match the schema

Cost Optimization

OpenAI pricing varies by model. Here are tips to optimize costs:

Use appropriate models: Don’t use GPT-4o when GPT-4o-mini or GPT-3.5-turbo will suffice
Monitor token usage: Track input and output tokens to identify optimization opportunities
Optimize prompts: Shorter, clearer prompts use fewer tokens
Cache common responses: Cache responses for frequently asked questions
Use batch API: For non-time-sensitive workloads, batch API is cheaper

Check current pricing at openai.com/pricing

Next Steps

Function Tools

Add function calling capabilities to your agents

Sessions & Memory

Manage multi-turn conversations with memory

Tools

Add function calling to your agents

Workflows

Build multi-agent workflows

Get Started

Core Concepts

Workflows

Providers

Hosting & Deployment

Migration Guides

OpenAI

OpenAI Provider

Provider Options

Chat Client

Responses Client

Assistants API

Installation

Authentication

Environment Variable (Recommended)

Explicit API Key

OpenAI Chat Client

Basic Usage

Configuration

Streaming Responses

Function Calling

OpenAI Responses Client

OpenAI Assistants API

Creating an Assistant

Using an Existing Assistant

Code Interpreter

File Search

Available Models

Reasoning Models

Vision Capabilities

Structured Outputs

Best Practices

Troubleshooting

Cost Optimization

Next Steps

Function Tools

Sessions & Memory

Tools

Workflows

Build docs developers (and LLMs) love

Get Started

Core Concepts

Workflows

Providers

Hosting & Deployment

Migration Guides

​OpenAI Provider

​Provider Options

Chat Client

Responses Client

Assistants API

​Installation

​Authentication

​Environment Variable (Recommended)

​Explicit API Key

​OpenAI Chat Client

​Basic Usage

​Configuration

​Streaming Responses

​Function Calling

​OpenAI Responses Client

​OpenAI Assistants API

​Creating an Assistant

​Using an Existing Assistant

​Code Interpreter

​File Search

​Available Models

​Reasoning Models

​Vision Capabilities

​Structured Outputs

​Best Practices

​Troubleshooting

​Cost Optimization

​Next Steps

Function Tools

Sessions & Memory

Tools

Workflows

Build docs developers (and LLMs) love

OpenAI Provider

Provider Options

Installation

Authentication

Environment Variable (Recommended)

Explicit API Key

OpenAI Chat Client

Basic Usage

Configuration

Streaming Responses

Function Calling

OpenAI Responses Client

OpenAI Assistants API

Creating an Assistant

Using an Existing Assistant

Code Interpreter

File Search

Available Models

Reasoning Models

Vision Capabilities

Structured Outputs

Best Practices

Troubleshooting

Cost Optimization

Next Steps