Skip to main content

OpenAI Provider

The OpenAI provider gives you direct access to OpenAI’s latest models including GPT-4o, GPT-4, o1, o3, and GPT-3.5-Turbo. Use this provider when you want to work directly with OpenAI’s API rather than through Azure.

Provider Options

OpenAI integration offers multiple approaches:

Chat Client

Direct chat completions with streaming

Responses Client

Structured responses with tools

Assistants API

Persistent agents with code interpreter and file search

Installation

pip install agent-framework --pre
# OpenAI support is included in the core package

Authentication

OpenAI uses API keys for authentication:
import os
from agent_framework.openai import OpenAIChatClient

# Set OPENAI_API_KEY environment variable
# The client will automatically read from the environment
client = OpenAIChatClient()
.env
OPENAI_API_KEY=sk-proj-...
OPENAI_CHAT_MODEL_ID=gpt-4o

Explicit API Key

from agent_framework.openai import OpenAIChatClient

client = OpenAIChatClient(api_key="sk-proj-...")
Never commit API keys to source control. Always use environment variables or secure secret management.

OpenAI Chat Client

The Chat Client provides direct access to OpenAI chat completions.

Basic Usage

import asyncio
from agent_framework.openai import OpenAIChatClient

async def main():
    # Create agent with OpenAI Chat Client
    agent = OpenAIChatClient().as_agent(
        name="Assistant",
        instructions="You are a helpful assistant.",
    )
    
    # Non-streaming response
    result = await agent.run("What is the capital of France?")
    print(result)

asyncio.run(main())

Configuration

OPENAI_API_KEY=sk-proj-...
OPENAI_CHAT_MODEL_ID=gpt-4o
OPENAI_ORGANIZATION=org-...

Streaming Responses

import asyncio
from agent_framework.openai import OpenAIChatClient

async def main():
    agent = OpenAIChatClient().as_agent(
        instructions="You are a helpful assistant.",
    )
    
    # Streaming response
    query = "Write a short story about AI agents."
    print("Agent: ", end="", flush=True)
    async for chunk in agent.run(query, stream=True):
        if chunk.text:
            print(chunk.text, end="", flush=True)
    print()

asyncio.run(main())

Function Calling

import asyncio
from typing import Annotated
from agent_framework import tool
from agent_framework.openai import OpenAIChatClient

@tool(approval_mode="never_require")  # Use "always_require" in production
def get_weather(location: Annotated[str, "City name"]) -> str:
    """Get the weather for a location."""
    return f"Weather in {location}: Sunny, 72°F"

async def main():
    agent = OpenAIChatClient().as_agent(
        name="WeatherAgent",
        instructions="You are a weather assistant.",
        tools=[get_weather],
    )
    
    result = await agent.run("What's the weather in Seattle?")
    print(result)

asyncio.run(main())

OpenAI Responses Client

The Responses Client provides structured response generation:
import asyncio
from agent_framework.openai import OpenAIResponsesClient

async def main():
    agent = OpenAIResponsesClient().as_agent(
        instructions="You are a helpful assistant.",
    )
    
    result = await agent.run("Explain quantum computing in simple terms.")
    print(result)

asyncio.run(main())

OpenAI Assistants API

The Assistants API provides persistent agents with managed state, code interpreter, and file search capabilities.

Creating an Assistant

import asyncio
import os
from agent_framework.openai import OpenAIAssistantProvider
from openai import AsyncOpenAI

async def main():
    client = AsyncOpenAI()
    provider = OpenAIAssistantProvider(client)
    
    # Create a new assistant
    agent = await provider.create_agent(
        name="MathTutor",
        model=os.environ.get("OPENAI_CHAT_MODEL_ID", "gpt-4o"),
        instructions="You are a helpful math tutor.",
        tools=[...],
    )
    
    try:
        result = await agent.run("What is 25 * 17?")
        print(result)
    finally:
        # Clean up the assistant
        await client.beta.assistants.delete(agent.id)

asyncio.run(main())

Using an Existing Assistant

import asyncio
from agent_framework.openai import OpenAIAssistantProvider
from openai import AsyncOpenAI

async def main():
    client = AsyncOpenAI()
    provider = OpenAIAssistantProvider(client)
    
    # Get existing assistant by ID
    agent = await provider.get_agent(assistant_id="asst_...")
    
    result = await agent.run("Hello!")
    print(result)

asyncio.run(main())

Code Interpreter

Enable code interpreter for Python code execution:
from agent_framework.openai import code_interpreter_tool

agent = await provider.create_agent(
    name="DataAnalyst",
    model="gpt-4o",
    instructions="You are a data analyst.",
    tools=[code_interpreter_tool()],
)
Enable file search for RAG capabilities:
from agent_framework.openai import file_search_tool

agent = await provider.create_agent(
    name="DocumentAssistant",
    model="gpt-4o",
    instructions="You help with documents.",
    tools=[file_search_tool()],
)

Available Models

OpenAI offers several model families:
ModelContext WindowBest ForFunction Calling
gpt-4o128kLatest flagship, multimodal, vision
gpt-4o-mini128kFast, cost-effective, intelligent
gpt-4-turbo128kHigh performance, vision
gpt-48kComplex reasoning
gpt-3.5-turbo16kFast, economical
o1200kAdvanced reasoning
o3-mini200kCost-effective reasoning
Reasoning models (o1, o3) do not support function calling or streaming. They are optimized for complex reasoning tasks.

Reasoning Models

OpenAI’s o1 and o3 models provide advanced reasoning capabilities:
import asyncio
from agent_framework.openai import OpenAIResponsesClient

async def main():
    agent = OpenAIResponsesClient(
        model_id="o3-mini"  # or "o1-preview", "o1-mini"
    ).as_agent(
        instructions="You are a reasoning assistant.",
    )
    
    result = await agent.run(
        "Solve this logic puzzle: Three people are in a room. "
        "Alice says Bob is lying. Bob says Charlie is lying. "
        "Charlie says both Alice and Bob are lying. Who is telling the truth?"
    )
    print(result)

asyncio.run(main())
Reasoning models do not support streaming or function calling. They are optimized for complex reasoning tasks that require extended thinking.

Vision Capabilities

GPT-4o and GPT-4-turbo support image inputs:
import asyncio
from agent_framework import Message
from agent_framework.openai import OpenAIChatClient

async def main():
    agent = OpenAIChatClient(model_id="gpt-4o").as_agent(
        instructions="You analyze images.",
    )
    
    message = Message(
        role="user",
        text="What's in this image?",
        images=["https://example.com/image.jpg"],
    )
    
    result = await agent.run(message)
    print(result)

asyncio.run(main())

Structured Outputs

Force the model to return structured JSON outputs:
import asyncio
from pydantic import BaseModel
from agent_framework.openai import OpenAIChatClient

class WeatherResponse(BaseModel):
    location: str
    temperature: float
    condition: str
    forecast: str

async def main():
    agent = OpenAIChatClient().as_agent(
        instructions="You are a weather assistant.",
        response_format=WeatherResponse,
    )
    
    result = await agent.run("What's the weather in Paris?")
    # result is parsed as WeatherResponse
    print(f"Location: {result.location}")
    print(f"Temperature: {result.temperature}°F")
    print(f"Condition: {result.condition}")

asyncio.run(main())

Best Practices

Never hardcode API keys in your source code:
  1. Use environment variables for API keys
  2. Use .env files locally (add to .gitignore)
  3. Use Azure Key Vault or similar for production
  4. Rotate API keys regularly
  5. Use project-scoped keys when possible
  • Use gpt-4o-mini for development and testing
  • Use gpt-4o for production workloads requiring maximum quality
  • Use gpt-3.5-turbo for simple tasks where cost is critical
  • Use o1/o3 for complex reasoning tasks that don’t need function calling
OpenAI has rate limits based on your tier:
  1. Implement exponential backoff retry logic
  2. Monitor usage in the OpenAI dashboard
  3. Consider upgrading your tier for higher limits
  4. Use batch processing for high-volume workloads
Tokens directly impact cost:
  1. Keep system prompts concise but clear
  2. Use smaller models when appropriate
  3. Implement conversation pruning for long sessions
  4. Monitor token usage in responses
  5. Use structured outputs to reduce verbose responses
When using the Assistants API:
  1. Delete assistants when no longer needed
  2. Clean up uploaded files
  3. Delete old threads to manage costs
  4. Use try-finally blocks to ensure cleanup

Troubleshooting

If you see API key errors:
  1. Verify the API key is correct and active
  2. Check if the key has expired
  3. Ensure the key has proper permissions
  4. Verify organization ID if using organization keys
If you’re hitting rate limits:
  1. Check your usage tier at platform.openai.com
  2. Implement exponential backoff
  3. Reduce request frequency
  4. Consider upgrading to a higher tier
  5. Use batch API for non-urgent requests
If the model is not available:
  1. Check if you have access to the model
  2. Verify the model name is correct
  3. Some models require waitlist access
  4. Check if your account tier supports the model
If function calling isn’t working:
  1. Verify the model supports function calling (o1/o3 don’t)
  2. Check function schema is valid JSON Schema
  3. Ensure function descriptions are clear
  4. Verify parameter types match the schema

Cost Optimization

OpenAI pricing varies by model. Here are tips to optimize costs:
  1. Use appropriate models: Don’t use GPT-4o when GPT-4o-mini or GPT-3.5-turbo will suffice
  2. Monitor token usage: Track input and output tokens to identify optimization opportunities
  3. Optimize prompts: Shorter, clearer prompts use fewer tokens
  4. Cache common responses: Cache responses for frequently asked questions
  5. Use batch API: For non-time-sensitive workloads, batch API is cheaper
Check current pricing at openai.com/pricing

Next Steps

Function Tools

Add function calling capabilities to your agents

Sessions & Memory

Manage multi-turn conversations with memory

Tools

Add function calling to your agents

Workflows

Build multi-agent workflows

Build docs developers (and LLMs) love