OpenAIClient
The primary client for OpenAI’s GPT models with support for structured outputs using the responses.parse API.Installation
Basic Usage
Constructor
Configuration object. If
None, creates default config.Enable response caching (not currently implemented, raises
NotImplementedError if True)Optional pre-configured
AsyncOpenAI client instance. If not provided, creates one from config.Maximum output tokens. Defaults to 16384 for compatibility.
Reasoning effort level for reasoning models (GPT-5, o1, o3). Options:
'minimal', 'low', 'medium', 'high'Verbosity level for reasoning models. Options:
'low', 'medium', 'high'Supported Models
Reasoning Models (via responses.parse API):gpt-5-*serieso1-*serieso3-*series
gpt-4.1-mini(recommended)gpt-4.1-nanogpt-4ogpt-4-turbo- All other GPT models
Reasoning models (GPT-5, o1, o3) do not support temperature settings. The client automatically omits temperature for these models.
Reasoning Model Configuration
For GPT-5 and o-series models, configure reasoning depth:Custom Base URL
Use OpenAI-compatible endpoints:Response Format
The client uses different APIs based on model capabilities: Reasoning Models (responses.parse):OpenAIGenericClient
A simplified OpenAI client designed for local and third-party OpenAI-compatible models. Does not support caching or the responses.parse API.When to Use
- Local models (e.g., Ollama, LM Studio)
- Third-party OpenAI-compatible APIs
- Models with higher token limits
- Simpler integration requirements
Basic Usage
Constructor
Configuration object. If
None, creates default config.Caching is not supported. Raises
NotImplementedError if True.Optional pre-configured
AsyncOpenAI client instance.Maximum output tokens. Default increased to 16384 for better local model compatibility.
Key Differences from OpenAIClient
| Feature | OpenAIClient | OpenAIGenericClient |
|---|---|---|
| Caching | Supported (not implemented) | Not supported |
| responses.parse API | Yes (reasoning models) | No |
| Structured outputs | Via responses.parse | Via json_schema |
| Max retries | 2 (configurable) | 2 (fixed) |
| Default max_tokens | 16384 | 16384 |
| Reasoning/verbosity | Yes | No |
Structured Output Handling
Usesjson_schema in response format:
Error Handling
Implements custom retry logic:- Max 2 retries on validation/parsing errors
- No retry for rate limits or refusals
- Automatic retry for OpenAI client errors (timeout, connection, server errors)
- Appends error context to messages for model self-correction
Example: Local Model
Compatibility Notes
- Works with any OpenAI-compatible API
- Does not use provider-specific features
- JSON schema support required for structured outputs
- Temperature and max_tokens always included in requests