Overview
OpenAI provides vision-capable models that can process documents and images. Zerox supports all OpenAI GPT-4 vision models for OCR and data extraction tasks.Credentials
To use OpenAI models, you need to provide your API key:Your OpenAI API key. Can be obtained from the OpenAI API dashboard.
Environment Variable
You can store your API key as an environment variable:Supported Models
The following OpenAI models are available through Zerox:| Model | Model ID | Description |
|---|---|---|
| GPT-4.1 | gpt-4.1 | Latest GPT-4.1 model with vision capabilities |
| GPT-4.1 Mini | gpt-4.1-mini | Smaller, faster GPT-4.1 model |
| GPT-4o | gpt-4o | Optimized GPT-4 model with vision |
| GPT-4o Mini | gpt-4o-mini | Smaller, cost-effective GPT-4o model |
Configuration
Basic Example
With Environment Variable
When using environment variables, you can omit the
modelProvider parameter. OpenAI is the default provider.LLM Parameters
OpenAI models support the following optional parameters:Controls randomness in the output. Values range from 0 to 2. Lower values make output more focused and deterministic.
Maximum number of tokens to generate in the completion.
Nucleus sampling parameter. An alternative to temperature sampling. Values range from 0 to 1.
Number between -2.0 and 2.0. Positive values penalize tokens based on their frequency in the text so far.
Number between -2.0 and 2.0. Positive values penalize tokens based on whether they appear in the text so far.
Whether to return log probabilities of the output tokens. Useful for confidence scoring.
Example with Parameters
Data Extraction
OpenAI models support structured data extraction using JSON schemas:Error Handling
Best Practices
- Use
gpt-4o-minifor cost-effective processing of simple documents - Use
gpt-4oorgpt-4.1for complex layouts, tables, and detailed extraction - Set
temperature: 0for deterministic output when consistency is critical - Enable
logprobs: trueto access confidence scores for extraction quality assessment - Monitor token usage through the returned
inputTokensandoutputTokensfields

