ZeroxArgs Interface
All parameters for thezerox() function.
Required Parameters
Path to the file to process. Can be a local file path or a URL to a remote file.Supported formats:
- PDF (
.pdf) - Images (
.png,.jpg,.jpeg,.heic) - Structured data files (
.xlsx,.xls,.csv) - Other document formats that can be converted to PDF
Authentication
Authentication credentials for the model provider. Required unless using
openaiAPIKey.Default: { apiKey: "" }Deprecated: Use
credentials instead. OpenAI API key for authentication.When provided, automatically sets modelProvider to ModelProvider.OPENAI and credentials to { apiKey: openaiAPIKey }.Default: ""Model Configuration
The model to use for OCR processing.Available options:
ModelOptions.OPENAI_GPT_4_1-"gpt-4.1"ModelOptions.OPENAI_GPT_4_1_MINI-"gpt-4.1-mini"ModelOptions.OPENAI_GPT_4O-"gpt-4o"ModelOptions.OPENAI_GPT_4O_MINI-"gpt-4o-mini"ModelOptions.BEDROCK_CLAUDE_3_HAIKU_2024_10ModelOptions.BEDROCK_CLAUDE_3_SONNET_2024_10ModelOptions.GOOGLE_GEMINI_2_5_PROModelOptions.GOOGLE_GEMINI_2_FLASH- And more (see Types)
ModelOptions.OPENAI_GPT_4OThe model provider to use.Options:
ModelProvider.OPENAIModelProvider.AZUREModelProvider.BEDROCKModelProvider.GOOGLE
ModelProvider.OPENAIAdditional parameters for the language model.Default:
{}Custom function to handle model completion. Useful for implementing custom model providers or adding preprocessing logic.
Extraction Configuration
JSON Schema for structured data extraction. When provided, Zerox will extract structured data matching this schema.
When
true, skips OCR and only performs structured data extraction. Requires schema to be provided.This mode automatically enables directImageExtraction.Default: falseList of schema property names to extract on a per-page basis rather than from the full document.
When
true, uses both OCR text and images for extraction, improving accuracy.Requirements:- Requires
schemato be provided - Cannot be used with
directImageExtractionorextractOnly
falseWhen
true, performs extraction directly from images without OCR text.Default: falseModel to use for extraction. If not specified, uses the same model as OCR.
Model provider for extraction. If not specified, uses the same provider as OCR.
Credentials for the extraction model. If not specified, uses the same credentials as OCR.
LLM parameters for extraction. If not specified, uses the same parameters as OCR.
Custom prompt for the extraction process.
OCR Configuration
Custom prompt to guide the OCR process. Use this to provide specific instructions about how to extract or format the text.
When
true, processes pages sequentially to maintain formatting consistency across pages.Requirements:- Cannot be used with
extractOnlymode - Disables concurrent processing
falseWhen
true, automatically detects and corrects page orientation using Tesseract OCR.Default: trueWhen
true, automatically trims white edges from images before processing.Default: trueImage Processing
DPI (dots per inch) for PDF to image conversion. Higher values produce better quality but larger files.Typical values: 150-300
Target height in pixels for converted images. Width is adjusted proportionally.
Format for converted images.Default:
"png"Maximum image size in megabytes. Images larger than this will be compressed.Default:
15Specify which pages to process:
-1: Process all pages (default)number: Process a single pagenumber[]: Process specific pages (e.g.,[1, 3, 5])
-1Performance & Reliability
Maximum number of pages to process concurrently. Higher values increase speed but use more resources.Default:
10Maximum number of retry attempts for failed operations.Default:
1How to handle errors during processing:
ErrorMode.IGNORE: Continue processing remaining pages (default)ErrorMode.THROW: Throw an error immediately on failure
ErrorMode.IGNOREMaximum number of Tesseract OCR workers for orientation correction.
-1: Unlimited (default)number: Limit to specific count
-1File Management
Directory to save the aggregated markdown output. If not specified, output is only returned in the response.
Directory for temporary files during processing.Default:
os.tmpdir()When
true, automatically removes temporary files after processing.Default: true
