Overview
Zerox provides advanced options to control image processing, formatting, and OCR behavior. This guide covers all advanced configuration parameters.Format Preservation
Maintain Format
Preserve document formatting across pages for consistent output:When
true, Zerox processes pages sequentially and provides context from the previous page to maintain consistent formatting (headers, tables, lists, etc.). This increases accuracy but reduces concurrency.Important: Cannot be used with extractOnly mode.With
maintainFormat: true, pages are processed sequentially rather than in parallel. This ensures formatting consistency but increases processing time.How It Works
When enabled, each page receives context from the previous page:Image Processing
Orientation Correction
Automatically detect and correct image orientation:Uses Tesseract OCR to detect image orientation and rotates images as needed. Improves accuracy for scanned or rotated documents.
Edge Trimming
Remove whitespace and borders from images:Automatically trims whitespace around the content. Reduces token usage and improves focus on actual content.
Image Conversion Options
Image Format
Choose the format for converted images:Format for converted images:
png: Higher quality, larger file sizejpeg: Smaller file size, slight quality loss
Image Density
Control the resolution of PDF-to-image conversion:DPI for PDF-to-image conversion. Higher values produce clearer images but larger file sizes.
- 150 DPI: Low quality, faster processing
- 300 DPI: Standard quality (recommended)
- 600 DPI: High quality, slower processing
Image Height
Specify a fixed height for converted images:Fixed height for images in pixels. Width is adjusted to maintain aspect ratio. Useful for standardizing image sizes.
Image Compression
Automatically compress images to reduce size:Maximum image size in MB. Images larger than this are automatically compressed. Set to
0 to disable compression.Page Selection
Process specific pages instead of the entire document:Pages to process:
-1: Process all pages (default)number: Process single page (1-indexed)number[]: Process specific pages
Concurrency Control
Control parallel processing of pages:Maximum number of pages to process concurrently. Higher values increase speed but also API rate limit risk.Note: Ignored when
maintainFormat: true (sequential processing).Tesseract Workers
Configure Tesseract OCR worker pool:Maximum number of Tesseract worker threads for orientation detection:
-1: Unlimited (default)number: Limit worker count
correctOrientation: true.Custom Prompts
Provide custom instructions for OCR:Custom instructions appended to the system prompt. Use to provide domain-specific guidance.
constants.ts:
LLM Parameters
Fine-tune model behavior:Model parameters. Available options depend on the provider:OpenAI/Azure:
temperature(0-2): Randomness (default: 0.7)maxTokens: Maximum output tokenstopP(0-1): Nucleus samplingfrequencyPenalty(-2 to 2): Penalize frequent tokenspresencePenalty(-2 to 2): Penalize repeated contentlogprobs(boolean): Return log probabilities
- Similar to OpenAI, but no
logprobs
- Uses
maxOutputTokensinstead ofmaxTokens
Custom Model Function
Provide a custom function for OCR processing:Custom function for OCR processing. Receives:
buffers: Array of image buffers (may be split for tall images)image: Path to the image filemaintainFormat: Whether to maintain formattingpageNumber: Current page numberpriorPage: Content from previous page (ifmaintainFormat: true)
CompletionResponse object.Complete Advanced Example
Next Steps
- Error Handling - Configure retry and error strategies
- Performance Tuning - Optimize processing speed
- Data Extraction - Extract structured data

