Zerox
Vision-Powered OCR for AI Ingestion
Convert PDFs, documents, and images to clean markdown using GPT-4o, Claude, Gemini and other vision models. Documents are visual after all — let vision models make sense of complex layouts, tables, and charts.
How It Works
Zerox makes document OCR dead simple by leveraging vision models:Key Features
Multi-Provider Support
Works with OpenAI, Azure OpenAI, AWS Bedrock, and Google Gemini
20+ File Formats
Supports PDF, DOCX, XLSX, images, and more out of the box
Structured Data Extraction
Extract specific fields using JSON schemas for forms, invoices, and tables
Dual SDKs
Available for both Node.js and Python with async APIs
Smart Processing
Auto-corrects orientation, trims edges, and handles concurrent pages
Format Preservation
Maintain consistent formatting across pages with tabular data
Quick Example
Trusted by Developers
12,000+ GitHub Stars
Join thousands of developers using Zerox for document processing in production
Get Started
Quickstart
Get up and running in 5 minutes
Node.js Setup
Install the Node.js SDK
Python Setup
Install the Python SDK
Popular Guides
Data Extraction
Extract structured data from documents using schemas
Model Providers
Configure OpenAI, Azure, Bedrock, or Gemini
Batch Processing
Process multiple documents efficiently
Invoice Extraction
Extract data from invoices and forms

