Zerox

Vision-Powered OCR for AI Ingestion

Convert PDFs, documents, and images to clean markdown using GPT-4o, Claude, Gemini and other vision models. Documents are visual after all — let vision models make sense of complex layouts, tables, and charts.

Get Started View on GitHub

How It Works

Zerox makes document OCR dead simple by leveraging vision models:

Pass in a file

Upload a PDF, DOCX, image, or any of 20+ supported formats

Convert to images

Files are converted into a series of high-quality images

Vision model processing

Each image is sent to GPT-4o, Claude, or Gemini for markdown conversion

Get markdown output

Receive clean, structured markdown perfect for AI ingestion and RAG systems

Key Features

Multi-Provider Support

Works with OpenAI, Azure OpenAI, AWS Bedrock, and Google Gemini

20+ File Formats

Supports PDF, DOCX, XLSX, images, and more out of the box

Structured Data Extraction

Extract specific fields using JSON schemas for forms, invoices, and tables

Dual SDKs

Available for both Node.js and Python with async APIs

Smart Processing

Auto-corrects orientation, trims edges, and handles concurrent pages

Format Preservation

Maintain consistent formatting across pages with tabular data

Quick Example

import { zerox } from "zerox";

const result = await zerox({
  filePath: "https://example.com/document.pdf",
  credentials: {
    apiKey: process.env.OPENAI_API_KEY,
  },
});

console.log(result.pages[0].content); // Markdown output

Trusted by Developers

12,000+ GitHub Stars

Join thousands of developers using Zerox for document processing in production

Get Started

Quickstart

Get up and running in 5 minutes

Node.js Setup

Install the Node.js SDK

Python Setup

Install the Python SDK

Popular Guides

Data Extraction

Extract structured data from documents using schemas

Model Providers

Configure OpenAI, Azure, Bedrock, or Gemini

Batch Processing

Process multiple documents efficiently

Invoice Extraction

Extract data from invoices and forms

⌘I

Get Started

Installation

Core Concepts

Guides

Zerox - Vision-Powered OCR

Zerox

How It Works

Key Features

Multi-Provider Support

20+ File Formats

Structured Data Extraction

Dual SDKs

Smart Processing

Format Preservation

Quick Example

Trusted by Developers

Get Started

Quickstart

Node.js Setup

Python Setup

Popular Guides

Data Extraction

Model Providers

Batch Processing

Invoice Extraction

Build docs developers (and LLMs) love

Get Started

Installation

Core Concepts

Guides

​Zerox

​How It Works

​Key Features

Multi-Provider Support

20+ File Formats

Structured Data Extraction

Dual SDKs

Smart Processing

Format Preservation

​Quick Example

​Trusted by Developers

​Get Started

Quickstart

Node.js Setup

Python Setup

​Popular Guides

Data Extraction

Model Providers

Batch Processing

Invoice Extraction

Build docs developers (and LLMs) love

Zerox

How It Works

Key Features

Quick Example

Trusted by Developers

Get Started

Popular Guides