Zerox requires GraphicsMagick and Ghostscript for PDF processing. These are usually installed automatically, but you may need to install them manually:
Python uses LiteLLM under the hood, which supports 100+ LLMs. Model names follow the LiteLLM format: provider/model-name. Refer to the LiteLLM documentation for the complete list of supported models.
Here are some frequently used options to customize Zerox’s behavior:
Node.js
Python
import { zerox } from "zerox";const result = await zerox({ filePath: "path/to/file.pdf", credentials: { apiKey: process.env.OPENAI_API_KEY, }, // Process only specific pages (e.g., pages 1, 3, and 5) pagesToConvertAsImages: [1, 3, 5], // Control parallel processing concurrency: 5, // Maintain consistent formatting across pages (slower but better for tables) maintainFormat: true, // Save combined markdown to a file outputDir: "./output", // Keep temporary images after processing cleanup: false, // Add custom instructions for the vision model prompt: "Extract all tables and preserve their structure exactly.", // Adjust image quality imageDensity: 300, // DPI for image conversion imageHeight: 2048, // Maximum height in pixels});
from pyzerox import zeroxresult = await zerox( file_path="path/to/file.pdf", model="gpt-4o-mini", # Process only specific pages (e.g., pages 1, 3, and 5) select_pages=[1, 3, 5], # Control parallel processing concurrency=5, # Maintain consistent formatting across pages (slower but better for tables) maintain_format=True, # Save combined markdown to a file output_dir="./output", # Keep temporary images after processing cleanup=False, # Use a custom system prompt custom_system_prompt="Extract all tables and preserve their structure exactly.", # Adjust image quality image_density=300, # DPI for image conversion image_height=(None, 2048), # Maximum height in pixels)
The maintainFormat option processes pages sequentially (not in parallel) because it passes the previous page’s output as context. This is slower but produces more consistent formatting, especially for tables that span multiple pages.