Skip to main content

Response Type

The zerox() function returns a ZeroxOutput object containing the processed markdown content and metadata.
from pyzerox import zerox

result = await zerox(file_path="document.pdf")
# result is a ZeroxOutput object

ZeroxOutput Structure

@dataclass
class ZeroxOutput:
    completion_time: float
    file_name: str
    input_tokens: int
    output_tokens: int
    pages: List[Page]

Response Fields

completion_time
float
Total processing time in milliseconds from start to finish.Includes time for:
  • File download (if URL)
  • PDF to image conversion
  • API calls to vision model
  • Markdown aggregation
result = await zerox(file_path="document.pdf")
print(f"Completed in {result.completion_time:.2f}ms")
# Output: Completed in 9432.98ms
file_name
str
Sanitized name of the processed file (without extension).Sanitization rules:
  • Special characters replaced with underscores
  • Converted to lowercase
  • Alphanumeric characters preserved
  • Truncated to 255 characters
Example:
  • Input: "My Document (Final).pdf"
  • Output: "my_document__final_"
result = await zerox(file_path="/path/to/Invoice #12345.pdf")
print(result.file_name)
# Output: invoice__12345
input_tokens
int
Total number of input tokens consumed across all API calls.Input tokens include:
  • System prompt
  • User message (image + instructions)
  • Previous page context (if maintain_format=True)
result = await zerox(file_path="document.pdf")
print(f"Input tokens: {result.input_tokens}")
# Output: Input tokens: 36877
output_tokens
int
Total number of output tokens generated across all API calls.Output tokens represent the markdown content generated by the model.
result = await zerox(file_path="document.pdf")
print(f"Output tokens: {result.output_tokens}")
total_tokens = result.input_tokens + result.output_tokens
print(f"Total tokens: {total_tokens}")
pages
List[Page]
List of Page objects containing the markdown content for each processed page.Pages are ordered sequentially. When using select_pages, only the selected pages are included with their original page numbers preserved.See Page Structure below for details.
result = await zerox(file_path="document.pdf")

# Iterate through pages
for page in result.pages:
    print(f"Page {page.page}: {page.content_length} characters")

# Access specific page
first_page = result.pages[0]
print(first_page.content)

Page Structure

Each page in the pages list is a Page object with the following structure:
@dataclass
class Page:
    content: str
    content_length: int
    page: int

Page Fields

content
str
The markdown content extracted from the page.Content includes:
  • Headers and body text
  • Tables (formatted as HTML)
  • Lists and formatting
  • Special markers for logos, watermarks, page numbers
result = await zerox(file_path="document.pdf")
page = result.pages[0]

print(page.content)
# Output:
# # INVOICE # 36258
# **Date:** Mar 06 2012
# ...
content_length
int
The length of the markdown content in characters.Equivalent to len(content).
result = await zerox(file_path="document.pdf")
page = result.pages[0]

print(f"Content length: {page.content_length} characters")
# Output: Content length: 2333 characters
page
int
The page number (1-indexed) from the original document.When using select_pages, this reflects the original page number, not the sequential index.
# Process only pages 2, 5, and 10
result = await zerox(
    file_path="document.pdf",
    select_pages=[2, 5, 10]
)

# Pages preserve original numbering
print(result.pages[0].page)  # Output: 2
print(result.pages[1].page)  # Output: 5
print(result.pages[2].page)  # Output: 10

Example Output

This example shows output from processing a single-page invoice PDF using Azure OpenAI’s gpt-4o-mini model.
ZeroxOutput(
    completion_time=9432.975,
    file_name='cs101',
    input_tokens=36877,
    output_tokens=515,
    pages=[
        Page(
            content='| Type    | Description                          | Wrapper Class |\n' +
                    '|---------|--------------------------------------|---------------|\n' +
                    '| byte    | 8-bit signed 2s complement integer   | Byte          |\n' +
                    '| short   | 16-bit signed 2s complement integer  | Short         |\n' +
                    '| int     | 32-bit signed 2s complement integer  | Integer       |\n' +
                    '| long    | 64-bit signed 2s complement integer  | Long          |\n' +
                    '| float   | 32-bit IEEE 754 floating point number| Float         |\n' +
                    '| double  | 64-bit floating point number         | Double        |\n' +
                    '| boolean | may be set to true or false          | Boolean       |\n' +
                    '| char    | 16-bit Unicode (UTF-16) character    | Character     |\n\n' +
                    'Table 26.2.: Primitive types in Java\n\n' +
                    '### 26.3.1. Declaration & Assignment\n\n' +
                    'Java is a statically typed language meaning that all variables must be declared before you can use ' +
                    'them or refer to them. In addition, when declaring a variable, you must specify both its type and ' +
                    'its identifier. For example:\n\n' +
                    '```java\n' +
                    'int numUnits;\n' +
                    'double costPerUnit;\n' +
                    'char firstInitial;\n' +
                    'boolean isStudent;\n' +
                    '```\n\n' +
                    'Each declaration specifies the variable\'s type followed by the identifier and ending with a ' +
                    'semicolon. The identifier rules are fairly standard: a name can consist of lowercase and ' +
                    'uppercase alphabetic characters, numbers, and underscores but may not begin with a numeric ' +
                    'character. We adopt the modern camelCasing naming convention for variables in our code. In ' +
                    'general, variables must be assigned a value before you can use them in an expression. You do not ' +
                    'have to immediately assign a value when you declare them (though it is good practice), but some ' +
                    'value must be assigned before they can be used or the compiler will issue an error.\n\n' +
                    'The assignment operator is a single equal sign, `=` and is a right-to-left assignment. That is, ' +
                    'the variable that we wish to assign the value to appears on the left-hand-side while the value ' +
                    '(literal, variable or expression) is on the right-hand-side. Using our variables from before, ' +
                    'we can assign them values:\n\n' +
                    '> 2 Instance variables, that is variables declared as part of an object do have default values. ' +
                    'For objects, the default is `null`, for all numeric types, zero is the default value. For the ' +
                    'boolean type, `false` is the default, and the default char value is `\\0`, the null-terminating ' +
                    'character (zero in the ASCII table).',
            content_length=2333,
            page=1
        )
    ]
)

Working with the Response

Accessing Page Content

result = await zerox(file_path="document.pdf")

# Get all page content as a single string
full_markdown = "\n\n".join(page.content for page in result.pages)

# Process each page individually
for page in result.pages:
    print(f"\n--- Page {page.page} ---")
    print(page.content[:200])  # Print first 200 characters
    print(f"Total length: {page.content_length} characters")

Calculating Token Costs

result = await zerox(file_path="document.pdf")

# Example pricing for gpt-4o-mini (as of 2024)
INPUT_COST_PER_1K = 0.00015   # $0.15 per 1M tokens
OUTPUT_COST_PER_1K = 0.0006    # $0.60 per 1M tokens

input_cost = (result.input_tokens / 1000) * INPUT_COST_PER_1K
output_cost = (result.output_tokens / 1000) * OUTPUT_COST_PER_1K
total_cost = input_cost + output_cost

print(f"Input tokens: {result.input_tokens:,} (${input_cost:.4f})")
print(f"Output tokens: {result.output_tokens:,} (${output_cost:.4f})")
print(f"Total cost: ${total_cost:.4f}")

Processing Time Analysis

result = await zerox(
    file_path="large-document.pdf",
    concurrency=10
)

time_in_seconds = result.completion_time / 1000
time_per_page = result.completion_time / len(result.pages)

print(f"Total time: {time_in_seconds:.2f}s")
print(f"Pages processed: {len(result.pages)}")
print(f"Average time per page: {time_per_page:.0f}ms")

Filtering Pages by Content

result = await zerox(file_path="document.pdf")

# Find pages containing specific keywords
search_term = "invoice"
matching_pages = [
    page for page in result.pages 
    if search_term.lower() in page.content.lower()
]

print(f"Found '{search_term}' on {len(matching_pages)} pages:")
for page in matching_pages:
    print(f"  - Page {page.page}")

Saving Individual Pages

import aiofiles
import os

result = await zerox(file_path="document.pdf")

# Save each page as a separate markdown file
output_dir = "./output/pages"
os.makedirs(output_dir, exist_ok=True)

for page in result.pages:
    file_path = os.path.join(output_dir, f"page_{page.page}.md")
    async with aiofiles.open(file_path, "w") as f:
        await f.write(page.content)

print(f"Saved {len(result.pages)} pages to {output_dir}")

Response Comparison: Python vs Node.js

Key differences between Python and Node.js response structures:
FieldPythonNode.jsNotes
Completion timecompletion_timecompletionTimeSnake case vs camel case
File namefile_namefileNameSnake case vs camel case
Input tokensinput_tokensinputTokensSnake case vs camel case
Output tokensoutput_tokensoutputTokensSnake case vs camel case
PagespagespagesSame
Page contentcontentcontentSame
Content lengthcontent_lengthcontentLengthSnake case vs camel case
Extracted dataNot availableextractedNode.js only
SummaryNot availablesummaryNode.js only
Node.js exclusive fields:
  • extracted: Structured data extraction results (when using schema parameter)
  • summary: Processing summary with success/failure counts

Type Annotations

For type hints in your Python code:
from typing import List
from pyzerox import zerox
from pyzerox.core.types import ZeroxOutput, Page

async def process_document(file_path: str) -> ZeroxOutput:
    result: ZeroxOutput = await zerox(file_path=file_path)
    return result

def analyze_pages(pages: List[Page]) -> None:
    for page in pages:
        print(f"Page {page.page}: {page.content_length} chars")

# Usage
result = await process_document("document.pdf")
analyze_pages(result.pages)

Build docs developers (and LLMs) love