Skip to main content

Enums

ModelOptions

Available model identifiers for OCR and extraction.
enum ModelOptions {
  // OpenAI GPT-4 Models
  OPENAI_GPT_4_1 = "gpt-4.1",
  OPENAI_GPT_4_1_MINI = "gpt-4.1-mini",
  OPENAI_GPT_4O = "gpt-4o",
  OPENAI_GPT_4O_MINI = "gpt-4o-mini",

  // Bedrock Claude 3 Models
  BEDROCK_CLAUDE_3_HAIKU_2024_10 = "anthropic.claude-3-5-haiku-20241022-v1:0",
  BEDROCK_CLAUDE_3_SONNET_2024_06 = "anthropic.claude-3-5-sonnet-20240620-v1:0",
  BEDROCK_CLAUDE_3_SONNET_2024_10 = "anthropic.claude-3-5-sonnet-20241022-v2:0",
  BEDROCK_CLAUDE_3_HAIKU_2024_03 = "anthropic.claude-3-haiku-20240307-v1:0",
  BEDROCK_CLAUDE_3_OPUS_2024_02 = "anthropic.claude-3-opus-20240229-v1:0",
  BEDROCK_CLAUDE_3_SONNET_2024_02 = "anthropic.claude-3-sonnet-20240229-v1:0",

  // Google Gemini Models
  GOOGLE_GEMINI_1_5_FLASH = "gemini-1.5-flash",
  GOOGLE_GEMINI_1_5_FLASH_8B = "gemini-1.5-flash-8b",
  GOOGLE_GEMINI_1_5_PRO = "gemini-1.5-pro",
  GOOGLE_GEMINI_2_5_PRO = "gemini-2.5-pro-preview-03-25",
  GOOGLE_GEMINI_2_FLASH = "gemini-2.0-flash-001",
  GOOGLE_GEMINI_2_FLASH_LITE = "gemini-2.0-flash-lite-preview-02-05",
}
Usage:
import { zerox, ModelOptions } from 'zerox';

const result = await zerox({
  filePath: './document.pdf',
  model: ModelOptions.OPENAI_GPT_4O,
  credentials: { apiKey: process.env.OPENAI_API_KEY }
});

ModelProvider

Supported model providers.
enum ModelProvider {
  AZURE = "AZURE",
  BEDROCK = "BEDROCK",
  GOOGLE = "GOOGLE",
  OPENAI = "OPENAI",
}
Usage:
import { zerox, ModelProvider } from 'zerox';

const result = await zerox({
  filePath: './document.pdf',
  modelProvider: ModelProvider.BEDROCK,
  credentials: {
    region: 'us-east-1',
    accessKeyId: process.env.AWS_ACCESS_KEY_ID,
    secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY
  }
});

ErrorMode

Error handling strategies.
enum ErrorMode {
  THROW = "THROW",
  IGNORE = "IGNORE",
}
Usage:
import { zerox, ErrorMode } from 'zerox';

const result = await zerox({
  filePath: './document.pdf',
  errorMode: ErrorMode.THROW, // Throw on first error
  credentials: { apiKey: process.env.OPENAI_API_KEY }
});

PageStatus

Page processing status.
enum PageStatus {
  SUCCESS = "SUCCESS",
  ERROR = "ERROR",
}
Usage:
result.pages.forEach(page => {
  if (page.status === PageStatus.ERROR) {
    console.error(`Page ${page.page} failed:`, page.error);
  }
});

OperationMode

Internal operation modes (used by custom model functions).
enum OperationMode {
  EXTRACTION = "EXTRACTION",
  OCR = "OCR",
}

Credential Types

ModelCredentials

Union type of all supported credential types.
type ModelCredentials =
  | AzureCredentials
  | BedrockCredentials
  | GoogleCredentials
  | OpenAICredentials;

OpenAICredentials

interface OpenAICredentials {
  apiKey: string;
}
Example:
const credentials: OpenAICredentials = {
  apiKey: process.env.OPENAI_API_KEY!
};

AzureCredentials

interface AzureCredentials {
  apiKey: string;
  endpoint: string;
}
Example:
const credentials: AzureCredentials = {
  apiKey: process.env.AZURE_API_KEY!,
  endpoint: 'https://your-resource.openai.azure.com/'
};

BedrockCredentials

interface BedrockCredentials {
  region: string;
  accessKeyId?: string;
  secretAccessKey?: string;
  sessionToken?: string;
}
Example:
const credentials: BedrockCredentials = {
  region: 'us-east-1',
  accessKeyId: process.env.AWS_ACCESS_KEY_ID,
  secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY
};

GoogleCredentials

interface GoogleCredentials {
  apiKey: string;
}
Example:
const credentials: GoogleCredentials = {
  apiKey: process.env.GOOGLE_API_KEY!
};

LLM Parameter Types

LLMParams

Union type of all provider-specific LLM parameters.
type LLMParams =
  | AzureLLMParams
  | BedrockLLMParams
  | GoogleLLMParams
  | OpenAILLMParams;

Common LLM Parameters

All LLM parameter types share these base properties:
interface BaseLLMParams {
  frequencyPenalty?: number;
  presencePenalty?: number;
  temperature?: number;
  topP?: number;
}

OpenAILLMParams

interface OpenAILLMParams extends BaseLLMParams {
  logprobs: boolean;
  maxTokens: number;
}
Example:
const llmParams: Partial<OpenAILLMParams> = {
  temperature: 0.1,
  maxTokens: 4000,
  logprobs: true
};

AzureLLMParams

interface AzureLLMParams extends BaseLLMParams {
  logprobs: boolean;
  maxTokens: number;
}

BedrockLLMParams

interface BedrockLLMParams extends BaseLLMParams {
  maxTokens: number;
}
Example:
const llmParams: Partial<BedrockLLMParams> = {
  temperature: 0.2,
  maxTokens: 3000,
  topP: 0.9
};

GoogleLLMParams

interface GoogleLLMParams extends BaseLLMParams {
  maxOutputTokens: number;
}
Example:
const llmParams: Partial<GoogleLLMParams> = {
  temperature: 0.3,
  maxOutputTokens: 2048
};

Response Types

Page

Represents a single processed page.
interface Page {
  content?: string;
  contentLength?: number;
  error?: string;
  extracted?: Record<string, unknown>;
  inputTokens?: number;
  outputTokens?: number;
  page: number;
  status: PageStatus;
}
Properties:
  • content - Extracted text from the page (undefined on error)
  • contentLength - Character count of content
  • error - Error message if processing failed
  • extracted - Per-page extracted data (when using extractPerPage)
  • inputTokens - Input tokens consumed for this page
  • outputTokens - Output tokens generated for this page
  • page - Page number (1-indexed)
  • status - PageStatus.SUCCESS or PageStatus.ERROR

Summary

Processing summary statistics.
interface Summary {
  totalPages: number;
  ocr: {
    successful: number;
    failed: number;
  } | null;
  extracted: {
    successful: number;
    failed: number;
  } | null;
}
Properties:
  • totalPages - Total number of pages in the document
  • ocr - OCR statistics (null in extractOnly mode)
    • successful - Pages successfully processed
    • failed - Pages that failed
  • extracted - Extraction statistics (null if no schema)
    • successful - Successful extraction operations
    • failed - Failed extraction operations

LogprobPage

Log probabilities for a single page or extraction.
interface LogprobPage {
  page: number | null;
  value: ChatCompletionTokenLogprob[];
}
Properties:
  • page - Page number (null for full-document extractions)
  • value - Array of token-level log probabilities from OpenAI API

Advanced Types

HybridInput

Input type for hybrid extraction mode.
interface HybridInput {
  imagePaths: string[];
  text: string;
}
Used internally when enableHybridExtraction: true.

CompletionResponse

Response from model completion (for custom model functions).
interface CompletionResponse {
  content: string;
  inputTokens: number;
  logprobs?: ChatCompletionTokenLogprob[] | null;
  outputTokens: number;
}
Usage:
const customModelFunction = async (params) => {
  // Your custom logic
  return {
    content: 'Extracted text...',
    inputTokens: 1000,
    outputTokens: 500,
    logprobs: null
  };
};

ExtractionResponse

Response from extraction operation (for custom model functions).
interface ExtractionResponse {
  extracted: Record<string, unknown>;
  inputTokens: number;
  logprobs?: ChatCompletionTokenLogprob[] | null;
  outputTokens: number;
}

ModelInterface

Interface for custom model implementations.
interface ModelInterface {
  getCompletion(
    mode: OperationMode,
    params: CompletionArgs | ExtractionArgs
  ): Promise<CompletionResponse | ExtractionResponse>;
}

Type Guards

You can use TypeScript’s type narrowing to work with the response:
import { PageStatus } from 'zerox';

result.pages.forEach(page => {
  if (page.status === PageStatus.SUCCESS && page.content) {
    // TypeScript knows page.content is defined here
    console.log(page.content.substring(0, 100));
  } else if (page.status === PageStatus.ERROR && page.error) {
    // TypeScript knows page.error is defined here
    console.error(page.error);
  }
});

Build docs developers (and LLMs) love