Required Parameters
The path or URL to the document file to process. Supports:
- Local file paths (e.g.,
"./document.pdf") - HTTP/HTTPS URLs (e.g.,
"https://example.com/doc.pdf") - Supported formats: PDF, DOCX, images, and more
Model Configuration
The vision model to use for OCR processing. The model name format depends on the provider.Refer to LiteLLM Providers for correct model names.Examples:
- OpenAI:
"gpt-4o-mini","gpt-4o" - Azure:
"azure/gpt-4o-mini"(format:azure/<deployment_name>) - Gemini:
"gemini/gemini-1.5-flash"(format:gemini/<model_name>) - Anthropic:
"claude-3-opus-20240229" - Bedrock:
"bedrock/anthropic.claude-3-sonnet-20240229-v1:0" - Vertex AI:
"vertex_ai/gemini-1.5-flash-001"
Override the default system prompt used for OCR processing.When set, a warning will be raised to inform you that the default prompt has been overridden.Default prompt: Instructs the model to convert documents to markdown, include all information, format tables as HTML, and use specific markers for logos, watermarks, and page numbers.
This parameter is unique to the Python SDK. The Node.js SDK does not support custom system prompts.
Processing Options
The number of pages to process concurrently. Higher values speed up processing but increase memory and API usage.Set to
1 for sequential processing (useful when combined with maintain_format).Whether to maintain consistent formatting across pages by passing the previous page’s output as context to the next page.When enabled:
- Pages are processed sequentially (slower)
- Each page receives the previous page’s markdown as context
- Useful for documents with tables spanning multiple pages
- Helps maintain consistent table formatting
Process only specific pages from the document. Can be:
- A single page number (int):
3 - A list of page numbers:
[1, 3, 5] - Any iterable of page numbers
None, all pages are processed.Image Conversion Options
DPI (dots per inch) for converting PDF pages to images. Higher values produce better quality but larger file sizes.Recommended values:
150: Fast, lower quality300: Good balance (default)600: High quality, slower
Maximum height for converted images as a tuple
(width, height). The first element (width) is typically None to maintain aspect ratio.Images larger than this height will be scaled down while preserving aspect ratio.File Management
Directory to save the aggregated markdown output as a
.md file.When set:- The directory will be created if it doesn’t exist
- Output file will be named
{file_name}.md - File name is sanitized (special characters replaced with underscores)
None, no file is written (markdown is only returned in the response).Directory for storing temporary files during processing (downloaded PDFs, converted images, etc.).Behavior:
- When
None: Uses system temp directory (automatically cleaned up) - When set: Uses specified directory
- If directory exists, contents are deleted before use
- If
cleanup=True, directory is deleted after processing - If
cleanup=False, files remain for inspection
Whether to delete temporary files after processing completes.Set to
False to keep temporary files for debugging or inspection.Additional LiteLLM Parameters
Additional keyword arguments passed directly to the Vertex AI Example:
litellm.completion() method.Commonly used parameters:temperature: Controls randomness (0.0 to 1.0)max_tokens: Maximum tokens in responsetop_p: Nucleus sampling parameterfrequency_penalty: Penalize frequent tokenspresence_penalty: Penalize present tokens- Provider-specific credentials (e.g.,
vertex_credentials)
- LiteLLM Providers for provider setup
- LiteLLM Completion Input for all parameters
Parameter Comparison: Python vs Node.js
Key differences between Python and Node.js SDKs:
| Feature | Python Parameter | Node.js Parameter | Notes |
|---|---|---|---|
| Custom prompts | custom_system_prompt | Not available | Python only |
| Page selection | select_pages | pagesToConvertAsImages | Different naming |
| Format maintenance | maintain_format | maintainFormat | Snake case vs camel case |
| Temp directory | temp_dir | tempDir | Snake case vs camel case |
| Data extraction | Not available | schema, extractPerPage | Node.js only |
| Error handling | Not available | errorMode | Node.js only |
| Orientation fix | Not available | correctOrientation | Node.js only |
| Edge trimming | Not available | trimEdges | Node.js only |

