Overview
PdfBackendOptions configures how PDF documents are parsed at the backend level, before pipeline processing stages.
PdfBackendOptions
Parameters
Backend type identifier. Always set to
"pdf" for PDF backends.Password for encrypted PDF documents. Use Pydantic’s
SecretStr type to securely handle sensitive password data.Example:Enable fetching of remote resources referenced in the PDF document.
Enable fetching of local resources referenced in the PDF document.
Usage
Basic Usage
With Pipeline Options
Backend Selection
Docling uses different PDF parsing backends depending on configuration:Available PDF Backends
Available PDF Backends
PYPDFIUM2
PYPDFIUM2
Standard PDF parser using PyPDFium2 library. Fast and reliable for basic text extraction.
DOCLING_PARSE
DOCLING_PARSE
Docling’s advanced parsing backend with enhanced layout analysis and structure preservation. Provides better table detection and complex layout handling.This is the current recommended backend (replaces deprecated DLPARSE_V1, DLPARSE_V2, DLPARSE_V4).
See Also
- PDF Backend - PDF backend architecture
- Pipeline Options - Pipeline configuration
- DocumentConverter - Main conversion API