Backends Overview

Overview

Backends are the foundation of Docling’s document processing architecture. Each backend is responsible for parsing a specific document format and extracting its raw content before pipeline processing stages (OCR, layout analysis, etc.) are applied.

Backend Architecture

Docling uses a modular backend system where each document format has a dedicated backend implementation:

Backend Types

Docling provides two main categories of backends:

Declarative Backends

Declarative backends can transform documents directly to DoclingDocument without requiring a recognition pipeline. These backends handle well-structured formats with explicit content markup. Examples:

DOCX (Microsoft Word)
PPTX (Microsoft PowerPoint)
XLSX (Microsoft Excel)
HTML
Markdown
AsciiDoc

Characteristics:

Direct conversion to DoclingDocument
No ML models required
Fast processing
Preserves document structure explicitly defined in format

Paginated Backends

Paginated backends extract page-level content and require additional pipeline processing for layout analysis, OCR, and structure recognition. Examples:

PDF
Images (JPEG, PNG, TIFF, etc.)

Characteristics:

Extracts raw page content (text, images, metadata)
Requires pipeline stages for structure recognition
Supports ML-based enhancements (OCR, layout analysis)
Page-by-page processing

Available Backends

PDF Backend

Process PDF documents with advanced parsing capabilities

DOCX Backend

Parse Microsoft Word documents with full formatting support

PPTX Backend

Extract content from PowerPoint presentations

XLSX Backend

Process Excel spreadsheets and tables

HTML Backend

Parse HTML documents and web pages

Image Backend

Process images (JPEG, PNG, TIFF, etc.)

Audio Backend

Transcribe audio files using ASR

Backend Interface

All backends implement the AbstractDocumentBackend interface:

Core Methods

is_valid()

bool

Check if the backend successfully loaded and can process the document.

supports_pagination()

bool

Indicates whether this backend processes documents page-by-page.

supported_formats()

set[InputFormat]

Returns the set of input formats this backend can handle.

unload()

None

Free resources and close file handles.

Declarative Backend Methods

convert()

DoclingDocument

Convert the document directly to a DoclingDocument. Only available on declarative backends.

Paginated Backend Methods

page_count()

int

Get the total number of pages in the document. Only available on paginated backends.

load_page(page_no)

PageBackend

Load a specific page for processing. Only available on some paginated backends (PDF, Image).

Backend Options

Each backend can be configured with format-specific options:

from docling.document_converter import DocumentConverter, PdfFormatOption
from docling.datamodel.backend_options import PdfBackendOptions
from pydantic import SecretStr

# Configure PDF backend
pdf_backend_options = PdfBackendOptions(
    password=SecretStr("secret123")
)

converter = DocumentConverter(
    format_options={
        PdfFormatOption: PdfFormatOption(
            backend_options=pdf_backend_options
        )
    }
)

See individual backend pages for format-specific options.

Choosing the Right Backend

Docling automatically selects the appropriate backend based on file extension and MIME type. However, understanding backend characteristics helps optimize performance:

For Document Conversion

Use DOCX, HTML, or Markdown backends when:

Source format explicitly defines structure
No OCR or layout analysis needed
Fast processing is priority
Preserving exact formatting is important

For Scanned Documents

Use PDF or Image backends when:

Documents are scanned or image-based
OCR is required
Layout analysis needed for structure detection
Processing historical or archival documents

For Data Extraction

Use XLSX backend when:

Extracting tabular data from spreadsheets
Processing financial reports or data exports
Working with structured data in Excel format

For Presentations

Use PPTX backend when:

Converting slide decks to structured format
Extracting presentation content
Processing training materials or reports

For Web Content

Use HTML backend when:

Processing web pages or HTML exports
Converting documentation sites
Handling markdown-rendered HTML

For Audio/Video

Use Audio backend when:

Transcribing recorded meetings or interviews
Processing podcast or lecture audio
Converting speech to text

Backend Lifecycle

Typical backend lifecycle during document processing:

Initialization

Backend is created with input document and options:

backend = PdfDocumentBackend(
    in_doc=input_document,
    path_or_stream=file_path,
    options=backend_options
)

Validation

Check if backend successfully loaded:

if not backend.is_valid():
    raise RuntimeError("Failed to load document")

Processing

Declarative backends:

doc = backend.convert()

Paginated backends:

for page_no in range(backend.page_count()):
    page = backend.load_page(page_no)
    # Process page through pipeline

Cleanup

Free resources:

backend.unload()

Thread Safety

Backend thread-safety considerations:

Initialization: Backends should be created per-thread or protected by locks
Page Loading (PDF/Image): Page backends are designed for concurrent access
Resource Management: Call unload() when done to free resources

import concurrent.futures

def process_page(backend, page_no):
    page = backend.load_page(page_no)
    # Process page
    page.unload()

# Safe: Load pages concurrently
backend = PdfDocumentBackend(...)
with concurrent.futures.ThreadPoolExecutor() as executor:
    futures = [
        executor.submit(process_page, backend, i)
        for i in range(backend.page_count())
    ]
    results = [f.result() for f in futures]

backend.unload()

Custom Backends

Docling’s backend system is extensible. To implement a custom backend:

Inherit from AbstractDocumentBackend, DeclarativeDocumentBackend, or PaginatedDocumentBackend
Implement required abstract methods
Register backend with DocumentConverter

from docling.backend.abstract_backend import DeclarativeDocumentBackend
from docling.datamodel.base_models import InputFormat
from docling_core.types.doc import DoclingDocument

class MyCustomBackend(DeclarativeDocumentBackend):
    @classmethod
    def supported_formats(cls):
        return {InputFormat.CUSTOM}
    
    def is_valid(self):
        return True
    
    @classmethod
    def supports_pagination(cls):
        return False
    
    def convert(self) -> DoclingDocument:
        # Implement conversion logic
        pass

Core API

Pipelines

Options & Configuration

Backends

CLI

Backends Overview

Overview

Backend Architecture

Backend Types

Declarative Backends

Paginated Backends

Available Backends

PDF Backend

DOCX Backend

PPTX Backend

XLSX Backend

HTML Backend

Image Backend

Audio Backend

Backend Interface

Core Methods

Declarative Backend Methods

Paginated Backend Methods

Backend Options

Choosing the Right Backend

Backend Lifecycle

Thread Safety

Custom Backends

See Also

Build docs developers (and LLMs) love

Core API

Pipelines

Options & Configuration

Backends

CLI

​Overview

​Backend Architecture

​Backend Types

​Declarative Backends

​Paginated Backends

​Available Backends

PDF Backend

DOCX Backend

PPTX Backend

XLSX Backend

HTML Backend

Image Backend

Audio Backend

​Backend Interface

​Core Methods

​Declarative Backend Methods

​Paginated Backend Methods

​Backend Options

​Choosing the Right Backend

​Backend Lifecycle

​Thread Safety

​Custom Backends

​See Also

Build docs developers (and LLMs) love

Overview

Backend Architecture

Backend Types

Declarative Backends

Paginated Backends

Available Backends

Backend Interface

Core Methods

Declarative Backend Methods

Paginated Backend Methods

Backend Options

Choosing the Right Backend

Backend Lifecycle

Thread Safety

Custom Backends

See Also