Plugin System

Overview

Docling’s plugin system enables third-party developers to extend Docling’s capabilities by registering custom:

OCR Engines: Add support for new optical character recognition engines
Layout Models: Integrate custom document layout detection models
Table Structure Models: Provide alternative table extraction engines
Picture Description Models: Add custom vision-language models

Plugins are loaded via the pluggy framework using setuptools entry points. Source: ~/workspace/source/docs/concepts/plugins.md:1

Plugin Architecture

Entry Point Registration

Plugins register themselves through setuptools entry points under the "docling" group.

[project.entry-points."docling"]
your_plugin_name = "your_package.module"

Source: ~/workspace/source/docs/concepts/plugins.md:5

your_plugin_name: Unique name for your plugin in the Docling ecosystem
your_package.module: Python module that registers the plugin factories

Plugin Discovery

Docling automatically discovers installed plugins at runtime through the pluggy system. Third-party plugins must be explicitly enabled:

from docling.datamodel.pipeline_options import PdfPipelineOptions

pipeline_options = PdfPipelineOptions()
pipeline_options.allow_external_plugins = True  # Enable third-party plugins

Plugin Factories

OCR Factory

The OCR factory registers custom OCR engines. Source: ~/workspace/source/docs/concepts/plugins.md:47

Implementation

# your_package/module.py

from docling.models.base_ocr_model import BaseOcrModel
from docling.datamodel.pipeline_options import OcrOptions
from pydantic import Field
from typing import ClassVar, Iterable
import logging

# Define your OCR options
class YourOcrOptions(OcrOptions):
    kind: ClassVar[str] = "your_ocr_engine"
    
    # Add custom configuration fields
    confidence_threshold: float = Field(
        default=0.8,
        description="Minimum confidence score for OCR results"
    )
    language_model: str = Field(
        default="en",
        description="Language model to use"
    )

# Implement your OCR model
class YourOcrModel(BaseOcrModel):
    def __init__(self, options: YourOcrOptions):
        self.options = options
        self.engine = self._initialize_engine()
        self._log = logging.getLogger(__name__)
    
    def _initialize_engine(self):
        # Initialize your OCR engine here
        return YourOCREngine()
    
    @classmethod
    def get_options_type(cls):
        return YourOcrOptions
    
    def __call__(self, conv_res, page_batch: Iterable):
        """Process a batch of pages and return OCR results."""
        for page in page_batch:
            # Your OCR logic here
            ocr_result = self.engine.recognize(page.image)
            
            # Add OCR results to page
            page.ocr_cells.extend(ocr_result.cells)
            
            yield page

# Factory registration
def ocr_engines():
    return {
        "ocr_engines": [
            YourOcrModel,
        ]
    }

Source: ~/workspace/source/docs/concepts/plugins.md:54

Base Class Requirements

Your OCR model must:

Inherit from BaseOcrModel
Provide an options class derived from OcrOptions
Implement get_options_type() class method
Implement __call__(conv_res, page_batch) for batch processing

Reference Implementation

See the default Docling plugins for examples. Source: ~/workspace/source/docling/models/plugins/defaults.py:1

# Docling's built-in OCR engines
def ocr_engines():
    from docling.models.stages.ocr.auto_ocr_model import OcrAutoModel
    from docling.models.stages.ocr.easyocr_model import EasyOcrModel
    from docling.models.stages.ocr.ocr_mac_model import OcrMacModel
    from docling.models.stages.ocr.rapid_ocr_model import RapidOcrModel
    from docling.models.stages.ocr.tesseract_ocr_cli_model import TesseractOcrCliModel
    from docling.models.stages.ocr.tesseract_ocr_model import TesseractOcrModel

    return {
        "ocr_engines": [
            OcrAutoModel,
            EasyOcrModel,
            OcrMacModel,
            RapidOcrModel,
            TesseractOcrModel,
            TesseractOcrCliModel,
        ]
    }

Layout Engine Factory

def layout_engines():
    from docling.models.stages.layout.layout_model import LayoutModel
    from docling.models.stages.layout.layout_object_detection_model import (
        LayoutObjectDetectionModel,
    )

    return {
        "layout_engines": [
            LayoutObjectDetectionModel,
            LayoutModel,
        ]
    }

Table Structure Factory

def table_structure_engines():
    from docling.models.stages.table_structure.table_structure_model import (
        TableStructureModel,
    )

    return {
        "table_structure_engines": [
            TableStructureModel,
        ]
    }

Picture Description Factory

Register custom vision-language models for image captioning. Source: ~/workspace/source/docling/models/plugins/defaults.py:21

def picture_description():
    from docling.models.stages.picture_description.picture_description_api_model import (
        PictureDescriptionApiModel,
    )
    from docling.models.stages.picture_description.picture_description_vlm_engine_model import (
        PictureDescriptionVlmEngineModel,
    )
    from docling.models.stages.picture_description.picture_description_vlm_model import (
        PictureDescriptionVlmModel,
    )

    return {
        "picture_description": [
            PictureDescriptionVlmEngineModel,  # New engine-based (preferred)
            PictureDescriptionVlmModel,  # Legacy direct transformers
            PictureDescriptionApiModel,  # API-based
        ]
    }

Using Plugins

Python API

Enable and configure external plugins:

from docling.datamodel.base_models import InputFormat
from docling.datamodel.pipeline_options import PdfPipelineOptions
from docling.document_converter import DocumentConverter, PdfFormatOption

# Import your plugin's options class
from your_package import YourOcrOptions

pipeline_options = PdfPipelineOptions()

# Enable external plugins
pipeline_options.allow_external_plugins = True

# Configure your plugin
pipeline_options.ocr_options = YourOcrOptions(
    lang=["eng", "spa"],
    confidence_threshold=0.9
)

converter = DocumentConverter(
    format_options={
        InputFormat.PDF: PdfFormatOption(
            pipeline_options=pipeline_options
        )
    }
)

result = converter.convert("document.pdf")

Source: ~/workspace/source/docs/concepts/plugins.md:72

Command Line Interface

When using the Docling CLI, enable external plugins before selecting them:

# List available external plugins
docling --show-external-plugins

# Run with external OCR plugin
docling --allow-external-plugins --ocr-engine=your_ocr_engine document.pdf

Source: ~/workspace/source/docs/concepts/plugins.md:92

Plugin Development Guide

Step 1: Create Plugin Package

mkdir docling-plugin-example
cd docling-plugin-example

Create package structure:

docling-plugin-example/
├── pyproject.toml
├── README.md
└── docling_plugin_example/
    ├── __init__.py
    ├── my_ocr_model.py
    └── my_ocr_options.py

Step 2: Implement Plugin

# docling_plugin_example/my_ocr_options.py
from docling.datamodel.pipeline_options import OcrOptions
from pydantic import Field
from typing import ClassVar

class MyOcrOptions(OcrOptions):
    kind: ClassVar[str] = "my_ocr"
    
    api_key: str = Field(
        description="API key for OCR service"
    )
    region: str = Field(
        default="us-east-1",
        description="Service region"
    )

# docling_plugin_example/my_ocr_model.py
from docling.models.base_ocr_model import BaseOcrModel
from .my_ocr_options import MyOcrOptions
from typing import Iterable
import logging

class MyOcrModel(BaseOcrModel):
    def __init__(self, options: MyOcrOptions):
        self.options = options
        self._log = logging.getLogger(__name__)
        self._initialize_client()
    
    def _initialize_client(self):
        # Initialize your OCR client/engine
        self.client = ExternalOCRService(
            api_key=self.options.api_key,
            region=self.options.region
        )
    
    @classmethod
    def get_options_type(cls):
        return MyOcrOptions
    
    def __call__(self, conv_res, page_batch: Iterable):
        for page in page_batch:
            # Convert page to image
            image = page.get_image()
            
            # Call your OCR service
            result = self.client.recognize(
                image,
                languages=self.options.lang
            )
            
            # Convert result to Docling format
            for word in result.words:
                ocr_cell = OcrCell(
                    text=word.text,
                    confidence=word.confidence,
                    bbox=word.bounding_box
                )
                page.ocr_cells.append(ocr_cell)
            
            yield page

# docling_plugin_example/__init__.py
from .my_ocr_model import MyOcrModel

def ocr_engines():
    """Plugin factory for OCR engines."""
    return {
        "ocr_engines": [
            MyOcrModel,
        ]
    }

__all__ = ["MyOcrModel", "ocr_engines"]

Step 3: Configure Entry Point

# pyproject.toml
[project]
name = "docling-plugin-example"
version = "0.1.0"
description = "Example OCR plugin for Docling"
requires-python = ">=3.10"
dependencies = [
    "docling>=2.0.0",
    # Your dependencies here
]

[project.entry-points."docling"]
my_ocr_plugin = "docling_plugin_example"

[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"

Step 4: Install and Test

# Install in development mode
pip install -e .

# Test plugin discovery
python -c "import pluggy; print(pluggy.PluginManager('docling').list_name_plugin())"

Step 5: Use Your Plugin

from docling.document_converter import DocumentConverter, PdfFormatOption
from docling.datamodel.pipeline_options import PdfPipelineOptions
from docling.datamodel.base_models import InputFormat
from docling_plugin_example import MyOcrOptions

pipeline_options = PdfPipelineOptions()
pipeline_options.allow_external_plugins = True
pipeline_options.ocr_options = MyOcrOptions(
    lang=["eng"],
    api_key="your-api-key",
    region="us-west-2"
)

converter = DocumentConverter(
    format_options={
        InputFormat.PDF: PdfFormatOption(
            pipeline_options=pipeline_options
        )
    }
)

result = converter.convert("document.pdf")

Best Practices

Naming Conventions

Use descriptive, unique plugin names (e.g., docling-plugin-aws-textract)
Prefix your package with docling-plugin- for discoverability
Use lowercase with hyphens for package names
Use snake_case for Python modules

Error Handling

class MyOcrModel(BaseOcrModel):
    def __call__(self, conv_res, page_batch):
        for page in page_batch:
            try:
                # Your OCR logic
                result = self.client.recognize(page.image)
                # Process result
            except OCRServiceError as e:
                self._log.error(f"OCR failed for page {page.page_no}: {e}")
                # Handle gracefully - don't break pipeline
                page.ocr_cells = []  # Empty results
            
            yield page

Configuration Validation

Use Pydantic’s validation features:

from pydantic import Field, field_validator

class MyOcrOptions(OcrOptions):
    api_key: str = Field(min_length=10)
    
    @field_validator('api_key')
    def validate_api_key(cls, v):
        if not v.startswith('ak_'):
            raise ValueError('API key must start with ak_')
        return v

Resource Management

Clean up resources properly:

class MyOcrModel(BaseOcrModel):
    def __init__(self, options):
        self.options = options
        self.client = None
    
    def _initialize_client(self):
        if self.client is None:
            self.client = OCRClient()
    
    def __del__(self):
        if self.client:
            self.client.close()

Testing

Provide comprehensive tests:

# tests/test_my_ocr.py
import pytest
from docling_plugin_example import MyOcrModel, MyOcrOptions

def test_ocr_initialization():
    options = MyOcrOptions(
        lang=["eng"],
        api_key="test_key"
    )
    model = MyOcrModel(options)
    assert model.options.api_key == "test_key"

def test_ocr_processing():
    # Test with sample document
    pass

Documentation

Document your plugin thoroughly:

README with installation instructions
Configuration options reference
Usage examples
Supported features and limitations
Performance characteristics

Security Considerations

External plugins run with full application privileges. Only install plugins from trusted sources.

Validate all user inputs in your plugin
Don’t store API keys or secrets in code - use configuration
Be transparent about data transmission to external services
Follow secure coding practices
Keep dependencies up to date

Publishing Your Plugin

PyPI Distribution

# Build distribution
pip install build
python -m build

# Upload to PyPI
pip install twine
twine upload dist/*

Documentation

Create a comprehensive README:

# Docling Plugin: Example OCR

Custom OCR engine plugin for Docling.

## Installation

```bash
pip install docling-plugin-example

Usage

from docling_plugin_example import MyOcrOptions
from docling.datamodel.pipeline_options import PdfPipelineOptions

pipeline_options = PdfPipelineOptions()
pipeline_options.allow_external_plugins = True
pipeline_options.ocr_options = MyOcrOptions(
    api_key="your-key"
)

Configuration

Option	Type	Default	Description
api_key	str	Required	API key
region	str	us-east-1	Service region

License

MIT

## Plugin Examples

Official Docling plugins:
- [docling-core](https://github.com/docling-project/docling-core) - Core data structures
- [docling-ibm-models](https://github.com/docling-project/docling-ibm-models) - IBM's layout models

Community plugins:
- Check [GitHub topics](https://github.com/topics/docling-plugin) for community-developed plugins

## Related Resources

<CardGroup cols={2}>
  <Card title="Model Catalog" icon="book" href="/advanced/model-catalog">
    Available models and engines
  </Card>
  <Card title="Enrichments" icon="sparkles" href="/advanced/enrichments">
    Document enrichment features
  </Card>
  <Card title="Pipeline Options" icon="sliders" href="/api/options/pipeline-options">
    Configure processing pipelines
  </Card>
  <Card title="Base Models" icon="code" href="https://github.com/docling-project/docling/blob/main/docling/models/">
    Base model implementations
  </Card>
</CardGroup>

Get Started

Core Concepts

Usage Guides

Advanced Features

Integrations

Overview

Plugin Architecture

Entry Point Registration

Plugin Discovery

Plugin Factories

OCR Factory

Implementation

Base Class Requirements

Reference Implementation

Layout Engine Factory

Table Structure Factory

Picture Description Factory

Using Plugins

Python API

Command Line Interface

Plugin Development Guide

Step 1: Create Plugin Package

Step 2: Implement Plugin

Step 3: Configure Entry Point

Step 4: Install and Test

Step 5: Use Your Plugin

Best Practices

Security Considerations

Publishing Your Plugin

PyPI Distribution

Documentation

Usage

Configuration

License

Build docs developers (and LLMs) love

Get Started

Core Concepts

Usage Guides

Advanced Features

Integrations

​Overview

​Plugin Architecture

​Entry Point Registration

​Plugin Discovery

​Plugin Factories

​OCR Factory

​Implementation

​Base Class Requirements

​Reference Implementation

​Layout Engine Factory

​Table Structure Factory

​Picture Description Factory

​Using Plugins

​Python API

​Command Line Interface

​Plugin Development Guide

​Step 1: Create Plugin Package

​Step 2: Implement Plugin

​Step 3: Configure Entry Point

​Step 4: Install and Test

​Step 5: Use Your Plugin

​Best Practices

​Security Considerations

​Publishing Your Plugin

​PyPI Distribution

​Documentation

​Usage

​Configuration

​License

Build docs developers (and LLMs) love

Overview

Plugin Architecture

Entry Point Registration

Plugin Discovery

Plugin Factories

OCR Factory

Implementation

Base Class Requirements

Reference Implementation

Layout Engine Factory

Table Structure Factory

Picture Description Factory

Using Plugins

Python API

Command Line Interface

Plugin Development Guide

Step 1: Create Plugin Package

Step 2: Implement Plugin

Step 3: Configure Entry Point

Step 4: Install and Test

Step 5: Use Your Plugin

Best Practices

Security Considerations

Publishing Your Plugin

PyPI Distribution

Documentation

Usage

Configuration

License