Skip to main content

Overview

Docling’s plugin system enables third-party developers to extend Docling’s capabilities by registering custom:
  • OCR Engines: Add support for new optical character recognition engines
  • Layout Models: Integrate custom document layout detection models
  • Table Structure Models: Provide alternative table extraction engines
  • Picture Description Models: Add custom vision-language models
Plugins are loaded via the pluggy framework using setuptools entry points. Source: ~/workspace/source/docs/concepts/plugins.md:1

Plugin Architecture

Entry Point Registration

Plugins register themselves through setuptools entry points under the "docling" group.
[project.entry-points."docling"]
your_plugin_name = "your_package.module"
Source: ~/workspace/source/docs/concepts/plugins.md:5
  • your_plugin_name: Unique name for your plugin in the Docling ecosystem
  • your_package.module: Python module that registers the plugin factories

Plugin Discovery

Docling automatically discovers installed plugins at runtime through the pluggy system. Third-party plugins must be explicitly enabled:
from docling.datamodel.pipeline_options import PdfPipelineOptions

pipeline_options = PdfPipelineOptions()
pipeline_options.allow_external_plugins = True  # Enable third-party plugins

Plugin Factories

OCR Factory

The OCR factory registers custom OCR engines. Source: ~/workspace/source/docs/concepts/plugins.md:47

Implementation

# your_package/module.py

from docling.models.base_ocr_model import BaseOcrModel
from docling.datamodel.pipeline_options import OcrOptions
from pydantic import Field
from typing import ClassVar, Iterable
import logging

# Define your OCR options
class YourOcrOptions(OcrOptions):
    kind: ClassVar[str] = "your_ocr_engine"
    
    # Add custom configuration fields
    confidence_threshold: float = Field(
        default=0.8,
        description="Minimum confidence score for OCR results"
    )
    language_model: str = Field(
        default="en",
        description="Language model to use"
    )

# Implement your OCR model
class YourOcrModel(BaseOcrModel):
    def __init__(self, options: YourOcrOptions):
        self.options = options
        self.engine = self._initialize_engine()
        self._log = logging.getLogger(__name__)
    
    def _initialize_engine(self):
        # Initialize your OCR engine here
        return YourOCREngine()
    
    @classmethod
    def get_options_type(cls):
        return YourOcrOptions
    
    def __call__(self, conv_res, page_batch: Iterable):
        """Process a batch of pages and return OCR results."""
        for page in page_batch:
            # Your OCR logic here
            ocr_result = self.engine.recognize(page.image)
            
            # Add OCR results to page
            page.ocr_cells.extend(ocr_result.cells)
            
            yield page

# Factory registration
def ocr_engines():
    return {
        "ocr_engines": [
            YourOcrModel,
        ]
    }
Source: ~/workspace/source/docs/concepts/plugins.md:54

Base Class Requirements

Your OCR model must:
  • Inherit from BaseOcrModel
  • Provide an options class derived from OcrOptions
  • Implement get_options_type() class method
  • Implement __call__(conv_res, page_batch) for batch processing

Reference Implementation

See the default Docling plugins for examples. Source: ~/workspace/source/docling/models/plugins/defaults.py:1
# Docling's built-in OCR engines
def ocr_engines():
    from docling.models.stages.ocr.auto_ocr_model import OcrAutoModel
    from docling.models.stages.ocr.easyocr_model import EasyOcrModel
    from docling.models.stages.ocr.ocr_mac_model import OcrMacModel
    from docling.models.stages.ocr.rapid_ocr_model import RapidOcrModel
    from docling.models.stages.ocr.tesseract_ocr_cli_model import TesseractOcrCliModel
    from docling.models.stages.ocr.tesseract_ocr_model import TesseractOcrModel

    return {
        "ocr_engines": [
            OcrAutoModel,
            EasyOcrModel,
            OcrMacModel,
            RapidOcrModel,
            TesseractOcrModel,
            TesseractOcrCliModel,
        ]
    }

Layout Engine Factory

Register custom layout detection models. Source: ~/workspace/source/docling/models/plugins/defaults.py:41
def layout_engines():
    from docling.models.stages.layout.layout_model import LayoutModel
    from docling.models.stages.layout.layout_object_detection_model import (
        LayoutObjectDetectionModel,
    )

    return {
        "layout_engines": [
            LayoutObjectDetectionModel,
            LayoutModel,
        ]
    }

Table Structure Factory

Register custom table extraction engines. Source: ~/workspace/source/docling/models/plugins/defaults.py:59
def table_structure_engines():
    from docling.models.stages.table_structure.table_structure_model import (
        TableStructureModel,
    )

    return {
        "table_structure_engines": [
            TableStructureModel,
        ]
    }

Picture Description Factory

Register custom vision-language models for image captioning. Source: ~/workspace/source/docling/models/plugins/defaults.py:21
def picture_description():
    from docling.models.stages.picture_description.picture_description_api_model import (
        PictureDescriptionApiModel,
    )
    from docling.models.stages.picture_description.picture_description_vlm_engine_model import (
        PictureDescriptionVlmEngineModel,
    )
    from docling.models.stages.picture_description.picture_description_vlm_model import (
        PictureDescriptionVlmModel,
    )

    return {
        "picture_description": [
            PictureDescriptionVlmEngineModel,  # New engine-based (preferred)
            PictureDescriptionVlmModel,  # Legacy direct transformers
            PictureDescriptionApiModel,  # API-based
        ]
    }

Using Plugins

Python API

Enable and configure external plugins:
from docling.datamodel.base_models import InputFormat
from docling.datamodel.pipeline_options import PdfPipelineOptions
from docling.document_converter import DocumentConverter, PdfFormatOption

# Import your plugin's options class
from your_package import YourOcrOptions

pipeline_options = PdfPipelineOptions()

# Enable external plugins
pipeline_options.allow_external_plugins = True

# Configure your plugin
pipeline_options.ocr_options = YourOcrOptions(
    lang=["eng", "spa"],
    confidence_threshold=0.9
)

converter = DocumentConverter(
    format_options={
        InputFormat.PDF: PdfFormatOption(
            pipeline_options=pipeline_options
        )
    }
)

result = converter.convert("document.pdf")
Source: ~/workspace/source/docs/concepts/plugins.md:72

Command Line Interface

When using the Docling CLI, enable external plugins before selecting them:
# List available external plugins
docling --show-external-plugins

# Run with external OCR plugin
docling --allow-external-plugins --ocr-engine=your_ocr_engine document.pdf
Source: ~/workspace/source/docs/concepts/plugins.md:92

Plugin Development Guide

Step 1: Create Plugin Package

mkdir docling-plugin-example
cd docling-plugin-example
Create package structure:
docling-plugin-example/
├── pyproject.toml
├── README.md
└── docling_plugin_example/
    ├── __init__.py
    ├── my_ocr_model.py
    └── my_ocr_options.py

Step 2: Implement Plugin

# docling_plugin_example/my_ocr_options.py
from docling.datamodel.pipeline_options import OcrOptions
from pydantic import Field
from typing import ClassVar

class MyOcrOptions(OcrOptions):
    kind: ClassVar[str] = "my_ocr"
    
    api_key: str = Field(
        description="API key for OCR service"
    )
    region: str = Field(
        default="us-east-1",
        description="Service region"
    )
# docling_plugin_example/my_ocr_model.py
from docling.models.base_ocr_model import BaseOcrModel
from .my_ocr_options import MyOcrOptions
from typing import Iterable
import logging

class MyOcrModel(BaseOcrModel):
    def __init__(self, options: MyOcrOptions):
        self.options = options
        self._log = logging.getLogger(__name__)
        self._initialize_client()
    
    def _initialize_client(self):
        # Initialize your OCR client/engine
        self.client = ExternalOCRService(
            api_key=self.options.api_key,
            region=self.options.region
        )
    
    @classmethod
    def get_options_type(cls):
        return MyOcrOptions
    
    def __call__(self, conv_res, page_batch: Iterable):
        for page in page_batch:
            # Convert page to image
            image = page.get_image()
            
            # Call your OCR service
            result = self.client.recognize(
                image,
                languages=self.options.lang
            )
            
            # Convert result to Docling format
            for word in result.words:
                ocr_cell = OcrCell(
                    text=word.text,
                    confidence=word.confidence,
                    bbox=word.bounding_box
                )
                page.ocr_cells.append(ocr_cell)
            
            yield page
# docling_plugin_example/__init__.py
from .my_ocr_model import MyOcrModel

def ocr_engines():
    """Plugin factory for OCR engines."""
    return {
        "ocr_engines": [
            MyOcrModel,
        ]
    }

__all__ = ["MyOcrModel", "ocr_engines"]

Step 3: Configure Entry Point

# pyproject.toml
[project]
name = "docling-plugin-example"
version = "0.1.0"
description = "Example OCR plugin for Docling"
requires-python = ">=3.10"
dependencies = [
    "docling>=2.0.0",
    # Your dependencies here
]

[project.entry-points."docling"]
my_ocr_plugin = "docling_plugin_example"

[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"

Step 4: Install and Test

# Install in development mode
pip install -e .

# Test plugin discovery
python -c "import pluggy; print(pluggy.PluginManager('docling').list_name_plugin())"

Step 5: Use Your Plugin

from docling.document_converter import DocumentConverter, PdfFormatOption
from docling.datamodel.pipeline_options import PdfPipelineOptions
from docling.datamodel.base_models import InputFormat
from docling_plugin_example import MyOcrOptions

pipeline_options = PdfPipelineOptions()
pipeline_options.allow_external_plugins = True
pipeline_options.ocr_options = MyOcrOptions(
    lang=["eng"],
    api_key="your-api-key",
    region="us-west-2"
)

converter = DocumentConverter(
    format_options={
        InputFormat.PDF: PdfFormatOption(
            pipeline_options=pipeline_options
        )
    }
)

result = converter.convert("document.pdf")

Best Practices

  • Use descriptive, unique plugin names (e.g., docling-plugin-aws-textract)
  • Prefix your package with docling-plugin- for discoverability
  • Use lowercase with hyphens for package names
  • Use snake_case for Python modules
class MyOcrModel(BaseOcrModel):
    def __call__(self, conv_res, page_batch):
        for page in page_batch:
            try:
                # Your OCR logic
                result = self.client.recognize(page.image)
                # Process result
            except OCRServiceError as e:
                self._log.error(f"OCR failed for page {page.page_no}: {e}")
                # Handle gracefully - don't break pipeline
                page.ocr_cells = []  # Empty results
            
            yield page
Use Pydantic’s validation features:
from pydantic import Field, field_validator

class MyOcrOptions(OcrOptions):
    api_key: str = Field(min_length=10)
    
    @field_validator('api_key')
    def validate_api_key(cls, v):
        if not v.startswith('ak_'):
            raise ValueError('API key must start with ak_')
        return v
Clean up resources properly:
class MyOcrModel(BaseOcrModel):
    def __init__(self, options):
        self.options = options
        self.client = None
    
    def _initialize_client(self):
        if self.client is None:
            self.client = OCRClient()
    
    def __del__(self):
        if self.client:
            self.client.close()
Provide comprehensive tests:
# tests/test_my_ocr.py
import pytest
from docling_plugin_example import MyOcrModel, MyOcrOptions

def test_ocr_initialization():
    options = MyOcrOptions(
        lang=["eng"],
        api_key="test_key"
    )
    model = MyOcrModel(options)
    assert model.options.api_key == "test_key"

def test_ocr_processing():
    # Test with sample document
    pass
Document your plugin thoroughly:
  • README with installation instructions
  • Configuration options reference
  • Usage examples
  • Supported features and limitations
  • Performance characteristics

Security Considerations

External plugins run with full application privileges. Only install plugins from trusted sources.
  • Validate all user inputs in your plugin
  • Don’t store API keys or secrets in code - use configuration
  • Be transparent about data transmission to external services
  • Follow secure coding practices
  • Keep dependencies up to date

Publishing Your Plugin

PyPI Distribution

# Build distribution
pip install build
python -m build

# Upload to PyPI
pip install twine
twine upload dist/*

Documentation

Create a comprehensive README:
# Docling Plugin: Example OCR

Custom OCR engine plugin for Docling.

## Installation

```bash
pip install docling-plugin-example

Usage

from docling_plugin_example import MyOcrOptions
from docling.datamodel.pipeline_options import PdfPipelineOptions

pipeline_options = PdfPipelineOptions()
pipeline_options.allow_external_plugins = True
pipeline_options.ocr_options = MyOcrOptions(
    api_key="your-key"
)

Configuration

OptionTypeDefaultDescription
api_keystrRequiredAPI key
regionstrus-east-1Service region

License

MIT

## Plugin Examples

Official Docling plugins:
- [docling-core](https://github.com/docling-project/docling-core) - Core data structures
- [docling-ibm-models](https://github.com/docling-project/docling-ibm-models) - IBM's layout models

Community plugins:
- Check [GitHub topics](https://github.com/topics/docling-plugin) for community-developed plugins

## Related Resources

<CardGroup cols={2}>
  <Card title="Model Catalog" icon="book" href="/advanced/model-catalog">
    Available models and engines
  </Card>
  <Card title="Enrichments" icon="sparkles" href="/advanced/enrichments">
    Document enrichment features
  </Card>
  <Card title="Pipeline Options" icon="sliders" href="/api/options/pipeline-options">
    Configure processing pipelines
  </Card>
  <Card title="Base Models" icon="code" href="https://github.com/docling-project/docling/blob/main/docling/models/">
    Base model implementations
  </Card>
</CardGroup>

Build docs developers (and LLMs) love