Skip to main content
MarkItDown’s plugin system allows you to extend its capabilities with custom document converters for file formats not supported by default.

Using Plugins

Enabling Plugins

Plugins are disabled by default and must be explicitly enabled.
markitdown --use-plugins file.rtf
markitdown -p file.rtf

Listing Installed Plugins

Check which plugins are installed:
markitdown --list-plugins
Output:
Installed MarkItDown 3rd-party Plugins:

  * sample_plugin    	(package: markitdown_sample_plugin)

Use the -p (or --use-plugins) option to enable 3rd-party plugins.
If no plugins are installed:
Installed MarkItDown 3rd-party Plugins:

  * No 3rd-party plugins installed.

Find plugins by searching for the hashtag #markitdown-plugin on GitHub.

Finding Plugins

Discover available plugins:
1

Search GitHub

Look for repositories tagged with #markitdown-plugin:Search GitHub for #markitdown-plugin
2

Check PyPI

Search PyPI for packages starting with markitdown-:
pip search markitdown-
3

Community Resources

Check the MarkItDown repository for plugin recommendations

Installing Plugins

Plugins are installed as Python packages:
# From PyPI
pip install markitdown-sample-plugin

# From GitHub
pip install git+https://github.com/user/markitdown-plugin-name.git

# From local directory
pip install -e /path/to/plugin
Verify installation:
markitdown --list-plugins

Creating Plugins

Plugin Structure

A MarkItDown plugin is a Python package that implements a specific interface:
1

Create a DocumentConverter

converter.py
from typing import BinaryIO, Any
from markitdown import DocumentConverter, DocumentConverterResult, StreamInfo

class RtfConverter(DocumentConverter):
    def accepts(
        self,
        file_stream: BinaryIO,
        stream_info: StreamInfo,
        **kwargs: Any,
    ) -> bool:
        """Check if this converter can handle the file."""
        extension = (stream_info.extension or "").lower()
        mimetype = (stream_info.mimetype or "").lower()
        
        if extension == ".rtf":
            return True
        if mimetype == "text/rtf":
            return True
        
        return False

    def convert(
        self,
        file_stream: BinaryIO,
        stream_info: StreamInfo,
        **kwargs: Any,
    ) -> DocumentConverterResult:
        """Convert the file to Markdown."""
        # Read the RTF content
        content = file_stream.read()
        
        # Convert to Markdown (simplified example)
        markdown = self._rtf_to_markdown(content)
        
        return DocumentConverterResult(
            markdown=markdown,
            title="RTF Document"
        )
    
    def _rtf_to_markdown(self, content: bytes) -> str:
        # Implement RTF parsing logic
        from striprtf.striprtf import rtf_to_text
        text = rtf_to_text(content.decode('utf-8'))
        return text
2

Create Plugin Interface

__init__.py
from .converter import RtfConverter
from markitdown import MarkItDown

# Plugin interface version
__plugin_interface_version__ = 1

def register_converters(markitdown: MarkItDown, **kwargs):
    """Register converters with MarkItDown instance."""
    markitdown.register_converter(RtfConverter())
3

Configure Entry Point

pyproject.toml
[project]
name = "markitdown-rtf-plugin"
version = "0.1.0"
dependencies = [
    "markitdown>=0.1.0",
    "striprtf",
]

[project.entry-points."markitdown.plugin"]
rtf_plugin = "markitdown_rtf_plugin"

Entry Point Configuration

The entry point is critical for plugin discovery:
[project.entry-points."markitdown.plugin"]
plugin_name = "package_name"
  • Entry point group: Must be "markitdown.plugin"
  • Plugin name: Any unique identifier (e.g., rtf_plugin)
  • Package name: The fully qualified package name (e.g., markitdown_rtf_plugin)

Plugin Interface Version

Your plugin must export the interface version:
__plugin_interface_version__ = 1
Currently, only version 1 is supported.

Registration Function

Implement the register_converters function:
def register_converters(markitdown: MarkItDown, **kwargs):
    """
    Called when MarkItDown instances are created with plugins enabled.
    
    Args:
        markitdown: The MarkItDown instance to register converters with
        **kwargs: Additional arguments passed to MarkItDown constructor
    """
    # Register one or more converters
    markitdown.register_converter(MyConverter())
    markitdown.register_converter(AnotherConverter())

Advanced Plugin Development

Converter Priority

Control when your converter is tried:
from markitdown import PRIORITY_SPECIFIC_FILE_FORMAT, PRIORITY_GENERIC_FILE_FORMAT

def register_converters(markitdown: MarkItDown, **kwargs):
    # High priority (tried first) - for specific file types
    markitdown.register_converter(
        RtfConverter(),
        priority=PRIORITY_SPECIFIC_FILE_FORMAT  # 0.0
    )
    
    # Lower priority (tried later) - for generic file types
    markitdown.register_converter(
        GenericTextConverter(),
        priority=PRIORITY_GENERIC_FILE_FORMAT  # 10.0
    )
Lower priority values are tried first. Built-in converters use 0.0 for specific formats and 10.0 for generic formats.

Accessing File Content

The file_stream is seekable:
def accepts(self, file_stream: BinaryIO, stream_info: StreamInfo, **kwargs) -> bool:
    # Save position
    cur_pos = file_stream.tell()
    
    # Read header to check file type
    header = file_stream.read(100)
    
    # IMPORTANT: Reset position
    file_stream.seek(cur_pos)
    
    return header.startswith(b'{\\rtf')
Always reset the file stream position after reading in accepts(). The convert() method expects the stream to be at the original position.

Using Configuration Options

Access configuration passed to MarkItDown:
def register_converters(markitdown: MarkItDown, **kwargs):
    # Access custom configuration
    custom_setting = kwargs.get('custom_setting', 'default')
    
    markitdown.register_converter(
        MyConverter(setting=custom_setting)
    )
Pass configuration when creating MarkItDown:
md = MarkItDown(
    enable_plugins=True,
    custom_setting='value'
)

Error Handling

Handle missing dependencies gracefully:
from markitdown import MissingDependencyException
import sys

_dependency_exc_info = None
try:
    import striprtf
except ImportError:
    _dependency_exc_info = sys.exc_info()

class RtfConverter(DocumentConverter):
    def __init__(self):
        if _dependency_exc_info is not None:
            raise MissingDependencyException(
                "RtfConverter requires 'striprtf' to be installed. "
                "Install with: pip install striprtf"
            ) from _dependency_exc_info[1].with_traceback(_dependency_exc_info[2])

Example: Sample Plugin

The official sample plugin demonstrates best practices:
# From markitdown-sample-plugin
from typing import BinaryIO, Any
from markitdown import DocumentConverter, DocumentConverterResult, StreamInfo
import sys

# Check for dependencies
_dependency_exc_info = None
try:
    from striprtf.striprtf import rtf_to_text
except ImportError:
    _dependency_exc_info = sys.exc_info()

class RtfConverter(DocumentConverter):
    def accepts(self, file_stream: BinaryIO, stream_info: StreamInfo, **kwargs: Any) -> bool:
        extension = (stream_info.extension or "").lower()
        
        if extension == ".rtf":
            return True
        
        # Check file magic
        cur_pos = file_stream.tell()
        header = file_stream.read(100)
        file_stream.seek(cur_pos)
        
        return header.startswith(b'{\\\\rtf')
    
    def convert(self, file_stream: BinaryIO, stream_info: StreamInfo, **kwargs: Any) -> DocumentConverterResult:
        if _dependency_exc_info is not None:
            raise MissingDependencyException(
                "RTF conversion requires 'striprtf'. Install with: pip install striprtf"
            )
        
        content = file_stream.read().decode('utf-8', errors='ignore')
        text = rtf_to_text(content)
        
        return DocumentConverterResult(markdown=text)

# Plugin interface
__plugin_interface_version__ = 1

def register_converters(markitdown, **kwargs):
    markitdown.register_converter(RtfConverter())
Install and use:
pip install markitdown-sample-plugin
markitdown --use-plugins document.rtf

Testing Plugins

Test your plugin:
from markitdown import MarkItDown
import io

def test_rtf_conversion():
    md = MarkItDown(enable_plugins=True)
    
    # Create test RTF content
    rtf_content = b"{\\rtf1 Hello World}"
    stream = io.BytesIO(rtf_content)
    
    result = md.convert_stream(stream, stream_info=StreamInfo(extension=".rtf"))
    
    assert "Hello World" in result.markdown
    print("✓ Plugin test passed")

if __name__ == "__main__":
    test_rtf_conversion()

Publishing Plugins

1

Package Your Plugin

python -m build
2

Test Locally

pip install dist/markitdown_rtf_plugin-0.1.0-py3-none-any.whl
markitdown --list-plugins
3

Publish to PyPI

python -m twine upload dist/*
4

Tag Repository

Add #markitdown-plugin topic to your GitHub repository for discoverability

Security Considerations

Plugins execute arbitrary code during conversion. Only install plugins from trusted sources.
Best practices:
  • Review plugin source code before installation
  • Use virtual environments for testing new plugins
  • Keep plugins updated
  • Report security issues to plugin authors

Troubleshooting

Plugin Not Found

If --list-plugins doesn’t show your plugin:
# Check if package is installed
pip list | grep markitdown

# Verify entry points
python -c "from importlib.metadata import entry_points; print(list(entry_points(group='markitdown.plugin')))"

# Reinstall the plugin
pip uninstall markitdown-rtf-plugin
pip install markitdown-rtf-plugin

Plugin Fails to Load

Check for errors:
import warnings
import traceback
from importlib.metadata import entry_points

for ep in entry_points(group='markitdown.plugin'):
    try:
        plugin = ep.load()
        print(f"✓ Loaded: {ep.name}")
    except Exception as e:
        print(f"✗ Failed: {ep.name}")
        traceback.print_exc()

Converter Not Called

Ensure accepts() returns True:
def accepts(self, file_stream: BinaryIO, stream_info: StreamInfo, **kwargs) -> bool:
    print(f"Checking: {stream_info.extension} / {stream_info.mimetype}")
    return stream_info.extension == ".rtf"

Build docs developers (and LLMs) love