Plugin Architecture
MarkItDown uses themarkitdown.plugin entry point group to discover plugins. When enable_plugins=True, MarkItDown calls each plugin’s register_converters() function during initialization.
Plugins are disabled by default. Users must explicitly enable them with
enable_plugins=True or the --use-plugins CLI flag.Creating a Plugin
markitdown-sample-plugin/
├── src/
│ └── markitdown_sample_plugin/
│ ├── __init__.py
│ ├── __about__.py
│ └── _plugin.py
├── tests/
│ ├── __init__.py
│ ├── test_sample_plugin.py
│ └── test_files/
│ └── test.rtf
├── pyproject.toml
└── README.md
from markitdown import (
MarkItDown,
DocumentConverter,
DocumentConverterResult,
StreamInfo,
)
from typing import BinaryIO, Any
# REQUIRED: Plugin interface version
__plugin_interface_version__ = 1
# REQUIRED: Registration function
def register_converters(markitdown: MarkItDown, **kwargs):
"""
Called during MarkItDown construction to register converters.
Parameters:
- markitdown: The MarkItDown instance to register converters with
- **kwargs: Additional configuration passed from MarkItDown constructor
"""
markitdown.register_converter(RtfConverter())
class RtfConverter(DocumentConverter):
"""Converts RTF files to Markdown."""
def accepts(
self,
file_stream: BinaryIO,
stream_info: StreamInfo,
**kwargs: Any,
) -> bool:
mimetype = (stream_info.mimetype or "").lower()
extension = (stream_info.extension or "").lower()
if extension in [".rtf"]:
return True
if mimetype.startswith("text/rtf") or mimetype.startswith("application/rtf"):
return True
return False
def convert(
self,
file_stream: BinaryIO,
stream_info: StreamInfo,
**kwargs: Any,
) -> DocumentConverterResult:
from striprtf.striprtf import rtf_to_text
import locale
# Decode the file
encoding = stream_info.charset or locale.getpreferredencoding()
stream_data = file_stream.read().decode(encoding)
# Convert to plain text
markdown = rtf_to_text(stream_data)
return DocumentConverterResult(
markdown=markdown,
title=None
)
from ._plugin import (
__plugin_interface_version__,
register_converters,
RtfConverter,
)
__all__ = [
"__plugin_interface_version__",
"register_converters",
"RtfConverter",
]
[project]
name = "markitdown-sample-plugin"
dynamic = ["version"]
description = "A sample plugin for the markitdown library."
requires-python = ">=3.10"
dependencies = [
"markitdown>=0.1.0a1",
"striprtf",
]
# CRITICAL: This entry point enables plugin discovery
[project.entry-points."markitdown.plugin"]
sample_plugin = "markitdown_sample_plugin"
Complete Example: RTF Plugin
Here’s the complete RTF converter plugin frompackages/markitdown-sample-plugin/:
Installing and Testing Your Plugin
Writing Plugin Tests
Create tests to verify both direct converter usage and plugin loading:Plugin Loading Process
MarkItDown loads plugins using this process (_markitdown.py:65):
- When
enable_plugins=True, MarkItDown callsenable_plugins()method _load_plugins()discovers allmarkitdown.pluginentry points- Each entry point is loaded and its
register_converters()function is called - If any plugin fails to load, a warning is issued and the plugin is skipped
Best Practices
Publishing Your Plugin
To share your plugin with others:- Naming Convention: Use
markitdown-<name>-pluginfor package name - PyPI Publishing: Follow standard Python package publishing process
- Documentation: Include clear installation and usage instructions
- Testing: Ensure comprehensive test coverage
Next Steps
Custom Converters
Learn more about implementing converters
Configuration
Understand MarkItDown configuration options