Skip to main content

Overview

This guide will help you install MarkItDown and convert your first document to Markdown using both the command-line interface and Python API.
Prerequisites: Python 3.10 or higher. We recommend using a virtual environment to avoid dependency conflicts.

Installation

Install MarkItDown with all optional dependencies for full format support:
pip install 'markitdown[all]'
For specific format support, see the detailed installation guide to install only the dependencies you need.

CLI quickstart

The command-line interface is the fastest way to convert documents.
1

Convert a file to Markdown

Use the markitdown command with any supported file:
markitdown document.pdf > output.md
Or specify the output file with the -o flag:
markitdown document.pdf -o output.md
2

Convert from stdin

You can pipe content directly to MarkItDown:
cat presentation.pptx | markitdown > slides.md
Or use input redirection:
markitdown < spreadsheet.xlsx > data.md
3

View the output

Open the output file to see your converted Markdown:
cat output.md

CLI examples

markitdown report.pdf -o report.md

Python API quickstart

Integrate MarkItDown into your Python applications for programmatic document conversion.
1

Import and initialize

from markitdown import MarkItDown

# Initialize with default settings
md = MarkItDown()
2

Convert a file

# Convert a local file
result = md.convert("document.pdf")

# Access the Markdown content
print(result.text_content)

# Access the document title (if available)
if result.title:
    print(f"Title: {result.title}")
3

Save the output

# Write to a file
with open("output.md", "w", encoding="utf-8") as f:
    f.write(result.markdown)

Python API examples

from markitdown import MarkItDown

md = MarkItDown()
result = md.convert("report.pdf")
print(result.text_content)

Advanced usage

Using LLM for image descriptions

Enhance image and PowerPoint conversions with AI-generated descriptions:
from markitdown import MarkItDown
from openai import OpenAI

client = OpenAI()
md = MarkItDown(
    llm_client=client,
    llm_model="gpt-4o",
    llm_prompt="Describe this image in detail for a technical audience."
)

result = md.convert("presentation.pptx")
print(result.text_content)

Azure Document Intelligence

Use Microsoft’s Document Intelligence for superior PDF processing:
markitdown document.pdf \
  -d \
  -e "https://YOUR_ENDPOINT.cognitiveservices.azure.com/" \
  -o output.md
Learn how to set up an Azure Document Intelligence Resource in the Azure documentation.

Using plugins

MarkItDown supports third-party plugins for extended functionality:
# List installed plugins
markitdown --list-plugins

# Use plugins when converting
markitdown --use-plugins document.xyz -o output.md
To find available plugins, search GitHub for the hashtag #markitdown-plugin.

Next steps

Installation guide

Learn about virtual environments and selective dependency installation

Python API reference

Explore the complete API documentation

CLI reference

See all command-line options and flags

Converters

Deep dive into format-specific converters

Common patterns

Batch conversion

from markitdown import MarkItDown
from pathlib import Path

md = MarkItDown()
input_dir = Path("documents")
output_dir = Path("markdown")

for file_path in input_dir.glob("*.pdf"):
    result = md.convert(str(file_path))
    output_path = output_dir / f"{file_path.stem}.md"
    output_path.write_text(result.markdown, encoding="utf-8")

Error handling

from markitdown import MarkItDown, FileConversionException, UnsupportedFormatException

md = MarkItDown()

try:
    result = md.convert("document.xyz")
    print(result.text_content)
except UnsupportedFormatException:
    print("This file format is not supported")
except FileConversionException as e:
    print(f"Conversion failed: {e}")

Processing HTTP responses

from markitdown import MarkItDown
import requests

md = MarkItDown()
response = requests.get("https://example.com/document.pdf")
result = md.convert_response(response)
print(result.text_content)
Need help? Check out the API reference for detailed documentation on all methods and parameters.

Build docs developers (and LLMs) love