Image Files

MarkItDown extracts metadata from image files and can optionally generate detailed descriptions using multimodal LLMs.

Supported Formats

JPEG: .jpg, .jpeg
PNG: .png

Dependencies

Core (No Dependencies)

Basic image conversion works without any dependencies, though metadata extraction requires exiftool.

Optional: EXIF Metadata

# macOS
brew install exiftool

# Ubuntu/Debian  
sudo apt-get install libimage-exiftool-perl

# Windows (download from)
https://exiftool.org/

Security: MarkItDown requires ExifTool version 12.24 or later to avoid CVE-2021-22204. The converter will verify the version before use.

Optional: LLM Captioning

pip install openai  # Or other LLM client

Basic Usage

from markitdown import MarkItDown

md = MarkItDown(exiftool_path="/usr/local/bin/exiftool")
result = md.convert("photo.jpg")
print(result.markdown)

Features

EXIF Metadata

Extract camera settings, dates, GPS coordinates

LLM Descriptions

Generate detailed image captions with multimodal LLMs

Embedded Metadata

Extract title, caption, description, keywords, artist

GPS Data

Extract geolocation information

Output Examples

Metadata Only

ImageSize: 4032x3024
DateTimeOriginal: 2024:02:15 14:30:25
Artist: John Doe
Description: Sunset at the beach
GPSPosition: 34.0522 N, 118.2437 W

With LLM Description

ImageSize: 1920x1080
DateTimeOriginal: 2024:02:15 10:15:00
Keywords: landscape, mountain, nature

# Description:
A breathtaking mountain landscape at golden hour. Snow-capped peaks rise majestically against a vibrant orange and pink sky. In the foreground, a winding river reflects the colorful sunset, with evergreen trees lining both banks. The composition captures the serenity and grandeur of alpine wilderness.

EXIF Metadata Fields

The converter extracts the following EXIF fields (when available):

Field	Description	Example
`ImageSize`	Dimensions in pixels	`4032x3024`
`Title`	Image title	`Vacation Photo`
`Caption`	Short caption	`Family at beach`
`Description`	Long description	`Summer vacation...`
`Keywords`	Comma-separated tags	`beach, sunset, ocean`
`Artist`	Photographer name	`Jane Smith`
`Author`	Author/creator	`John Doe`
`DateTimeOriginal`	When photo was taken	`2024:02:15 14:30:25`
`CreateDate`	File creation date	`2024:02:15 14:30:25`
`GPSPosition`	GPS coordinates	`34.0522 N, 118.2437 W`

LLM Integration

Using OpenAI

from markitdown import MarkItDown
from openai import OpenAI

client = OpenAI(api_key="your-api-key")
md = MarkItDown(
    llm_client=client,
    llm_model="gpt-4o",
    llm_prompt="Write a detailed caption for this image."
)

result = md.convert("photo.jpg")

Custom Prompts

md = MarkItDown(
    llm_client=client,
    llm_model="gpt-4o"
)

result = md.convert(
    "photo.jpg",
    llm_prompt="Describe this image focusing on colors, composition, and mood. Be detailed."
)

Default prompt: "Write a detailed caption for this image."

Using Other LLM Providers

from anthropic import Anthropic
from markitdown import MarkItDown

# Any client with a compatible chat.completions API works
client = Anthropic(api_key="your-api-key")
md = MarkItDown(llm_client=client, llm_model="claude-3-opus")

result = md.convert("photo.jpg")

Implementation Details

Source Location

packages/markitdown/src/markitdown/converters/
├── _image_converter.py  # Main image converter
└── _exiftool.py         # ExifTool metadata extraction

Converter Class

Class Name: ImageConverter
Accepted Extensions: .jpg, .jpeg, .png
MIME Types: image/jpeg, image/png

ExifTool Integration

The exiftool_metadata() function in _exiftool.py:

def exiftool_metadata(
    file_stream: BinaryIO,
    *,
    exiftool_path: Union[str, None],
) -> dict:
    # Verify ExifTool version >= 12.24 (CVE-2021-22204)
    # Run: exiftool -json -
    # Returns: dict of metadata fields

Security Check:

version = subprocess.run([exiftool_path, "-ver"], ...)
if version < (12, 24):
    raise RuntimeError("ExifTool version is vulnerable to CVE-2021-22204")

LLM Description Process

Convert image to base64
Create data URI: data:image/jpeg;base64,...
Send to LLM with prompt
Return generated description

def _get_llm_description(self, file_stream, stream_info, *, client, model, prompt):
    base64_image = base64.b64encode(file_stream.read()).decode("utf-8")
    data_uri = f"data:{content_type};base64,{base64_image}"
    
    messages = [{
        "role": "user",
        "content": [
            {"type": "text", "text": prompt},
            {"type": "image_url", "image_url": {"url": data_uri}}
        ]
    }]
    
    response = client.chat.completions.create(model=model, messages=messages)
    return response.choices[0].message.content

Advanced Examples

Batch Processing Images

from markitdown import MarkItDown
from openai import OpenAI
import os

client = OpenAI()
md = MarkItDown(
    llm_client=client,
    llm_model="gpt-4o",
    exiftool_path="/usr/local/bin/exiftool"
)

image_dir = "photos"
for filename in os.listdir(image_dir):
    if filename.lower().endswith(('.jpg', '.jpeg', '.png')):
        filepath = os.path.join(image_dir, filename)
        result = md.convert(filepath)
        
        # Save markdown
        output_path = filepath.replace(os.path.splitext(filepath)[1], '.md')
        with open(output_path, 'w') as f:
            f.write(result.markdown)

Extract Only Metadata

from markitdown import MarkItDown

# No LLM client = metadata only
md = MarkItDown(exiftool_path="/usr/local/bin/exiftool")
result = md.convert("photo.jpg")

# Parse metadata from markdown
for line in result.markdown.split('\n'):
    if ':' in line:
        key, value = line.split(':', 1)
        print(f"{key.strip()}: {value.strip()}")

Custom Metadata Processing

from markitdown.converters import exiftool_metadata

with open('photo.jpg', 'rb') as f:
    metadata = exiftool_metadata(
        f,
        exiftool_path="/usr/local/bin/exiftool"
    )
    
    # Direct access to metadata dict
    print(f"Camera: {metadata.get('Make')} {metadata.get('Model')}")
    print(f"ISO: {metadata.get('ISO')}")
    print(f"Aperture: {metadata.get('Aperture')}")
    print(f"Shutter Speed: {metadata.get('ShutterSpeed')}")

Error Handling

from markitdown import MarkItDown

md = MarkItDown(exiftool_path="/usr/local/bin/exiftool")

try:
    result = md.convert("photo.jpg")
    if not result.markdown.strip():
        print("No metadata or description generated")
except FileNotFoundError:
    print("exiftool not found at specified path")
except RuntimeError as e:
    if "CVE-2021-22204" in str(e):
        print("Please upgrade exiftool to version 12.24 or later")
    else:
        raise
except Exception as e:
    print(f"Error processing image: {e}")

Use Cases

Photo Library Documentation

Generate markdown catalogs of photo collections with metadata and AI-generated descriptions for searchability.

Digital Asset Management

Extract and index metadata from image libraries for better organization and retrieval.

Accessibility

Generate alt text and detailed descriptions for web images using LLM captioning.

Image Analysis Pipelines

Integrate into data processing workflows to extract technical and descriptive information from images.

Evidence Documentation

Extract EXIF data including GPS coordinates and timestamps for forensic or legal purposes.

Limitations

No OCR: Text within images is not extracted (consider using Document Intelligence for OCR)
LLM Accuracy: AI-generated descriptions may contain hallucinations or inaccuracies
Format Support: Only JPEG and PNG; no support for TIFF, GIF, WebP, etc.
EXIF Dependency: Metadata extraction requires external exiftool binary

Get Started

Guides

File Formats

Advanced

Supported Formats

Dependencies

Core (No Dependencies)

Optional: EXIF Metadata

Optional: LLM Captioning

Basic Usage

Features

EXIF Metadata

LLM Descriptions

Embedded Metadata

GPS Data

Output Examples

Metadata Only

With LLM Description

EXIF Metadata Fields

LLM Integration

Using OpenAI

Custom Prompts

Using Other LLM Providers

Implementation Details

Source Location

Converter Class

ExifTool Integration

LLM Description Process

Advanced Examples

Batch Processing Images

Extract Only Metadata

Custom Metadata Processing

Error Handling

Use Cases

Limitations

Next Steps

Document Intelligence

PowerPoint Images

Build docs developers (and LLMs) love

Get Started

Guides

File Formats

Advanced

​Supported Formats

​Dependencies

​Core (No Dependencies)

​Optional: EXIF Metadata

​Optional: LLM Captioning

​Basic Usage

​Features

EXIF Metadata

LLM Descriptions

Embedded Metadata

GPS Data

​Output Examples

​Metadata Only

​With LLM Description

​EXIF Metadata Fields

​LLM Integration

​Using OpenAI

​Custom Prompts

​Using Other LLM Providers

​Implementation Details

​Source Location

​Converter Class

​ExifTool Integration

​LLM Description Process

​Advanced Examples

​Batch Processing Images

​Extract Only Metadata

​Custom Metadata Processing

​Error Handling

​Use Cases

​Limitations

​Next Steps

Document Intelligence

PowerPoint Images

Build docs developers (and LLMs) love

Supported Formats

Dependencies

Core (No Dependencies)

Optional: EXIF Metadata

Optional: LLM Captioning

Basic Usage

Features

Output Examples

Metadata Only

With LLM Description

EXIF Metadata Fields

LLM Integration

Using OpenAI

Custom Prompts

Using Other LLM Providers

Implementation Details

Source Location

Converter Class

ExifTool Integration

LLM Description Process

Advanced Examples

Batch Processing Images

Extract Only Metadata

Custom Metadata Processing

Error Handling

Use Cases

Limitations

Next Steps