CLI Options Reference

This page provides a comprehensive reference for all CLI options available in Docling.

Global Options

These options are available for the main docling convert command.

Input/Output Options

source

argument

required

PDF files to convert. Can be local file paths, directory paths, or URLs. Multiple sources can be specified.Examples:

docling convert document.pdf
docling convert /path/to/docs/
docling convert https://example.com/file.pdf

--from

option

Specify input formats to convert from.Type: Multiple values allowed
Default: All formats
Available: pdf, image, docx, pptx, xlsx, html, md, latex, audio, mets_gbs

--to

option

Specify output formats.Type: Multiple values allowed
Default: markdown
Available: json, yaml, html, html_split_page, markdown, text, doctags, vttExample:

docling convert doc.pdf --to json --to markdown

--output

option

Output directory where results are saved.Type: Path
Default: . (current directory)Example:

docling convert doc.pdf --output ./results

--headers

option

Specify HTTP request headers used when fetching URL input sources.Type: JSON stringExample:

docling convert https://api.example.com/doc.pdf \
  --headers '{"Authorization": "Bearer token"}'

Image Export Options

--image-export-mode

option

Image export mode for documents (applies to JSON, Markdown, and HTML outputs).Type: Enum
Default: embedded
Options:

placeholder - Only mark image positions in output
embedded - Embed images as base64 encoded strings
referenced - Export images as PNG files and reference them

--show-layout

flag

If enabled, page images will show bounding boxes of detected items.Type: Boolean
Default: false

Pipeline Configuration

--pipeline

option

Choose the processing pipeline for PDF or image files.Type: Enum
Default: standard
Options:

standard - Traditional document processing pipeline
vlm - Vision-Language Model based pipeline

Example:

docling convert doc.pdf --pipeline vlm --vlm-model granite_docling

--vlm-model

option

Choose the VLM (Vision-Language Model) preset to use with PDF or image files.Type: String
Default: granite_docling
Available presets: granite_docling, smol_docling, and othersOnly applicable when --pipeline vlm is set.

--asr-model

option

Choose the ASR (Automatic Speech Recognition) model for audio/video files.Type: Enum
Default: whisper_tiny
Available models:

Auto-select: whisper_tiny, whisper_base, whisper_small, whisper_medium, whisper_large, whisper_turbo
MLX variants: whisper_tiny_mlx, whisper_base_mlx, whisper_small_mlx, whisper_medium_mlx, whisper_large_mlx, whisper_turbo_mlx
Native variants: whisper_tiny_native, whisper_base_native, whisper_small_native, whisper_medium_native, whisper_large_native, whisper_turbo_native

OCR Options

--ocr / --no-ocr

flag

Enable or disable OCR for bitmap content.Type: Boolean
Default: trueExample:

docling convert scan.pdf --ocr

--force-ocr

flag

Replace any existing text with OCR-generated text over the full content.Type: Boolean
Default: falseUse this when existing text extraction is poor quality.

--ocr-engine

option

The OCR engine to use.Type: String
Default: auto
Available engines: auto, tesseract_cli, tesseract, easyocr, rapidocrAdditional engines may be available with --allow-external-plugins.Example:

docling convert scan.pdf --ocr-engine easyocr

--ocr-lang

option

Comma-separated list of languages for the OCR engine.Type: String
Note: Each OCR engine has different language code formats.Example:

docling convert doc.pdf --ocr-lang "eng,fra,deu"

--psm

option

Page Segmentation Mode for the OCR engine.Type: Integer (0-13)
Applies to: Tesseract engines onlySee Tesseract documentation for PSM mode details.

Table Processing Options

--tables / --no-tables

flag

Enable or disable table structure extraction.Type: Boolean
Default: true

--table-mode

option

The mode to use in the table structure model.Type: Enum
Default: accurate
Options:

fast - Faster processing, less accurate
accurate - Slower processing, more accurate

PDF Backend Options

--pdf-backend

option

The PDF backend to use for processing.Type: Enum
Default: docling_parse
Options:

docling_parse - Recommended backend (default)
pypdfium2 - Alternative backend

--pdf-password

option

Password for protected PDF documents.Type: StringExample:

docling convert protected.pdf --pdf-password "secret123"

Enrichment Options

--enrich-code

flag

Enable code enrichment model in the pipeline.Type: Boolean
Default: falseImproves detection and formatting of code blocks.

--enrich-formula

flag

Enable formula enrichment model in the pipeline.Type: Boolean
Default: falseImproves detection and rendering of mathematical formulas.

--enrich-picture-classes

flag

Enable picture classification enrichment model.Type: Boolean
Default: falseClassifies images into categories (charts, diagrams, photos, etc.).

--enrich-picture-description

flag

Enable picture description model.Type: Boolean
Default: falseGenerates textual descriptions for images.

--enrich-chart-extraction

flag

Enable chart extraction to convert bar, pie, and line charts to tabular format.Type: Boolean
Default: falseExample:

docling convert report.pdf --enrich-chart-extraction

Performance Options

--num-threads

option

Number of threads to use for processing.Type: Integer
Default: 4Example:

docling convert doc.pdf --num-threads 8

--device

option

Accelerator device to use for model inference.Type: Enum
Default: auto
Options:

auto - Automatically select best available device
cpu - Use CPU only
cuda - Use NVIDIA GPU
mps - Use Apple Metal Performance Shaders (Mac)

--page-batch-size

option

Number of pages processed in one batch.Type: Integer
Default: System default (configurable)Larger values may improve throughput but use more memory.

--document-timeout

option

Timeout for processing each document, in seconds.Type: FloatExample:

docling convert doc.pdf --document-timeout 300

Model and Plugin Options

--artifacts-path

option

Location of model artifacts for offline use.Type: PathExample:

# First download models
docling tools models download --output-dir ./models

# Then use offline
docling convert doc.pdf --artifacts-path ./models

--enable-remote-services

flag

Must be enabled when using models that connect to remote services.Type: Boolean
Default: false

--allow-external-plugins

flag

Must be enabled for loading modules from third-party plugins.Type: Boolean
Default: falseExample:

docling convert doc.pdf --allow-external-plugins --ocr-engine custom_plugin

--show-external-plugins

flag

List third-party plugins available when --allow-external-plugins is set.Type: BooleanDisplays available OCR, layout, and table extraction plugins.Example:

docling convert --show-external-plugins

Error Handling Options

--abort-on-error / --no-abort-on-error

flag

Control whether processing should abort on first error.Type: Boolean
Default: false (continue on errors)When disabled, failed documents are logged but processing continues.

Verbosity and Logging Options

-v, --verbose

flag

Set verbosity level for logging.Type: Count (repeatable)
Levels:

No flag: WARNING level
-v: INFO level
-vv: DEBUG level

Example:

docling convert doc.pdf -vv  # Debug logging

Debug Visualization Options

--debug-visualize-cells

flag

Enable debug output which visualizes PDF cells.Type: Boolean
Default: false

--debug-visualize-ocr

flag

Enable debug output which visualizes OCR cells.Type: Boolean
Default: false

--debug-visualize-layout

flag

Enable debug output which visualizes layout clusters.Type: Boolean
Default: false

--debug-visualize-tables

flag

Enable debug output which visualizes table cells.Type: Boolean
Default: falseExample:

docling convert doc.pdf --debug-visualize-tables --output ./debug

Profiling Options

--profiling

flag

Enable profiling to summarize timing details for all conversion stages.Type: Boolean
Default: falseDisplays a detailed timing table after conversion.

--save-profiling

flag

Save profiling summaries to JSON files.Type: Boolean
Default: falseExample:

docling convert doc.pdf --profiling --save-profiling

Information Options

--version

flag

Show version information and exit.Output includes:

Docling version
Docling Core version
Docling IBM Models version
Docling Parse version
Python version and implementation
Platform information

Example:

docling convert --version

--logo

flag

Display Docling ASCII art logo and exit.Example:

docling convert --logo

Model Download Options

These options apply to the docling tools models download command.

models

argument

Specific models to download.Type: Multiple values allowed
Available models:

layout - Layout analysis model
tableformer - Table structure extraction model
code_formula - Code and formula detection model
picture_classifier - Picture classification model
smolvlm - Small VLM model
granitedocling - Granite Docling VLM
granitedocling_mlx - Granite Docling for MLX
smoldocling - Small Docling VLM
smoldocling_mlx - Small Docling for MLX
granite_vision - Granite Vision model
granite_chart_extraction - Chart extraction model
rapidocr - RapidOCR model
easyocr - EasyOCR model

Default models (downloaded when no specific models specified):

layout, tableformer, code_formula, picture_classifier, rapidocr

-o, --output-dir

option

Directory where models will be downloaded.Type: Path
Default: System cache directoryExample:

docling tools models download --output-dir /opt/docling-models

--force

flag

Force download even if models already exist.Type: Boolean
Default: false

--all

flag

Download all available models.Type: Boolean
Default: false
Note: Mutually exclusive with specifying individual modelsExample:

docling tools models download --all

-q, --quiet

flag

Minimal output mode.Type: Boolean
Default: falseWhen enabled, only prints the output directory path. Useful for scripts.Example:

MODEL_DIR=$(docling tools models download --quiet)

HuggingFace Download Options

These options apply to the docling tools models download-hf-repo command.

models

argument

required

HuggingFace repository IDs to download.Type: Multiple values allowed
Format: org-name/repo-nameExample:

docling tools models download-hf-repo docling-project/docling-models

-o, --output-dir

option

Directory where models will be downloaded.Type: Path
Default: System cache directory

--force

flag

Force download even if model already exists.Type: Boolean
Default: false

-q, --quiet

flag

Minimal output mode.Type: Boolean
Default: falseWhen enabled, only prints the output directory path.

Core API

Pipelines

Options & Configuration

Backends

CLI

CLI Options Reference

Global Options

Input/Output Options

Image Export Options

Pipeline Configuration

OCR Options

Table Processing Options

PDF Backend Options

Enrichment Options

Performance Options

Model and Plugin Options

Error Handling Options

Verbosity and Logging Options

Debug Visualization Options

Profiling Options

Information Options

Model Download Options

HuggingFace Download Options

Build docs developers (and LLMs) love

Core API

Pipelines

Options & Configuration

Backends

CLI

​Global Options

​Input/Output Options

​Image Export Options

​Pipeline Configuration

​OCR Options

​Table Processing Options

​PDF Backend Options

​Enrichment Options

​Performance Options

​Model and Plugin Options

​Error Handling Options

​Verbosity and Logging Options

​Debug Visualization Options

​Profiling Options

​Information Options

​Model Download Options

​HuggingFace Download Options

Build docs developers (and LLMs) love

Global Options

Input/Output Options

Image Export Options

Pipeline Configuration

OCR Options

Table Processing Options

PDF Backend Options

Enrichment Options

Performance Options

Model and Plugin Options

Error Handling Options

Verbosity and Logging Options

Debug Visualization Options

Profiling Options

Information Options

Model Download Options

HuggingFace Download Options