Skip to main content
This page provides a comprehensive reference for all CLI options available in Docling.

Global Options

These options are available for the main docling convert command.

Input/Output Options

source
argument
required
PDF files to convert. Can be local file paths, directory paths, or URLs. Multiple sources can be specified.Examples:
  • docling convert document.pdf
  • docling convert /path/to/docs/
  • docling convert https://example.com/file.pdf
--from
option
Specify input formats to convert from.Type: Multiple values allowed
Default: All formats
Available: pdf, image, docx, pptx, xlsx, html, md, latex, audio, mets_gbs
--to
option
Specify output formats.Type: Multiple values allowed
Default: markdown
Available: json, yaml, html, html_split_page, markdown, text, doctags, vtt
Example:
docling convert doc.pdf --to json --to markdown
--output
option
Output directory where results are saved.Type: Path
Default: . (current directory)
Example:
docling convert doc.pdf --output ./results
--headers
option
Specify HTTP request headers used when fetching URL input sources.Type: JSON stringExample:
docling convert https://api.example.com/doc.pdf \
  --headers '{"Authorization": "Bearer token"}'

Image Export Options

--image-export-mode
option
Image export mode for documents (applies to JSON, Markdown, and HTML outputs).Type: Enum
Default: embedded
Options:
  • placeholder - Only mark image positions in output
  • embedded - Embed images as base64 encoded strings
  • referenced - Export images as PNG files and reference them
--show-layout
flag
If enabled, page images will show bounding boxes of detected items.Type: Boolean
Default: false

Pipeline Configuration

--pipeline
option
Choose the processing pipeline for PDF or image files.Type: Enum
Default: standard
Options:
  • standard - Traditional document processing pipeline
  • vlm - Vision-Language Model based pipeline
Example:
docling convert doc.pdf --pipeline vlm --vlm-model granite_docling
--vlm-model
option
Choose the VLM (Vision-Language Model) preset to use with PDF or image files.Type: String
Default: granite_docling
Available presets: granite_docling, smol_docling, and others
Only applicable when --pipeline vlm is set.
--asr-model
option
Choose the ASR (Automatic Speech Recognition) model for audio/video files.Type: Enum
Default: whisper_tiny
Available models:
  • Auto-select: whisper_tiny, whisper_base, whisper_small, whisper_medium, whisper_large, whisper_turbo
  • MLX variants: whisper_tiny_mlx, whisper_base_mlx, whisper_small_mlx, whisper_medium_mlx, whisper_large_mlx, whisper_turbo_mlx
  • Native variants: whisper_tiny_native, whisper_base_native, whisper_small_native, whisper_medium_native, whisper_large_native, whisper_turbo_native

OCR Options

--ocr / --no-ocr
flag
Enable or disable OCR for bitmap content.Type: Boolean
Default: true
Example:
docling convert scan.pdf --ocr
--force-ocr
flag
Replace any existing text with OCR-generated text over the full content.Type: Boolean
Default: false
Use this when existing text extraction is poor quality.
--ocr-engine
option
The OCR engine to use.Type: String
Default: auto
Available engines: auto, tesseract_cli, tesseract, easyocr, rapidocr
Additional engines may be available with --allow-external-plugins.Example:
docling convert scan.pdf --ocr-engine easyocr
--ocr-lang
option
Comma-separated list of languages for the OCR engine.Type: String
Note: Each OCR engine has different language code formats.
Example:
docling convert doc.pdf --ocr-lang "eng,fra,deu"
--psm
option
Page Segmentation Mode for the OCR engine.Type: Integer (0-13)
Applies to: Tesseract engines only
See Tesseract documentation for PSM mode details.

Table Processing Options

--tables / --no-tables
flag
Enable or disable table structure extraction.Type: Boolean
Default: true
--table-mode
option
The mode to use in the table structure model.Type: Enum
Default: accurate
Options:
  • fast - Faster processing, less accurate
  • accurate - Slower processing, more accurate

PDF Backend Options

--pdf-backend
option
The PDF backend to use for processing.Type: Enum
Default: docling_parse
Options:
  • docling_parse - Recommended backend (default)
  • pypdfium2 - Alternative backend
--pdf-password
option
Password for protected PDF documents.Type: StringExample:
docling convert protected.pdf --pdf-password "secret123"

Enrichment Options

--enrich-code
flag
Enable code enrichment model in the pipeline.Type: Boolean
Default: false
Improves detection and formatting of code blocks.
--enrich-formula
flag
Enable formula enrichment model in the pipeline.Type: Boolean
Default: false
Improves detection and rendering of mathematical formulas.
--enrich-picture-classes
flag
Enable picture classification enrichment model.Type: Boolean
Default: false
Classifies images into categories (charts, diagrams, photos, etc.).
--enrich-picture-description
flag
Enable picture description model.Type: Boolean
Default: false
Generates textual descriptions for images.
--enrich-chart-extraction
flag
Enable chart extraction to convert bar, pie, and line charts to tabular format.Type: Boolean
Default: false
Example:
docling convert report.pdf --enrich-chart-extraction

Performance Options

--num-threads
option
Number of threads to use for processing.Type: Integer
Default: 4
Example:
docling convert doc.pdf --num-threads 8
--device
option
Accelerator device to use for model inference.Type: Enum
Default: auto
Options:
  • auto - Automatically select best available device
  • cpu - Use CPU only
  • cuda - Use NVIDIA GPU
  • mps - Use Apple Metal Performance Shaders (Mac)
--page-batch-size
option
Number of pages processed in one batch.Type: Integer
Default: System default (configurable)
Larger values may improve throughput but use more memory.
--document-timeout
option
Timeout for processing each document, in seconds.Type: FloatExample:
docling convert doc.pdf --document-timeout 300

Model and Plugin Options

--artifacts-path
option
Location of model artifacts for offline use.Type: PathExample:
# First download models
docling tools models download --output-dir ./models

# Then use offline
docling convert doc.pdf --artifacts-path ./models
--enable-remote-services
flag
Must be enabled when using models that connect to remote services.Type: Boolean
Default: false
--allow-external-plugins
flag
Must be enabled for loading modules from third-party plugins.Type: Boolean
Default: false
Example:
docling convert doc.pdf --allow-external-plugins --ocr-engine custom_plugin
--show-external-plugins
flag
List third-party plugins available when --allow-external-plugins is set.Type: BooleanDisplays available OCR, layout, and table extraction plugins.Example:
docling convert --show-external-plugins

Error Handling Options

--abort-on-error / --no-abort-on-error
flag
Control whether processing should abort on first error.Type: Boolean
Default: false (continue on errors)
When disabled, failed documents are logged but processing continues.

Verbosity and Logging Options

-v, --verbose
flag
Set verbosity level for logging.Type: Count (repeatable)
Levels:
  • No flag: WARNING level
  • -v: INFO level
  • -vv: DEBUG level
Example:
docling convert doc.pdf -vv  # Debug logging

Debug Visualization Options

--debug-visualize-cells
flag
Enable debug output which visualizes PDF cells.Type: Boolean
Default: false
--debug-visualize-ocr
flag
Enable debug output which visualizes OCR cells.Type: Boolean
Default: false
--debug-visualize-layout
flag
Enable debug output which visualizes layout clusters.Type: Boolean
Default: false
--debug-visualize-tables
flag
Enable debug output which visualizes table cells.Type: Boolean
Default: false
Example:
docling convert doc.pdf --debug-visualize-tables --output ./debug

Profiling Options

--profiling
flag
Enable profiling to summarize timing details for all conversion stages.Type: Boolean
Default: false
Displays a detailed timing table after conversion.
--save-profiling
flag
Save profiling summaries to JSON files.Type: Boolean
Default: false
Example:
docling convert doc.pdf --profiling --save-profiling

Information Options

--version
flag
Show version information and exit.Output includes:
  • Docling version
  • Docling Core version
  • Docling IBM Models version
  • Docling Parse version
  • Python version and implementation
  • Platform information
Example:
docling convert --version
Display Docling ASCII art logo and exit.Example:
docling convert --logo

Model Download Options

These options apply to the docling tools models download command.
models
argument
Specific models to download.Type: Multiple values allowed
Available models:
  • layout - Layout analysis model
  • tableformer - Table structure extraction model
  • code_formula - Code and formula detection model
  • picture_classifier - Picture classification model
  • smolvlm - Small VLM model
  • granitedocling - Granite Docling VLM
  • granitedocling_mlx - Granite Docling for MLX
  • smoldocling - Small Docling VLM
  • smoldocling_mlx - Small Docling for MLX
  • granite_vision - Granite Vision model
  • granite_chart_extraction - Chart extraction model
  • rapidocr - RapidOCR model
  • easyocr - EasyOCR model
Default models (downloaded when no specific models specified):
  • layout, tableformer, code_formula, picture_classifier, rapidocr
-o, --output-dir
option
Directory where models will be downloaded.Type: Path
Default: System cache directory
Example:
docling tools models download --output-dir /opt/docling-models
--force
flag
Force download even if models already exist.Type: Boolean
Default: false
--all
flag
Download all available models.Type: Boolean
Default: false
Note: Mutually exclusive with specifying individual models
Example:
docling tools models download --all
-q, --quiet
flag
Minimal output mode.Type: Boolean
Default: false
When enabled, only prints the output directory path. Useful for scripts.Example:
MODEL_DIR=$(docling tools models download --quiet)

HuggingFace Download Options

These options apply to the docling tools models download-hf-repo command.
models
argument
required
HuggingFace repository IDs to download.Type: Multiple values allowed
Format: org-name/repo-name
Example:
docling tools models download-hf-repo docling-project/docling-models
-o, --output-dir
option
Directory where models will be downloaded.Type: Path
Default: System cache directory
--force
flag
Force download even if model already exists.Type: Boolean
Default: false
-q, --quiet
flag
Minimal output mode.Type: Boolean
Default: false
When enabled, only prints the output directory path.

Build docs developers (and LLMs) love