Skip to main content

System Requirements

Before installing olmOCR, ensure your system meets these requirements:

Hardware Requirements

GPU

Recent NVIDIA GPU with CUDA supportTested GPUs:
  • RTX 4090
  • L40S
  • A100
  • H100

Storage

Minimum 30GB of free disk spaceRequired for:
  • Model weights
  • Temporary processing files
  • Output storage

Software Requirements

  • Python: 3.11 or higher (pyproject.toml:20)
  • CUDA: Compatible version for your GPU
  • Operating System: Ubuntu/Debian (other Linux distributions may work)
GPU is required for running inference. CPU-only mode is not supported for the inference pipeline.

Installation Steps

1

Install System Dependencies

Install poppler-utils and additional fonts for rendering PDF images:
sudo apt-get update
sudo apt-get install poppler-utils ttf-mscorefonts-installer msttcorefonts fonts-crosextra-caladea fonts-crosextra-carlito gsfonts lcdf-typetools
These fonts are required for properly rendering PDF pages to images.
2

Create Conda Environment

Set up a dedicated conda environment with Python 3.11:
conda create -n olmocr python=3.11
conda activate olmocr
3

Clone and Install olmOCR

Clone the repository and install the package:
git clone https://github.com/allenai/olmocr.git
cd olmocr
pip install -e .
This installs all core dependencies including:
  • pypdf (>=5.2.0) - PDF parsing
  • pypdfium2 - PDF rendering
  • torch (>=2.5.1) - Deep learning framework
  • transformers (>=4.46.2) - Model loading
  • Pillow - Image processing
  • And more (see pyproject.toml:21-42 for full list)
4

Install SGLang for GPU Inference (Recommended)

For GPU-accelerated inference, install sglang with flashinfer support:
pip install sgl-kernel==0.0.3.post1 --force-reinstall --no-deps
pip install "sglang[all]==0.4.2" --find-links https://flashinfer.ai/whl/cu124/torch2.4/flashinfer/
This step is optional but highly recommended for production use. Without sglang, you won’t be able to run GPU inference.
5

Verify Installation

Verify your installation by checking the package version:
python -c "import olmocr; print(olmocr.VERSION)"

Optional Dependencies

Development Tools

If you plan to contribute to olmOCR, install the development dependencies:
pip install -e ".[dev]"
This includes:
  • Testing tools (pytest, pytest-cov)
  • Code formatters (black, isort, ruff)
  • Type checking (mypy)
  • Documentation tools (Sphinx)

Training Dependencies

For fine-tuning models, install the training extras:
pip install -e ".[train]"
This adds:
  • accelerate - Distributed training
  • peft - Parameter-efficient fine-tuning
  • wandb - Experiment tracking
  • datasets - Data loading

GPU Configuration

Memory Requirements

The pipeline automatically adjusts GPU memory usage based on available VRAM:
  • < 60GB VRAM: Uses 80% memory fraction for KV cache
  • >= 60GB VRAM: Uses default memory allocation
The memory fraction is automatically configured in pipeline.py:508 based on your GPU.

Checking GPU Availability

Verify that PyTorch can detect your GPU:
python -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}'); print(f'GPU count: {torch.cuda.device_count()}')"
Expected output:
CUDA available: True
GPU count: 1

Troubleshooting

Poppler Not Found

If you get an error about poppler not being found:
# Verify poppler is installed
pdftoppm -v
If not found, reinstall:
sudo apt-get install --reinstall poppler-utils

CUDA Version Mismatch

If you encounter CUDA version errors, ensure your PyTorch installation matches your CUDA version:
# Check CUDA version
nvidia-smi

# Install PyTorch with matching CUDA version
# Visit https://pytorch.org for the correct command

Font Rendering Issues

If PDFs render with missing characters, ensure all fonts are installed:
fc-cache -f -v

Next Steps

Now that you have olmOCR installed, you’re ready to process your first PDF!

Quickstart Guide

Learn how to convert PDFs in minutes

Build docs developers (and LLMs) love