Introduction to Retto

Retto is a high-performance OCR (Optical Character Recognition) SDK built in Rust that provides PaddleOCR inference capabilities with WebAssembly support. It enables fast, accurate text detection and recognition across multiple platforms including native Rust applications, command-line tools, and web browsers.

What is Retto?

Retto is a complete OCR solution that implements the PaddleOCR v4 pipeline, consisting of three stages:

Text Detection - Locates text regions in images
Text Classification - Determines text orientation (0°, 90°, 180°, 270°)
Text Recognition - Recognizes the actual text content

The project provides three main components:

retto-core - Core Rust library with processor implementations
retto-cli - Command-line tool for batch OCR processing
retto-wasm - WebAssembly package for browser-based OCR

Why Use Retto?

High Performance

Built with Rust for maximum performance and memory safety. Utilizes ONNX Runtime for optimized inference with CPU, CUDA, and DirectML support.

Cross-Platform

Run OCR natively on desktop, in command-line tools, or directly in web browsers via WebAssembly - all from the same codebase.

Multiple Backends

Supports multiple execution providers including CPU, CUDA (NVIDIA GPUs), and DirectML (Windows) for optimal performance on different hardware.

Flexible Model Loading

Load models from local files, memory buffers, or automatically download from Hugging Face Hub.

Key Features

PaddleOCR v4 Models - Implements the latest PaddleOCR v4 pipeline with detection, classification, and recognition
ONNX Runtime Integration - Leverages ONNX Runtime for efficient model inference
Streaming Results - Process OCR stages asynchronously with callback support
Batch Processing - Process multiple images efficiently with parallel processing
Type Safety - Full Rust type safety with comprehensive error handling
Serialization Support - Built-in JSON serialization for easy integration
Hardware Acceleration - Optional CUDA and DirectML support for GPU acceleration

Architecture Overview

Retto processes images through a pipeline architecture:

Image Helper - Handles image loading, resizing, and preprocessing
Detection Processor - Uses a CNN model to detect text bounding boxes
Classification Processor - Determines text orientation for each detected region
Recognition Processor - Converts text regions into actual text using CTC decoding

Session-Based API

Retto uses a session-based design where you create a RettoSession with your desired configuration:

let cfg = RettoSessionConfig {
    worker_config: RettoOrtWorkerConfig {
        device: RettoOrtWorkerDevice::CPU,
        models: RettoOrtWorkerModelProvider::from_hf_hub_v4_default(),
    },
    max_side_len: 2000,
    min_side_len: 30,
    ..Default::default()
};

let mut session = RettoSession::new(cfg)?;

The session manages model loading, resource allocation, and provides methods for synchronous and streaming inference.

Use Cases

Document Digitization

Extract text from scanned documents, receipts, and forms for archival and searchability.

Web Applications

Build browser-based OCR tools without server-side processing using the WebAssembly package.

Batch Processing

Process large volumes of images efficiently using the CLI tool with parallel processing.

Real-time OCR

Integrate OCR capabilities into desktop applications with low-latency inference.

Getting Started

Choose Your Platform

Decide whether you need the Rust library, CLI tool, or WebAssembly package based on your use case.

Install Retto

Follow the installation guide to add Retto to your project.

Run Your First OCR

Check out the quickstart guide to run your first OCR in minutes.

Retto is currently in active development (v0.1.5). The API may change in future versions.

License

Retto is licensed under the Apache License 2.0, making it suitable for both open source and commercial projects.

Get Started

Core Concepts

Guides

Examples

Introduction

Introduction to Retto

What is Retto?

Why Use Retto?

High Performance

Cross-Platform

Multiple Backends

Flexible Model Loading

Key Features

Architecture Overview

Session-Based API

Use Cases

Getting Started

License

Next Steps

Installation

Quickstart

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Examples

​Introduction to Retto

​What is Retto?

​Why Use Retto?

High Performance

Cross-Platform

Multiple Backends

Flexible Model Loading

​Key Features

​Architecture Overview

​Session-Based API

​Use Cases

​Getting Started

​License

​Next Steps

Installation

Quickstart

Build docs developers (and LLMs) love

Introduction to Retto

What is Retto?

Why Use Retto?

Key Features

Architecture Overview

Session-Based API

Use Cases

Getting Started

License

Next Steps