RettoSession

Overview

RettoSession<W: RettoWorker> is the primary interface for executing OCR operations in Retto. It manages the complete OCR pipeline including text detection, classification, and recognition.

Type Definition

pub struct RettoSession<W: RettoWorker> {
    worker: W,
    rec_character: RecCharacter,
    config: RettoSessionConfig<W>,
}

The session is generic over a RettoWorker type, which handles the actual model inference. Common implementations include RettoOrtWorker for ONNX Runtime.

Constructor

new()

Creates a new OCR session with the specified configuration.

pub fn new(cfg: RettoSessionConfig<W>) -> RettoResult<Self>

cfg

RettoSessionConfig<W>

required

Configuration for the session, including worker config and processor settings

Returns

RettoResult<RettoSession<W>>

Returns a new session instance or an error if initialization fails (e.g., model loading errors)

Example

use retto_core::prelude::*;

let cfg = RettoSessionConfig {
    worker_config: RettoOrtWorkerConfig::default(),
    max_side_len: 2000,
    min_side_len: 30,
    det_processor_config: DetProcessorConfig::default(),
    cls_processor_config: ClsProcessorConfig::default(),
    rec_processor_config: RecProcessorConfig::default(),
};

let session = RettoSession::new(cfg)?;

Methods

run()

Executes the complete OCR pipeline on an input image and returns all results.

pub fn run(&mut self, input: impl AsRef<[u8]>) -> RettoResult<RettoWorkerResult>

input

impl AsRef<[u8]>

required

Raw image data in any format supported by the image crate (PNG, JPEG, etc.)

Returns

RettoResult<RettoWorkerResult>

Combined results from all three stages: detection, classification, and recognition

Processing Pipeline

The run() method executes three sequential stages:

Detection: Locates text regions in the image
Classification: Determines text orientation (0° or 180°)
Recognition: Extracts text content from each region

Example

let image_data = std::fs::read("input.png")?;
let result = session.run(image_data)?;

// Access detection results
for det in &result.det_result.0 {
    println!("Text box: {:?}, score: {}", det.boxes, det.score);
}

// Access classification results
for cls in &result.cls_result.0 {
    println!("Orientation: {}°, score: {}", cls.label.label, cls.label.score);
}

// Access recognition results
for rec in &result.rec_result.0 {
    println!("Text: {}, confidence: {}", rec.text, rec.score);
}

run_stream()

Executes the OCR pipeline with streaming results via a channel.

pub fn run_stream(
    &mut self,
    input: impl AsRef<[u8]>,
    sender: mpsc::Sender<RettoWorkerStageResult>,
) -> RettoResult<()>

input

impl AsRef<[u8]>

required

Raw image data in any format supported by the image crate

sender

mpsc::Sender<RettoWorkerStageResult>

required

Channel sender for receiving stage results as they complete

Returns

RettoResult<()>

Returns Ok(()) if pipeline completes successfully. Results are sent via the channel.

Use Case

The streaming API is useful when you need to:

Process results as soon as each stage completes
Display progressive results in a UI
Implement early termination based on intermediate results

Example

use std::sync::mpsc;

let (tx, rx) = mpsc::channel();
let image_data = std::fs::read("input.png")?;

// Run in a separate thread to avoid blocking
std::thread::spawn(move || {
    session.run_stream(image_data, tx).unwrap();
});

// Receive results as they arrive
for stage_result in rx {
    match stage_result {
        RettoWorkerStageResult::Det(det) => {
            println!("Detection complete: {} regions found", det.0.len());
        }
        RettoWorkerStageResult::Cls(cls) => {
            println!("Classification complete: {} results", cls.0.len());
        }
        RettoWorkerStageResult::Rec(rec) => {
            println!("Recognition complete: {} texts extracted", rec.0.len());
        }
    }
}

Image Processing

Before OCR processing begins, RettoSession automatically:

Decodes the input image from raw bytes
Resizes the image based on max_side_len and min_side_len constraints
Maintains aspect ratio during resizing
Scales coordinates back to original image dimensions in results

These preprocessing steps are configured via RettoSessionConfig.

Thread Safety

RettoSession is not Send or Sync by default. For multi-threaded usage:

Create one session per thread
Use a thread pool pattern with session-per-worker
Share configuration but not session instances

RettoSessionConfig - Session configuration
RettoWorkerResult - Complete OCR results
RettoWorkerStageResult - Individual stage results

Core API

Processors

Workers

CLI

WebAssembly

RettoSession

Overview

Type Definition

Constructor

new()

Example

Methods

run()

Processing Pipeline

Example

run_stream()

Use Case

Example

Image Processing

Thread Safety

Build docs developers (and LLMs) love

Core API

Processors

Workers

CLI

WebAssembly

​Overview

​Type Definition

​Constructor

​new()

​Example

​Methods

​run()

​Processing Pipeline

​Example

​run_stream()

​Use Case

​Example

​Image Processing

​Thread Safety

​Related Types

Build docs developers (and LLMs) love

Overview

Type Definition

Constructor

new()

Example

Methods

run()

Processing Pipeline

Example

run_stream()

Use Case

Example

Image Processing

Thread Safety

Related Types