Overview
This example demonstrates the basic usage of Retto for performing OCR on images. You’ll learn how to:- Initialize a
RettoSessionwith default configuration - Load and process an image
- Extract text detection boxes, rotation angles, and recognized text
- Parse the results
Complete Example
Step-by-Step Breakdown
1. Create Configuration
Start by creating aRettoSessionConfig with default settings:
hf-hub feature is enabled) or uses local models.
2. Initialize Session
Create a new session with your configuration:3. Load and Process Image
Read your image file as bytes:4. Parse Results
TheRettoWorkerResult contains three components:
Detection Results (det_result)
Contains bounding boxes and confidence scores for detected text regions:
Classification Results (cls_result)
Contains rotation angles (0° or 180°) for each detected region:
Recognition Results (rec_result)
Contains the actual recognized text:
Complete Working Example
Here’s a more practical example that processes an image and outputs formatted results:Model Loading
By default, Retto uses the Hugging Face Hub to download models automatically:Next Steps
Streaming OCR
Process OCR stages with real-time callbacks
Custom Configuration
Fine-tune detection, classification, and recognition parameters
