Overview
Retto provides extensive configuration options to customize OCR behavior. You can adjust:- Image size limits for processing
- Detection threshold and sensitivity
- Classification threshold for rotation detection
- Recognition batch size and image dimensions
- Hardware acceleration (CPU, CUDA, DirectML)
Configuration Structure
TheRettoSessionConfig contains all configurable parameters:
Basic Custom Configuration
Here’s a simple example with custom settings:Image Size Configuration
max_side_len and min_side_len
These parameters control how Retto resizes input images before processing:-
max_side_len: Higher values preserve detail but increase processing time
- Use 2000-3000 for standard documents
- Use 4000-6000 for high-resolution scans
- Use 1000-1500 for fast processing of low-quality images
-
min_side_len: Filters out very small text regions
- Use 30-50 for standard text
- Use 100+ to ignore small noise
Detection Configuration
Customize text detection parameters:Key Detection Parameters
Pixel-level threshold for text detection. Lower values detect more text but may include noise.
- 0.2-0.25: Aggressive detection, more false positives
- 0.3-0.35: Balanced (recommended)
- 0.4-0.5: Conservative, fewer false positives
Minimum confidence score for a detected text box to be accepted.
- 0.3-0.4: Accept more uncertain detections
- 0.5: Balanced confidence
- 0.6-0.7: High confidence only
Expansion ratio for detected text boxes. Higher values include more context around text.
- 1.2-1.4: Tight boxes
- 1.6: Standard (recommended)
- 1.8-2.0: Loose boxes with more padding
Minimum side length for detected boxes (in pixels). Filters out very small detections.
Classification Configuration
Customize rotation detection:Key Classification Parameters
Confidence threshold for applying 180° rotation. Only rotate if confidence exceeds this value.
- 0.7-0.8: Aggressive rotation correction
- 0.9: Balanced (recommended)
- 0.95+: Conservative, only rotate if very confident
Number of images to classify in parallel. Higher values use more memory but may be faster.
- 2-4: Low memory usage
- 6-8: Balanced
- 10-16: High throughput (requires more RAM)
Recognition Configuration
Customize text recognition:Custom Character Dictionary
Use a custom character dictionary for recognition:Hardware Acceleration
CPU (Default)
CUDA (NVIDIA GPU)
DirectML (Windows)
Complete Custom Configuration Example
Here’s a complete example with all customizations:Configuration Presets
Here are recommended presets for common use cases:High Accuracy (Slow)
Fast Processing (Lower Accuracy)
Balanced (Recommended)
Next Steps
Basic OCR
Learn the fundamentals of OCR with Retto
Streaming OCR
Process stages with real-time callbacks
