Overview
The Retto WASM API provides OCR capabilities in JavaScript/TypeScript environments through WebAssembly bindings.
Installation
Retto Class
The main entry point for using Retto in JavaScript.
Loading the Module
Retto.load()
Loads the WebAssembly module and initializes the Retto instance.
static async load(onProgress?: (ratio: number) => void): Promise<Retto>
Parameters:
onProgress (optional): Callback function receiving download progress (0 to 1)
Returns: Promise resolving to a Retto instance
Example:
import { Retto } from 'retto-wasm';
const retto = await Retto.load((progress) => {
console.log(`Loading: ${(progress * 100).toFixed(1)}%`);
});
Source: retto-wasm/fe/index.ts:154
Initialization
init()
Initializes the OCR engine with model files.
async init(models?: RettoModel): Promise<void>
Parameters:
models (optional): Model files as ArrayBuffers. Required unless using an embedded build.
Type Definition:
interface RettoModel {
det_model: ArrayBuffer;
cls_model: ArrayBuffer;
rec_model: ArrayBuffer;
rec_dict: ArrayBuffer;
}
Examples:
// If built with embed-models feature
await retto.init();
Source: retto-wasm/fe/index.ts:209
is_embed_build
Checks if the build includes embedded models.
get is_embed_build(): boolean
Returns: true if models are embedded, false otherwise
Example:
if (retto.is_embed_build) {
await retto.init();
} else {
await retto.init(customModels);
}
Source: retto-wasm/fe/index.ts:169
Recognition
recognize()
Performs OCR on an image, yielding results as each stage completes.
async *recognize(
data: Uint8Array | ArrayBuffer
): AsyncGenerator<RettoWorkerStage, void, unknown>
Parameters:
data: Image data as Uint8Array or ArrayBuffer (any format supported by the image crate)
Yields: Progress updates for each OCR stage (detection, classification, recognition)
Type Definitions:
type RettoWorkerStage =
| { stage: "det"; result: DetProcessorResult }
| { stage: "cls"; result: ClsProcessorResult }
| { stage: "rec"; result: RecProcessorResult };
type DetProcessorResult = DetProcessorInnerResult[];
interface DetProcessorInnerResult {
boxes: PointBox;
score: number;
}
interface PointBox {
inner: [Point, Point, Point, Point];
}
interface Point {
x: number;
y: number;
}
type ClsProcessorResult = ClsProcessorSingleResult[];
interface ClsProcessorSingleResult {
label: ClsPostProcessLabel;
}
interface ClsPostProcessLabel {
label: number; // 0 or 180 degrees
score: number;
}
type RecProcessorResult = RecProcessorSingleResult[];
interface RecProcessorSingleResult {
text: string;
score: number;
}
Source: retto-wasm/fe/index.ts:5-42, 237
Example:
const imageData = await fetch('/image.jpg').then(r => r.arrayBuffer());
for await (const stage of retto.recognize(imageData)) {
if (stage.stage === 'det') {
console.log('Detection results:', stage.result);
// stage.result is DetProcessorResult[]
} else if (stage.stage === 'cls') {
console.log('Classification results:', stage.result);
// stage.result is ClsProcessorResult[]
} else if (stage.stage === 'rec') {
console.log('Recognition results:', stage.result);
// stage.result is RecProcessorResult[]
stage.result.forEach((item, i) => {
console.log(`Text ${i}: ${item.text} (confidence: ${item.score})`);
});
}
}
Low-Level C Functions
These functions are exposed from the WASM module but typically accessed through the Retto class.
Memory Management
alloc()
Allocates memory in the WASM heap.
Source: retto-wasm/src/wasm_lib.rs:23
dealloc()
Frees previously allocated memory.
void dealloc(void* ptr, size_t size)
Source: retto-wasm/src/wasm_lib.rs:32
These functions are managed internally by the Retto class. Direct usage requires careful memory management.
Initialization Functions
retto_init()
Initializes the OCR engine with model data.
void retto_init(
const uint8_t* det_ptr,
size_t det_len,
const uint8_t* cls_ptr,
size_t cls_len,
const uint8_t* rec_ptr,
size_t rec_len,
const uint8_t* rec_dict_ptr,
size_t rec_dict_len
)
Parameters:
det_ptr, det_len: Detection model data and length
cls_ptr, cls_len: Classification model data and length
rec_ptr, rec_len: Recognition model data and length
rec_dict_ptr, rec_dict_len: Character dictionary data and length
Source: retto-wasm/src/wasm_lib.rs:71
retto_embed_init()
Initializes with embedded models (only available in builds with embed-models feature).
Source: retto-wasm/src/wasm_lib.rs:113
Recognition Function
retto_rec()
Performs OCR on image data.
const char* retto_rec(
const uint8_t* image_data_ptr,
uint32_t image_data_len
)
Parameters:
image_data_ptr: Pointer to image data
image_data_len: Length of image data
Returns: Session ID string (UUID)
Source: retto-wasm/src/wasm_lib.rs:132
Callback Functions
These JavaScript callbacks can be set on the Module object to receive results:
onRettoNotifyDetDone
Called when text detection completes.
Module.onRettoNotifyDetDone = (sessionId: string, msg: string) => {
const result = JSON.parse(msg) as DetProcessorResult;
// Handle detection result
};
Source: retto-wasm/src/wasm_lib.rs:40
onRettoNotifyClsDone
Called when text classification completes.
Module.onRettoNotifyClsDone = (sessionId: string, msg: string) => {
const result = JSON.parse(msg) as ClsProcessorResult;
// Handle classification result
};
Source: retto-wasm/src/wasm_lib.rs:46
onRettoNotifyRecDone
Called when text recognition completes.
Module.onRettoNotifyRecDone = (sessionId: string, msg: string) => {
const result = JSON.parse(msg) as RecProcessorResult;
// Handle recognition result
};
Source: retto-wasm/src/wasm_lib.rs:52
The Retto class automatically sets up these callbacks. You typically don’t need to handle them directly.
Complete Examples
React Component
import { Retto, RecProcessorResult } from 'retto-wasm';
import { useState, useEffect } from 'react';
function OCRComponent() {
const [retto, setRetto] = useState<Retto | null>(null);
const [results, setResults] = useState<RecProcessorResult | null>(null);
const [loading, setLoading] = useState(false);
useEffect(() => {
async function init() {
const instance = await Retto.load((progress) => {
console.log(`Loading: ${(progress * 100).toFixed(0)}%`);
});
await instance.init();
setRetto(instance);
}
init();
}, []);
async function processImage(file: File) {
if (!retto) return;
setLoading(true);
const buffer = await file.arrayBuffer();
for await (const stage of retto.recognize(buffer)) {
if (stage.stage === 'rec') {
setResults(stage.result);
}
}
setLoading(false);
}
return (
<div>
<input
type="file"
onChange={(e) => e.target.files?.[0] && processImage(e.target.files[0])}
disabled={!retto || loading}
/>
{results && (
<div>
{results.map((item, i) => (
<div key={i}>
{item.text} ({(item.score * 100).toFixed(1)}%)
</div>
))}
</div>
)}
</div>
);
}
Node.js Script
import { Retto } from 'retto-wasm';
import { readFile } from 'fs/promises';
async function main() {
// Load the WASM module
const retto = await Retto.load();
// Load models from disk
const models = {
det_model: await readFile('./models/det.onnx').then(b => b.buffer),
cls_model: await readFile('./models/cls.onnx').then(b => b.buffer),
rec_model: await readFile('./models/rec.onnx').then(b => b.buffer),
rec_dict: await readFile('./models/keys.txt').then(b => b.buffer),
};
await retto.init(models);
// Process image
const imageData = await readFile('./test.jpg').then(b => b.buffer);
for await (const stage of retto.recognize(imageData)) {
console.log(`Stage: ${stage.stage}`);
if (stage.stage === 'rec') {
stage.result.forEach((item) => {
console.log(` ${item.text} (${(item.score * 100).toFixed(1)}%)`);
});
}
}
}
main();
Progressive Result Display
import { Retto } from 'retto-wasm';
async function processWithProgress(imageFile: File) {
const retto = await Retto.load();
await retto.init();
const buffer = await imageFile.arrayBuffer();
for await (const stage of retto.recognize(buffer)) {
switch (stage.stage) {
case 'det':
console.log(`Found ${stage.result.length} text regions`);
stage.result.forEach((box, i) => {
console.log(` Region ${i}: confidence ${box.score.toFixed(2)}`);
});
break;
case 'cls':
console.log(`Classified ${stage.result.length} orientations`);
stage.result.forEach((cls, i) => {
console.log(` Region ${i}: ${cls.label.label}° (${cls.label.score.toFixed(2)})`);
});
break;
case 'rec':
console.log(`Recognized ${stage.result.length} text segments`);
stage.result.forEach((rec, i) => {
console.log(` Region ${i}: "${rec.text}" (${rec.score.toFixed(2)})`);
});
break;
}
}
}
Build Configuration
The WASM module can be built with different features:
Standard Build
wasm-pack build --target web
Requires models to be provided at runtime via init(models).
Embedded Models Build
wasm-pack build --target web --features embed-models
Includes models in the WASM binary. Can use retto_embed_init() or init() without arguments.
Embedded builds are larger (~50MB+) but don’t require separate model downloads.
Memory Management
The Retto class includes a WasmBufferManager that handles allocation and deallocation:
class WasmBufferManager {
// Automatically allocates buffer in WASM memory
async ctx<T>(
data: BufferData,
fn: (ptr: number, len: number) => Promise<T> | T
): Promise<T>
// Handles multiple buffers
async ctxRegions<T>(
datas: BufferData[],
fn: (...regions: BufferRegion[]) => Promise<T> | T
): Promise<T>
}
Source: retto-wasm/fe/index.ts:73
Memory is automatically cleaned up after operations complete. You don’t need to manually call alloc/dealloc.
Browser Compatibility
- WebAssembly: Required (all modern browsers)
- SharedArrayBuffer: Not required
- Worker Support: Single-threaded execution
- SIMD: Uses WASM SIMD if available
Tested Browsers
- Chrome/Edge 90+
- Firefox 89+
- Safari 15+
Some features may require appropriate CORS headers when serving the WASM file.