Skip to main content

Overview

The Retto WASM API provides OCR capabilities in JavaScript/TypeScript environments through WebAssembly bindings.

Installation

npm install retto-wasm

Retto Class

The main entry point for using Retto in JavaScript.

Loading the Module

Retto.load()

Loads the WebAssembly module and initializes the Retto instance.
static async load(onProgress?: (ratio: number) => void): Promise<Retto>
Parameters:
  • onProgress (optional): Callback function receiving download progress (0 to 1)
Returns: Promise resolving to a Retto instance Example:
import { Retto } from 'retto-wasm';

const retto = await Retto.load((progress) => {
  console.log(`Loading: ${(progress * 100).toFixed(1)}%`);
});
Source: retto-wasm/fe/index.ts:154

Initialization

init()

Initializes the OCR engine with model files.
async init(models?: RettoModel): Promise<void>
Parameters:
  • models (optional): Model files as ArrayBuffers. Required unless using an embedded build.
Type Definition:
interface RettoModel {
  det_model: ArrayBuffer;
  cls_model: ArrayBuffer;
  rec_model: ArrayBuffer;
  rec_dict: ArrayBuffer;
}
Examples:
// If built with embed-models feature
await retto.init();
Source: retto-wasm/fe/index.ts:209

is_embed_build

Checks if the build includes embedded models.
get is_embed_build(): boolean
Returns: true if models are embedded, false otherwise Example:
if (retto.is_embed_build) {
  await retto.init();
} else {
  await retto.init(customModels);
}
Source: retto-wasm/fe/index.ts:169

Recognition

recognize()

Performs OCR on an image, yielding results as each stage completes.
async *recognize(
  data: Uint8Array | ArrayBuffer
): AsyncGenerator<RettoWorkerStage, void, unknown>
Parameters:
  • data: Image data as Uint8Array or ArrayBuffer (any format supported by the image crate)
Yields: Progress updates for each OCR stage (detection, classification, recognition) Type Definitions:
type RettoWorkerStage =
  | { stage: "det"; result: DetProcessorResult }
  | { stage: "cls"; result: ClsProcessorResult }
  | { stage: "rec"; result: RecProcessorResult };

type DetProcessorResult = DetProcessorInnerResult[];

interface DetProcessorInnerResult {
  boxes: PointBox;
  score: number;
}

interface PointBox {
  inner: [Point, Point, Point, Point];
}

interface Point {
  x: number;
  y: number;
}

type ClsProcessorResult = ClsProcessorSingleResult[];

interface ClsProcessorSingleResult {
  label: ClsPostProcessLabel;
}

interface ClsPostProcessLabel {
  label: number;  // 0 or 180 degrees
  score: number;
}

type RecProcessorResult = RecProcessorSingleResult[];

interface RecProcessorSingleResult {
  text: string;
  score: number;
}
Source: retto-wasm/fe/index.ts:5-42, 237 Example:
const imageData = await fetch('/image.jpg').then(r => r.arrayBuffer());

for await (const stage of retto.recognize(imageData)) {
  if (stage.stage === 'det') {
    console.log('Detection results:', stage.result);
    // stage.result is DetProcessorResult[]
  } else if (stage.stage === 'cls') {
    console.log('Classification results:', stage.result);
    // stage.result is ClsProcessorResult[]
  } else if (stage.stage === 'rec') {
    console.log('Recognition results:', stage.result);
    // stage.result is RecProcessorResult[]
    stage.result.forEach((item, i) => {
      console.log(`Text ${i}: ${item.text} (confidence: ${item.score})`);
    });
  }
}

Low-Level C Functions

These functions are exposed from the WASM module but typically accessed through the Retto class.

Memory Management

alloc()

Allocates memory in the WASM heap.
void* alloc(size_t size)
Source: retto-wasm/src/wasm_lib.rs:23

dealloc()

Frees previously allocated memory.
void dealloc(void* ptr, size_t size)
Source: retto-wasm/src/wasm_lib.rs:32
These functions are managed internally by the Retto class. Direct usage requires careful memory management.

Initialization Functions

retto_init()

Initializes the OCR engine with model data.
void retto_init(
  const uint8_t* det_ptr,
  size_t det_len,
  const uint8_t* cls_ptr,
  size_t cls_len,
  const uint8_t* rec_ptr,
  size_t rec_len,
  const uint8_t* rec_dict_ptr,
  size_t rec_dict_len
)
Parameters:
  • det_ptr, det_len: Detection model data and length
  • cls_ptr, cls_len: Classification model data and length
  • rec_ptr, rec_len: Recognition model data and length
  • rec_dict_ptr, rec_dict_len: Character dictionary data and length
Source: retto-wasm/src/wasm_lib.rs:71

retto_embed_init()

Initializes with embedded models (only available in builds with embed-models feature).
void retto_embed_init()
Source: retto-wasm/src/wasm_lib.rs:113

Recognition Function

retto_rec()

Performs OCR on image data.
const char* retto_rec(
  const uint8_t* image_data_ptr,
  uint32_t image_data_len
)
Parameters:
  • image_data_ptr: Pointer to image data
  • image_data_len: Length of image data
Returns: Session ID string (UUID) Source: retto-wasm/src/wasm_lib.rs:132

Callback Functions

These JavaScript callbacks can be set on the Module object to receive results:

onRettoNotifyDetDone

Called when text detection completes.
Module.onRettoNotifyDetDone = (sessionId: string, msg: string) => {
  const result = JSON.parse(msg) as DetProcessorResult;
  // Handle detection result
};
Source: retto-wasm/src/wasm_lib.rs:40

onRettoNotifyClsDone

Called when text classification completes.
Module.onRettoNotifyClsDone = (sessionId: string, msg: string) => {
  const result = JSON.parse(msg) as ClsProcessorResult;
  // Handle classification result
};
Source: retto-wasm/src/wasm_lib.rs:46

onRettoNotifyRecDone

Called when text recognition completes.
Module.onRettoNotifyRecDone = (sessionId: string, msg: string) => {
  const result = JSON.parse(msg) as RecProcessorResult;
  // Handle recognition result
};
Source: retto-wasm/src/wasm_lib.rs:52
The Retto class automatically sets up these callbacks. You typically don’t need to handle them directly.

Complete Examples

React Component

import { Retto, RecProcessorResult } from 'retto-wasm';
import { useState, useEffect } from 'react';

function OCRComponent() {
  const [retto, setRetto] = useState<Retto | null>(null);
  const [results, setResults] = useState<RecProcessorResult | null>(null);
  const [loading, setLoading] = useState(false);

  useEffect(() => {
    async function init() {
      const instance = await Retto.load((progress) => {
        console.log(`Loading: ${(progress * 100).toFixed(0)}%`);
      });
      await instance.init();
      setRetto(instance);
    }
    init();
  }, []);

  async function processImage(file: File) {
    if (!retto) return;
    
    setLoading(true);
    const buffer = await file.arrayBuffer();
    
    for await (const stage of retto.recognize(buffer)) {
      if (stage.stage === 'rec') {
        setResults(stage.result);
      }
    }
    setLoading(false);
  }

  return (
    <div>
      <input 
        type="file" 
        onChange={(e) => e.target.files?.[0] && processImage(e.target.files[0])}
        disabled={!retto || loading}
      />
      {results && (
        <div>
          {results.map((item, i) => (
            <div key={i}>
              {item.text} ({(item.score * 100).toFixed(1)}%)
            </div>
          ))}
        </div>
      )}
    </div>
  );
}

Node.js Script

import { Retto } from 'retto-wasm';
import { readFile } from 'fs/promises';

async function main() {
  // Load the WASM module
  const retto = await Retto.load();
  
  // Load models from disk
  const models = {
    det_model: await readFile('./models/det.onnx').then(b => b.buffer),
    cls_model: await readFile('./models/cls.onnx').then(b => b.buffer),
    rec_model: await readFile('./models/rec.onnx').then(b => b.buffer),
    rec_dict: await readFile('./models/keys.txt').then(b => b.buffer),
  };
  
  await retto.init(models);
  
  // Process image
  const imageData = await readFile('./test.jpg').then(b => b.buffer);
  
  for await (const stage of retto.recognize(imageData)) {
    console.log(`Stage: ${stage.stage}`);
    
    if (stage.stage === 'rec') {
      stage.result.forEach((item) => {
        console.log(`  ${item.text} (${(item.score * 100).toFixed(1)}%)`);
      });
    }
  }
}

main();

Progressive Result Display

import { Retto } from 'retto-wasm';

async function processWithProgress(imageFile: File) {
  const retto = await Retto.load();
  await retto.init();
  
  const buffer = await imageFile.arrayBuffer();
  
  for await (const stage of retto.recognize(buffer)) {
    switch (stage.stage) {
      case 'det':
        console.log(`Found ${stage.result.length} text regions`);
        stage.result.forEach((box, i) => {
          console.log(`  Region ${i}: confidence ${box.score.toFixed(2)}`);
        });
        break;
        
      case 'cls':
        console.log(`Classified ${stage.result.length} orientations`);
        stage.result.forEach((cls, i) => {
          console.log(`  Region ${i}: ${cls.label.label}° (${cls.label.score.toFixed(2)})`);
        });
        break;
        
      case 'rec':
        console.log(`Recognized ${stage.result.length} text segments`);
        stage.result.forEach((rec, i) => {
          console.log(`  Region ${i}: "${rec.text}" (${rec.score.toFixed(2)})`);
        });
        break;
    }
  }
}

Build Configuration

The WASM module can be built with different features:

Standard Build

wasm-pack build --target web
Requires models to be provided at runtime via init(models).

Embedded Models Build

wasm-pack build --target web --features embed-models
Includes models in the WASM binary. Can use retto_embed_init() or init() without arguments.
Embedded builds are larger (~50MB+) but don’t require separate model downloads.

Memory Management

The Retto class includes a WasmBufferManager that handles allocation and deallocation:
class WasmBufferManager {
  // Automatically allocates buffer in WASM memory
  async ctx<T>(
    data: BufferData,
    fn: (ptr: number, len: number) => Promise<T> | T
  ): Promise<T>
  
  // Handles multiple buffers
  async ctxRegions<T>(
    datas: BufferData[],
    fn: (...regions: BufferRegion[]) => Promise<T> | T
  ): Promise<T>
}
Source: retto-wasm/fe/index.ts:73
Memory is automatically cleaned up after operations complete. You don’t need to manually call alloc/dealloc.

Browser Compatibility

  • WebAssembly: Required (all modern browsers)
  • SharedArrayBuffer: Not required
  • Worker Support: Single-threaded execution
  • SIMD: Uses WASM SIMD if available

Tested Browsers

  • Chrome/Edge 90+
  • Firefox 89+
  • Safari 15+
Some features may require appropriate CORS headers when serving the WASM file.

Build docs developers (and LLMs) love