WebAssembly Usage

Installation

Install the npm package:

npm

npm install @nekoimageland/retto-wasm

pnpm

pnpm add @nekoimageland/retto-wasm

yarn

yarn add @nekoimageland/retto-wasm

The package is published as @nekoimageland/retto-wasm and includes TypeScript definitions.

Loading the Module

The WASM module must be loaded asynchronously before use:

import { Retto } from '@nekoimageland/retto-wasm';

// Load the WASM module with optional progress tracking
const retto = await Retto.load((progress) => {
  console.log(`Loading: ${(progress * 100).toFixed(1)}%`);
});

Initialization

Embedded Models
External Models

If built with embedded models:

await retto.init();

Check if embedded models are available:

if (retto.is_embed_build) {
  await retto.init();
}

Load models from URLs:

const models = {
  det_model: await fetch('/models/det.onnx').then(r => r.arrayBuffer()),
  cls_model: await fetch('/models/cls.onnx').then(r => r.arrayBuffer()),
  rec_model: await fetch('/models/rec.onnx').then(r => r.arrayBuffer()),
  rec_dict: await fetch('/models/dict.txt').then(r => r.arrayBuffer()),
};

await retto.init(models);

Running OCR

Get image data

Load an image as bytes:

const file = document.querySelector('input[type="file"]').files[0];
const imageData = await file.arrayBuffer();

Process with streaming

OCR results stream in three stages:

for await (const stage of retto.recognize(imageData)) {
  if (stage.stage === 'det') {
    console.log('Text detection:', stage.result);
  } else if (stage.stage === 'cls') {
    console.log('Text classification:', stage.result);
  } else if (stage.stage === 'rec') {
    console.log('Text recognition:', stage.result);
  }
}

Use the results

Extract recognized text:

let recognizedText = [];

for await (const stage of retto.recognize(imageData)) {
  if (stage.stage === 'rec') {
    recognizedText = stage.result.map(r => r.text);
  }
}

console.log('Text:', recognizedText.join('\n'));

Complete Example

app.ts

import { Retto } from '@nekoimageland/retto-wasm';

class OCRApp {
  private retto: Retto | null = null;
  
  async initialize() {
    // Load WASM module
    console.log('Loading OCR engine...');
    this.retto = await Retto.load((progress) => {
      console.log(`Progress: ${(progress * 100).toFixed(0)}%`);
    });
    
    // Initialize with embedded models
    await this.retto.init();
    console.log('OCR engine ready!');
  }
  
  async processImage(file: File): Promise<string[]> {
    if (!this.retto) {
      throw new Error('OCR engine not initialized');
    }
    
    const imageData = await file.arrayBuffer();
    const results: string[] = [];
    
    for await (const stage of this.retto.recognize(imageData)) {
      if (stage.stage === 'rec') {
        results.push(...stage.result.map(r => r.text));
      }
    }
    
    return results;
  }
}

// Usage
const app = new OCRApp();
await app.initialize();

const fileInput = document.querySelector('input[type="file"]');
fileInput.addEventListener('change', async (e) => {
  const file = (e.target as HTMLInputElement).files?.[0];
  if (file) {
    const text = await app.processImage(file);
    console.log('Recognized text:', text);
  }
});

TypeScript Types

The package includes comprehensive TypeScript definitions:

export interface Point {
  x: number;
  y: number;
}

export interface PointBox {
  inner: [Point, Point, Point, Point];
}

export interface DetProcessorInnerResult {
  boxes: PointBox;
  score: number;
}

export type DetProcessorResult = DetProcessorInnerResult[];

export interface ClsPostProcessLabel {
  label: number;  // Rotation angle (0, 90, 180, 270)
  score: number;  // Confidence score
}

export interface RecProcessorSingleResult {
  text: string;   // Recognized text
  score: number;  // Confidence score
}

export type RecProcessorResult = RecProcessorSingleResult[];

React Example

OCRComponent.tsx

import React, { useState, useEffect } from 'react';
import { Retto, RecProcessorResult } from '@nekoimageland/retto-wasm';

export function OCRComponent() {
  const [retto, setRetto] = useState<Retto | null>(null);
  const [loading, setLoading] = useState(true);
  const [results, setResults] = useState<string[]>([]);
  
  useEffect(() => {
    async function init() {
      const instance = await Retto.load();
      await instance.init();
      setRetto(instance);
      setLoading(false);
    }
    init();
  }, []);
  
  const handleFileUpload = async (e: React.ChangeEvent<HTMLInputElement>) => {
    const file = e.target.files?.[0];
    if (!file || !retto) return;
    
    setLoading(true);
    const imageData = await file.arrayBuffer();
    const texts: string[] = [];
    
    for await (const stage of retto.recognize(imageData)) {
      if (stage.stage === 'rec') {
        texts.push(...stage.result.map(r => r.text));
      }
    }
    
    setResults(texts);
    setLoading(false);
  };
  
  return (
    <div>
      <input type="file" onChange={handleFileUpload} disabled={loading} />
      {loading && <p>Processing...</p>}
      <ul>
        {results.map((text, i) => <li key={i}>{text}</li>)}
      </ul>
    </div>
  );
}

Implementation Details

From index.ts:154, the load function downloads the WASM binary:

static async load(onProgress?: (ratio: number) => void): Promise<Retto> {
  const wasmUrl = new URL("public/retto_wasm.wasm", import.meta.url).href;
  const { data } = await axios.get<ArrayBuffer>(wasmUrl, {
    responseType: "arraybuffer",
    onDownloadProgress: ({ loaded, total }) => {
      if (total && onProgress) onProgress(loaded / total);
    },
  });
  const module = await initWASI({
    wasmBinary: data,
    locateFile: () => "",
  }) as typeof RettoInner;
  return new Retto(module);
}

The streaming API uses async generators (index.ts:237):

async *recognize(
  data: Uint8Array | ArrayBuffer,
): AsyncGenerator<RettoWorkerStage, void, unknown> {
  const sessionPtr = this.module._retto_rec(ptr, len);
  const sessionId = this.module.UTF8ToString(sessionPtr);
  
  const det = await once<DetProcessorResult>("det");
  yield { stage: "det", result: det };
  
  const cls = await once<ClsProcessorResult>("cls");
  yield { stage: "cls", result: cls };
  
  const rec = await once<RecProcessorResult>("rec");
  yield { stage: "rec", result: rec };
}

The WASM module runs in a web worker thread. Do not block the main thread during OCR processing.

Browser Compatibility

Retto WASM requires:

WebAssembly support
SharedArrayBuffer support (for threading)
Modern ES2020+ JavaScript features

For SharedArrayBuffer to work, your server must send these headers:

Cross-Origin-Opener-Policy: same-origin
Cross-Origin-Embedder-Policy: require-corp

Performance Considerations

Load once, reuse: Initialize the Retto instance once and reuse it for multiple images
Model size: Embedded models increase bundle size; consider loading externally for production
Image size: Larger images take longer to process; consider resizing before OCR
Streaming: Use the streaming API to provide real-time feedback

Get Started

Core Concepts

Guides

Examples

WebAssembly Usage

Installation

Loading the Module

Initialization

Running OCR

Complete Example

TypeScript Types

React Example

Implementation Details

Browser Compatibility

Performance Considerations

Next Steps

Model Loading

Rust Usage

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Examples

​Installation

​Loading the Module

​Initialization

​Running OCR

​Complete Example

​TypeScript Types

​React Example

​Implementation Details

​Browser Compatibility

​Performance Considerations

​Next Steps

Model Loading

Rust Usage

Build docs developers (and LLMs) love

Installation

Loading the Module

Initialization

Running OCR

Complete Example

TypeScript Types

React Example

Implementation Details

Browser Compatibility

Performance Considerations

Next Steps