Skip to main content

Overview

Execution providers enable hardware acceleration for faster inference and lower power consumption. React-native-sherpa-onnx supports:
ProviderPlatformHardwareStatus
CPUiOS, AndroidCPU✅ Always available
QNNAndroidQualcomm NPU (HTP)✅ Requires runtime libs
NNAPIAndroidGPU/DSP/NPU✅ Built-in
XNNPACKAndroid, iOSCPU-optimized✅ Built-in
Core MLiOSApple Neural Engine✅ Built-in
QNN Runtime Libraries Not IncludedFor license reasons, this SDK does not include Qualcomm QNN runtime libraries. To use QNN acceleration, you must:
  1. Download the Qualcomm AI Runtime (accept license)
  2. Copy runtime .so files to your app’s jniLibs
  3. Include Qualcomm notices in your app’s legal/credits
See Adding QNN Runtime Libs below.

Quick Start: Check and Use Acceleration

Check QNN Support (Qualcomm NPU)

import { getQnnSupport } from 'react-native-sherpa-onnx';
import { createSTT } from 'react-native-sherpa-onnx/stt';

const support = await getQnnSupport();

if (support.canInit) {
  // QNN is available - use it
  const stt = await createSTT({
    modelPath: { type: 'asset', path: 'models/transducer-en' },
    provider: 'qnn',
  });
} else if (support.providerCompiled) {
  console.log('QNN is compiled but not available on this device');
  console.log('Add QNN runtime libs or use CPU/NNAPI');
} else {
  console.log('QNN not in build, use CPU or other providers');
}

Check NNAPI Support (Android)

import { getNnapiSupport } from 'react-native-sherpa-onnx';

const support = await getNnapiSupport();

if (support.canInit) {
  // NNAPI works (may use CPU, GPU, DSP, or NPU)
  const stt = await createSTT({
    modelPath: { type: 'asset', path: 'models/paraformer-zh' },
    provider: 'nnapi',
  });
  
  if (support.hasAccelerator) {
    console.log('Device has dedicated accelerator (GPU/DSP/NPU)');
  } else {
    console.log('NNAPI works but no dedicated accelerator reported');
  }
}

Check Core ML Support (iOS)

import { getCoreMlSupport } from 'react-native-sherpa-onnx';
import { createTTS } from 'react-native-sherpa-onnx/tts';

const support = await getCoreMlSupport();

if (support.hasAccelerator) {
  console.log('Apple Neural Engine available');
  const tts = await createTTS({
    modelPath: { type: 'asset', path: 'models/vits-piper-en' },
    provider: 'coreml',
  });
} else {
  console.log('No ANE, Core ML will use CPU/GPU');
}

Check Available Providers

import { getAvailableProviders } from 'react-native-sherpa-onnx';

const providers = await getAvailableProviders();
console.log('Available providers:', providers);
// Android: ['CPU', 'QNN', 'NNAPI', 'XNNPACK']
// iOS: ['CPU', 'COREML', 'XNNPACK']

const hasQnn = providers.some((p) => p.toUpperCase() === 'QNN');
if (hasQnn) {
  // Offer "Use NPU" option in settings
}

Adding QNN Runtime Libs

Step 1: Download Qualcomm AI Runtime

  1. Go to Qualcomm AI Runtime Community
  2. Accept the license agreement
  3. Download the SDK for your development platform

Step 2: Copy Runtime Libraries

Extract and copy the following .so files to your app’s jniLibs per ABI: Required libraries:
  • libQnnHtp.so
  • libQnnHtpV*Stub.so (multiple versions: V68, V69, V73, V75, V79, V81)
  • libQnnHtpV*Skel.so (multiple versions: V68, V69, V73, V75, V79, V81)
  • libQnnHtpPrepare.so
  • libQnnSystem.so
  • libQnnCpu.so (optional, for CPU fallback)
Destination:
android/app/src/main/jniLibs/
  arm64-v8a/
    libQnnHtp.so
    libQnnHtpV68Stub.so
    libQnnHtpV68Skel.so
    libQnnHtpV69Stub.so
    libQnnHtpV69Skel.so
    libQnnHtpV73Stub.so
    libQnnHtpV73Skel.so
    libQnnHtpV75Stub.so
    libQnnHtpV75Skel.so
    libQnnHtpV79Stub.so
    libQnnHtpV79Skel.so
    libQnnHtpV81Stub.so
    libQnnHtpV81Skel.so
    libQnnHtpPrepare.so
    libQnnSystem.so
    libQnnCpu.so
  armeabi-v7a/
    (same files for 32-bit ARM)
See sherpa-onnx QNN guide for the exact list of required files.

Step 3: Rebuild and Test

Rebuild your app:
cd android
./gradlew clean
cd ..
npx react-native run-android
Test QNN support:
import { getQnnSupport } from 'react-native-sherpa-onnx';

const support = await getQnnSupport();
console.log('QNN canInit:', support.canInit);
// Should be true on devices with Qualcomm chipsets

Step 4: Include License Notices

Add Qualcomm’s copyright and license notice to your app’s legal/credits section. The notice is in the QNN SDK LICENSE file.
Redistribution Requirements:The Qualcomm AI Stack License permits you to distribute QNN runtime libraries only:
  1. In object code (no source)
  2. As part of your application (not standalone)
  3. With proper attribution and notices
Do not remove Qualcomm’s copyright or proprietary notices from the libraries.

AccelerationSupport Format

All support checks return the same structure:
interface AccelerationSupport {
  providerCompiled: boolean;  // EP built into ONNX Runtime
  hasAccelerator: boolean;    // Hardware accelerator present
  canInit: boolean;           // Session test succeeded
}

Understanding the Fields

FieldMeaningExample
providerCompiledExecution provider is compiled into ONNX RuntimeQNN in getAvailableProviders()
hasAcceleratorHardware accelerator detectedQualcomm HTP init succeeds, NNAPI reports GPU/NPU, Apple ANE present
canInitSession with EP can be createdTest model loads successfully with provider
Use canInit to decide if you can use the provider. It’s the most reliable indicator that the provider will work for your models.

API Reference

getQnnSupport(modelBase64?)

Check QNN (Qualcomm NPU) support.
import { getQnnSupport } from 'react-native-sherpa-onnx';

const support = await getQnnSupport();
// Test with embedded model

const supportForMyModel = await getQnnSupport(myModelBase64);
// Test with specific model
Parameters:
  • modelBase64?: Base64-encoded ONNX model to test (optional; uses embedded test model if omitted)
Returns: Promise<AccelerationSupport>
SituationproviderCompiledhasAcceleratorcanInit
QNN libs added, Qualcomm device
QNN libs not added
Non-Qualcomm device
QNN not in build
iOS

getNnapiSupport(modelBase64?)

Check NNAPI (Android Neural Networks API) support.
import { getNnapiSupport } from 'react-native-sherpa-onnx';

const support = await getNnapiSupport();
Parameters:
  • modelBase64?: Base64-encoded ONNX model to test (optional)
Returns: Promise<AccelerationSupport>
Why hasAccelerator: false but canInit: true?
  • hasAccelerator checks if the NDK reports a dedicated accelerator device (GPU/DSP/NPU)
  • canInit checks if ONNX Runtime can create a session with NNAPI
NNAPI can run on CPU even when no accelerator is reported. Use canInit to decide if you can use provider: 'nnapi'.

getXnnpackSupport(modelBase64?)

Check XNNPACK (CPU-optimized) support.
import { getXnnpackSupport } from 'react-native-sherpa-onnx';

const support = await getXnnpackSupport();
Parameters:
  • modelBase64?: Base64-encoded ONNX model to test (required for meaningful canInit)
Returns: Promise<AccelerationSupport>
hasAccelerator is true when XNNPACK is compiled (CPU-optimized, not hardware acceleration).

getCoreMlSupport(modelBase64?)

Check Core ML (iOS) support.
import { getCoreMlSupport } from 'react-native-sherpa-onnx';

const support = await getCoreMlSupport();
Parameters:
  • modelBase64?: Base64-encoded ONNX model to test (not used; reserved for future)
Returns: Promise<AccelerationSupport>
FieldiOS 15+ with ANEiOS without ANEAndroid
providerCompiled
hasAccelerator✅ (ANE)
canInit❌ (not implemented)

getAvailableProviders()

List ONNX Runtime execution providers in the current build.
import { getAvailableProviders } from 'react-native-sherpa-onnx';

const providers = await getAvailableProviders();
// Android: ['CPU', 'QNN', 'NNAPI', 'XNNPACK']
// iOS: ['CPU', 'COREML', 'XNNPACK']
Returns: Promise<string[]>

Using Providers with STT/TTS

Pass the provider option when creating engines:

STT with QNN

import { getQnnSupport } from 'react-native-sherpa-onnx';
import { createSTT } from 'react-native-sherpa-onnx/stt';

const support = await getQnnSupport();
if (support.canInit) {
  const stt = await createSTT({
    modelPath: { type: 'asset', path: 'models/transducer-en' },
    provider: 'qnn',  // Use Qualcomm NPU
    numThreads: 1,
  });
  
  const result = await stt.transcribeFile('/path/to/audio.wav');
  console.log(result.text);
  
  await stt.destroy();
}

STT with NNAPI

import { getNnapiSupport } from 'react-native-sherpa-onnx';
import { createSTT } from 'react-native-sherpa-onnx/stt';

const support = await getNnapiSupport();
if (support.canInit) {
  const stt = await createSTT({
    modelPath: { type: 'asset', path: 'models/paraformer-zh' },
    provider: 'nnapi',  // Use NNAPI (GPU/DSP/NPU)
  });
}

TTS with Core ML

import { getCoreMlSupport } from 'react-native-sherpa-onnx';
import { createTTS } from 'react-native-sherpa-onnx/tts';

const support = await getCoreMlSupport();
if (support.hasAccelerator) {
  const tts = await createTTS({
    modelPath: { type: 'asset', path: 'models/vits-piper-en' },
    provider: 'coreml',  // Use Apple Neural Engine
  });
}

Streaming STT with QNN

import { getQnnSupport } from 'react-native-sherpa-onnx';
import { createStreamingSTT } from 'react-native-sherpa-onnx/stt';

const support = await getQnnSupport();
if (support.canInit) {
  const engine = await createStreamingSTT({
    modelPath: { type: 'asset', path: 'models/streaming-zipformer-en' },
    modelType: 'transducer',
    provider: 'qnn',
  });
}

Provider Selection Strategy

Recommended provider selection order:
import {
  getQnnSupport,
  getNnapiSupport,
  getXnnpackSupport,
} from 'react-native-sherpa-onnx';
import { createSTT } from 'react-native-sherpa-onnx/stt';

async function createOptimizedSTT(modelPath: ModelPathConfig) {
  // 1. Try QNN (fastest on Qualcomm)
  const qnn = await getQnnSupport();
  if (qnn.canInit) {
    return createSTT({ modelPath, provider: 'qnn' });
  }
  
  // 2. Try NNAPI (GPU/DSP/NPU on Android)
  const nnapi = await getNnapiSupport();
  if (nnapi.canInit) {
    return createSTT({ modelPath, provider: 'nnapi' });
  }
  
  // 3. Try XNNPACK (CPU-optimized)
  const xnnpack = await getXnnpackSupport();
  if (xnnpack.canInit) {
    return createSTT({ modelPath, provider: 'xnnpack' });
  }
  
  // 4. Fallback to CPU
  return createSTT({ modelPath, provider: 'cpu' });
}

const stt = await createOptimizedSTT({
  type: 'asset',
  path: 'models/transducer-en',
});

Performance Comparison

Typical speedup over CPU (device-dependent):
ProviderSpeedupPower EfficiencyAvailability
QNN3-5xExcellentQualcomm only
NNAPI2-4xGoodAndroid 8.1+
Core ML2-3xExcellentiOS (ANE on A12+)
XNNPACK1.5-2xGoodAndroid/iOS
CPU1xBaselineAlways available
Actual performance depends on:
  • Model architecture and size
  • Device chipset and generation
  • Thermal conditions
  • OS version

Troubleshooting

Possible causes:
  • QNN runtime libs not added to jniLibs
  • Device doesn’t have Qualcomm chipset
  • QNN backend initialization failed (unsupported SoC or driver)
Solution:
This is normal. NNAPI can work without a dedicated accelerator (runs on CPU through NNAPI).Use canInit to decide if you can use NNAPI. hasAccelerator only indicates if the device reports a GPU/DSP/NPU.
Some operations may not be supported by hardware EPs:
  • Try a different model
  • Check if the model is compatible with the provider
  • Fallback to CPU for unsupported models
Apple Neural Engine requires:
  • A12 chip or later (iPhone XS/XR and newer)
  • iOS 15+ for reliable detection
Simulator always returns false for ANE.
This can happen when:
  • Model is very small (overhead outweighs benefit)
  • First run (initialization overhead)
  • Thermal throttling
Benchmark over multiple runs to get accurate results.

Testing Provider Performance

Benchmark different providers:
import { createSTT } from 'react-native-sherpa-onnx/stt';

async function benchmarkProvider(provider: string) {
  console.log(`Testing provider: ${provider}`);
  
  const start = Date.now();
  const stt = await createSTT({
    modelPath: { type: 'asset', path: 'models/transducer-en' },
    provider,
  });
  const initTime = Date.now() - start;
  
  const transcribeStart = Date.now();
  const result = await stt.transcribeFile('/path/to/test-audio.wav');
  const transcribeTime = Date.now() - transcribeStart;
  
  await stt.destroy();
  
  console.log(`${provider}: init=${initTime}ms, transcribe=${transcribeTime}ms`);
  return { provider, initTime, transcribeTime, text: result.text };
}

// Benchmark all available providers
const providers = ['cpu', 'qnn', 'nnapi', 'xnnpack'];
const results = [];

for (const provider of providers) {
  try {
    const result = await benchmarkProvider(provider);
    results.push(result);
  } catch (error) {
    console.error(`${provider} failed:`, error);
  }
}

console.table(results);

Next Steps

Speech-to-Text

Use hardware acceleration with STT

Text-to-Speech

Use hardware acceleration with TTS

Model Setup

Learn how to bundle and load models

Streaming STT

Real-time recognition with acceleration

Build docs developers (and LLMs) love