Overview
Execution providers enable hardware acceleration for faster inference and lower power consumption. React-native-sherpa-onnx supports:| Provider | Platform | Hardware | Status |
|---|---|---|---|
| CPU | iOS, Android | CPU | ✅ Always available |
| QNN | Android | Qualcomm NPU (HTP) | ✅ Requires runtime libs |
| NNAPI | Android | GPU/DSP/NPU | ✅ Built-in |
| XNNPACK | Android, iOS | CPU-optimized | ✅ Built-in |
| Core ML | iOS | Apple Neural Engine | ✅ Built-in |
Quick Start: Check and Use Acceleration
Check QNN Support (Qualcomm NPU)
Check NNAPI Support (Android)
Check Core ML Support (iOS)
Check Available Providers
Adding QNN Runtime Libs
Step 1: Download Qualcomm AI Runtime
- Go to Qualcomm AI Runtime Community
- Accept the license agreement
- Download the SDK for your development platform
Step 2: Copy Runtime Libraries
Extract and copy the following.so files to your app’s jniLibs per ABI:
Required libraries:
libQnnHtp.solibQnnHtpV*Stub.so(multiple versions: V68, V69, V73, V75, V79, V81)libQnnHtpV*Skel.so(multiple versions: V68, V69, V73, V75, V79, V81)libQnnHtpPrepare.solibQnnSystem.solibQnnCpu.so(optional, for CPU fallback)
Step 3: Rebuild and Test
Rebuild your app:Step 4: Include License Notices
Add Qualcomm’s copyright and license notice to your app’s legal/credits section. The notice is in the QNN SDKLICENSE file.
AccelerationSupport Format
All support checks return the same structure:Understanding the Fields
| Field | Meaning | Example |
|---|---|---|
providerCompiled | Execution provider is compiled into ONNX Runtime | QNN in getAvailableProviders() |
hasAccelerator | Hardware accelerator detected | Qualcomm HTP init succeeds, NNAPI reports GPU/NPU, Apple ANE present |
canInit | Session with EP can be created | Test model loads successfully with provider |
Use
canInit to decide if you can use the provider. It’s the most reliable indicator that the provider will work for your models.API Reference
getQnnSupport(modelBase64?)
Check QNN (Qualcomm NPU) support.modelBase64?: Base64-encoded ONNX model to test (optional; uses embedded test model if omitted)
Promise<AccelerationSupport>
| Situation | providerCompiled | hasAccelerator | canInit |
|---|---|---|---|
| QNN libs added, Qualcomm device | ✅ | ✅ | ✅ |
| QNN libs not added | ✅ | ❌ | ❌ |
| Non-Qualcomm device | ✅ | ❌ | ❌ |
| QNN not in build | ❌ | ❌ | ❌ |
| iOS | ❌ | ❌ | ❌ |
getNnapiSupport(modelBase64?)
Check NNAPI (Android Neural Networks API) support.modelBase64?: Base64-encoded ONNX model to test (optional)
Promise<AccelerationSupport>
Why
hasAccelerator: false but canInit: true?hasAcceleratorchecks if the NDK reports a dedicated accelerator device (GPU/DSP/NPU)canInitchecks if ONNX Runtime can create a session with NNAPI
canInit to decide if you can use provider: 'nnapi'.getXnnpackSupport(modelBase64?)
Check XNNPACK (CPU-optimized) support.modelBase64?: Base64-encoded ONNX model to test (required for meaningfulcanInit)
Promise<AccelerationSupport>
hasAccelerator is true when XNNPACK is compiled (CPU-optimized, not hardware acceleration).getCoreMlSupport(modelBase64?)
Check Core ML (iOS) support.modelBase64?: Base64-encoded ONNX model to test (not used; reserved for future)
Promise<AccelerationSupport>
| Field | iOS 15+ with ANE | iOS without ANE | Android |
|---|---|---|---|
providerCompiled | ✅ | ✅ | ❌ |
hasAccelerator | ✅ (ANE) | ❌ | ❌ |
canInit | ❌ (not implemented) | ❌ | ❌ |
getAvailableProviders()
List ONNX Runtime execution providers in the current build.Promise<string[]>
Using Providers with STT/TTS
Pass theprovider option when creating engines:
STT with QNN
STT with NNAPI
TTS with Core ML
Streaming STT with QNN
Provider Selection Strategy
Recommended provider selection order:Performance Comparison
Typical speedup over CPU (device-dependent):| Provider | Speedup | Power Efficiency | Availability |
|---|---|---|---|
| QNN | 3-5x | Excellent | Qualcomm only |
| NNAPI | 2-4x | Good | Android 8.1+ |
| Core ML | 2-3x | Excellent | iOS (ANE on A12+) |
| XNNPACK | 1.5-2x | Good | Android/iOS |
| CPU | 1x | Baseline | Always available |
Actual performance depends on:
- Model architecture and size
- Device chipset and generation
- Thermal conditions
- OS version
Troubleshooting
QNN: providerCompiled=true but canInit=false
QNN: providerCompiled=true but canInit=false
Possible causes:
- QNN runtime libs not added to
jniLibs - Device doesn’t have Qualcomm chipset
- QNN backend initialization failed (unsupported SoC or driver)
- Add QNN
.sofiles (see Adding QNN Runtime Libs) - Use NNAPI or CPU on non-Qualcomm devices
NNAPI: hasAccelerator=false but canInit=true
NNAPI: hasAccelerator=false but canInit=true
This is normal. NNAPI can work without a dedicated accelerator (runs on CPU through NNAPI).Use
canInit to decide if you can use NNAPI. hasAccelerator only indicates if the device reports a GPU/DSP/NPU.Model fails with hardware provider but works on CPU
Model fails with hardware provider but works on CPU
Some operations may not be supported by hardware EPs:
- Try a different model
- Check if the model is compatible with the provider
- Fallback to CPU for unsupported models
Core ML: hasAccelerator=false on newer iPhone
Core ML: hasAccelerator=false on newer iPhone
Apple Neural Engine requires:
- A12 chip or later (iPhone XS/XR and newer)
- iOS 15+ for reliable detection
false for ANE.Slower with hardware provider than CPU
Slower with hardware provider than CPU
This can happen when:
- Model is very small (overhead outweighs benefit)
- First run (initialization overhead)
- Thermal throttling
Testing Provider Performance
Benchmark different providers:Next Steps
Speech-to-Text
Use hardware acceleration with STT
Text-to-Speech
Use hardware acceleration with TTS
Model Setup
Learn how to bundle and load models
Streaming STT
Real-time recognition with acceleration