Overview
The PCM Live Stream API provides native microphone capture with automatic resampling that delivers PCM audio at your requested sample rate (e.g., 16 kHz for STT). Both iOS and Android capture audio at supported hardware rates (16000, 44100, or 48000 Hz), resample to your target rate, and emit Int16 mono PCM.Import from:
react-native-sherpa-onnx/audioKey Features
- Native capture with automatic resampling
- iOS: Audio Queue API (
AudioQueueNewInput) with custom linear-interpolation resampler - Android:
SherpaOnnxPcmCapturewith native resampling - Float32 PCM output in
[-1, 1]range for direct STT processing - Base64 decoding with preallocated buffers to reduce GC pressure
- Built-in buffer package dependency
This API is typically used together with the Streaming STT API for live transcription.
API Reference
createPcmLiveStream
Creates a PCM live stream from the device microphone with native capture and resampling.Parameters
Configuration options for the PCM live stream
Returns
Handle for controlling the PCM live stream
Quick Start: Live Transcription
Minimal example showing how to start the microphone, feed PCM into a streaming STT stream, and display results.Integration with Streaming STT
Typical workflow for live transcription:Create streaming STT engine
Initialize a streaming-capable STT model using
createStreamingSTT. See the Streaming STT documentation for details.Create PCM live stream
Create a PCM live stream with the same
sampleRate as your STT model (usually 16000 Hz).Process audio chunks
In the PCM handle’s
onData callback, pass each chunk to stream.processAudioChunk. Use result.text for partial/final transcripts and optionally check isEndpoint for end-of-utterance detection.Permissions
Microphone access requires proper permissions on both platforms.Android Permissions
Android Permissions
Add Request permission at runtime:
RECORD_AUDIO permission to AndroidManifest.xml:iOS Permissions
iOS Permissions
Set iOS will automatically prompt the user for permission when
NSMicrophoneUsageDescription in Info.plist:start() is called.iOS Simulator Note
On the iOS Simulator, this module uses the Audio Queue API, so the default input device is used. If the simulator produces silence:- Choose a valid input device in your host Mac’s sound settings
- Test on a physical device for real microphone input
Other Audio Utilities
Thereact-native-sherpa-onnx/audio module also provides audio file conversion utilities:
convertAudioToFormat
Converts an audio file to a supported format (MP3, FLAC, WAV).On Android, this requires FFmpeg prebuilts. See the README for configuration options.
convertAudioToWav16k
Converts audio to WAV 16 kHz mono 16-bit PCM (ideal for offline STT).Best Practices
Serial Processing
Process audio chunks serially to avoid overlapping STT calls. Use promise chains or queues rather than parallel processing.
Error Handling
Always register an
onError handler to catch capture or resampling errors. Handle permission denials gracefully.Resource Cleanup
Always unsubscribe from
onData/onError and call stop() when done. Clean up STT streams and engines to free native resources.Sample Rate Matching
Use the same sample rate for both PCM capture and STT (typically 16000 Hz) to avoid unnecessary resampling.