Skip to main content

Overview

The PCM Live Stream API provides native microphone capture with automatic resampling that delivers PCM audio at your requested sample rate (e.g., 16 kHz for STT). Both iOS and Android capture audio at supported hardware rates (16000, 44100, or 48000 Hz), resample to your target rate, and emit Int16 mono PCM.
Import from: react-native-sherpa-onnx/audio

Key Features

  • Native capture with automatic resampling
  • iOS: Audio Queue API (AudioQueueNewInput) with custom linear-interpolation resampler
  • Android: SherpaOnnxPcmCapture with native resampling
  • Float32 PCM output in [-1, 1] range for direct STT processing
  • Base64 decoding with preallocated buffers to reduce GC pressure
  • Built-in buffer package dependency
This API is typically used together with the Streaming STT API for live transcription.

API Reference

createPcmLiveStream

Creates a PCM live stream from the device microphone with native capture and resampling.
import { createPcmLiveStream } from 'react-native-sherpa-onnx/audio';

const pcm = createPcmLiveStream({
  sampleRate: 16000,
  channelCount: 1,
  bufferSizeFrames: 0,
});

Parameters

options
PcmLiveStreamOptions
Configuration options for the PCM live stream

Returns

PcmLiveStreamHandle
object
Handle for controlling the PCM live stream

Quick Start: Live Transcription

Minimal example showing how to start the microphone, feed PCM into a streaming STT stream, and display results.
import { createPcmLiveStream } from 'react-native-sherpa-onnx/audio';
import { createStreamingSTT, getOnlineTypeOrNull } from 'react-native-sherpa-onnx/stt';

const SAMPLE_RATE = 16000;

// 1) Create streaming STT engine and stream
const onlineType = getOnlineTypeOrNull('transducer');
if (!onlineType) throw new Error('Model does not support streaming');

const engine = await createStreamingSTT({
  modelPath: { type: 'asset', path: 'models/streaming-zipformer-en' },
  modelType: onlineType,
});
const stream = await engine.createStream();

// 2) Create PCM live stream with same sample rate as STT
const pcm = createPcmLiveStream({ sampleRate: SAMPLE_RATE });

pcm.onError((msg) => console.error('PCM error:', msg));

const unsubData = pcm.onData(async (samples, sampleRate) => {
  const { result } = await stream.processAudioChunk(samples, sampleRate);
  if (result.text) console.log('Partial:', result.text);
});

await pcm.start();
// ... recording in progress ...

// 3) Stop and cleanup
await pcm.stop();
unsubData();
await stream.release();
await engine.destroy();

Integration with Streaming STT

Typical workflow for live transcription:
1

Create streaming STT engine

Initialize a streaming-capable STT model using createStreamingSTT. See the Streaming STT documentation for details.
const engine = await createStreamingSTT({
  modelPath: { type: 'asset', path: 'models/streaming-zipformer-en' },
  modelType: 'transducer',
});
const stream = await engine.createStream();
2

Create PCM live stream

Create a PCM live stream with the same sampleRate as your STT model (usually 16000 Hz).
const pcm = createPcmLiveStream({ sampleRate: 16000 });
3

Process audio chunks

In the PCM handle’s onData callback, pass each chunk to stream.processAudioChunk. Use result.text for partial/final transcripts and optionally check isEndpoint for end-of-utterance detection.
pcm.onData(async (samples, sampleRate) => {
  const { result, isEndpoint } = await stream.processAudioChunk(samples, sampleRate);
  if (result.text) console.log('Transcript:', result.text);
  if (isEndpoint) console.log('End of utterance detected');
});
4

Start capture

Start the microphone capture.
await pcm.start();
5

Stop and cleanup

When done, stop capture, unsubscribe from events, and clean up resources.
await pcm.stop();
unsubData();
await stream.inputFinished(); // Optional: signal end of audio
await stream.release();
await engine.destroy();
Process chunks serially (e.g., with a promise chain or queue) to avoid overlapping calls. The example app at example/src/screens/stt/STTScreen.tsx shows a complete implementation with start/stop and cleanup.

Permissions

Microphone access requires proper permissions on both platforms.
Add RECORD_AUDIO permission to AndroidManifest.xml:
<uses-permission android:name="android.permission.RECORD_AUDIO" />
Request permission at runtime:
import { PermissionsAndroid, Platform } from 'react-native';

if (Platform.OS === 'android') {
  const granted = await PermissionsAndroid.request(
    PermissionsAndroid.PERMISSIONS.RECORD_AUDIO
  );
  if (granted !== PermissionsAndroid.RESULTS.GRANTED) {
    console.error('Microphone permission denied');
  }
}
Set NSMicrophoneUsageDescription in Info.plist:
<key>NSMicrophoneUsageDescription</key>
<string>This app needs microphone access for speech recognition</string>
iOS will automatically prompt the user for permission when start() is called.
Without proper permissions, start() may fail or onError may fire with an error message.

iOS Simulator Note

On the iOS Simulator, this module uses the Audio Queue API, so the default input device is used. If the simulator produces silence:
  • Choose a valid input device in your host Mac’s sound settings
  • Test on a physical device for real microphone input

Other Audio Utilities

The react-native-sherpa-onnx/audio module also provides audio file conversion utilities:

convertAudioToFormat

Converts an audio file to a supported format (MP3, FLAC, WAV).
import { convertAudioToFormat } from 'react-native-sherpa-onnx/audio';

await convertAudioToFormat(
  '/path/to/input.mp3',
  '/path/to/output.wav',
  'wav',
  16000 // Optional: output sample rate
);
On Android, this requires FFmpeg prebuilts. See the README for configuration options.

convertAudioToWav16k

Converts audio to WAV 16 kHz mono 16-bit PCM (ideal for offline STT).
import { convertAudioToWav16k } from 'react-native-sherpa-onnx/audio';

await convertAudioToWav16k(
  '/path/to/input.mp3',
  '/path/to/output.wav'
);

Best Practices

Serial Processing

Process audio chunks serially to avoid overlapping STT calls. Use promise chains or queues rather than parallel processing.

Error Handling

Always register an onError handler to catch capture or resampling errors. Handle permission denials gracefully.

Resource Cleanup

Always unsubscribe from onData/onError and call stop() when done. Clean up STT streams and engines to free native resources.

Sample Rate Matching

Use the same sample rate for both PCM capture and STT (typically 16000 Hz) to avoid unnecessary resampling.

Type Definitions

export type PcmLiveStreamOptions = {
  sampleRate?: number;      // default 16000
  channelCount?: number;    // default 1
  bufferSizeFrames?: number; // default 0
};

export type PcmLiveStreamHandle = {
  start: () => Promise<void>;
  stop: () => Promise<void>;
  onData: (callback: (samples: number[], sampleRate: number) => void) => () => void;
  onError: (callback: (message: string) => void) => () => void;
};
Import these types:
import {
  createPcmLiveStream,
  type PcmLiveStreamOptions,
  type PcmLiveStreamHandle,
} from 'react-native-sherpa-onnx/audio';

Build docs developers (and LLMs) love