Skip to main content

createStreamingTTS()

Creates a streaming TTS engine instance for real-time speech synthesis. Use this when you need:
  • Incremental audio generation with chunk callbacks
  • Real-time playback while generating
  • Lower latency for interactive applications
For batch generation of complete audio files, use createTTS() instead.
function createStreamingTTS(
  options: TTSInitializeOptions | ModelPathConfig
): Promise<StreamingTtsEngine>

Parameters

options
TTSInitializeOptions | ModelPathConfig
required
TTS initialization options. Can be either:
  • Full TTSInitializeOptions object with model path and configuration
  • Simple ModelPathConfig for quick initialization with defaults
See Configuration for details.

Returns

Returns a Promise that resolves to a StreamingTtsEngine instance.

Example

import { createStreamingTTS } from 'react-native-sherpa-onnx';

const tts = await createStreamingTTS({
  modelPath: { type: 'asset', path: 'models/vits-piper-en' },
  modelType: 'vits',
});

// Start built-in PCM player
const sampleRate = await tts.getSampleRate();
await tts.startPcmPlayer(sampleRate, 1); // 1 = mono

// Generate and play in real-time
const controller = await tts.generateSpeechStream(
  'Hello, this is streaming speech synthesis!',
  undefined,
  {
    onChunk: async (chunk) => {
      // Play audio chunk as it's generated
      await tts.writePcmChunk(chunk.samples);
      console.log(`Progress: ${(chunk.progress * 100).toFixed(1)}%`);
    },
    onEnd: async () => {
      await tts.stopPcmPlayer();
      console.log('Generation complete');
    },
    onError: (error) => {
      console.error('Error:', error.message);
    },
  }
);

// Clean up when done
await tts.destroy();

StreamingTtsEngine

The streaming TTS engine interface returned by createStreamingTTS().

Properties

instanceId
string
Unique identifier for this streaming TTS engine instance.

Methods

generateSpeechStream()

Generates speech in streaming mode with chunk callbacks.
generateSpeechStream(
  text: string,
  options: TtsGenerationOptions | undefined,
  handlers: TtsStreamHandlers
): Promise<TtsStreamController>
text
string
required
The text to synthesize into speech.
options
TtsGenerationOptions
Optional generation parameters (speaker ID, speed, voice cloning, etc.).See Configuration for details.
handlers
TtsStreamHandlers
required
Event handlers for stream chunks, completion, and errors.See TtsStreamHandlers below.
Returns: Promise<TtsStreamController> - Controller for managing the stream. Example:
const allSamples: number[] = [];

const controller = await tts.generateSpeechStream(
  'This is a test',
  { speed: 1.2, sid: 0 },
  {
    onChunk: (chunk) => {
      allSamples.push(...chunk.samples);
      if (chunk.isFinal) {
        console.log('Final chunk received');
      }
    },
    onEnd: (event) => {
      if (event.cancelled) {
        console.log('Generation was cancelled');
      } else {
        console.log('Complete! Total samples:', allSamples.length);
      }
    },
    onError: (error) => {
      console.error('Error:', error.message);
    },
  }
);

// Cancel if needed
// await controller.cancel();

cancelSpeechStream()

Cancels the current streaming generation.
cancelSpeechStream(): Promise<void>
Example:
// Start generation
const controller = await tts.generateSpeechStream('Long text...', undefined, handlers);

// Cancel after 1 second
setTimeout(async () => {
  await tts.cancelSpeechStream();
}, 1000);

startPcmPlayer()

Starts the built-in PCM audio player for real-time playback.
startPcmPlayer(sampleRate: number, channels: number): Promise<void>
sampleRate
number
required
Sample rate in Hz (e.g., 22050, 44100). Should match the model’s sample rate.
channels
number
required
Number of audio channels:
  • 1 = mono (recommended for TTS)
  • 2 = stereo
Example:
const sampleRate = await tts.getSampleRate();
await tts.startPcmPlayer(sampleRate, 1);

writePcmChunk()

Writes audio samples to the PCM player. Call this from the onChunk handler.
writePcmChunk(samples: number[]): Promise<void>
samples
number[]
required
Float PCM samples in range [-1.0, 1.0].
Example:
await tts.generateSpeechStream('Hello', undefined, {
  onChunk: async (chunk) => {
    await tts.writePcmChunk(chunk.samples);
  },
});

stopPcmPlayer()

Stops and releases the PCM player.
stopPcmPlayer(): Promise<void>
Example:
try {
  await tts.startPcmPlayer(22050, 1);
  // ... generate and play audio ...
} finally {
  await tts.stopPcmPlayer();
}

getModelInfo()

Returns model capabilities (sample rate and number of speakers).
getModelInfo(): Promise<TTSModelInfo>
Returns: Promise resolving to model information. Example:
const info = await tts.getModelInfo();
console.log(`Sample rate: ${info.sampleRate}Hz`);
console.log(`Speakers: ${info.numSpeakers}`);

getSampleRate()

Returns the sample rate at which the model generates audio.
getSampleRate(): Promise<number>
Returns: Promise resolving to sample rate in Hz.

getNumSpeakers()

Returns the number of speakers/voices available in the model.
getNumSpeakers(): Promise<number>
Returns: Promise resolving to number of speakers.

destroy()

Releases native resources. Always call this when done to prevent memory leaks.
destroy(): Promise<void>
Example:
try {
  const controller = await tts.generateSpeechStream('Hello', undefined, handlers);
  // ... handle stream ...
} finally {
  await tts.destroy();
}

TtsStreamHandlers

Event handlers for streaming TTS generation.
interface TtsStreamHandlers {
  onChunk?: (chunk: TtsStreamChunk) => void;
  onEnd?: (event: TtsStreamEnd) => void;
  onError?: (event: TtsStreamError) => void;
}

onChunk

Called for each audio chunk as it’s generated.
chunk
TtsStreamChunk
Audio chunk data. See TtsStreamChunk below.

onEnd

Called when generation completes or is cancelled.
event
TtsStreamEnd
Completion event. See TtsStreamEnd below.

onError

Called when an error occurs during generation.
event
TtsStreamError
Error event. See TtsStreamError below.

TtsStreamChunk

Audio chunk received during streaming generation.
interface TtsStreamChunk {
  instanceId?: string;
  requestId?: string;
  samples: number[];
  sampleRate: number;
  progress: number;
  isFinal: boolean;
}
instanceId
string
Instance ID of the TTS engine that generated this chunk.
requestId
string
Request ID for this specific generation.
samples
number[]
Float PCM audio samples in range [-1.0, 1.0].
sampleRate
number
Sample rate of the audio in Hz.
progress
number
Generation progress as a float between 0.0 and 1.0.
isFinal
boolean
true if this is the last chunk in the stream.

TtsStreamEnd

Event emitted when streaming generation ends.
interface TtsStreamEnd {
  instanceId?: string;
  requestId?: string;
  cancelled: boolean;
}
instanceId
string
Instance ID of the TTS engine.
requestId
string
Request ID for this generation.
cancelled
boolean
true if generation was cancelled, false if it completed normally.

TtsStreamError

Event emitted when an error occurs during streaming.
interface TtsStreamError {
  instanceId?: string;
  requestId?: string;
  message: string;
}
instanceId
string
Instance ID of the TTS engine.
requestId
string
Request ID for this generation.
message
string
Error message describing what went wrong.

TtsStreamController

Controller returned by generateSpeechStream() for managing the stream.
interface TtsStreamController {
  cancel(): Promise<void>;
  unsubscribe(): void;
}

cancel()

Cancels the ongoing TTS generation.
const controller = await tts.generateSpeechStream('text', undefined, handlers);
await controller.cancel();

unsubscribe()

Removes event listeners. Called automatically on end/error, or can be called manually.
controller.unsubscribe();

Complete Example

import { createStreamingTTS } from 'react-native-sherpa-onnx';

async function streamingTtsExample() {
  // Initialize streaming TTS
  const tts = await createStreamingTTS({
    modelPath: { type: 'asset', path: 'models/vits-piper-en' },
    modelType: 'vits',
  });

  try {
    // Get model info and start player
    const info = await tts.getModelInfo();
    console.log(`Model: ${info.sampleRate}Hz, ${info.numSpeakers} speakers`);
    
    await tts.startPcmPlayer(info.sampleRate, 1);

    // Generate and play streaming audio
    const controller = await tts.generateSpeechStream(
      'This is real-time speech synthesis!',
      { speed: 1.0, sid: 0 },
      {
        onChunk: async (chunk) => {
          await tts.writePcmChunk(chunk.samples);
          console.log(
            `Chunk: ${chunk.samples.length} samples, ` +
            `${(chunk.progress * 100).toFixed(1)}% complete`
          );
        },
        onEnd: async (event) => {
          await tts.stopPcmPlayer();
          if (event.cancelled) {
            console.log('Cancelled');
          } else {
            console.log('Complete');
          }
        },
        onError: (error) => {
          console.error('Error:', error.message);
        },
      }
    );

    // Optionally cancel after delay
    // setTimeout(() => controller.cancel(), 2000);
  } finally {
    await tts.destroy();
  }
}

See Also

Build docs developers (and LLMs) love