iOS Setup

Installation

After installing the package, you need to install the iOS dependencies:

Install pods

Run pod install in your iOS directory:

npx pod-install

Choose build method (optional)

By default, whisper.rn uses a pre-built rnwhisper.xcframework for faster builds. To build from source instead, add this to your Podfile:

ENV['RNWHISPER_BUILD_FROM_SOURCE'] = '1'

Microphone Permissions

If you want to use realtime transcription features, you need to add microphone permissions.

Add to Info.plist

Add the microphone usage description to ios/[YOUR_APP_NAME]/Info.plist:

<key>NSMicrophoneUsageDescription</key>
<string>This app requires microphone access in order to transcribe speech</string>

For tvOS, please note that the microphone is not supported.

Request Permission at Runtime

Use React Native’s PermissionsIOS API to request permission:

import { PermissionsIOS, Platform } from 'react-native'

if (Platform.OS === 'ios') {
  const status = await PermissionsIOS.request(
    PermissionsIOS.PERMISSIONS.MICROPHONE
  )
  
  if (status !== 'granted') {
    console.log('Microphone permission denied')
  }
}

Audio Session Management

whisper.rn provides utilities to manage iOS Audio Session settings for optimal recording quality and compatibility with other audio playback.

Available Categories

Ambient - Mix with other audio, no recording
SoloAmbient - Interrupt other audio, no recording
Playback - For playback only
Record - For recording only
PlayAndRecord - Recommended for realtime transcription
MultiRoute - Multiple audio routes

Available Options

MixWithOthers - Mix with other apps’ audio
DuckOthers - Lower other apps’ audio volume
InterruptSpokenAudioAndMixWithOthers
AllowBluetoothA2DP - Enable Bluetooth A2DP
AllowAirPlay - Enable AirPlay
AllowBluetooth - Enable Bluetooth (iOS only, not tvOS)
DefaultToSpeaker - Route to speaker by default (iOS only, not tvOS)

Available Modes

Default - Default mode
VoiceChat - Optimized for voice chat
VideoChat - Optimized for video chat
GameChat - Optimized for game chat
VideoRecording - Optimized for video recording
Measurement - For audio measurement
MoviePlayback - For movie playback
SpokenAudio - For podcasts/audiobooks

Manual Audio Session Management

You can manually configure the audio session before starting transcription:

import { AudioSessionIos } from 'whisper.rn'

// Configure category with options
await AudioSessionIos.setCategory(
  AudioSessionIos.Category.PlayAndRecord,
  [AudioSessionIos.CategoryOption.MixWithOthers]
)

// Set mode
await AudioSessionIos.setMode(AudioSessionIos.Mode.Default)

// Activate session
await AudioSessionIos.setActive(true)

// Now start recording/transcription

Automatic Session Management (Deprecated API)

If using the deprecated transcribeRealtime API, you can configure audio session automatically:

const { stop, subscribe } = await whisperContext.transcribeRealtime({
  audioSessionOnStartIos: {
    category: AudioSessionIos.Category.PlayAndRecord,
    options: [AudioSessionIos.CategoryOption.MixWithOthers],
    mode: AudioSessionIos.Mode.Default,
  },
  audioSessionOnStopIos: 'restore', // Or provide custom settings
})

Core ML Support

Core ML significantly accelerates the encoder on iOS 15.0+ and tvOS 15.0+.

Model Requirements

Core ML models are loaded based on your GGML model path:

GGML model: ggml-tiny.en.bin
Core ML model: ggml-tiny.en-encoder.mlmodelc/

The GGML model is still required alongside Core ML models for decoder and fallback.

Core ML Model Structure

The .mlmodelc directory contains these files:

[
  'model.mil',              // Required
  'coremldata.bin',         // Required
  'weights/weight.bin',     // Required
  'metadata.json',          // Optional
  'analytics/coremldata.bin' // Optional
]

Downloading Core ML Models

Core ML models are available from Hugging Face: https://huggingface.co/ggerganov/whisper.cpp/tree/main

Models are distributed as zip archives. You’ll need to unzip them at runtime using a library like react-native-zip-archive.

Bundling Core ML Models

You can bundle Core ML models in your app (increases app size significantly):

import { Platform } from 'react-native'

const whisperContext = await initWhisper({
  filePath: require('../assets/ggml-tiny.en.bin'),
  coreMLModelAsset:
    Platform.OS === 'ios'
      ? {
          filename: 'ggml-tiny.en-encoder.mlmodelc',
          assets: [
            require('../assets/ggml-tiny.en-encoder.mlmodelc/weights/weight.bin'),
            require('../assets/ggml-tiny.en-encoder.mlmodelc/model.mil'),
            require('../assets/ggml-tiny.en-encoder.mlmodelc/coremldata.bin'),
          ],
        }
      : undefined,
})

Bundling models significantly increases app size. Consider using platform-specific files (context-opts.ios.js) to avoid bundling iOS assets for Android.

Disabling Core ML

To disable Core ML even when model files exist:

const whisperContext = await initWhisper({
  filePath: 'path/to/model.bin',
  useCoreMLIos: false,
})

Or disable during build by setting environment variable in your Podfile:

ENV['RNWHISPER_DISABLE_COREML'] = '1'

Metal GPU Acceleration

Metal acceleration is enabled by default on iOS and tvOS for significant performance improvements.

Enabling/Disabling Metal

Control Metal acceleration at runtime:

const whisperContext = await initWhisper({
  filePath: 'path/to/model.bin',
  useGpu: true, // Enabled by default
})

Disable Metal during build in your Podfile:

ENV['RNWHISPER_DISABLE_METAL'] = '1'

Extended Virtual Addressing

For medium or large models, enable the Extended Virtual Addressing capability in your Xcode project:

Open Xcode project

Open ios/[YOUR_APP_NAME].xcworkspace

Select your target

Select your app target in the project navigator

Add capability

Go to Signing & Capabilities → Click + Capability → Search for “Extended Virtual Addressing” → Add it

This allows your app to use more memory for larger models.

Build Configuration

Pre-built Framework (Default)

By default, whisper.rn uses ios/rnwhisper.xcframework which includes:

Pre-compiled whisper.cpp bindings
Metal shaders (.metallib)
Optimized builds for all iOS/tvOS architectures

This provides:

✅ Faster build times
✅ Smaller build artifacts
✅ Consistent performance

Building from Source

To build from source, add to your Podfile before pod install:

ENV['RNWHISPER_BUILD_FROM_SOURCE'] = '1'

This is useful when:

Debugging native code
Contributing to whisper.rn
Customizing whisper.cpp configuration

Building from source increases build time significantly but gives you full control over compilation flags.

Minimum Versions

iOS: 11.0+
tvOS: 11.0+
Core ML: iOS 15.0+ / tvOS 15.0+

Platform Limitations

Microphone is not supported on tvOS. Realtime transcription features will not work on tvOS devices.

Performance Tips

Use Core ML - 2-3x faster encoder on supported devices
Enable Metal - Significant GPU acceleration
Test in Release mode - Debug builds are much slower
Use quantized models - Smaller size, faster inference
Choose appropriate model size - tiny/base for most mobile use cases

Troubleshooting

Build errors

Clean derived data and rebuild:

rm -rf ~/Library/Developer/Xcode/DerivedData
cd ios && pod install && cd ..

Core ML not working

Check that:

iOS version is 15.0+
.mlmodelc directory is properly located
All required files exist in .mlmodelc/
useCoreMLIos is not set to false

Audio session conflicts

Manually manage audio session before starting other audio operations:

await AudioSessionIos.setActive(false)
// Other audio operations
await AudioSessionIos.setActive(true)

Get Started

Core Concepts

Features

Platform Guides

Examples

Advanced

Resources

Installation

Microphone Permissions

Add to Info.plist

Request Permission at Runtime

Audio Session Management

Available Categories

Available Options

Available Modes

Manual Audio Session Management

Automatic Session Management (Deprecated API)

Core ML Support

Model Requirements

Core ML Model Structure

Downloading Core ML Models

Bundling Core ML Models

Disabling Core ML

Metal GPU Acceleration

Enabling/Disabling Metal

Extended Virtual Addressing

Build Configuration

Pre-built Framework (Default)

Building from Source

Minimum Versions

Platform Limitations

Performance Tips

Troubleshooting

Build errors

Core ML not working

Audio session conflicts

Build docs developers (and LLMs) love

Get Started

Core Concepts

Features

Platform Guides

Examples

Advanced

Resources

​Installation

​Microphone Permissions

​Add to Info.plist

​Request Permission at Runtime

​Audio Session Management

​Available Categories

​Available Options

​Available Modes

​Manual Audio Session Management

​Automatic Session Management (Deprecated API)

​Core ML Support

​Model Requirements

​Core ML Model Structure

​Downloading Core ML Models

​Bundling Core ML Models

​Disabling Core ML

​Metal GPU Acceleration

​Enabling/Disabling Metal

​Extended Virtual Addressing

​Build Configuration

​Pre-built Framework (Default)

​Building from Source

​Minimum Versions

​Platform Limitations

​Performance Tips

​Troubleshooting

​Build errors

​Core ML not working

​Audio session conflicts

Build docs developers (and LLMs) love

Installation

Microphone Permissions

Add to Info.plist

Request Permission at Runtime

Audio Session Management

Available Categories

Available Options

Available Modes

Manual Audio Session Management

Automatic Session Management (Deprecated API)

Core ML Support

Model Requirements

Core ML Model Structure

Downloading Core ML Models

Bundling Core ML Models

Disabling Core ML

Metal GPU Acceleration

Enabling/Disabling Metal

Extended Virtual Addressing

Build Configuration

Pre-built Framework (Default)

Building from Source

Minimum Versions

Platform Limitations

Performance Tips

Troubleshooting

Build errors

Core ML not working

Audio session conflicts