Skip to main content
Core ML support on iOS (15.0+) and tvOS (15.0+) accelerates the Whisper encoder using Apple’s Neural Engine, providing significant performance improvements.

Overview

Benefits:
  • 2-4x faster encoder processing on compatible devices
  • Lower power consumption
  • Utilizes Apple’s Neural Engine hardware
  • Automatic fallback to CPU if Core ML fails
Requirements:
  • iOS 15.0+ or tvOS 15.0+
  • Core ML model files (.mlmodelc directory)
  • GGML model file (still required for decoder)
Core ML only accelerates the encoder. The decoder still runs using the GGML model.

Setup

1

Download Core ML models

Core ML models are available on Hugging Face:https://huggingface.co/ggerganov/whisper.cpp/tree/mainFiles are archived (.zip) - you need to:
  1. Download the .zip file (e.g., ggml-base-encoder.mlmodelc.zip)
  2. Extract to get the .mlmodelc directory
You can use react-native-zip-archive to extract at runtime, or host individual files yourself.
2

Understand model structure

A .mlmodelc directory contains (3 required files):
ggml-base-encoder.mlmodelc/
├── model.mil              # Required: Model definition
├── coremldata.bin         # Required: Model metadata
└── weights/
    └── weight.bin         # Required: Model weights
Optional files:
  • metadata.json
  • analytics/coremldata.bin
3

Place models with GGML model

Core ML models must be co-located with the GGML model:
/path/to/models/
├── ggml-base.bin                    # GGML model
└── ggml-base-encoder.mlmodelc/      # Core ML model (same prefix)
    ├── model.mil
    ├── coremldata.bin
    └── weights/weight.bin
Naming convention:
  • GGML: ggml-{model}.bin
  • Core ML: ggml-{model}-encoder.mlmodelc/
For example:
  • ggml-tiny.en.binggml-tiny.en-encoder.mlmodelc/
  • ggml-base.binggml-base-encoder.mlmodelc/

Usage Patterns

Download and extract Core ML models at runtime to avoid increasing app size:
import { initWhisper } from 'whisper.rn'
import RNFS from 'react-native-fs'
import { unzip } from 'react-native-zip-archive'

// Download GGML model
const modelPath = `${RNFS.DocumentDirectoryPath}/ggml-base.bin`
await RNFS.downloadFile({
  fromUrl: 'https://example.com/ggml-base.bin',
  toFile: modelPath
}).promise

// Download Core ML model archive
const coreMLZip = `${RNFS.DocumentDirectoryPath}/ggml-base-encoder.mlmodelc.zip`
await RNFS.downloadFile({
  fromUrl: 'https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base-encoder.mlmodelc.zip',
  toFile: coreMLZip
}).promise

// Extract Core ML model
const modelsDir = RNFS.DocumentDirectoryPath
await unzip(coreMLZip, modelsDir)
// Now you have: modelsDir/ggml-base-encoder.mlmodelc/

// Initialize with Core ML enabled
const whisperContext = await initWhisper({
  filePath: modelPath,
  useCoreMLIos: true  // Enable Core ML (default: true)
})

console.log('Core ML enabled:', whisperContext.gpu)
if (!whisperContext.gpu) {
  console.log('Reason:', whisperContext.reasonNoGPU)
}

Bundled Assets (Dev/Testing)

Bundle Core ML models in your app (increases app size significantly):
import { initWhisper } from 'whisper.rn'
import { Platform } from 'react-native'

const whisperContext = await initWhisper({
  filePath: require('../assets/ggml-base.bin'),
  coreMLModelAsset: Platform.OS === 'ios' ? {
    filename: 'ggml-base-encoder.mlmodelc',
    assets: [
      require('../assets/ggml-base-encoder.mlmodelc/weights/weight.bin'),
      require('../assets/ggml-base-encoder.mlmodelc/model.mil'),
      require('../assets/ggml-base-encoder.mlmodelc/coremldata.bin'),
    ]
  } : undefined
})
This significantly increases app size. Only use for development or if Core ML models are essential.

Metro Config for Assets

If bundling Core ML models, add .mil to Metro’s asset extensions:
// metro.config.js
const defaultAssetExts = require('metro-config/src/defaults/defaults').assetExts

module.exports = {
  resolver: {
    assetExts: [
      ...defaultAssetExts,
      'bin',  // GGML models
      'mil',  // Core ML models
    ]
  }
}
RN packager has a 2GB file size limit. Large models (e.g., large f16 models at 2.9GB) cannot be bundled.

Initialization Options

type ContextOptions = {
  filePath: string | number
  
  // Core ML options
  useCoreMLIos?: boolean              // Enable Core ML (default: true)
  coreMLModelAsset?: {                // For bundled assets
    filename: string                  // e.g., 'ggml-base-encoder.mlmodelc'
    assets: Array<string | number>   // Required files (paths or require())
  }
  
  // GPU options (alternative to Core ML)
  useGpu?: boolean                    // Use Metal GPU (default: true)
  useFlashAttn?: boolean              // Flash Attention (for GPU only)
  
  // Other options
  isBundleAsset?: boolean             // Is filePath a bundle asset
}
Priority:
  1. If useGpu: true → Tries Metal GPU (Core ML ignored)
  2. If useCoreMLIos: true and Core ML model exists → Uses Core ML
  3. Otherwise → Falls back to CPU

Checking Core ML Status

const whisperContext = await initWhisper({
  filePath: modelPath,
  useCoreMLIos: true
})

if (whisperContext.gpu) {
  console.log('Core ML is active!')
} else {
  console.log('Core ML not active')
  console.log('Reason:', whisperContext.reasonNoGPU)
}
Common reasonNoGPU values:
  • Core ML model not found
  • Core ML disabled in build
  • Failed to load Core ML model
  • GPU (Metal) takes priority

Build Configuration

Enable/Disable Core ML

Control Core ML compilation in your Podfile:
# Podfile
ENV['RNWHISPER_DISABLE_COREML'] = '1'  # Disable Core ML compilation
Useful for:
  • Reducing build time during development
  • Building for devices without Core ML support

Check Core ML Availability

import { isUseCoreML, isCoreMLAllowFallback } from 'whisper.rn'

console.log('Core ML available:', isUseCoreML)
console.log('Fallback allowed:', isCoreMLAllowFallback)

Performance Comparison

Typical speedup (iPhone 13 Pro, tiny.en model):
ModeEncode TimeTotal Time
CPU~800ms~1200ms
Core ML~200ms~600ms
Speedup: ~2-3x for encoder, ~2x total
Larger models (base, small, medium) see greater speedups with Core ML.

GPU vs Core ML

whisper.rn supports two iOS acceleration methods:
FeatureCore MLMetal GPU
AcceleratesEncoder onlyFull model
iOS Version15.0+11.0+
PriorityLowerHigher (if useGpu: true)
Model Files.mlmodelc directoryGGML only
Typical Speedup2-3x encoderVaries
Recommendation:
  • Use Core ML for best balance of speed and compatibility
  • Use Metal GPU if you need full-model acceleration (experimental)
// Prefer Metal GPU over Core ML
const whisperContext = await initWhisper({
  filePath: modelPath,
  useGpu: true,         // Metal GPU (higher priority)
  useCoreMLIos: true    // Fallback to Core ML if Metal fails
})

Troubleshooting

Core ML not loading

1

Check model naming

Ensure Core ML model directory name matches GGML model:
  • ggml-base.binggml-base-encoder.mlmodelc/
  • Names must be in the same directory
2

Verify required files

Core ML directory must contain:
  • model.mil
  • coremldata.bin
  • weights/weight.bin
3

Check file paths

import RNFS from 'react-native-fs'

const modelDir = RNFS.DocumentDirectoryPath
const coreMLPath = `${modelDir}/ggml-base-encoder.mlmodelc`

const exists = await RNFS.exists(`${coreMLPath}/model.mil`)
console.log('Core ML model.mil exists:', exists)
4

Check reasonNoGPU

console.log('Reason:', whisperContext.reasonNoGPU)

App size too large

Don’t bundle Core ML models - download at runtime instead.

Build errors

# Clean build
cd ios
rm -rf Pods/ Podfile.lock
pod install
If Core ML compilation fails, disable it:
# Podfile
ENV['RNWHISPER_DISABLE_COREML'] = '1'

Extended Virtual Addressing

For medium and large models on iOS, enable Extended Virtual Addressing:
  1. Open Xcode
  2. Select your target → Signing & Capabilities
  3. Add Increased Memory Limit capability
This entitlement allows apps to use more than the default memory limit.

See Also

Build docs developers (and LLMs) love