OpenVINO Execution Provider
The OpenVINO Execution Provider enables accelerated inference on Intel CPUs, integrated GPUs, and VPUs (Vision Processing Units) using the Intel OpenVINO toolkit.When to Use OpenVINO EP
Use the OpenVINO Execution Provider when:- You’re running on Intel CPUs (especially Xeon or Core processors)
- You have Intel integrated GPUs (Iris Xe, UHD Graphics)
- You’re using Intel discrete GPUs (Arc, Flex, Max series)
- You have Intel VPUs or Movidius devices
- You need optimized inference on Intel hardware
- You want to deploy on edge devices with Intel processors
Key Features
- Intel Hardware Optimization: Leverages Intel CPU extensions (AVX2, AVX-512, VNNI)
- Multi-Device Support: CPU, GPU, VPU in a single framework
- Graph Optimizations: Advanced model optimizations for Intel hardware
- Dynamic Shapes: Efficient handling of variable input sizes
- Precision Modes: FP32, FP16, INT8 quantization support
- Heterogeneous Execution: Can split workload across different devices
Prerequisites
Hardware Support
CPUs:- Intel Core processors (6th gen and newer recommended)
- Intel Xeon processors (Skylake and newer)
- Supports SSE4.2, AVX2, AVX-512, VNNI instructions
- Intel Integrated Graphics (HD Graphics 6xx and newer)
- Intel Iris Xe Graphics
- Intel Arc Graphics (A-series)
- Intel Data Center GPU Flex/Max series
- Intel Movidius Myriad X
- Intel Vision Processing Units
Software Requirements
- OpenVINO Runtime: 2024.0 or newer recommended
- ONNX Runtime with OpenVINO support
- Intel GPU drivers (for GPU execution)
Installation
Python
Using Intel Distribution
C++
Download pre-built binaries or build from source with OpenVINO support:Basic Usage
Python
C++
C#
Configuration Options
Device Types
OpenVINO supports multiple device types with different precision modes:Available Device Types
| Device Type | Description | Typical Use Case |
|---|---|---|
CPU_FP32 | CPU with 32-bit floating point | General purpose, development |
CPU_FP16 | CPU with 16-bit floating point | Memory-constrained systems |
GPU_FP32 | Intel GPU with 32-bit float | GPU acceleration, balanced |
GPU_FP16 | Intel GPU with 16-bit float | Maximum GPU performance |
MYRIAD_FP16 | Intel VPU/Movidius | Edge devices, low power |
HETERO:GPU,CPU | Heterogeneous execution | Fallback support |
MULTI:GPU,CPU | Multi-device execution | Load balancing |
Advanced Configuration
Device Selection
Querying Available Devices
CPU Optimization
GPU Optimization
Heterogeneous Execution
Split workload across multiple devices:Performance Optimization
Model Caching
OpenVINO compiles models on first run. Enable caching to speed up subsequent loads:Dynamic Shapes
OpenVINO handles dynamic shapes efficiently:Quantization (INT8)
For INT8 models, OpenVINO provides automatic optimization:Platform Support
| Platform | Architecture | Support |
|---|---|---|
| Linux | x64 | ✅ Full |
| Linux | ARM64 | ✅ Limited |
| Windows | x64 | ✅ Full |
| Windows | ARM64 | ⚠️ Experimental |
| macOS | x64 | ✅ Full |
| macOS | ARM64 | ⚠️ Limited |
Use Cases
Edge Deployment
Cloud Inference (Intel Xeon)
Intel Arc GPU
Performance Comparison
Typical performance improvements over standard CPU execution:| Hardware | Precision | Speedup | Notes |
|---|---|---|---|
| Intel Xeon (AVX-512) | FP32 | 2-4x | vs standard CPU EP |
| Intel Core i7/i9 | FP32 | 1.5-3x | vs standard CPU EP |
| Intel Iris Xe GPU | FP16 | 3-6x | vs CPU |
| Intel Arc GPU | FP16 | 5-10x | vs CPU |
| Movidius VPU | FP16 | 2-5x | Low power |
Troubleshooting
Provider Not Available
GPU Not Detected
Performance Issues
Compilation Errors
Comparison with Other Providers
| Feature | OpenVINO | oneDNN | CUDA |
|---|---|---|---|
| Intel CPU | Excellent | Good | N/A |
| Intel GPU | Excellent | N/A | N/A |
| NVIDIA GPU | N/A | N/A | Excellent |
| Edge Devices | Excellent | Limited | Limited |
| Setup Complexity | Moderate | Easy | Moderate |
Next Steps
- Learn about model optimization for OpenVINO
- Compare with DirectML for cross-vendor support
- Explore INT8 quantization for better performance
- See OpenVINO documentation for advanced features