Overview
ONNX Runtime provides a mechanism to register custom operators at runtime. This is useful when:- You need domain-specific operations not in the ONNX spec
- You want to optimize certain operations for your hardware
- You need to integrate proprietary algorithms
Creating a Custom Operator
C++ Implementation
Custom operators are implemented in C++ using the ONNX Runtime C API:Operator Schema
Define the operator’s input/output schema:Registering Custom Operators
Using SessionOptions
Register your custom operator when creating an inference session:Python Example
Microsoft Contrib Operators
ONNX Runtime includes many contrib operators in thecom.microsoft domain for specialized use cases:
Attention Operators
Attention- Multi-head attention for transformersMultiHeadAttention- Optimized multi-head attentionGroupQueryAttention- Grouped query attention for efficient inference
Quantization Operators
MatMulNBits- N-bit quantized matrix multiplicationQLinearConv- Quantized convolutionDynamicQuantizeMatMul- Dynamic quantization for MatMul
Activation Functions
Gelu- Gaussian Error Linear UnitFastGelu- Fast approximation of GELUQuickGelu- Quick GELU variant
Usage Example
Best Practices
Performance
- Vectorize operations: Use SIMD instructions when possible
- Minimize memory allocations: Reuse buffers where feasible
- Thread safety: Ensure your operator is thread-safe for parallel execution
Compatibility
- Version your operators: Use operator versioning for backward compatibility
- Document schemas: Clearly document input/output types and shapes
- Handle edge cases: Validate inputs and handle boundary conditions
Testing
Operator Execution Providers
Custom operators can be optimized for specific execution providers:- CPU: Standard implementation
- CUDA: GPU-accelerated version
- TensorRT: TensorRT kernel implementation
- DirectML: DirectX ML implementation
Resources
Common Issues
Operator Not Found
If you see “operator not found” errors:- Verify the operator domain is registered
- Check the operator name matches exactly
- Ensure the custom op library is loaded before session creation