Skip to main content

SessionOptions

The SessionOptions class allows you to configure various aspects of an InferenceSession, including graph optimization, thread pool sizes, execution providers, and profiling.

Constructor

SessionOptions()
Creates a new SessionOptions object with default settings.

Properties

graph_optimization_level
GraphOptimizationLevel
Controls the level of graph optimizations applied.
  • ORT_DISABLE_ALL - No optimizations
  • ORT_ENABLE_BASIC - Basic optimizations (default)
  • ORT_ENABLE_EXTENDED - Extended optimizations
  • ORT_ENABLE_ALL - All optimizations including layout transformations
intra_op_num_threads
int
Number of threads used to parallelize execution within nodes. Default is 0 (use default number).
inter_op_num_threads
int
Number of threads used to parallelize execution of nodes. Default is 0 (use default number).
execution_mode
ExecutionMode
Controls whether operators are executed sequentially or in parallel.
  • ORT_SEQUENTIAL - Execute operators sequentially
  • ORT_PARALLEL - Execute operators in parallel when possible
execution_order
ExecutionOrder
Controls the order in which graph nodes are executed.
  • DEFAULT - Use default topological order
  • PRIORITY_BASED - Use priority-based scheduling
enable_profiling
bool
Enable profiling to collect performance data. Default is False.
optimized_model_filepath
str
Path to save the optimized model. If set, the optimized graph will be saved to this location.
log_severity_level
int
Logging verbosity level (0=Verbose, 1=Info, 2=Warning, 3=Error, 4=Fatal). Default is 2.
log_verbosity_level
int
VLOG level for verbose logging. Default is 0.
enable_mem_pattern
bool
Enable memory pattern optimization. Default is True.
enable_mem_reuse
bool
Enable memory reuse optimization. Default is True.
enable_cpu_mem_arena
bool
Enable CPU memory arena allocator. Default is True.

Methods

add_session_config_entry()

Add custom session configuration entry.
add_session_config_entry(
    key: str,
    value: str
)
key
str
required
Configuration key.
value
str
required
Configuration value.
Common Configuration Keys:
  • session.load_model_format - Set to “ONNX” or “ORT”
  • session.use_env_allocators - Use environment allocators
  • session.record_ep_graph_assignment_info - Record EP graph assignment (“1” to enable)
  • session.disable_prepacking - Disable weight prepacking

register_custom_ops_library()

Register a shared library containing custom operators.
register_custom_ops_library(library_path: str)
library_path
str
required
Path to the shared library (.so, .dll, or .dylib).

add_external_initializers()

Add external initializers to the session.
add_external_initializers(
    names: list[str],
    values: list[OrtValue]
)
names
list[str]
required
Names of the initializers.
values
list[OrtValue]
required
OrtValue objects containing initializer data.

add_free_dimension_override_by_name()

Override a free dimension with a specific value.
add_free_dimension_override_by_name(
    dim_name: str,
    dim_value: int
)
dim_name
str
required
Name of the dimension to override.
dim_value
int
required
Value to use for the dimension.

Example Usage

Basic Configuration

import onnxruntime as ort

sess_options = ort.SessionOptions()

# Enable all optimizations
sess_options.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_ALL

# Set thread counts
sess_options.intra_op_num_threads = 4
sess_options.inter_op_num_threads = 2

# Enable parallel execution
sess_options.execution_mode = ort.ExecutionMode.ORT_PARALLEL

# Create session with options
sess = ort.InferenceSession("model.onnx", sess_options=sess_options)

Enable Profiling

sess_options = ort.SessionOptions()
sess_options.enable_profiling = True

sess = ort.InferenceSession("model.onnx", sess_options=sess_options)

# Run inference
outputs = sess.run(None, inputs)

# Get profiling results
profile_file = sess.end_profiling()
print(f"Profiling data saved to: {profile_file}")

Save Optimized Model

sess_options = ort.SessionOptions()
sess_options.optimized_model_filepath = "model_optimized.onnx"
sess_options.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_EXTENDED

sess = ort.InferenceSession("model.onnx", sess_options=sess_options)
# Optimized model is automatically saved

Custom Configuration

sess_options = ort.SessionOptions()

# Load ORT format model
sess_options.add_session_config_entry("session.load_model_format", "ORT")

# Record EP graph assignment for debugging
sess_options.add_session_config_entry("session.record_ep_graph_assignment_info", "1")

sess = ort.InferenceSession("model.ort", sess_options=sess_options)

# Get graph assignment info
assignment = sess.get_provider_graph_assignment_info()

Register Custom Operators

sess_options = ort.SessionOptions()
sess_options.register_custom_ops_library("custom_ops.so")

sess = ort.InferenceSession("model_with_custom_ops.onnx", sess_options=sess_options)

Performance Tuning

# For CPU inference
sess_options = ort.SessionOptions()
sess_options.intra_op_num_threads = 8  # Use 8 threads per op
sess_options.inter_op_num_threads = 1   # Sequential op execution
sess_options.execution_mode = ort.ExecutionMode.ORT_SEQUENTIAL
sess_options.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_ALL

# For GPU inference
sess_options = ort.SessionOptions()
sess_options.execution_mode = ort.ExecutionMode.ORT_PARALLEL
sess_options.graph_optimization_level = ort.GraphOptimizationLevel.ORT_ENABLE_ALL