Overview
The reproducibility module provides utilities for ensuring deterministic execution across different runs by controlling random seeds, threading behavior, and environment variables.Functions
set_global_seed
Sets random seeds globally for Python, NumPy, and configures threading for deterministic behavior.Random seed value to use across all random number generators
- Sets
random.seed(seed)for Python’s built-in random module - Sets
np.random.seed(seed)for NumPy - Sets
PYTHONHASHSEEDenvironment variable to ensure hash randomization is deterministic - Configures threading environment variables (see below)
"1" if not already set:
OMP_NUM_THREADS: OpenMP thread countMKL_NUM_THREADS: Intel MKL thread countOPENBLAS_NUM_THREADS: OpenBLAS thread count
reproducibility_context
Captures the current reproducibility context including platform information, seed, and environment variables.Configuration object (typically a dataclass or dict-like) containing a
random_seed fieldDictionary containing reproducibility metadata:python_version (str): Python version string (e.g., “3.10.5”)platform (str): Platform identifier (e.g., “Linux-5.15.0-x86_64”)seed (int): Random seed extracted from config or 0 if not foundthread_env (dict): Current threading environment variable values:
OMP_NUM_THREADS: OpenMP thread count settingMKL_NUM_THREADS: Intel MKL thread count settingOPENBLAS_NUM_THREADS: OpenBLAS thread count settingPYTHONHASHSEED: Python hash seed value
Usage Patterns
Standard Setup
With Custom Configuration
Saving Reproducibility Information
Notes
- Always call
set_global_seed()at the beginning of your script before any random operations - The threading configuration limits parallelism to ensure determinism but may impact performance
reproducibility_context()works with both dataclasses and dictionary-like configuration objects- Environment variables are only set if not already configured, allowing manual overrides