DatasetSpec
Dataclass defining the static properties of a dataset.Human-readable dataset name
Dataset version identifier
Relative or absolute path to training CSV file
Relative or absolute path to test CSV file
Number of feature columns (excluding label)
Minimum rows required for validation
Base URL for downloading dataset files
FASHION_MNIST_SPEC
Pre-configuredDatasetSpec for Fashion-MNIST dataset.
- name: “fashion-mnist”
- version: “v1”
- train_path: “Neural Network from Scratch/task/Data/fashion-mnist_train.csv”
- test_path: “Neural Network from Scratch/task/Data/fashion-mnist_test.csv”
file_digest()
Compute SHA256 hash digest for a dataset file.Path to the file
FileNotFoundError if file does not exist
Source: dataset_config.py:43
validate_dataset_file()
Validate dataset integrity at both file and tensor levels.Path to dataset CSV file
Expected number of feature columns (excluding label column)
Minimum number of rows required
Optional SHA256 hash for file integrity verification
(n_rows, n_cols) for the validated dataset
Raises:
FileNotFoundErrorif file does not existValueErrorif file is empty, hash mismatch, shape mismatch, too few rows, contains NaN, or labels out of range [0,9]
load_dataset()
Load and preprocess a CSV dataset into normalized features and labels.Path to dataset CSV file
(X, y) where:
X: Normalized feature matrix (float32) with values scaled by max pixel valuey: Integer label vector (int32)
download_fashion_mnist()
Download Fashion-MNIST train and test CSV files from remote server.Dataset specification with download URLs and target paths
"train_sha256" and "test_sha256" containing computed hashes
Raises: requests.HTTPError if download fails
Source: dataset_config.py:115
ensure_dataset_ready()
Validate dataset availability and optionally auto-download if missing or invalid.Dataset specification
Expected feature count
Minimum row count
If True, automatically download dataset when validation fails
Optional SHA256 for integrity check
(n_rows, n_cols) after successful validation
Raises:
- Validation errors if
auto_download=Falseand dataset is invalid RuntimeErrorif auto-download fails