Overview
Theesios.processing.i90 module provides tools for parsing I90DIA files — daily XLS workbooks published by ESIOS containing generation, demand, and exchange data. These files have a complex structure with multiple sheets, variable time frequencies (hourly vs. quarter-hourly), and inconsistent header formats.
The module automatically:
- Detects hourly vs. quarter-hourly data
- Normalizes three different column header formats
- Parses dates and builds timezone-aware DatetimeIndex
- Extracts metadata (publication date, data date, table of contents)
I90 File Format
I90DIA files are daily XLS workbooks with the naming patternI90DIA_YYYYMMDD.xls. Each file contains:
- First sheet: Table of contents listing all data sheets and their descriptions
- Data sheets: Named with numeric identifiers (e.g., “3.1”, “3.2”) containing time-series data
- Metadata: Data date and publication date in the first sheet
Time Frequencies
Sheets can contain:- Hourly data: 24 periods (1–24)
- Quarter-hourly data: 96 periods (1–96), representing 15-minute intervals
Column Header Formats
I90 files use three different formats for time period columns:- Sequential integers:
1, 2, 3, ..., 24(hourly) or1, 2, 3, ..., 96(quarterly) - H-Q dash notation:
"1-1", "1-2", "1-3", "1-4", "2-1", ...(hour-quarter pairs) - NaN-filler format:
[1, NaN, NaN, NaN, 2, NaN, NaN, NaN, ...](one label per hour, three trailing NaNs for quarters 2–4)
I90Book
Represents an I90DIA workbook with lazy sheet loading.Constructor
Path to the I90DIA XLS file.
Attributes
Path to the XLS file.
Dictionary containing:
date_data: The date of the data (extracted from first sheet, row 4)date_publication: The publication date (extracted from first sheet, row 4)
Mapping of sheet names to their descriptions (e.g.,
{"3.1": "Generación programada PBF", ...}).Dictionary of already-loaded sheets. Use
get_sheet() or [] to load sheets on demand.Methods
get_sheet
Load and preprocess a specific sheet by name (lazy — only reads on demand).Name of the sheet to load (e.g.,
"3.1", "3.2").I90Sheet — The loaded sheet with preprocessed DataFrame.
Raises: KeyError if the sheet name doesn’t exist in the workbook.
__getitem__
Convenience syntax forget_sheet():
from_archive (classmethod)
Download I90 files from ESIOS archives and parse them into I90Book objects.An
ArchiveHandle from client.archives.get(34) (I90DIA archive ID is 34).Start date in
"YYYY-MM-DD" format.End date in
"YYYY-MM-DD" format.list[I90Book] — List of successfully parsed I90Book objects, sorted by date. Files that fail to parse are logged and skipped.
Examples
Basic Usage
Batch Download and Parse
Combine Data Across Multiple Files
I90Sheet
Represents a single sheet within an I90 workbook. Preprocessing automatically detects frequency and produces a DataFrame with a UTC DatetimeIndex.Constructor
Note: You should not instantiateI90Sheet directly. Use I90Book.get_sheet() or I90Book[sheet_name] instead.
Attributes
Name of the sheet (e.g.,
"3.1").The preprocessed DataFrame with a timezone-aware DatetimeIndex.
None until the sheet is loaded.Detected frequency:
"hourly" (24 periods) or "hourly-quarterly" (96 periods). None until preprocessing.Reference to the parent book’s metadata dictionary.
Path to the parent XLS file.
DataFrame Structure
After preprocessing,sheet.df contains:
- Index:
DatetimeIndexwith timezone"Europe/Madrid", covering the data date with the detected frequency - Columns: Variable names or values from the sheet
- Values: Numeric data (converted to
float)
- Period 1 → 00:00
- Period 2 → 00:15
- Period 96 → 23:45
- Period 1 → 01:00 (Spanish convention: hour 1 = 01:00)
- Period 24 → 24:00 (midnight of the next day)
Examples
Inspect Sheet Properties
Access Raw Data
Frequency Detection
The parser automatically detects whether a sheet contains hourly or quarter-hourly data by checking if any normalized column index exceeds 30:- Values ≤ 24: Hourly data (24 periods)
- Values > 30: Quarter-hourly data (96 periods)
- Hourly:
period * 60minutes (e.g., period 1 → 60 min = 01:00) - Quarter-hourly:
(period - 1) * 15minutes (e.g., period 1 → 0 min = 00:00)
Column Normalization
The_normalize_datetime_columns method handles three formats:
1. Sequential Integers
Input:[1, 2, 3, ..., 24] or [1, 2, 3, ..., 96]
Output: Unchanged (already sequential)
2. H-Q Dash Notation
Input:["1-1", "1-2", "1-3", "1-4", "2-1", "2-2", ...]
Processing:
- Split on
"-"to extract hour and quarter - Convert to sequential:
(hour - 1) * 4 + quarter
[1, 2, 3, 4, 5, 6, ..., 96]
3. NaN-Filler Format
Input:[1, NaN, NaN, NaN, 2, NaN, NaN, NaN, 3, ...]
Processing:
- Forward-fill NaN values →
[1, 1, 1, 1, 2, 2, 2, 2, 3, ...] - Detect duplicates (same hour repeated 4 times)
- Assign sequential indices:
[1, 2, 3, 4, 5, 6, 7, 8, 9, ...]
[1, 2, 3, 4, 5, 6, ..., 96]
Error Handling
- Missing sheets:
KeyErroris raised when accessing a non-existent sheet - Parsing failures: Logged as warnings and skipped in
from_archive() - Malformed sheets: Return empty DataFrame and log error
- Timezone ambiguity: DST transitions are handled with
ambiguous="infer"
Integration with Archives
I90 files are available through the ESIOS archives API (archive ID 34):Notes
- Uses
python-calaminefor fast XLS parsing - All timestamps are in
"Europe/Madrid"timezone - Sheet names are case-sensitive (e.g.,
"3.1"not"3.1 "with trailing space) - The first sheet is always loaded during
I90Bookinitialization (to extract metadata) - Other sheets are loaded lazily only when accessed
- Empty sheets return an empty DataFrame without raising an error
