I90 File Processing

Overview

The esios.processing.i90 module provides tools for parsing I90DIA files — daily XLS workbooks published by ESIOS containing generation, demand, and exchange data. These files have a complex structure with multiple sheets, variable time frequencies (hourly vs. quarter-hourly), and inconsistent header formats. The module automatically:

Detects hourly vs. quarter-hourly data
Normalizes three different column header formats
Parses dates and builds timezone-aware DatetimeIndex
Extracts metadata (publication date, data date, table of contents)

I90 File Format

I90DIA files are daily XLS workbooks with the naming pattern I90DIA_YYYYMMDD.xls. Each file contains:

First sheet: Table of contents listing all data sheets and their descriptions
Data sheets: Named with numeric identifiers (e.g., “3.1”, “3.2”) containing time-series data
Metadata: Data date and publication date in the first sheet

Time Frequencies

Sheets can contain:

Hourly data: 24 periods (1–24)
Quarter-hourly data: 96 periods (1–96), representing 15-minute intervals

Column Header Formats

I90 files use three different formats for time period columns:

Sequential integers: 1, 2, 3, ..., 24 (hourly) or 1, 2, 3, ..., 96 (quarterly)
H-Q dash notation: "1-1", "1-2", "1-3", "1-4", "2-1", ... (hour-quarter pairs)
NaN-filler format: [1, NaN, NaN, NaN, 2, NaN, NaN, NaN, ...] (one label per hour, three trailing NaNs for quarters 2–4)

The parser automatically detects and normalizes all three formats to sequential period indices (1–96).

I90Book

Represents an I90DIA workbook with lazy sheet loading.

from esios.processing.i90 import I90Book

book = I90Book("I90DIA_20250101.xls")
sheet = book["3.1"]
df = sheet.df

Constructor

I90Book(path)

path

Path | str

required

Path to the I90DIA XLS file.

Attributes

path

Path

Path to the XLS file.

metadata

dict

Dictionary containing:

date_data: The date of the data (extracted from first sheet, row 4)
date_publication: The publication date (extracted from first sheet, row 4)

table_of_contents

dict[str, str]

Mapping of sheet names to their descriptions (e.g., {"3.1": "Generación programada PBF", ...}).

sheets

dict[str, I90Sheet]

Dictionary of already-loaded sheets. Use get_sheet() or [] to load sheets on demand.

Methods

get_sheet

Load and preprocess a specific sheet by name (lazy — only reads on demand).

sheet = book.get_sheet("3.1")

sheet_name

str

required

Name of the sheet to load (e.g., "3.1", "3.2").

Returns: I90Sheet — The loaded sheet with preprocessed DataFrame. Raises: KeyError if the sheet name doesn’t exist in the workbook.

getitem

Convenience syntax for get_sheet():

sheet = book["3.1"]  # equivalent to book.get_sheet("3.1")

from_archive (classmethod)

Download I90 files from ESIOS archives and parse them into I90Book objects.

from esios import ESIOSClient

client = ESIOSClient(token="your-token")
archive = client.archives.get(34)  # I90DIA archive

books = I90Book.from_archive(
    archive,
    start="2025-01-01",
    end="2025-01-31"
)

Examples

Basic Usage

from esios.processing.i90 import I90Book

# Open an I90 file
book = I90Book("I90DIA_20250101.xls")

# View metadata
print(book.metadata["date_data"])         # 2025-01-01
print(book.metadata["date_publication"])  # 2025-01-02

# View table of contents
print(book.table_of_contents)
# {'3.1': 'Generación programada PBF', '3.2': 'Demanda real', ...}

# Load a specific sheet
sheet = book["3.1"]
df = sheet.df
print(df.head())

Batch Download and Parse

from esios import ESIOSClient
from esios.processing.i90 import I90Book

client = ESIOSClient(token="your-token")
archive = client.archives.get(34)  # I90DIA archive

# Download and parse January 2025 files
books = I90Book.from_archive(
    archive,
    start="2025-01-01",
    end="2025-01-31"
)

print(f"Loaded {len(books)} I90 files")

# Access sheets across multiple files
for book in books:
    sheet = book["3.1"]
    print(f"{book.metadata['date_data']}: {sheet.df.shape}")

Combine Data Across Multiple Files

import pandas as pd

# Collect sheet 3.1 from all files
dfs = []
for book in books:
    sheet = book["3.1"]
    dfs.append(sheet.df)

# Concatenate into a single DataFrame
full_df = pd.concat(dfs, axis=0).sort_index()
print(full_df)

I90Sheet

Represents a single sheet within an I90 workbook. Preprocessing automatically detects frequency and produces a DataFrame with a UTC DatetimeIndex.

sheet = book["3.1"]
print(sheet.frequency)  # "hourly" or "hourly-quarterly"
df = sheet.df

Constructor

Note: You should not instantiate I90Sheet directly. Use I90Book.get_sheet() or I90Book[sheet_name] instead.

Attributes

sheet_name

str

Name of the sheet (e.g., "3.1").

pd.DataFrame | None

The preprocessed DataFrame with a timezone-aware DatetimeIndex. None until the sheet is loaded.

frequency

str | None

Detected frequency: "hourly" (24 periods) or "hourly-quarterly" (96 periods). None until preprocessing.

metadata

dict

Reference to the parent book’s metadata dictionary.

path

Path

Path to the parent XLS file.

DataFrame Structure

After preprocessing, sheet.df contains:

Index: DatetimeIndex with timezone "Europe/Madrid", covering the data date with the detected frequency
Columns: Variable names or values from the sheet
Values: Numeric data (converted to float)

For quarter-hourly data:

Period 1 → 00:00
Period 2 → 00:15
Period 96 → 23:45

For hourly data:

Period 1 → 01:00 (Spanish convention: hour 1 = 01:00)
Period 24 → 24:00 (midnight of the next day)

Examples

Inspect Sheet Properties

sheet = book["3.1"]

print(sheet.sheet_name)   # "3.1"
print(sheet.frequency)    # "hourly" or "hourly-quarterly"
print(sheet.df.shape)     # (24, N) or (96, N) depending on frequency
print(sheet.df.index.tz)  # Europe/Madrid

Access Raw Data

sheet = book["3.1"]
df = sheet.df

# Filter by time
morning = df.between_time("06:00", "12:00")

# Resample quarter-hourly to hourly
if sheet.frequency == "hourly-quarterly":
    hourly = df.resample("H").mean()

Frequency Detection

The parser automatically detects whether a sheet contains hourly or quarter-hourly data by checking if any normalized column index exceeds 30:

Values ≤ 24: Hourly data (24 periods)
Values > 30: Quarter-hourly data (96 periods)

Time deltas are calculated as:

Hourly: period * 60 minutes (e.g., period 1 → 60 min = 01:00)
Quarter-hourly: (period - 1) * 15 minutes (e.g., period 1 → 0 min = 00:00)

Column Normalization

The _normalize_datetime_columns method handles three formats:

1. Sequential Integers

Input: [1, 2, 3, ..., 24] or [1, 2, 3, ..., 96] Output: Unchanged (already sequential)

2. H-Q Dash Notation

Input: ["1-1", "1-2", "1-3", "1-4", "2-1", "2-2", ...] Processing:

Split on "-" to extract hour and quarter
Convert to sequential: (hour - 1) * 4 + quarter

Output: [1, 2, 3, 4, 5, 6, ..., 96]

3. NaN-Filler Format

Input: [1, NaN, NaN, NaN, 2, NaN, NaN, NaN, 3, ...] Processing:

Forward-fill NaN values → [1, 1, 1, 1, 2, 2, 2, 2, 3, ...]
Detect duplicates (same hour repeated 4 times)
Assign sequential indices: [1, 2, 3, 4, 5, 6, 7, 8, 9, ...]

Output: [1, 2, 3, 4, 5, 6, ..., 96]

Error Handling

Missing sheets: KeyError is raised when accessing a non-existent sheet
Parsing failures: Logged as warnings and skipped in from_archive()
Malformed sheets: Return empty DataFrame and log error
Timezone ambiguity: DST transitions are handled with ambiguous="infer"

Integration with Archives

I90 files are available through the ESIOS archives API (archive ID 34):

from esios import ESIOSClient
from esios.processing.i90 import I90Book

client = ESIOSClient(token="your-token")
archive = client.archives.get(34)

# Download and parse with one call
books = I90Book.from_archive(
    archive,
    start="2025-01-01",
    end="2025-01-31"
)

# Files are cached locally — re-running won't re-download
books = I90Book.from_archive(archive, start="2025-01-01", end="2025-01-31")

Notes

Uses python-calamine for fast XLS parsing
All timestamps are in "Europe/Madrid" timezone
Sheet names are case-sensitive (e.g., "3.1" not "3.1 " with trailing space)
The first sheet is always loaded during I90Book initialization (to extract metadata)
Other sheets are loaded lazily only when accessed
Empty sheets return an empty DataFrame without raising an error

Client

Managers

Models

Processing

CLI

Overview

I90 File Format

Time Frequencies

Column Header Formats

I90Book

Constructor

Attributes

Methods

get_sheet

getitem

from_archive (classmethod)

Examples

Basic Usage

Batch Download and Parse

Combine Data Across Multiple Files

I90Sheet

Constructor

Attributes

DataFrame Structure

Examples

Inspect Sheet Properties

Access Raw Data

Frequency Detection

Column Normalization

1. Sequential Integers

2. H-Q Dash Notation

3. NaN-Filler Format

Error Handling

Integration with Archives

Notes

Build docs developers (and LLMs) love

Client

Managers

Models

Processing

CLI

​Overview

​I90 File Format

​Time Frequencies

​Column Header Formats

​I90Book

​Constructor

​Attributes

​Methods

​get_sheet

​__getitem__

​from_archive (classmethod)

​Examples

​Basic Usage

​Batch Download and Parse

​Combine Data Across Multiple Files

​I90Sheet

​Constructor

​Attributes

​DataFrame Structure

​Examples

​Inspect Sheet Properties

​Access Raw Data

​Frequency Detection

​Column Normalization

​1. Sequential Integers

​2. H-Q Dash Notation

​3. NaN-Filler Format

​Error Handling

​Integration with Archives

​Notes

Build docs developers (and LLMs) love

Overview

I90 File Format

Time Frequencies

Column Header Formats

I90Book

Constructor

Attributes

Methods

get_sheet

getitem

from_archive (classmethod)

Examples

Basic Usage

Batch Download and Parse

Combine Data Across Multiple Files

I90Sheet

Constructor

Attributes

DataFrame Structure

Examples

Inspect Sheet Properties

Access Raw Data

Frequency Detection

Column Normalization

1. Sequential Integers

2. H-Q Dash Notation

3. NaN-Filler Format

Error Handling

Integration with Archives

Notes