Access the dataset directly through the HuggingFace Datasets library without manual downloads. The dataset is automatically synchronized with the latest race data.
View on HuggingFace
Browse the dataset on HuggingFace Hub: tracinginsights/RaceData
Easily convert HuggingFace datasets to pandas DataFrames for analysis:
from datasets import load_datasetimport pandas as pd# Load a specific tabledataset = load_dataset( "tracinginsights/RaceData", data_files="lap_times.csv", split="train")# Convert to pandas DataFramelap_times_df = dataset.to_pandas()# Now use standard pandas operationsprint(lap_times_df.head())print(lap_times_df.describe())
HuggingFace Datasets uses Apache Arrow under the hood for efficient data handling:
from datasets import load_datasetdataset = load_dataset( "tracinginsights/RaceData", data_files="results.csv", split="train")# Get Arrow tablearrow_table = dataset.dataprint(type(arrow_table)) # pyarrow.Table
Convert to Polars DataFrame for high-performance data manipulation:
from datasets import load_datasetimport polars as pldataset = load_dataset( "tracinginsights/RaceData", data_files="lap_times.csv", split="train")# Convert to Polarsdf = pl.from_arrow(dataset.data.to_pandas())print(df.head())
Query the data directly with DuckDB:
from datasets import load_datasetimport duckdbdataset = load_dataset( "tracinginsights/RaceData", data_files="races.csv", split="train")# Convert to pandas then query with DuckDBdf = dataset.to_pandas()result = duckdb.query("SELECT * FROM df WHERE year >= 2020").df()print(result)