Changelog & Updates
The RaceData Formula 1 dataset is automatically updated to ensure you always have access to the latest race information.Automated Update System
Update Frequency
The dataset is automatically updated within 3 hours of the completion of each Formula 1 race.
- Downloads the latest data from Kaggle sources
- Checks for any changes compared to the existing data
- If changes are detected:
- Commits the updated data to the repository
- Creates a new GitHub release with downloadable
data.zip - Uploads to HuggingFace Datasets
How It Works
The automated update process uses GitHub Actions to:- Download - Fetches latest datasets from Kaggle:
- Detect Changes - Compares new data with existing data
-
Commit & Push - If changes exist:
-
Create Release - Generates a new GitHub release:
- Tag format:
data-YYYY-MM-DD - Includes
data.zipwith all tables - Automated release notes
- Tag format:
- Sync HuggingFace - Updates the HuggingFace Datasets mirror
If no changes are detected (data already current), no release is created.
Accessing Latest Data
You can get the most recent data from multiple sources:GitHub Releases
Download the latestdata.zip directly:
HuggingFace Datasets
Load directly into Python:Direct Git Clone
Checking for Updates
Via GitHub API
Check the latest release programmatically:Via Git
If you’ve cloned the repository:Release Notifications
Stay informed about updates:- Watch the repository on GitHub (select “Releases only”)
- Subscribe to releases via RSS:
https://github.com/TracingInsights/RaceData/releases.atom - Follow TracingInsights on social media
Update History
Each release includes:- Release date and time (UTC)
- Dataset version tag (e.g.,
data-2026-03-12) - Source information - Which Kaggle datasets were updated
- Download link - Direct link to
data.zip
Data Version Control
All changes to the dataset are tracked in Git, allowing you to:- View change history - See what data changed and when
- Access historical versions - Check out previous releases
- Track data evolution - Understand how the dataset has grown
Accessing Historical Data
To get data from a specific date:Manual Updates
The GitHub Actions workflow can also be triggered manually if needed.
- GitHub UI: Actions → Update F1 Datasets → Run workflow
- GitHub CLI:
gh workflow run update-datasets.yml
Data Source Updates
Our dataset is only as current as our upstream sources:- Updates depend on Kaggle dataset maintainers updating their datasets
- Typically updated within hours of race completion
- Historical data corrections propagate through automated updates
Questions About Updates?
If you notice:- Missing recent race data (>3 hours after race completion)
- Update failures or errors
- Stale data that should have been updated
