Skip to main content
The qualifying.csv table contains qualifying session results for each Grand Prix, including times from all three qualifying segments (Q1, Q2, Q3).

Schema

FieldTypeDescription
qualifyIdintegerUnique identifier for each qualifying result
raceIdintegerForeign key to races.csv
driverIdintegerForeign key to drivers.csv
constructorIdintegerForeign key to constructors.csv
numberintegerDriver’s car number
positionintegerFinal qualifying position (determines grid position)
q1stringQ1 lap time (MM:SS.mmm format)
q2stringQ2 lap time (MM:SS.mmm format)
q3stringQ3 lap time (MM:SS.mmm format)
Qualifying times contain \N for null values:
  • q2 is \N for drivers eliminated in Q1 (positions 16-20)
  • q3 is \N for drivers eliminated in Q2 (positions 11-15)
  • Entire qualifying data may be missing for some historical races

Sample Data

qualifyIdraceIddriverIdconstructorIdnumberpositionq1q2q3
118112211:26.5721:25.1871:26.714
21892421:26.1031:25.3151:26.869
318512331:25.6641:25.4521:27.079
418136241:25.9941:25.6911:27.178

Relationships

References:
  • qualifying.raceIdraces.raceId
  • qualifying.driverIddrivers.driverId
  • qualifying.constructorIdconstructors.constructorId

Dataset Statistics

  • Total Records: 10,992 qualifying results
  • Date Range: Varies by race (most complete from 2003 onwards)
  • Drivers per Session: ~20

Qualifying Format

The current three-segment knockout qualifying format:
  1. Q1 (18 minutes): All drivers participate. Slowest 5 are eliminated (positions 16-20).
  2. Q2 (15 minutes): Remaining 15 drivers compete. Slowest 5 are eliminated (positions 11-15).
  3. Q3 (12 minutes): Top 10 drivers compete for pole position.

Example Queries

Get pole positions for a driver

import pandas as pd

qualifying = pd.read_csv('qualifying.csv')
pole_positions = qualifying[qualifying['position'] == 1]
print(pole_positions[['raceId', 'driverId', 'q3']])

Find Q1 eliminations

q1_eliminated = qualifying[
    (qualifying['position'] >= 16) & 
    (qualifying['q2'] == '\\N')
]
print(q1_eliminated[['raceId', 'driverId', 'position', 'q1']])

Analyze qualifying progression

# Top 10 who made it to Q3
q3_participants = qualifying[qualifying['q3'] != '\\N']
print(q3_participants[['raceId', 'driverId', 'q1', 'q2', 'q3', 'position']])

Compare qualifying times

# For a specific race
race_quali = qualifying[qualifying['raceId'] == 1000]
race_quali_sorted = race_quali.sort_values('position')
print(race_quali_sorted[['position', 'driverId', 'q1', 'q2', 'q3']])

Calculate qualifying gaps

import re

def time_to_seconds(time_str):
    if pd.isna(time_str) or time_str == '\\N':
        return None
    parts = re.split('[:\.]', time_str)
    return int(parts[0]) * 60 + int(parts[1]) + int(parts[2]) / 1000

# Get pole time vs P2 gap
race_quali = qualifying[qualifying['raceId'] == 1000].copy()
race_quali['q3_seconds'] = race_quali['q3'].apply(time_to_seconds)
pole_time = race_quali[race_quali['position'] == 1]['q3_seconds'].iloc[0]
p2_time = race_quali[race_quali['position'] == 2]['q3_seconds'].iloc[0]
gap = p2_time - pole_time
print(f"Gap to pole: {gap:.3f} seconds")

Join with race and driver data

races = pd.read_csv('races.csv')
drivers = pd.read_csv('drivers.csv')

full_quali = qualifying.merge(races, on='raceId') \
                       .merge(drivers, on='driverId')
                       
print(full_quali[['year', 'name', 'forename', 'surname', 'position', 'q3']])

Notes

  • Times are in MM:SS.mmm format (minutes:seconds.milliseconds)
  • \N indicates the driver didn’t participate in that qualifying segment
  • The three-segment knockout format was introduced in 2006
  • Earlier qualifying formats may not have Q1/Q2/Q3 data
  • Grid penalties applied after qualifying are not reflected in these results
  • Weather conditions can significantly affect qualifying times
  • Some races may have used different qualifying formats (e.g., sprint qualifying)

Build docs developers (and LLMs) love