Data Model Architecture
The NBA Statistics Data Platform organizes data across multiple dimensions:- Player-level statistics: Individual performance metrics aggregated by season
- Team-level statistics: Team performance and shooting data
- Advanced metrics: Calculated fields including play types, tracking data, and RAPM
- Game-level data: Play-by-play and game logs
Core Identifiers
All datasets are connected through consistent identifier fields:Unique player identifier used across all player datasets. Also referred to as
PLAYER_ID in some files.Unique team identifier. Also appears as
TEAM_ID or TeamId depending on the dataset.Season year (e.g., 2014, 2015). Regular season and playoff data are stored separately.
Player name as displayed. Used for human-readable references.
Team abbreviation (e.g., “LAL”, “BOS”, “GSW”). May appear as
Tm, TEAM, or TEAM_ABBREVIATION.File Naming Conventions
Regular Season vs Playoffs
Datasets follow a consistent naming pattern to distinguish between regular season and playoff data:- Regular season:
filename.csv(e.g.,scoring.csv,hustle.csv) - Playoffs:
filename_ps.csv(e.g.,scoring_ps.csv,hustle_ps.csv)
_ps suffix indicates playoff-specific data. Playoff files contain the same schema as their regular season counterparts but only include postseason games.
Common Playoff Files
defense_master_ps.csvhustle_ps.csvpassing_ps.csvtotals_ps.csvtracking_ps.csvteam_shooting_ps.csvshotzone_ps.csv
Data Relationships
Player → Team → Season
Data follows a hierarchical structure:Joining Datasets
To combine data from multiple sources, use the following join keys: Player statistics across datasets:Common Field Patterns
Game Counts
Games played or games started (context-dependent)
Games played (total appearances)
Playing Time
Minutes played (total for season)
Minutes (may be per game or total depending on dataset)
Shooting Percentages
Field goal percentage (0-100 scale)
Effective field goal percentage, adjusted for 3-point value
True shooting percentage, accounts for free throws
Data Quality Notes
Missing Values
- Players with insufficient minutes may have
nullor0values for advanced metrics - Percentage fields may be
nullwhen the denominator is 0 (e.g., no attempts) - Some tracking data is only available from 2013-14 season onward
Historical Coverage
Data availability varies by metric and year:- Basic stats: Available from 1974+ (scoring, totals)
- Shooting zones: Available from 2014+
- Tracking data: Available from 2014+
- Hustle stats: Available from 2016+
- RAPM: Calculated for seasons with play-by-play data
Next Steps
Player Stats
Explore player-level data schemas
Team Stats
Review team-level datasets
Advanced Metrics
Dive into calculated metrics and play types