Introduction

Welcome to the NBA Statistics Data Platform

This data platform aggregates comprehensive NBA statistics from multiple authoritative sources, providing historical and current season data covering the 2014-15 through 2024-25 NBA seasons. Built with Python, the platform combines web scraping, API integration, and automated data pipelines to deliver consistent, structured datasets for advanced basketball analytics.

Quick Start Guide

Get up and running in minutes with installation and your first data collection

Data Sources

Explore the three major data sources: NBA.com, Basketball Reference, and pbpstats

API Reference

Browse the complete collection of data extraction scripts and utilities

Data Schema

Understand the structure of output datasets and available metrics

Key Features

Multi-Source Data Aggregation

The platform intelligently combines data from three complementary sources:

NBA.com Stats API: Official league statistics including tracking data, shooting metrics, and advanced analytics
Basketball Reference: Historical totals, per-possession stats, and player identification
pbpstats.com API: Play-by-play derived metrics, on/off data, and advanced possession statistics

Each source provides unique data points that are merged using player and team identifiers to create comprehensive datasets.

Historical Coverage (2014-2025)

Access 11+ seasons of NBA data organized by:

Regular Season: Complete data for all regular season games
Playoffs: Separate datasets for postseason performance
Year-over-year structure: Data organized in year-based directories (2014/, 2015/, …, 2025/)

Comprehensive Statistical Categories

Player Tracking & Movement

Speed and distance metrics
Touches and time of possession
Drives and paint touches
Hustle statistics (deflections, loose balls, screen assists)

Shooting & Scoring

Shot quality by defender distance (Very Tight, Tight, Open, Wide Open)
Dribble-based shooting splits (0, 1, 2, 3-6, 7+ dribbles)
Catch & shoot vs. pull-up shooting
Shot zone frequency and efficiency

Passing & Playmaking

Assists by type (3PT, at-rim, mid-range)
Potential assists and assist conversion rates
Secondary assists
Bad pass turnovers and steals

Defense

Opponent field goal percentage by distance
Rim protection metrics (frequency, accuracy)
Defensive matchup data
Team defensive tracking

Contract & Salary Data

Player salaries by season
Team cap holds and dead money
Contract options (player/team)
Cap space analysis

Automated Data Pipeline

The platform includes Python scripts that:

Make authenticated API requests with proper headers
Respect rate limits with built-in delays
Handle pagination and multi-year data collection
Automatically merge new data with historical datasets
Generate unified master CSV files for easy analysis

Data Output Format

All data is stored as CSV files with consistent schemas:

workspace/source/
├── 2014/                    # Season directories
│   ├── defense/
│   ├── player_shooting/
│   ├── tracking/
│   └── playoffs/
├── 2015/
├── ...
├── 2025/
├── hustle.csv              # Master files (all seasons)
├── passing.csv
├── defense_master.csv
├── player_shooting.csv
└── tracking.csv

Master CSV files contain consolidated data from all seasons, while year-specific directories contain individual season data.

Technology Stack

The platform is built with:

Python 3.x: Core scripting language
pandas 1.5.3: Data manipulation and CSV operations
requests 2.32.3: HTTP requests to APIs
BeautifulSoup4 4.12.3: HTML parsing for web scraping
nba_api 1.6.1: Official NBA.com API wrapper
plotly 5.23.0: Data visualization (optional)

Use Cases

This data platform supports:

Player evaluation: Compare players across multiple statistical dimensions
Team analytics: Analyze team offensive and defensive tendencies
Historical trends: Track how the game has evolved from 2014-2025
Predictive modeling: Build models using comprehensive feature sets
Scouting reports: Generate detailed performance profiles
Contract analysis: Evaluate player value relative to salary

Getting Started

Ready to dive in? Head over to the Quick Start Guide to:

Set up your Python environment
Run your first data collection script
Query and analyze the resulting CSV data

View on GitHub

Explore the source code and contribute to the project

Get Started

Data Collections

Scripts & Automation

Data Schema

Welcome to the NBA Statistics Data Platform

Quick Start Guide

Data Sources

API Reference

Data Schema

Key Features

Multi-Source Data Aggregation

Historical Coverage (2014-2025)

Comprehensive Statistical Categories

Automated Data Pipeline

Data Output Format

Technology Stack

Use Cases

Getting Started

View on GitHub

Build docs developers (and LLMs) love

Get Started

Data Collections

Scripts & Automation

Data Schema

​Welcome to the NBA Statistics Data Platform

Quick Start Guide

Data Sources

API Reference

Data Schema

​Key Features

​Multi-Source Data Aggregation

​Historical Coverage (2014-2025)

​Comprehensive Statistical Categories

​Automated Data Pipeline

​Data Output Format

​Technology Stack

​Use Cases

​Getting Started

View on GitHub

Build docs developers (and LLMs) love

Welcome to the NBA Statistics Data Platform

Key Features

Multi-Source Data Aggregation

Historical Coverage (2014-2025)

Comprehensive Statistical Categories

Automated Data Pipeline

Data Output Format

Technology Stack

Use Cases

Getting Started