Welcome to the NBA Statistics Data Platform
This data platform aggregates comprehensive NBA statistics from multiple authoritative sources, providing historical and current season data covering the 2014-15 through 2024-25 NBA seasons. Built with Python, the platform combines web scraping, API integration, and automated data pipelines to deliver consistent, structured datasets for advanced basketball analytics.Quick Start Guide
Get up and running in minutes with installation and your first data collection
Data Sources
Explore the three major data sources: NBA.com, Basketball Reference, and pbpstats
API Reference
Browse the complete collection of data extraction scripts and utilities
Data Schema
Understand the structure of output datasets and available metrics
Key Features
Multi-Source Data Aggregation
The platform intelligently combines data from three complementary sources:- NBA.com Stats API: Official league statistics including tracking data, shooting metrics, and advanced analytics
- Basketball Reference: Historical totals, per-possession stats, and player identification
- pbpstats.com API: Play-by-play derived metrics, on/off data, and advanced possession statistics
Historical Coverage (2014-2025)
Access 11+ seasons of NBA data organized by:- Regular Season: Complete data for all regular season games
- Playoffs: Separate datasets for postseason performance
- Year-over-year structure: Data organized in year-based directories (
2014/,2015/, …,2025/)
Comprehensive Statistical Categories
Player Tracking & Movement
Player Tracking & Movement
- Speed and distance metrics
- Touches and time of possession
- Drives and paint touches
- Hustle statistics (deflections, loose balls, screen assists)
Shooting & Scoring
Shooting & Scoring
- Shot quality by defender distance (Very Tight, Tight, Open, Wide Open)
- Dribble-based shooting splits (0, 1, 2, 3-6, 7+ dribbles)
- Catch & shoot vs. pull-up shooting
- Shot zone frequency and efficiency
Passing & Playmaking
Passing & Playmaking
- Assists by type (3PT, at-rim, mid-range)
- Potential assists and assist conversion rates
- Secondary assists
- Bad pass turnovers and steals
Defense
Defense
- Opponent field goal percentage by distance
- Rim protection metrics (frequency, accuracy)
- Defensive matchup data
- Team defensive tracking
Contract & Salary Data
Contract & Salary Data
- Player salaries by season
- Team cap holds and dead money
- Contract options (player/team)
- Cap space analysis
Automated Data Pipeline
The platform includes Python scripts that:- Make authenticated API requests with proper headers
- Respect rate limits with built-in delays
- Handle pagination and multi-year data collection
- Automatically merge new data with historical datasets
- Generate unified master CSV files for easy analysis
Data Output Format
All data is stored as CSV files with consistent schemas:Master CSV files contain consolidated data from all seasons, while year-specific directories contain individual season data.
Technology Stack
The platform is built with:- Python 3.x: Core scripting language
- pandas 1.5.3: Data manipulation and CSV operations
- requests 2.32.3: HTTP requests to APIs
- BeautifulSoup4 4.12.3: HTML parsing for web scraping
- nba_api 1.6.1: Official NBA.com API wrapper
- plotly 5.23.0: Data visualization (optional)
Use Cases
This data platform supports:- Player evaluation: Compare players across multiple statistical dimensions
- Team analytics: Analyze team offensive and defensive tendencies
- Historical trends: Track how the game has evolved from 2014-2025
- Predictive modeling: Build models using comprehensive feature sets
- Scouting reports: Generate detailed performance profiles
- Contract analysis: Evaluate player value relative to salary
Getting Started
Ready to dive in? Head over to the Quick Start Guide to:- Set up your Python environment
- Run your first data collection script
- Query and analyze the resulting CSV data
View on GitHub
Explore the source code and contribute to the project