Overview
Shooting analytics datasets provide granular shot-level detail including shot types (catch-and-shoot, pull-up, dribble moves), shot zones (at-rim, mid-range, corner 3s), and contextual shooting metrics. These datasets enable deep-dive analysis of shooting efficiency and shot selection.
Data Files
Catch & Shoot Catch-and-shoot performance metrics
Pull-Up Shots Pull-up jump shot statistics
Shot Zones Court zone analytics (rim, paint, mid-range, 3PT)
shotzone.csv / shotzone_ps.csv
Dribble Shooting Shots after dribble moves
dribbleshot.csv / dribbleshot_ps.csv
dribbleshots.csv (by dribble count)
Schema: Catch and Shoot
File : catchshoot.csv
Records : ~27,000+ player-season records
Source : NBA.com tracking data
Fields
Player Info
Game Stats
Shooting Metrics
NBA.com unique player identifier
Catch-and-shoot field goals made
Catch-and-shoot field goals attempted
Catch-and-shoot FG% (0-1 scale)
Points from catch-and-shoot attempts
Catch-and-shoot 3-pointers made
Catch-and-shoot 3-pointers attempted
Catch-and-shoot 3PT% (0-1 scale)
Catch-and-shoot effective FG%
Sample Data
PLAYER_ID, PLAYER_NAME, TEAM_ID, TEAM_ABBREVIATION, GP, W, L, MIN, CATCH_SHOOT_FGM, CATCH_SHOOT_FGA, CATCH_SHOOT_FG_PCT, CATCH_SHOOT_PTS, CATCH_SHOOT_FG3M, CATCH_SHOOT_FG3A, CATCH_SHOOT_FG3_PCT, CATCH_SHOOT_EFG_PCT, Season
201985, AJ Price, 1610612739, CLE, 26, 11, 15, 323.0, 10, 31, 0.323, 30, 10.0, 28.0, 0.357, 0.484, 2014-15
201166, Aaron Brooks, 1610612741, CHI, 82, 50, 32, 1885.0, 67, 157, 0.427, 198, 64.0, 152.0, 0.421, 0.631, 2014-15
Schema: Pull-Up Shots
File : pullup.csv
Structure : Similar to catch-and-shoot data
Fields follow the same pattern with PULL_UP_* prefix:
PULL_UP_FGM, PULL_UP_FGA, PULL_UP_FG_PCT
PULL_UP_FG3M, PULL_UP_FG3A, PULL_UP_FG3_PCT
PULL_UP_PTS, PULL_UP_EFG_PCT
Schema: Shot Zones (Advanced)
File : shotzone.csv / shotzone_ps.csv
Source : pbpstats.com API
Records : 50,000+ player-season records
At-Rim Metrics (< 6 feet)
Field goal attempts at rim
Shooting percentage at rim (0-1 scale)
Mid-Range Metrics
Frequency of short mid-range shots
Three-Point Metrics
Three-point attempts (excluding heaves)
Three-point makes (excluding heaves)
Corner three-point attempts
Half-court heave attempts
Advanced Efficiency
True shooting percentage (accounts for FTs)
Effective field goal percentage
Average expected points per shot (xPTS model)
SecondChanceShotQualityAvg
Shot quality on second-chance opportunities
Shot quality in penalty situations
Schema: Dribble Shooting
File : dribbleshot.csv / dribbleshot_ps.csv
Purpose : Shot performance after dribble moves (crossovers, hesitations, etc.)
Usage Examples
Elite Catch-and-Shoot Three-Point Shooters
import pandas as pd
df = pd.read_csv( 'catchshoot.csv' )
# Find elite catch-and-shoot 3PT shooters (min 100 attempts)
df[ 'year' ] = df[ 'Season' ].str.split( '-' ).str[ 1 ].astype( int ) + 2000
elite_cs = df[
(df[ 'year' ] == 2024 ) &
(df[ 'CATCH_SHOOT_FG3A' ] >= 100 )
].sort_values( 'CATCH_SHOOT_FG3_PCT' , ascending = False )
print (elite_cs[[ 'PLAYER_NAME' , 'TEAM_ABBREVIATION' , 'CATCH_SHOOT_FG3A' , 'CATCH_SHOOT_FG3_PCT' ]].head( 20 ))
Shot Zone Distribution Analysis
import pandas as pd
df = pd.read_csv( 'shotzone.csv' )
# Calculate shot distribution for a player
player_name = 'Stephen Curry'
player = df[(df[ 'Name' ] == player_name) & (df[ 'year' ] == 2024 )].iloc[ 0 ]
total_fga = player[ 'AtRimFGA' ] + player[ 'ShortMidRangeFGA' ] + player[ 'LongMidRangeFGA' ] + player[ 'NonHeaveArc3FGA' ]
print ( f "Shot Distribution for { player_name } (2024):" )
print ( f "At Rim: { player[ 'AtRimFGA' ] / total_fga * 100 :.1f} % ( { player[ 'AtRimAccuracy' ] * 100 :.1f} % FG)" )
print ( f "Short Mid: { player[ 'ShortMidRangeFGA' ] / total_fga * 100 :.1f} %" )
print ( f "Long Mid: { player[ 'LongMidRangeFGA' ] / total_fga * 100 :.1f} %" )
print ( f "Three-Point: { player[ 'NonHeaveArc3FGA' ] / total_fga * 100 :.1f} % ( { player[ 'NonHeaveFg3Pct' ] * 100 :.1f} % 3PT)" )
print ( f "Corner 3s: { player[ 'Corner3FGA' ] } attempts ( { player[ 'Corner3FGM' ] } makes)" )
Compare Catch-and-Shoot vs Pull-Up Efficiency
import pandas as pd
cs = pd.read_csv( 'catchshoot.csv' )
pu = pd.read_csv( 'pullup.csv' )
# Extract year from Season column
cs[ 'year' ] = cs[ 'Season' ].str.split( '-' ).str[ 1 ].astype( int ) + 2000
pu[ 'year' ] = pu[ 'Season' ].str.split( '-' ).str[ 1 ].astype( int ) + 2000
# Filter 2024, min volume
cs_2024 = cs[(cs[ 'year' ] == 2024 ) & (cs[ 'CATCH_SHOOT_FG3A' ] >= 100 )]
pu_2024 = pu[(pu[ 'year' ] == 2024 ) & (pu[ 'PULL_UP_FG3A' ] >= 100 )]
# Merge on player
merged = cs_2024.merge(
pu_2024[[ 'PLAYER_ID' , 'PULL_UP_FG3_PCT' , 'PULL_UP_FG3A' ]],
on = 'PLAYER_ID'
)
merged[ 'CS_vs_PU_Diff' ] = merged[ 'CATCH_SHOOT_FG3_PCT' ] - merged[ 'PULL_UP_FG3_PCT' ]
print ( "Players with Biggest Catch-and-Shoot Advantage:" )
print (merged.sort_values( 'CS_vs_PU_Diff' , ascending = False )[[ 'PLAYER_NAME' , 'CATCH_SHOOT_FG3_PCT' , 'PULL_UP_FG3_PCT' , 'CS_vs_PU_Diff' ]].head( 10 ))
import pandas as pd
df = pd.read_csv( 'shotzone.csv' )
# Compare xPTS (ShotQualityAvg) to actual efficiency
df_2024 = df[df[ 'year' ] == 2024 ].copy()
# Calculate points per FGA
df_2024[ 'ActualPPFGA' ] = df_2024[ 'Points' ] / (df_2024[ 'FG2A' ] + df_2024[ 'FG3A' ])
df_2024[ 'OverPerformance' ] = df_2024[ 'ActualPPFGA' ] - df_2024[ 'ShotQualityAvg' ]
# Find players beating expected performance
print ( "Players Exceeding Shot Quality Expectations:" )
print (df_2024.sort_values( 'OverPerformance' , ascending = False )[[ 'Name' , 'ShotQualityAvg' , 'ActualPPFGA' , 'OverPerformance' ]].head( 20 ))
Data Collection Scripts
new_tracking.py Collects catch-and-shoot, pull-up, and tracking data from NBA.com
scrape_shooting.py Shot zone and shot quality data scraper
Notes
Catch-and-shoot : Shots taken within 2 seconds of receiving a pass
Pull-up : Shots taken off the dribble
Shot Quality metrics from pbpstats.com use expected points models
Percentages in catchshoot.csv and pullup.csv are on 0-1 scale
Percentages in shotzone.csv are on 0-1 scale
The shotzone.csv file contains data from pbpstats.com which may use different player IDs than NBA.com. Cross-reference using player names when merging with other datasets.