Skip to main content

Overview

Advanced metrics provide deeper insight into player and team performance through calculated statistics, play-type analysis, and adjusted metrics that account for context and competition.

Play-Type Analysis

playtype.csv

Player-level play-type statistics showing how players score in different offensive situations. File: playtype.csv / playtype_p.csv
PLAYER_ID
integer
required
Unique player identifier
Player
string
required
Player name
Team
string
Team abbreviation
TEAM_ID
integer
Team identifier
TEAM_NAME
string
Full team name
year
integer
required
Season year
playtype
string
required
Type of offensive play:
  • Isolation: 1-on-1 isolation plays
  • Transition: Fast break opportunities
  • PRBallHandler: Pick and roll ball handler
  • PRRollMan: Pick and roll roll man
  • Postup: Post-up plays
  • Spotup: Spot-up shooting
  • Handoff: Handoff plays
  • Cut: Cutting to the basket
  • OffScreen: Coming off screens
  • Putbacks: Offensive rebound putbacks
  • Misc: Miscellaneous plays
Position
string
Player position (G, F, C, or combinations)
GP
integer
Games played
% Time
float
Percentage of offensive possessions used in this play type
Poss
integer
Total possessions of this play type
Points
integer
Total points scored on this play type
PPP
float
Points per possession - primary efficiency metricElite: > 1.10 PPP
Good: 0.95-1.10 PPP
Average: 0.85-0.95 PPP
Below Average: < 0.85 PPP
FGm
integer
Field goals missed (alternative spelling in some datasets)
FGM
integer
Field goals made
FGA
integer
Field goals attempted
FG%
float
Field goal percentage
aFG%
float
Adjusted field goal percentage (accounts for 3-point value)
%TO
float
Turnover percentage - portion of possessions ending in turnover
%FT
float
Free throw percentage - portion of possessions resulting in free throws
%SF
float
Shooting foul percentage - portion of possessions drawing shooting fouls
%Score
float
Scoring frequency - portion of possessions resulting in points
FTFreq%
float
Free throw frequency (duplicate field)
TOVFreq%
float
Turnover frequency (duplicate field)
SFFreq%
float
Shooting foul frequency (duplicate field)
And OneFreq%
float
And-one frequency - made basket with foul drawn
ScoreFreq%
float
Scoring frequency (duplicate field)
Percentile
integer
Percentile rank among all players for this play type (0-100)Based on PPP efficiency relative to league average
Example:
Position,Player,Team,GP,% Time,Poss,Points,PPP,FGm,FGM,FGA,FG%,aFG%,%TO,%FT,%SF,%Score,year,playtype,...
G,Stephen Curry,GSW,79,12.4,412,478,1.16,142,178,320,55.6,58.9,11.2,8.5,7.3,62.1,2015,Transition,...

Regularized Adjusted Plus-Minus (RAPM)

rapm.csv

Advanced plus-minus metrics that estimate player impact while controlling for teammates, opponents, and game context. File: rapm.csv
playerId
integer
required
Unique player identifier
playerName
string
required
Player name
season
string
required
Season in format “YYYY-YY” (e.g., “2014-15”)
primaryKey
string
required
Unique key combining player ID and season (e.g., “101106_2009-10”)

RAPM Metrics

RAPM
float
Overall Regularized Adjusted Plus-MinusNet points per 100 possessions contributed by player relative to average, adjusted for teammates and opponents.Interpretation:
  • Elite: +3.0 or higher
  • Good: +1.5 to +3.0
  • Average: -1.0 to +1.5
  • Below Average: < -1.0
RAPM_Rank
integer
League rank by RAPM (lower is better)
RAPM__Off
float
Offensive RAPM - points added per 100 possessions on offense
RAPM__Off_Rank
integer
League rank by offensive RAPM
RAPM__Def
float
Defensive RAPM - points prevented per 100 possessions on defenseNote: Positive values indicate better defense (fewer points allowed)
RAPM__Def_Rank
integer
League rank by defensive RAPM
RAPM__intercept
float
Regression intercept for RAPM model

Luck-Adjusted RAPM (LA_RAPM)

LA_RAPM
float
Luck-adjusted RAPM - adjusts for shooting variance and randomness
LA_RAPM_Rank
integer
League rank by luck-adjusted RAPM
LA_RAPM__Off
float
Offensive luck-adjusted RAPM
LA_RAPM__Off_Rank
integer
Offensive LA_RAPM rank
LA_RAPM__Def
float
Defensive luck-adjusted RAPM
LA_RAPM__Def_Rank
integer
Defensive LA_RAPM rank
LA_RAPM__intercept
float
Regression intercept for LA_RAPM model

Component Metrics

RAPM broken down by four factors (eFG%, turnover rate, offensive rebound rate, free throw rate):
RA_EFG
float
RAPM component for effective field goal percentage
RA_EFG__Off
float
Offensive eFG% RAPM component
RA_EFG__Def
float
Defensive eFG% RAPM component
RA_FTR
float
RAPM component for free throw rate
RA_FTR__Off
float
Offensive FTR RAPM component
RA_FTR__Def
float
Defensive FTR RAPM component
RA_ORBD
float
RAPM component for offensive rebound differential
RA_ORBD__Off
float
Offensive rebound RAPM component
RA_ORBD__Def
float
Defensive rebound RAPM component
RA_TOV
float
RAPM component for turnover rate
RA_TOV__Off
float
Offensive turnover RAPM component
RA_TOV__Def
float
Defensive turnover RAPM component (turnovers forced)
All component metrics include _Rank and _intercept fields following the same pattern as overall RAPM. Example:
playerId,playerName,RAPM,RAPM_Rank,RAPM__Off,RAPM__Def,...,season,primaryKey
101106,"Andrew Bogut",3.06,18,0.49,2.57,...,2009-10,101106_2009-10

Calculated Shooting Metrics

True Shooting Percentage (TS%)

Found in: scoring.csv, totals.csv, passing.csv, team_shotzone.csv Formula: PTS / (2 * (FGA + 0.44 * FTA)) Interpretation:
  • Elite: > 60%
  • Good: 55-60%
  • Average: 52-55%
  • Below Average: < 52%
Accounts for:
  • Value of 3-point shots (worth 1.5x a 2-pointer)
  • Value of free throws (efficient scoring)
  • And-one free throws (the 0.44 coefficient)

Effective Field Goal Percentage (eFG%)

Found in: player_shooting.csv, tracking.csv, team_shooting.csv, team_shotzone.csv Formula: (FGM + 0.5 * 3PM) / FGA Interpretation:
  • Elite: > 55%
  • Good: 52-55%
  • Average: 48-52%
  • Below Average: < 48%
Adjusts field goal percentage to account for the added value of 3-point shots.

Points Per Possession (PPP)

Found in: playtype.csv, teamplay.csv Formula: Points / Possessions Interpretation:
  • Elite: > 1.10
  • Good: 0.95-1.10
  • Average: 0.85-0.95
  • Below Average: < 0.85
Primary metric for evaluating efficiency of specific play types.

Usage Examples

Finding Elite Pick and Roll Ball Handlers

import pandas as pd

# Load play-type data
playtype = pd.read_csv('playtype.csv')

# Filter for PnR ball handlers with significant volume
pr_handlers = playtype[
    (playtype['playtype'] == 'PRBallHandler') &
    (playtype['year'] == 2023) &
    (playtype['Poss'] >= 200)  # Minimum possessions
].sort_values('PPP', ascending=False)

print(pr_handlers[['Player', 'Team', 'Poss', 'PPP', 'Percentile', '%Score', '%TO']].head(15))

Analyzing RAPM Two-Way Players

# Load RAPM data
rapm = pd.read_csv('rapm.csv')

# Filter for 2022-23 season
rapm_2023 = rapm[rapm['season'] == '2022-23']

# Find elite two-way players (top 50 in both offense and defense)
two_way = rapm_2023[
    (rapm_2023['RAPM__Off_Rank'] <= 50) &
    (rapm_2023['RAPM__Def_Rank'] <= 50)
].sort_values('RAPM', ascending=False)

print(two_way[['playerName', 'RAPM', 'RAPM__Off', 'RAPM__Def', 'RAPM_Rank']])

Comparing Play-Type Efficiency by Position

# Load play-type data
playtype = pd.read_csv('playtype.csv')

# Focus on isolation plays
iso = playtype[
    (playtype['playtype'] == 'Isolation') &
    (playtype['year'] == 2023) &
    (playtype['Poss'] >= 100)
]

# Group by position
iso_by_pos = iso.groupby('Position').agg({
    'PPP': 'mean',
    'FG%': 'mean',
    '%TO': 'mean',
    'Poss': 'sum'
}).round(3)

print(iso_by_pos.sort_values('PPP', ascending=False))

True Shooting vs. Effective Field Goal Analysis

# Load scoring and totals data
scoring = pd.read_csv('scoring.csv')
totals = pd.read_csv('totals.csv')

# Merge datasets
merged = pd.merge(
    scoring[['nba_id', 'Player', 'year', 'TS%', 'PTS']],
    totals[['nba_id', 'year', 'FGA', 'FTA']],
    on=['nba_id', 'year']
)

# Calculate FTA rate
merged['FTA_rate'] = merged['FTA'] / merged['FGA']

# Players with high TS% but low FTA rate (pure shooters)
filtered_2023 = merged[
    (merged['year'] == 2023) &
    (merged['PTS'] >= 500)  # Min scoring threshold
]

# Find efficient scorers who don't rely on free throws
efficient_shooters = filtered_2023[
    (filtered_2023['TS%'] >= 60) &
    (filtered_2023['FTA_rate'] < 0.3)
].sort_values('TS%', ascending=False)

print(efficient_shooters[['Player', 'TS%', 'FTA_rate', 'PTS']])

RAPM Component Analysis

# Load RAPM data
rapm = pd.read_csv('rapm.csv')

# Focus on recent season
rapm_recent = rapm[rapm['season'] == '2022-23']

# Analyze component contributions
rapm_recent['shooting_impact'] = rapm_recent['RA_EFG__Off'] - rapm_recent['RA_EFG__Def']
rapm_recent['turnover_impact'] = rapm_recent['RA_TOV__Off'] - rapm_recent['RA_TOV__Def']
rapm_recent['rebound_impact'] = rapm_recent['RA_ORBD__Off'] - rapm_recent['RA_ORBD__Def']
rapm_recent['ft_impact'] = rapm_recent['RA_FTR__Off'] - rapm_recent['RA_FTR__Def']

# Find players who excel at specific skills
top_shooters = rapm_recent.nlargest(10, 'shooting_impact')[['playerName', 'shooting_impact', 'RAPM']]
top_playmakers = rapm_recent.nsmallest(10, 'turnover_impact')[['playerName', 'turnover_impact', 'RAPM']]

print("Top Shooting Impact:")
print(top_shooters)
print("\nTop Playmaking Impact (low turnovers):")
print(top_playmakers)

Play-Type Specialization Index

# Load play-type data
playtype = pd.read_csv('playtype.csv')

# Calculate specialization for 2023
pt_2023 = playtype[playtype['year'] == 2023]

# For each player, find their most efficient play type (min 50 poss)
player_best = pt_2023[
    pt_2023['Poss'] >= 50
].sort_values('PPP', ascending=False).groupby('PLAYER_ID').first().reset_index()

# Players who are elite specialists (>90th percentile)
specialists = player_best[
    player_best['Percentile'] >= 90
].sort_values('PPP', ascending=False)

print(specialists[['Player', 'playtype', 'PPP', 'Percentile', 'Poss']])

Integration with Core Schemas

Joining Advanced Metrics with Player Stats

# Combine RAPM with basic stats
scoring = pd.read_csv('scoring.csv')
rapm = pd.read_csv('rapm.csv')

# Extract year from season string
rapm['year'] = rapm['season'].str[:4].astype(int)

# Merge
advanced = pd.merge(
    scoring,
    rapm[['playerId', 'year', 'RAPM', 'RAPM__Off', 'RAPM__Def']],
    left_on=['nba_id', 'year'],
    right_on=['playerId', 'year'],
    how='inner'
)

print(advanced[['Player', 'year', 'PTS', 'TS%', 'RAPM']].head())

Comprehensive Player Profile

# Create complete player profile combining multiple sources
def get_player_profile(player_id, season):
    scoring = pd.read_csv('scoring.csv')
    playtype = pd.read_csv('playtype.csv')
    rapm = pd.read_csv('rapm.csv')
    hustle = pd.read_csv('hustle.csv')
    
    # Basic stats
    basic = scoring[
        (scoring['nba_id'] == player_id) &
        (scoring['year'] == season)
    ]
    
    # Play types
    plays = playtype[
        (playtype['PLAYER_ID'] == player_id) &
        (playtype['year'] == season)
    ].sort_values('Poss', ascending=False)
    
    # RAPM
    season_str = f"{season}-{str(season+1)[-2:]}"
    impact = rapm[rapm['season'] == season_str]
    
    # Hustle
    hustle_stats = hustle[
        (hustle['PLAYER_ID'] == player_id) &
        (hustle['year'] == season)
    ]
    
    return {
        'basic': basic,
        'play_types': plays,
        'impact': impact,
        'hustle': hustle_stats
    }

# Example usage
profile = get_player_profile(201939, 2023)  # Stephen Curry
print("Play Type Breakdown:")
print(profile['play_types'][['playtype', 'Poss', 'PPP', 'Percentile']])

Build docs developers (and LLMs) love