14 Year Average RAPM Dataset

Home for all your discussion of basketball statistical analysis.
J.E.
Posts: 818
Joined: Fri Apr 15, 2011 8:28 am

14 Year Average RAPM Dataset

Post by J.E. » Tue Apr 08, 2014 3:23 pm

** Split off from another thread - DSMok1 **

Not entirely sure where to put this - figured this was as good a place as any.

As requested by DSMok1 here's RAPM done with all the data I have, adjusted for age when getting the estimates (to not penalize players that played with extremely old/young teammates) but aging again added in afterwards (otherwise the estimates would reflect the case of everyone being the same age).

That should be a reasonable dataset for anyone trying to build a SPM. Data from the 2001 season until now

bbstats
Posts: 224
Joined: Thu Apr 21, 2011 8:25 pm
Location: Boone, NC
Contact:

Re: Statistical +/-

Post by bbstats » Tue Apr 08, 2014 5:41 pm

Looks useful for SPM development...but.........why is KD +1?

knarsu3
Posts: 103
Joined: Thu Apr 14, 2011 11:25 pm

Re: Statistical +/-

Post by knarsu3 » Tue Apr 08, 2014 6:06 pm

bbstats wrote:Looks useful for SPM development...but.........why is KD +1?
This is why I think it'd be great to explore some relationships between the Vantage stats and RPM/RAPM. We'd then be able to get to the "why" for each player.

J.E.
Posts: 818
Joined: Fri Apr 15, 2011 8:28 am

Re: Statistical +/-

Post by J.E. » Tue Apr 08, 2014 6:47 pm

bbstats wrote:Looks useful for SPM development...but.........why is KD +1?
RAPM is not a fan of his first two years in the league, I suppose.
Unfortunately he gets unfairly punished by having played a lot of minutes in those first two years, despite being extremely awful in raw +/-. Other rookies/2nd-year-players might not play as much when performing that way. Because he did, there's 'more statistical evidence' that he's not great (given the circumstances of the analysis, 'everybody follows the same aging curve', etc.) and the regression has a hard time giving him a more positive estimate

nbo2
Posts: 42
Joined: Thu Apr 25, 2013 4:28 am
Contact:

Re: Statistical +/-

Post by nbo2 » Tue Apr 08, 2014 7:01 pm

Looks useful for SPM development...but.........why is KD +1?
Because of his rookie and sophomore years

AcrossTheCourt
Posts: 237
Joined: Sat Feb 16, 2013 11:56 am

Re: Statistical +/-

Post by AcrossTheCourt » Tue Apr 08, 2014 9:39 pm

J.E. wrote:Not entirely sure where to put this - figured this was as good a place as any.

As requested by DSMok1 here's RAPM done with all the data I have, adjusted for age when getting the estimates (to not penalize players that played with extremely old/young teammates) but aging again added in afterwards (otherwise the estimates would reflect the case of everyone being the same age).

That should be a reasonable dataset for anyone trying to build a SPM. Data from the 2001 season until now
Will you redo this once the 2014 season is over so the complete season is used? And the entire 2001 season is in there, right? I know an earlier version only had an incomplete 2001, but that was a while ago.

DSMok1
Posts: 905
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Statistical +/-

Post by DSMok1 » Wed Apr 09, 2014 1:35 am

J.E. wrote:Not entirely sure where to put this - figured this was as good a place as any.

As requested by DSMok1 here's RAPM done with all the data I have, adjusted for age when getting the estimates (to not penalize players that played with extremely old/young teammates) but aging again added in afterwards (otherwise the estimates would reflect the case of everyone being the same age).

That should be a reasonable dataset for anyone trying to build a SPM. Data from the 2001 season until now
Excellent, excellent work, Jeremias. Great work the RPM also. This is a great set for me to revise ASPM upon.

KD really (truly) had bad numbers his first couple of years. Probably drags down his overall value because the aging curve is the same for everyone... where he's gone from awful (because of situation, mostly--played the 2 as a rookie) to MVP level.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
GodismyJudgeOK.com/DStats/
Twitter.com/DSMok1

Mike G
Posts: 4429
Joined: Fri Apr 15, 2011 12:02 am
Location: Asheville, NC

Re: Statistical +/-

Post by Mike G » Wed Apr 09, 2014 3:18 pm

J.E. wrote:Not entirely sure where to put this - figured this was as good a place as any.

As requested by DSMok1 here's RAPM done with all the data I have, adjusted for age when getting the estimates (to not penalize players that played with extremely old/young teammates) but aging again added in afterwards (otherwise the estimates would reflect the case of everyone being the same age).

That should be a reasonable dataset for anyone trying to build a SPM. Data from the 2001 season until now
Quickly looking over that link, I find:
- Of 1383 players in the interval, just 248, or 18%, are positive. That's 8 per team over 14 seasons.
- Almost 47% are below the dreaded -2.35 that denotes 'replacement level'.
- EDITs: The 18% of players who are above average account for 44% of all possessions.
- : The 47% account for 25% of all possessions.

Is there a formula to create WAR (or just Wins) from RAPM X possessions?

DSMok1
Posts: 905
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Statistical +/-

Post by DSMok1 » Wed Apr 09, 2014 4:54 pm

J.E. wrote:Not entirely sure where to put this - figured this was as good a place as any.

As requested by DSMok1 here's RAPM done with all the data I have, adjusted for age when getting the estimates (to not penalize players that played with extremely old/young teammates) but aging again added in afterwards (otherwise the estimates would reflect the case of everyone being the same age).

That should be a reasonable dataset for anyone trying to build a SPM. Data from the 2001 season until now
J.E., there are duplicate names in that table. Could you please post with your IDs also? I hope you're using BBRef IDs; that's what I'm using.

There are 5 duplicates: Chris Johnson, Glen Rice, Marcus Williams, Patrick Ewing, and Tim Hardaway.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
GodismyJudgeOK.com/DStats/
Twitter.com/DSMok1

DSMok1
Posts: 905
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Statistical +/-

Post by DSMok1 » Wed Apr 09, 2014 5:44 pm

Here's a visualization of the 14 year RAPM dataset that J.E. posted:

http://public.tableausoftware.com/profi ... 14YearRAPM
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
GodismyJudgeOK.com/DStats/
Twitter.com/DSMok1

J.E.
Posts: 818
Joined: Fri Apr 15, 2011 8:28 am

Re: Statistical +/-

Post by J.E. » Wed Apr 09, 2014 10:40 pm

Mike G wrote:Quickly looking over that link, I find:
- Of 1383 players in the interval, just 248, or 18%, are positive. That's 8 per team over 14 seasons.
- Almost 47% are below the dreaded -2.35 that denotes 'replacement level'.
- EDITs: The 18% of players who are above average account for 44% of all possessions.
- : The 47% account for 25% of all possessions.
I'm sorry, sometimes I don't shift the ratings correctly out of laziness, which is the case here. I figured it didn't matter too much with these since whoever is going to use these as a dependent variable in SPM is going to center them anyway.

So, some results of your analysis might look differently when they do get shifted correctly (although it might not make a huge difference). You can wait for me to do to it (since I'm going to re-post them anyway because of the duplicate names), or you can shift them yourself, so that

Sum (for_each_player: off_rating*poss/2) = 0, and
Sum (for_each_player: def_rating*poss/2) = 0

Crow
Posts: 6250
Joined: Thu Apr 14, 2011 11:10 pm

Re: 14 Year Average RAPM Dataset

Post by Crow » Thu Apr 10, 2014 3:24 am

With regard to the visualization, about ten of the named guys near the frontier in the quadrant that contains players positive on offense and defense have won titles whereas in the other three quadrants only one named frontier player has done so. Of course there are many unlabelled non-frontier players who may have won titles in each quad but this is a modest indication of the importance of 2 way strong leaders for winning titles, perhaps.

Mike G
Posts: 4429
Joined: Fri Apr 15, 2011 12:02 am
Location: Asheville, NC

Re: 14 Year Average RAPM Dataset

Post by Mike G » Thu Apr 10, 2014 10:42 am

Sum (for_each_player: off_rating*poss/2) = 0, and
Sum (for_each_player: def_rating*poss/2) = 0
If I take ORtg*poss/200, I should have total points above avg, yes?
In that case, the whole 1383 player-careers total 72,600 points below avg on offense, +29,300 on defense.
In 30 million possessions, it's -.24/100 on offense and +.10 on defense.
Is there a formula to create WAR (or just Wins) from RAPM X possessions?
Good question. The 2014 file from the ESPN article (other thread) shows minutes but not possessions.
Estimating possessions by Min*TmPace/48, I get the formula:
WAR = Poss*(RPM+2.35)/3199

The divisor 3199 returns an avg difference of .0354 between posted WAR and (faux) WAR using estimated possessions.
The ESPN table I copied indicates 1141 games played this year. Player WAR sum to 804.8, or .705 of that.

J.E.
Posts: 818
Joined: Fri Apr 15, 2011 8:28 am

Re: 14 Year Average RAPM Dataset

Post by J.E. » Thu Apr 10, 2014 12:15 pm

Mike G wrote:
Sum (for_each_player: off_rating*poss/2) = 0, and
Sum (for_each_player: def_rating*poss/2) = 0
If I take ORtg*poss/200, I should have total points above avg, yes?
In that case, the whole 1383 player-careers total 72,600 points below avg on offense, +29,300 on defense.
In 30 million possessions, it's -.24/100 on offense and +.10 on defense.
Looks good except for it's 15 million possessions for offense and defense each. You'd want to shift everyone's offensive rating up by .48 then, and down .2 for defense

Updated the file with basketball-reference player IDs (and shifted the ratings accordingly) http://stats-for-the-nba.appspot.com/ratings/14y.html

DSMok1
Posts: 905
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: 14 Year Average RAPM Dataset

Post by DSMok1 » Thu Apr 10, 2014 12:21 pm

Thanks a lot, J.E.!

Could you explain how players with ~0 possessions can have a variety of ratings in a pure RAPM format? Players below 30 possessions vary from -1 to -3. I would have expected them to converge to a single value (the prior) at that small a sample size.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
GodismyJudgeOK.com/DStats/
Twitter.com/DSMok1

Post Reply