14 Year Average RAPM Dataset
14 Year Average RAPM Dataset
** Split off from another thread - DSMok1 **
Not entirely sure where to put this - figured this was as good a place as any.
As requested by DSMok1 here's RAPM done with all the data I have, adjusted for age when getting the estimates (to not penalize players that played with extremely old/young teammates) but aging again added in afterwards (otherwise the estimates would reflect the case of everyone being the same age).
That should be a reasonable dataset for anyone trying to build a SPM. Data from the 2001 season until now
Not entirely sure where to put this - figured this was as good a place as any.
As requested by DSMok1 here's RAPM done with all the data I have, adjusted for age when getting the estimates (to not penalize players that played with extremely old/young teammates) but aging again added in afterwards (otherwise the estimates would reflect the case of everyone being the same age).
That should be a reasonable dataset for anyone trying to build a SPM. Data from the 2001 season until now
Re: Statistical +/-
Looks useful for SPM development...but.........why is KD +1?
Re: Statistical +/-
This is why I think it'd be great to explore some relationships between the Vantage stats and RPM/RAPM. We'd then be able to get to the "why" for each player.bbstats wrote:Looks useful for SPM development...but.........why is KD +1?
Re: Statistical +/-
RAPM is not a fan of his first two years in the league, I suppose.bbstats wrote:Looks useful for SPM development...but.........why is KD +1?
Unfortunately he gets unfairly punished by having played a lot of minutes in those first two years, despite being extremely awful in raw +/-. Other rookies/2nd-year-players might not play as much when performing that way. Because he did, there's 'more statistical evidence' that he's not great (given the circumstances of the analysis, 'everybody follows the same aging curve', etc.) and the regression has a hard time giving him a more positive estimate
Re: Statistical +/-
Because of his rookie and sophomore yearsLooks useful for SPM development...but.........why is KD +1?
-
- Posts: 237
- Joined: Sat Feb 16, 2013 11:56 am
Re: Statistical +/-
Will you redo this once the 2014 season is over so the complete season is used? And the entire 2001 season is in there, right? I know an earlier version only had an incomplete 2001, but that was a while ago.J.E. wrote:Not entirely sure where to put this - figured this was as good a place as any.
As requested by DSMok1 here's RAPM done with all the data I have, adjusted for age when getting the estimates (to not penalize players that played with extremely old/young teammates) but aging again added in afterwards (otherwise the estimates would reflect the case of everyone being the same age).
That should be a reasonable dataset for anyone trying to build a SPM. Data from the 2001 season until now
Re: Statistical +/-
Excellent, excellent work, Jeremias. Great work the RPM also. This is a great set for me to revise ASPM upon.J.E. wrote:Not entirely sure where to put this - figured this was as good a place as any.
As requested by DSMok1 here's RAPM done with all the data I have, adjusted for age when getting the estimates (to not penalize players that played with extremely old/young teammates) but aging again added in afterwards (otherwise the estimates would reflect the case of everyone being the same age).
That should be a reasonable dataset for anyone trying to build a SPM. Data from the 2001 season until now
KD really (truly) had bad numbers his first couple of years. Probably drags down his overall value because the aging curve is the same for everyone... where he's gone from awful (because of situation, mostly--played the 2 as a rookie) to MVP level.
Re: Statistical +/-
Quickly looking over that link, I find:J.E. wrote:Not entirely sure where to put this - figured this was as good a place as any.
As requested by DSMok1 here's RAPM done with all the data I have, adjusted for age when getting the estimates (to not penalize players that played with extremely old/young teammates) but aging again added in afterwards (otherwise the estimates would reflect the case of everyone being the same age).
That should be a reasonable dataset for anyone trying to build a SPM. Data from the 2001 season until now
- Of 1383 players in the interval, just 248, or 18%, are positive. That's 8 per team over 14 seasons.
- Almost 47% are below the dreaded -2.35 that denotes 'replacement level'.
- EDITs: The 18% of players who are above average account for 44% of all possessions.
- : The 47% account for 25% of all possessions.
Is there a formula to create WAR (or just Wins) from RAPM X possessions?
Re: Statistical +/-
J.E., there are duplicate names in that table. Could you please post with your IDs also? I hope you're using BBRef IDs; that's what I'm using.J.E. wrote:Not entirely sure where to put this - figured this was as good a place as any.
As requested by DSMok1 here's RAPM done with all the data I have, adjusted for age when getting the estimates (to not penalize players that played with extremely old/young teammates) but aging again added in afterwards (otherwise the estimates would reflect the case of everyone being the same age).
That should be a reasonable dataset for anyone trying to build a SPM. Data from the 2001 season until now
There are 5 duplicates: Chris Johnson, Glen Rice, Marcus Williams, Patrick Ewing, and Tim Hardaway.
Re: Statistical +/-
Here's a visualization of the 14 year RAPM dataset that J.E. posted:
http://public.tableausoftware.com/profi ... 14YearRAPM
http://public.tableausoftware.com/profi ... 14YearRAPM
Re: Statistical +/-
I'm sorry, sometimes I don't shift the ratings correctly out of laziness, which is the case here. I figured it didn't matter too much with these since whoever is going to use these as a dependent variable in SPM is going to center them anyway.Mike G wrote:Quickly looking over that link, I find:
- Of 1383 players in the interval, just 248, or 18%, are positive. That's 8 per team over 14 seasons.
- Almost 47% are below the dreaded -2.35 that denotes 'replacement level'.
- EDITs: The 18% of players who are above average account for 44% of all possessions.
- : The 47% account for 25% of all possessions.
So, some results of your analysis might look differently when they do get shifted correctly (although it might not make a huge difference). You can wait for me to do to it (since I'm going to re-post them anyway because of the duplicate names), or you can shift them yourself, so that
Sum (for_each_player: off_rating*poss/2) = 0, and
Sum (for_each_player: def_rating*poss/2) = 0
Re: 14 Year Average RAPM Dataset
With regard to the visualization, about ten of the named guys near the frontier in the quadrant that contains players positive on offense and defense have won titles whereas in the other three quadrants only one named frontier player has done so. Of course there are many unlabelled non-frontier players who may have won titles in each quad but this is a modest indication of the importance of 2 way strong leaders for winning titles, perhaps.
Re: 14 Year Average RAPM Dataset
If I take ORtg*poss/200, I should have total points above avg, yes?Sum (for_each_player: off_rating*poss/2) = 0, and
Sum (for_each_player: def_rating*poss/2) = 0
In that case, the whole 1383 player-careers total 72,600 points below avg on offense, +29,300 on defense.
In 30 million possessions, it's -.24/100 on offense and +.10 on defense.
Good question. The 2014 file from the ESPN article (other thread) shows minutes but not possessions.Is there a formula to create WAR (or just Wins) from RAPM X possessions?
Estimating possessions by Min*TmPace/48, I get the formula:
WAR = Poss*(RPM+2.35)/3199
The divisor 3199 returns an avg difference of .0354 between posted WAR and (faux) WAR using estimated possessions.
The ESPN table I copied indicates 1141 games played this year. Player WAR sum to 804.8, or .705 of that.
Re: 14 Year Average RAPM Dataset
Looks good except for it's 15 million possessions for offense and defense each. You'd want to shift everyone's offensive rating up by .48 then, and down .2 for defenseMike G wrote:If I take ORtg*poss/200, I should have total points above avg, yes?Sum (for_each_player: off_rating*poss/2) = 0, and
Sum (for_each_player: def_rating*poss/2) = 0
In that case, the whole 1383 player-careers total 72,600 points below avg on offense, +29,300 on defense.
In 30 million possessions, it's -.24/100 on offense and +.10 on defense.
Updated the file with basketball-reference player IDs (and shifted the ratings accordingly) http://stats-for-the-nba.appspot.com/ratings/14y.html
Re: 14 Year Average RAPM Dataset
Thanks a lot, J.E.!
Could you explain how players with ~0 possessions can have a variety of ratings in a pure RAPM format? Players below 30 possessions vary from -1 to -3. I would have expected them to converge to a single value (the prior) at that small a sample size.
Could you explain how players with ~0 possessions can have a variety of ratings in a pure RAPM format? Players below 30 possessions vary from -1 to -3. I would have expected them to converge to a single value (the prior) at that small a sample size.