vanilla RAPM
vanilla RAPM
Back by popular demand. You can find it here
http://stats-for-the-nba.appspot.com/va ... -2007.html
http://stats-for-the-nba.appspot.com/va ... -2008.html
http://stats-for-the-nba.appspot.com/va ... -2009.html
http://stats-for-the-nba.appspot.com/va ... -2010.html
http://stats-for-the-nba.appspot.com/va ... -2011.html
http://stats-for-the-nba.appspot.com/va ... -2012.html
http://stats-for-the-nba.appspot.com/va ... -2013.html
"2004-2007" is multiyear RAPM, where 2004 gets weighted with 1/8, 2005 with 1/4, 2006 with 1/2 and 2007 with 1. All other files accordingly. Doesn't use priors anywhere, so it's probably a little kinder to rookies, and free of any BoxScore data
http://stats-for-the-nba.appspot.com/va ... -2007.html
http://stats-for-the-nba.appspot.com/va ... -2008.html
http://stats-for-the-nba.appspot.com/va ... -2009.html
http://stats-for-the-nba.appspot.com/va ... -2010.html
http://stats-for-the-nba.appspot.com/va ... -2011.html
http://stats-for-the-nba.appspot.com/va ... -2012.html
http://stats-for-the-nba.appspot.com/va ... -2013.html
"2004-2007" is multiyear RAPM, where 2004 gets weighted with 1/8, 2005 with 1/4, 2006 with 1/2 and 2007 with 1. All other files accordingly. Doesn't use priors anywhere, so it's probably a little kinder to rookies, and free of any BoxScore data
-
- Posts: 416
- Joined: Tue Nov 27, 2012 7:04 pm
Re: vanilla RAPM
Great. Just when I found out how to do it properly 
Could you share your decision on minutes cutoff and cross validation method?

Could you share your decision on minutes cutoff and cross validation method?
-
- Posts: 105
- Joined: Thu Jul 26, 2012 8:49 pm
- Location: Dallas, TX
Re: vanilla RAPM
Great, thanks J.E.
Re: vanilla RAPM
You don't need any minute cutoff with ridge (provided you have a reasonable lambda).permaximum wrote:Great. Just when I found out how to do it properly
Could you share your decision on minutes cutoff and cross validation method?
As for crossvalidation, I'd randomly remove observations from the data and then "forecast" the left out observations later with the computed ß's. You could do 10-fold CV, but 4-fold already seems enough if you want to save a bit of computing time
-
- Posts: 416
- Joined: Tue Nov 27, 2012 7:04 pm
Re: vanilla RAPM
Thanks for the information. I found out that lambda pretty much stabilizes at 100-fold cv. It takes more time ofc but not that bad. I don't like getting different (close but different anyway) lambda values for each time I use 10-fold cv.J.E. wrote:You don't need any minute cutoff with ridge (provided you have a reasonable lambda).permaximum wrote:Great. Just when I found out how to do it properly
Could you share your decision on minutes cutoff and cross validation method?
As for crossvalidation, I'd randomly remove observations from the data and then "forecast" the left out observations later with the computed ß's. You could do 10-fold CV, but 4-fold already seems enough if you want to save a bit of computing time
As for cutoff, I decided to go with it simply because of this paragraph here. (http://www.nbastuffer.com/component/opt ... /catid,42/)
I think I understood the sentence wrong because of my EnglishRAPM is about twice as accurate as an APM using standard regression and using 3 years of data, where the weighting of past years of data and the reference player minutes cutoff has also been carefully optimized.

Re: vanilla RAPM
For the actual computation of the ß's a minute cutoff isn't necessary. When displaying results you might want to use a cutoff, so people aren't tempted to call _unknown_player_X the best player in the league ("because RAPM said so"). Or just include each player's minutes or # of possessions with the RAPM results and add a disclaimer that there's higher uncertainty with lower minutes.permaximum wrote:do you think for 1-year RAPM cutoff isn't needed too?
Kosta Koufos is #1 right now in 1y RAPM, I think.
I don't exactly know what you're using those for, but I'd just use more years of data, if you want to avoid "weird names at the top" and whatnot
-
- Posts: 416
- Joined: Tue Nov 27, 2012 7:04 pm
Re: vanilla RAPM
Thanks for all the info again. Then, I won't use minute cutoff for the ridge but qualify players results by minutes for the final results in the end.
I have a simple player box-score metric which I believe near best you can get with raw box scores. I want to check which players' negative or positive effect won't translate to box scores to get a rough idea of those things that box score misses. It will also be useful to see what type of players they are. Then I will decide what to do with the results.
I have a simple player box-score metric which I believe near best you can get with raw box scores. I want to check which players' negative or positive effect won't translate to box scores to get a rough idea of those things that box score misses. It will also be useful to see what type of players they are. Then I will decide what to do with the results.
Re: vanilla RAPM
You might be interested in a couple of articles I wrote: http://godismyjudgeok.com/DStats/2012/n ... -via-rapm/ and http://godismyjudgeok.com/DStats/2012/n ... pm-part-2/permaximum wrote:Thanks for all the info again. Then, I won't use minute cutoff for the ridge but qualify players results by minutes for the final results in the end.
I have a simple player box-score metric which I believe near best you can get with raw box scores. I want to check which players' negative or positive effect won't translate to box scores to get a rough idea of those things that box score misses. It will also be useful to see what type of players they are. Then I will decide what to do with the results.
-
- Posts: 416
- Joined: Tue Nov 27, 2012 7:04 pm
Re: vanilla RAPM
I read those articles before and I think they prove the general consensus. They are also on par with my findings.DSMok1 wrote:You might be interested in a couple of articles I wrote: http://godismyjudgeok.com/DStats/2012/n ... -via-rapm/ and http://godismyjudgeok.com/DStats/2012/n ... pm-part-2/
I recently have come up with RAPM ratings of the last full regular season (2010-11). Kevin Love is extremely overrated in my PTR and PER. As you guess pure point guards are underrated. There are also very good non-stat defenders PTR, PER or other box score ratings miss. Here's top 10 RAPM, DRAPM and ORAPM of 2010-11 regular season. (minimum 1238 minutes)
Code: Select all
Player RAPM
1. Nowitzki, Dirk 4.06 (Finals MVP)
2. Garnett, Kevin 3.87
3. Ginobili, Manu 3.73
4. Collison, Nick 3.33
5. Pierce, Paul 3.04
6. Bosh, Chris 2.93
7. Duncan, Tim 2.69
8. James, LeBron 2.63
9. Howard, Dwight 2.57
10.Chandler, Tyson 2.55
...
21.Rose, Derrick 1.91 (MVP)
Code: Select all
Player DRAPM
1. Garnett, Kevin 2.78
2. Brewer, Ronnie 2.52
3. Arthur, Darrell 2.16
4. Duncan, Tim 1.98
5. Allen, Tony 1.96
6. Howard, Dwight 1.77 (DPOY)
7. Bass, Brandon 1.75
8. Pierce, Paul 1.75
9. Livingston, Shaun 1.73
10.Dooling, Keyon 1.69
Code: Select all
Player ORAPM
1. Nowitzki, Dirk 2.49
2. Nash, Steve 2.40
3. Ginobili, Manu 2.40
4. Wade, Dwyane 2.33
5. Smith, J.R. 1.95
6. Collison, Nick 1.90
7. Bonner, Matt 1.85
8. Lawson, Ty 1.77
9. Davis, Baron 1.68
10.Bosh, Chris 1.65
Also, I have come to a conclusion that no box score or PBP model (advanced or not) can give better results than RAPM. In the midseason I would use prior-informed RAPM(xRAPM). For multiple years, RAPM and age-weighted RAPM(are there any?). Towards the end of season and after that, normal RAPM. Box-score metrics should only be used in the beginning of season imo.
Re: vanilla RAPM
First question: What dataset did you use?
In my experience, a blended version of a boxscore-based model and a RAPM model gives the best results in terms of prediction and explanation. When I blended my SPM with Jerry's previously published prior informed RAPM results, I got the best result.
The blend of xRAPM and my SPM showed to be a worse and ended up with a clear bias towards bigger players, which was not seen in such a fashion for a 10 yr dataset of blended SPM+prior informed RAPM.
But overall you need the boxscore informations, because that is the only way to determine production and efficiency for individual players. And you really need that information, because otherwise you will have a systematic error in the model regarding low and high usage players. You can't just increase the offensive load for an individual player and expect that the efficiency stays the same. But without the boxscore information you don't know who is doing what on the court. RAPM gives you an impact number for players in the used situation. Even if you have some sort of position information added, picking the highest RAPM players for each position may very well not result into the best overall team performance, because you don't know anything about balance and fit; RAPM is not helping you to differentiate players here and does not enable you to find a balanced and fitting lineup.
How did you come to that conclusion?permaximum wrote: Also, I have come to a conclusion that no box score or PBP model (advanced or not) can give better results than RAPM.
In my experience, a blended version of a boxscore-based model and a RAPM model gives the best results in terms of prediction and explanation. When I blended my SPM with Jerry's previously published prior informed RAPM results, I got the best result.
The blend of xRAPM and my SPM showed to be a worse and ended up with a clear bias towards bigger players, which was not seen in such a fashion for a 10 yr dataset of blended SPM+prior informed RAPM.
Boxscore-based models can be as good as a predictor as prior informed RAPM in terms of point differential. The bias towards offense of the boxscore and the lack of informations about team, help and weakside defense makes the differentation between offense and defense rather difficult for individual players. There isn't much of a chance of crediting the correct player with the defensive impact via boxscore in more than 50% of the cases and mostly the rebounder, blocker and the stealer will get the defensive credit. That makes a prediction of the team defensive strength based on individual boxscore-based models basically a coin flip.permaximum wrote:Box-score metrics should only be used in the beginning of season imo.
But overall you need the boxscore informations, because that is the only way to determine production and efficiency for individual players. And you really need that information, because otherwise you will have a systematic error in the model regarding low and high usage players. You can't just increase the offensive load for an individual player and expect that the efficiency stays the same. But without the boxscore information you don't know who is doing what on the court. RAPM gives you an impact number for players in the used situation. Even if you have some sort of position information added, picking the highest RAPM players for each position may very well not result into the best overall team performance, because you don't know anything about balance and fit; RAPM is not helping you to differentiate players here and does not enable you to find a balanced and fitting lineup.
Re: vanilla RAPM
If you run a retrodiction contest, with any flavor of RAPM, ASPM, SPM, any version of plus/minus stats, it would be straight forward to calculate the RMSE at lineup level. That would help identify how good each of those are.permaximum wrote: I can say I'm very satisfied with these results. In the end of each season, I will never look at any player metric but RAPM to evaluate player performance, vote for MVP, DPOY etc. However, in midseason, RAPM won't give very accurate results because of less data. There comes prediction, prior-informed RAPM thus J.E.'s multiyear-RAPM informed RAPM. He says xRAPM is better at that, so I take his word. Towards the end of season, one year uninformed RAPM should give more accurate results than xRAPM as far as seasonal player evaluation goes.
Also, I have come to a conclusion that no box score or PBP model (advanced or not) can give better results than RAPM. In the midseason I would use prior-informed RAPM(xRAPM). For multiple years, RAPM and age-weighted RAPM(are there any?). Towards the end of season and after that, normal RAPM. Box-score metrics should only be used in the beginning of season imo.
I think you might be surprised how long it takes RAPM to stabilize. Check out Alex's Retrodiction contest for one way of looking at the issue: http://sportskeptic.wordpress.com/tag/aspm/
-
- Posts: 416
- Joined: Tue Nov 27, 2012 7:04 pm
Re: vanilla RAPM
Dataset for what? If you mean RAPM and box-score comparison, I can say I compared 1996/97 - 2012-2013 seasonal xRAPM to PER, PTR and 2007/08 - 2011/12 seasonal uninformed 1 year RAPM to PTR and PER. But I accept I didn't compare RAPM to any type of SPM blend. However in theory, SPM-RAPM blend should have potential to be better at prediction (supported by xRAPM too whcih involves box score and I already pointed out that it's better at prediction). When it comes to explenation, I can't see any chance it can surpass RAPM over the long term. I don't have any proof for that. It's just theory. Still, if you can come up with better ratings (to whom anyway) than RAPM for explanation of previous seasons' player performances, I'll gladly accept it.mystic wrote:First question: What dataset did you use?
How did you come to that conclusion?permaximum wrote: Also, I have come to a conclusion that no box score or PBP model (advanced or not) can give better results than RAPM.
In my experience, a blended version of a boxscore-based model and a RAPM model gives the best results in terms of prediction and explanation. When I blended my SPM with Jerry's previously published prior informed RAPM results, I got the best result.
The blend of xRAPM and my SPM showed to be a worse and ended up with a clear bias towards bigger players, which was not seen in such a fashion for a 10 yr dataset of blended SPM+prior informed RAPM.
Boxscore-based models can be as good as a predictor as prior informed RAPM in terms of point differential. The bias towards offense of the boxscore and the lack of informations about team, help and weakside defense makes the differentation between offense and defense rather difficult for individual players. There isn't much of a chance of crediting the correct player with the defensive impact via boxscore in more than 50% of the cases and mostly the rebounder, blocker and the stealer will get the defensive credit. That makes a prediction of the team defensive strength based on individual boxscore-based models basically a coin flip.permaximum wrote:Box-score metrics should only be used in the beginning of season imo.
But overall you need the boxscore informations, because that is the only way to determine production and efficiency for individual players. And you really need that information, because otherwise you will have a systematic error in the model regarding low and high usage players. You can't just increase the offensive load for an individual player and expect that the efficiency stays the same. But without the boxscore information you don't know who is doing what on the court. RAPM gives you an impact number for players in the used situation. Even if you have some sort of position information added, picking the highest RAPM players for each position may very well not result into the best overall team performance, because you don't know anything about balance and fit; RAPM is not helping you to differentiate players here and does not enable you to find a balanced and fitting lineup.
I also agree with you that RAPM has no business with efficiency and only box scores can give us a rough idea about that.(In fact I generally agree with your points except player performance explanation). With RAPM we assume every player has been used at their maximum efficiency and performance/usage by coaches. Box score's this advantage won't translate to better player performance explanation because we judge the value of players in a year regardess of how he has been used by their coaches.
In the end, except explanation of player performances in a season, I agree with everything you say.
@DSMoK1
Nice suggestion but I think J.E. has probably done it and I assume nothing can pass xRAPM at prediction atm although xRAPM's box score weights are out of place because of inclusion of very similar stats, unnecessary stats and especially the seperation of box score weights by defense and offense (I talked about why those weights are wrong a bit in my thread and mystic mentioned defense-offense issue in his previous post too). Which means I agree with you that box score involvement make things better for prediction. BTW I really like your site. I'm a webmaster in fact (but my sites aren't related to any sports) and I think you should be involved with SEO to draw high traffic and reach to casual audiences. Your site would be very helpful to inform people.
Stabilization of RAPM! Just what I needed. I think I'm gonna thank you for the 10th time or more

Edit: The author of that article's effort on this subject is great. Unfortunately, although his technical knowledge is beyond mine (I recently got involved with these things), I see lots of wrongs in the article. In short, you can't test explanatory and predictive power of those metrics that way.
-
- Posts: 105
- Joined: Thu Jul 26, 2012 8:49 pm
- Location: Dallas, TX
Re: vanilla RAPM
J.E.,
I noticed you've had these down for some time now. Do you have plans to put them back up at some point?
James
I noticed you've had these down for some time now. Do you have plans to put them back up at some point?
James