Sharing code for RAPM-style player rankings
Sharing code for RAPM-style player rankings
To begin with, apologies if I am stepping on toes here, not sure what the etiquette is on stuff like this.
I have noticed a lot of one-off blogs or threads here and there talking about methodologies for things like regularized adjusted plus minus. What I haven't seen much of (or any of) is sharing raw data or more importantly sharing CODE. I am coming from an academic background so this is what I am more used to, and I was assuming that the community would have this more open-source type of flavor to it.
So one question...am I wrong/have I not looked hard enough? I ask earnestly, as player evaluation is very interesting to me.
And one contribution: I have put some rankings together. They are not original. They are current-season regularized adjusted plus-minus rankings with three different penalty types (lasso, ridge, elastic net with alpha = 0.5) and including 1-, 2-, or 3-man combinations.
http://wclark3.github.io/basketball/
The innovation as far as I see it is that I am freely sharing my data and my code (in the data/methodology links at the top). If others are interested feel free to look at what I've written and change it/critique it/point out some silly errors I'm making/use it for whatever you like. As far as I can tell this hasn't been done and I think it would be fun to open the community up in this way.
Best,
Will
			
			
									
						
										
						I have noticed a lot of one-off blogs or threads here and there talking about methodologies for things like regularized adjusted plus minus. What I haven't seen much of (or any of) is sharing raw data or more importantly sharing CODE. I am coming from an academic background so this is what I am more used to, and I was assuming that the community would have this more open-source type of flavor to it.
So one question...am I wrong/have I not looked hard enough? I ask earnestly, as player evaluation is very interesting to me.
And one contribution: I have put some rankings together. They are not original. They are current-season regularized adjusted plus-minus rankings with three different penalty types (lasso, ridge, elastic net with alpha = 0.5) and including 1-, 2-, or 3-man combinations.
http://wclark3.github.io/basketball/
The innovation as far as I see it is that I am freely sharing my data and my code (in the data/methodology links at the top). If others are interested feel free to look at what I've written and change it/critique it/point out some silly errors I'm making/use it for whatever you like. As far as I can tell this hasn't been done and I think it would be fun to open the community up in this way.
Best,
Will
Re: Sharing code for RAPM-style player rankings
 Spectacular!
  Spectacular!Open source is very much welcome here; everything I have developed is public.
This is a wonderful contribution, and I'll have more comments once I look at everything further. I've always wondered what elastic net would look like!
Re: Sharing code for RAPM-style player rankings
That is what I call some top notch contributing. Great job.
BTW, unless I'm confused, the 3rd name is cut off (not showing) on many 3 player lineups.
			
			
									
						
										
						BTW, unless I'm confused, the 3rd name is cut off (not showing) on many 3 player lineups.
Re: Sharing code for RAPM-style player rankings
I think you probably could say Eli Witus shared "code" with his step by step guide to how to compute APM. Dan Rosenbaum shared his overall method. Tidbits have been shared over time about elements of RAPM and RPM methodology.
What are the correlations between the three methods at each level? Would you have interest in offering an average result or a blend of equal or unequal weights? Do you have any philosophical or other comments on the wisdom of blending? Are you planning preparing previous season RAPM? Which method or blend does better to explain the team results in a season? Which serves better for predictions of future seasons?
Do you plan to RAPM at factor level or any of the many dozens of different kinds of splits? Multi-season runs?
Be the first to do RAPM for games at college level between top 40 of 70 teams?
Thanks for this start and for whatever you decide to take on and share.
			
			
									
						
										
						What are the correlations between the three methods at each level? Would you have interest in offering an average result or a blend of equal or unequal weights? Do you have any philosophical or other comments on the wisdom of blending? Are you planning preparing previous season RAPM? Which method or blend does better to explain the team results in a season? Which serves better for predictions of future seasons?
Do you plan to RAPM at factor level or any of the many dozens of different kinds of splits? Multi-season runs?
Be the first to do RAPM for games at college level between top 40 of 70 teams?
Thanks for this start and for whatever you decide to take on and share.
Re: Sharing code for RAPM-style player rankings
Also, Evan Zamir showed how to calculate RAPM with Python and Spark here: http://nyloncalculus.com/2015/07/29/gue ... -a-how-to/
			
			
									
						
										
						Re: Sharing code for RAPM-style player rankings
Thanks for the link, I think that's very cool stuff. Hadn't seen that before but it's exactly what I was looking for and hoping to provideAlso, Evan Zamir showed how to calculate RAPM with Python and Spark here: http://nyloncalculus.com/2015/07/29/gue ... -a-how-to/
What I did there is include all one, two, and three player lineups, and the results are that individual and two-player combos are a lot more important. Kind of an interesting result -- the two-player synergies tend to dominate the performance effects. From a team-building point of view I think this is a useful result, that getting a powerful one-two punch may be more important than getting a powerful big three.BTW, unless I'm confused, the 3rd name is cut off (not showing) on many 3 player lineups.
Fair point, and I had seen some of those references you mentioned which were extremely helpful for getting me started. I didn't mean to say that nobody had ever shared before, just that it hadn't been shared in the way that I was (naively) expecting to seeI think you probably could say Eli Witus shared "code" with his step by step guide to how to compute APM. Dan Rosenbaum shared his overall method. Tidbits have been shared over time about elements of RAPM and RPM methodology.
I will check on the correlations, that's a good question. One thing I did notice quickly was that the mean out of sample errors were extremely similar across all models which surprised me initially. I think it makes sense given that for all models we are searching for minimum OOS error over a range of penalty values and that the range is close enough in all cases that the optimal models end up being all very similar in accuracy. So the real model selection issue is more about a "philosophy" of basketball -- i.e. if you believe in a lasso penalty then you believe that team performance is extremely driven by super-duper starts (or guys who are really terrible); if you believe in a ridge penalty you believe that role players are actually very important. In this regard I am partial to the ridge model which includes more players but I can see it both ways.What are the correlations between the three methods at each level? Would you have interest in offering an average result or a blend of equal or unequal weights?
I'm torn on this. If it turns out to give much better predictive accuracy then it is useful in some sense (but I am a little skeptical although I will look into this). But I think the farther we move into black box territory the less useful the model becomes as a descriptive tool.Do you have any philosophical or other comments on the wisdom of blending?
I think the multi-season runs are super interesting and I would like to see how rankings evolve over time. Particularly how they are influenced by model assumptions (including Bayesian priors which I haven't touched yet).Do you plan to RAPM at factor level or any of the many dozens of different kinds of splits? Multi-season runs?
This is an interesting idea although I worry about sparsity of the input matrix. I would assume that many of the top teams never play each other. The "best" case scenario is that two top teams share a conference but they still play each other only once or twice at the most. Would have to write another scraper for this anyway which I have a limited appetite for.Be the first to do RAPM for games at college level between top 40 of 70 teams?
Anyway my other "big" idea was to use the two- and three-player results to do some team-building analysis. So to take an example Dirk and Wes Matthews are the #7 ranked unit in the 2-player lasso. So if you're the Mavs, you might ask what this can tell you about how to build a team around Dirk. There are two ways you can do this. The first is to find players really similar to Wes Matthews and assume they would also pair well with Dirk. Or you could find players similar to Dirk and see who they pair well with. It ends up being kind of like what happens if you log into Netflix and see some recommended films. "Oh, you liked playing with Wes Matthews, how about you play with..." etc.
The obvious nut to to crack is judging similarity which I suspect there has been a lot of work done on already. One idea I had was using the "topography" of shot charts, so two players with nearly identical shot charts would be similar because they tend to occupy the same parts of the floor. Then you could do a search for similar players and find other potential productive pairs (or triples) that way. More of a long-term idea but would be fun to do
Sorry for being long-winded and thanks to all for the comments, fun to be part of a community that enjoys this stuff.
Will
Re: Sharing code for RAPM-style player rankings
By the way, if you want matchup data files for previous seasons, PM me.  Jerry Englemann sent me everything back to 2001 (I think it was).
			
			
									
						
										
						Re: Sharing code for RAPM-style player rankings
Awesome! Question: Why only 22 players in the 1-player-Lasso, for example?
			
			
									
						
										
						Re: Sharing code for RAPM-style player rankings
That's what LASSO does: it maximizes out of sample performance with a minimum number of variables. It's a variable selection algorithm besides regularization: http://www.mathworks.com/help/stats/las ... hworks.combbstats wrote:Awesome! Question: Why only 22 players in the 1-player-Lasso, for example?
Re: Sharing code for RAPM-style player rankings
Just did a bit of a deeper dive into some of Crow's questions about model correlation: http://wclark3.github.io/2016/02/20/bba ... -comp.html
Also added a note about model accuracy which I found really interesting (and unexpected).
			
			
									
						
										
						Also added a note about model accuracy which I found really interesting (and unexpected).
The models are all about as accurate as each other when it comes to out-of-sample predictive ability. This tells me that the models aren’t really that good at predicting. If you take the raw estimated coefficients (which are on the scale of log odds) and convert them to probabilities, the models are all saying that in most cases, most players are pretty darn close to average (this is why it’s helpful to convert to the scale of point differential per 100 possessions, so we can see how small differences accumulate over time in a meaninful way).
Anyway, the “predictions” in this context are really saying that the plus-minus that we would expect to see when a given player is playing is basically zero. So all the models are about equally bad at “predicting” performace. I think they are still very useful as a source of ordinal rankings, particularly at seeing what pairs/groups of three are particularly valuable. Plus, as I mentioned above, basketball is a sport where small differences in performance can add up in meaningful ways even over the course of one game.
Re: Sharing code for RAPM-style player rankings
Thanks for pursuing the correlation question rigorously thru two stages.
			
			
									
						
										
						- 
				permaximum
- Posts: 416
- Joined: Tue Nov 27, 2012 7:04 pm
Re: Sharing code for RAPM-style player rankings
This is how RAPM stuff should be done. Thanks for that. Sharing the code and the data must be a requirement for sensitive calculations like RAPM.
I quickly checked your data but I can't see possession info for lineups.
As for prediction accuracy, yes it's bad. Ridge should be a bit better than lasso and elastic. I believe because of the fact that public box-score metrics are even worse at predicting "next year"'s team wins, people rely on RAPM based stuff too much. Metrics show promise but they still don't come close to eye valuation. However, they have their use. Such as using them to get a rough idea about players you haven't been able to follow.
			
			
									
						
										
						I quickly checked your data but I can't see possession info for lineups.
As for prediction accuracy, yes it's bad. Ridge should be a bit better than lasso and elastic. I believe because of the fact that public box-score metrics are even worse at predicting "next year"'s team wins, people rely on RAPM based stuff too much. Metrics show promise but they still don't come close to eye valuation. However, they have their use. Such as using them to get a rough idea about players you haven't been able to follow.
Re: Sharing code for RAPM-style player rankings
For 6 top players (Curry, Draymond, Paul, LBJ, JD, and Westbrook) the average deviation across RPM and enet, lasso & ridge here is almost 6 pts per 100 possessions for Curry, 1.5 to 3.25 for the rest and an average of 2.8. Enet and lasso for these 6 players have a correlation of almost .99, but they are pretty different from the other two methods especially on Curry and Green. RPM and ridge just .66, lower than I expected and a concern to me. 
RPM and Ridge disagree the most on Russell Westbrook. (The deviation is greater on Kawhi but because he is not in the lasso set I left him out of this comparison. Same with Duncan. Lasso don't think as highly of Spurs.) Of the 24 estimates made, lasso gives LBJ, KD and Russ the lowest 3, all below 4.5. Russ gets the lowest single mark at 3.1 on lasso, KD the lowest average by a hair over Westbrook. In % terms, Westbrook's 4 estimates have the widest spread. He is the only one of the original 6 with 2 estimates below 5. (Kawhi does too.) KD the only other in original 6 with an estimate below 4. Enet estimates D Jordan below 3, Redick at 4.4 though. Klay Thompson is barely above 1 on RPM, below 2 on lasso but enet and ridge think he is up there with the original 6. His % spread is twice as wide as Westbrook's.
These adjusted plus minus estimates run counter to the box score and eye test judgments that Russ and KD are the best duo. Curry and Green on simple average of these 4 methods are estimated to be 2.5 times as impactful. KD & Russ basically similar to other top team duos.
Between RPM, ridge and some blend of the two, which more closely correlates in minutes weighted rollups for team vs. actual MOV (or SRS)?
It appears for RPM team rollups about half are within 0-1pts of SRS and the other half are generally 2-3 pts off. Is that a lot? It is more than I hoped. I wonder if it could be reduced while maintaining integrity.
			
			
									
						
										
						RPM and Ridge disagree the most on Russell Westbrook. (The deviation is greater on Kawhi but because he is not in the lasso set I left him out of this comparison. Same with Duncan. Lasso don't think as highly of Spurs.) Of the 24 estimates made, lasso gives LBJ, KD and Russ the lowest 3, all below 4.5. Russ gets the lowest single mark at 3.1 on lasso, KD the lowest average by a hair over Westbrook. In % terms, Westbrook's 4 estimates have the widest spread. He is the only one of the original 6 with 2 estimates below 5. (Kawhi does too.) KD the only other in original 6 with an estimate below 4. Enet estimates D Jordan below 3, Redick at 4.4 though. Klay Thompson is barely above 1 on RPM, below 2 on lasso but enet and ridge think he is up there with the original 6. His % spread is twice as wide as Westbrook's.
These adjusted plus minus estimates run counter to the box score and eye test judgments that Russ and KD are the best duo. Curry and Green on simple average of these 4 methods are estimated to be 2.5 times as impactful. KD & Russ basically similar to other top team duos.
Between RPM, ridge and some blend of the two, which more closely correlates in minutes weighted rollups for team vs. actual MOV (or SRS)?
It appears for RPM team rollups about half are within 0-1pts of SRS and the other half are generally 2-3 pts off. Is that a lot? It is more than I hoped. I wonder if it could be reduced while maintaining integrity.
Re: Sharing code for RAPM-style player rankings
A couple quick replies.
(A picture of the heatmap is here: https://raw.githubusercontent.com/wclar ... f_heat.png)
I would also say that the two player models are not picking up the best duos but rather the most synergistic duos. So to take the KD/Russ example, the 1-player model captures their individual effects and whatever benefit they get from playing with each other is thrown into the residual (or somewhere else in the model unintentionally). The two player model is supposed to measure the interaction between them, so that there is a KD term, a Russ term, and a KD*Russ term that captures something like the extra performance that they can coax out of each other (the whole is greater than the sum of its parts argument here).
Although a quick and dirty estimate of player pairs only (ignoring individual player coefficients) is not kind to these guys either. They only make the top 10 in the lasso/elastic net models but not the ridge. So maybe this is a point against the models. But then again maybe there is more going on than we can observe in the box score.
Anyway thanks for your comments. I'll have to unpack the rest later.
			
			
									
						
										
						All the models think not-so-well of individual Spurs but tend to think highly of the Spurs overall. One thing I have been doing (silently here) is including team effects as controls in all regressions. I am letting them be unpenalized, i.e. forced into the model. So we can pull out the team coefficients and convert onto the point diff per 100 possessions metric and here the Spurs come out way on top. Every model gives the Spurs over 20 points/100 possessions of credit when the next highest team coefficient is 11. I could be convinced that this doesn't pass the eye test, but then again people talk about Spurs basketball like it is some separate plane of existence so maybe we are just picking up on that.Lasso don't think as highly of Spurs
(A picture of the heatmap is here: https://raw.githubusercontent.com/wclar ... f_heat.png)
I would also say that the two player models are not picking up the best duos but rather the most synergistic duos. So to take the KD/Russ example, the 1-player model captures their individual effects and whatever benefit they get from playing with each other is thrown into the residual (or somewhere else in the model unintentionally). The two player model is supposed to measure the interaction between them, so that there is a KD term, a Russ term, and a KD*Russ term that captures something like the extra performance that they can coax out of each other (the whole is greater than the sum of its parts argument here).
Although a quick and dirty estimate of player pairs only (ignoring individual player coefficients) is not kind to these guys either. They only make the top 10 in the lasso/elastic net models but not the ridge. So maybe this is a point against the models. But then again maybe there is more going on than we can observe in the box score.
Anyway thanks for your comments. I'll have to unpack the rest later.
Re: Sharing code for RAPM-style player rankings
If you make an equal blend of RPM, Player Tracking Plus Minus and ridge, Curry is  #1, Green #2,  LBJ #3 just barely ahead of Kawhi at #4, then Russ, Paul and KD in a tight pack. If you add lasso and enet Paul rises to and Kawhi falls out of 4th but otherwise it is the same.