Joe Sill's "Improved NBA Adjusted +/- "

Home for all your discussion of basketball statistical analysis.
Post Reply
l_davies93
Posts: 7
Joined: Sat Aug 02, 2014 7:28 am

Joe Sill's "Improved NBA Adjusted +/- "

Post by l_davies93 »

Hey guys, I was wondering if anybody had Sill's "Improved NBA Adjusted +/- Using Regularization and Out-of-Sample Testing" as it's unavailable online. Any help is much appreciated.

Also, there's a cool new paper out here (http://arxiv.org/pdf/1408.0777v1.pdf) which gives some great insight into how SportsVU can be used. I only started reading it, but I'm 15 pages in and it's nice to see some stochastic modelling as opposed to non-stop regressional analysis. I could perhaps make a one page summary, essentially an extended abstract, and post it here once I've finished if people wanted (I only just joined this forum so I'm not 100% sure what kind of things I should post).
Crow
Posts: 10536
Joined: Thu Apr 14, 2011 11:10 pm

Re: Joe Sill's "Improved NBA Adjusted +/- "

Post by Crow »

Hey. A summary would be fine. Wasn't this presented at Sloan Conference? I believe there was some earlier discussion about it here; but I haven't looked it up.

In general, post what interests you. More is usually better to me. Worst case it will just sit there quietly.

I have / had Joe's published data but it is sitting on a hard drive of a damaged or dead computer and not currently available. There is a site on the internet that collected some past apm data. It is probably mentioned in link section. Maybe it is there. Or maybe someone else can help.
l_davies93
Posts: 7
Joined: Sat Aug 02, 2014 7:28 am

Re: Joe Sill's "Improved NBA Adjusted +/- "

Post by l_davies93 »

Crow wrote:Hey. A summary would be cool. Wasn't this presented at Sloan Conference? I believe there was some earlier discussion about it here; but I haven't looked it up.

Oh I didn't realise. The paper itself was released less than a week ago, but they may have had some presentation/report on their initial findings which I didn't look up.

Crow wrote:There is a site on the internet that collected some past apm data. It is probably mentioned in link section. Maybe it is there. Or maybe someone else can help.
Thanks, but I was actually looking for the paper itself (sorry I should have made that clear) so I can see the exact methodology he used. I know he used ridge regression and I'm guessing by the "Out-of-Sample Testing" that some form of cross-validation was used to select lambda (I'm guessing k-fold), but that's all I know.

Thanks for the reply, buddy.
Crow
Posts: 10536
Joined: Thu Apr 14, 2011 11:10 pm

Re: Joe Sill's "Improved NBA Adjusted +/- "

Post by Crow »

I had the paper too but it is a moot point until I try to get the computer fixed (probably not soon).

Here is the thread I mentioned:

viewtopic.php?f=2&t=8473
knarsu3
Posts: 116
Joined: Thu Apr 14, 2011 11:25 pm

Re: Joe Sill's "Improved NBA Adjusted +/- "

Post by knarsu3 »

I have the paper. Send me a PM with your email and I can send it.
l_davies93
Posts: 7
Joined: Sat Aug 02, 2014 7:28 am

Re: Joe Sill's "Improved NBA Adjusted +/- "

Post by l_davies93 »

Crow wrote:I had the paper too but it is a moot point until I try to get the computer fixed (probably not soon).

Here is the thread I mentioned:

viewtopic.php?f=2&t=8473
Oh cool, I never saw this. Thanks for the link!
mtamada
Posts: 163
Joined: Thu Apr 14, 2011 11:35 pm

Re: Joe Sill's "Improved NBA Adjusted +/- "

Post by mtamada »

Thanks for the link to the paper. Lots of good stuff in there. What continues to be not so good is their attempt to evaluate individual players.

They're doing exciting work, literally tracking and modeling all ten players and their locations and velocity, measured with respect to the location of the basket and the ball, and calculating their EPV from all that. There's some really cool and complex stuff there.

But when it comes to calculating individual players' contributions, i.e. their EPV-Added (EPVA) statistic, if I'm understanding Appendix B they're still doing what they did at the Sloan conference: attributing all of the change in EPV to the player with the ball. The contributions, good and bad, of the other players such as setting picks, being in position for the kick-out for the 3-pointer, etc. are ignored in the EPVA calculation.

And EPVA still hates Ricky Rubio, rating him the worst player in EPVA; and it still loves Nowitzki, still rating him as the top or second from the top.

But in this latest article, Chris Paul is no longer #1 indeed he's no longer in the top 10. And Kevin Love, who was rated as 2nd worst in EPVA at the Sloan presentation, is now rated as having the 5th highest EPVA! LeBron James now cracks the EPVA top ten, but just barely: he's #10.

So the EPVA ratings look more plausible than before but still look not ready for prime time. EPVA lacks a plausible mechanism for allocating credit and blame among the teammates; instead it simplistically attributes everything to the ballhandler.

But the rest of the EPV stuff looks like a solid start at analyzing this stuff. There's still many more steps to do, e..g using the video data to analyze rebounding, but it looks like good stuff.


From the article, here's their new list of the top ten and bottom ten in EPVA:

Player EPVA
Dirk Nowitzki 6.08
Kevin Durant 6.08
Jose Calderon 5.33
Damian Lillard 5.28
Kevin Love 5.13
Stephen Curry 4.63
Channing Frye 4.58
Kyle Lowry 4.50
Paul George 4.40
LeBron James 4.38

Player EPVA
Ricky Rubio -0.07
Luke Ridnour 0.18
Tayshaun Prince 0.26
Shaun Livingston 0.38
Beno Udrih 0.47
P.J. Tucker 0.55
Al-Farouq Aminu 0.59
Andre Miller 0.68
Gerald Henderson 0.71
Cody Zeller 0.71
l_davies93
Posts: 7
Joined: Sat Aug 02, 2014 7:28 am

Re: Joe Sill's "Improved NBA Adjusted +/- "

Post by l_davies93 »

mtamada wrote: But when it comes to calculating individual players' contributions, i.e. their EPV-Added (EPVA) statistic, if I'm understanding Appendix B they're still doing what they did at the Sloan conference: attributing all of the change in EPV to the player with the ball. The contributions, good and bad, of the other players such as setting picks, being in position for the kick-out for the 3-pointer, etc. are ignored in the EPVA calculation.

And EPVA still hates Ricky Rubio, rating him the worst player in EPVA; and it still loves Nowitzki, still rating him as the top or second from the top.

But in this latest article, Chris Paul is no longer #1 indeed he's no longer in the top 10. And Kevin Love, who was rated as 2nd worst in EPVA at the Sloan presentation, is now rated as having the 5th highest EPVA! LeBron James now cracks the EPVA top ten, but just barely: he's #10.

So the EPVA ratings look more plausible than before but still look not ready for prime time. EPVA lacks a plausible mechanism for allocating credit and blame among the teammates; instead it simplistically attributes everything to the ballhandler.




I think the EPVA isn't too important personally. I was more concerned with some of the definitions of the stochastic modelling. Although there are inconsistencies in their definition of the stopping time, which they don't acknowledge, and I'm not sure modelling EPV as a martingale is suitable (don't players and coaches pick up on the tendencies of their opponents as the game goes on?), I think they did a really good job of modelling all of this data in a simple, but effective manner. I just found the idea of combining your coarsened data, with the full resolution data via Monte Carlo, really interesting and innovative. I guess I just really enjoyed reading this paper after having read a lot of others which were simply some form of basic regression analysis with some suspicious assumptions haha.

They do acknowledge the limits of EPVA and I really don't get the impression they take it too seriously. They do say that they could evolve this to take into account a player's off ball activities (such as ball screening), but I really don't think that was their concern here. I'm sure they'll develop something in the future though now that they have developed a model which they are happy with.

Looking FAR into the future, player evaluations don't particularly interest me as I think there can be far more impressive applications to this, but that's just me.
mtamada
Posts: 163
Joined: Thu Apr 14, 2011 11:35 pm

Re: Joe Sill's "Improved NBA Adjusted +/- "

Post by mtamada »

I largely agree that EPVA is a sideshow -- they even relegated it to an Appendix. The main contribution of this paper is to publicly demonstrate a way of harnessing the enormously complex data covering location, velocity, and time and figuring out how to make it useable and how to analyze it.

But in the long run, individual players' performances and abilities are still a critical research area. We've had good descriptive statistics for decades which describe how teams perform and how that affects the outcome of a game. The exceedingly difficult task, the Holy Grail, has been to figure out how individual players, with their various skills and talents, combine their individual characteristics to create the team outcome (and vice versa, how can we take the observed outcomes such as shots made and rebound grabbed, and infer what each individual player's contributions and likely talent level are).

E.g., looking at the diagrams in the paper, with the players in specific locations and the ball in one player's hand, they can calculate the EPV for that situation. But what happens next is largely dependent on the skills and decision-making abilities of the players: does the ballhandler have a Gary Payton-ish ability to knife past two defenders and still maintain his dribble? Does he have a Nash-ian ability to threaten to drive and then make the EPV-boosting pass that many other players wouldn't even see much less be able to complete?

The record of the game will show that something happened in the next few seconds, and the EPV changed. But what did the players do to cause that change, and more importantly what could they have done -- Moses Malone won't raise EPV by making that Nash-ian pass, but he could raise his team's EPV by bulling into position in the low post, and once he got the ball convert the possession into either a field goal or FT attempts (or a missed FG, but being Moses Malone he was likely to simply grab the offensive rebound and do it again). Or if nothing else draw double-team attention leaving someone like Andrew Toney open.

What's really exciting about these locational data is that they give researchers the ability to analyze those types of questions. A team of five Moses Malones would not be able to do much to raise EPV. But put Moses out there with Cheeks to distribute the ball, Toney to threaten the outside shot, and Erving to threaten to drive and dunk, and Moses is likely helping to raise EPV by MVP-level quantities.

The research by Cervone, Goldsberry, et al may eventually allow us to do that sort of analysis; which players could raise EPV in what sorts of ways? The resulting measures of player quality are unlikely to be EPVA, hence you are correct that EPVA is a sideshow. But I believe that one of the most valuable outcomes of this research will be the ability to analyze players' abilities and productivity in raising EPV, and how to harness their talents to further raise the team's EPV.

So even if the value of this research is not literally EPVA, it's something with a similar flavor: looking at individual players -- but in a sophisticated way in the context of the team -- and what those players can do to raise EPV.
Post Reply