Logistic Regression with new SportVu data

Home for all your discussion of basketball statistical analysis.
Post Reply
italia13calcio
Posts: 100
Joined: Sun Dec 08, 2013 2:54 am

Logistic Regression with new SportVu data

Post by italia13calcio »

Hey everyone,

So I'm sure most of you have seen/played around with the new SportVu data on nba.com, more specifically the new shot logs for each player, which contains specific detail about their shots. Partially inspired by J.E.'s logistic RAPM blend, I was fiddling around with the data, and came up with several things that I thought it could be used for. For example, by getting certain subsets of the data (like restricting it to shots within 5 ft of the basket in which there was also an opponent within 5 ft of the basket) I could get something fairly similar to what would be considered the rim protection stats also on nba.com. I also restricted the data set to shots between players who had shot more than 82 of these shots and players who had defended more than 82 of these shots (1 per game) just to get rid of the players with a small sample size. From there, I did a logistic regression (based on whether the shot went in) with two factors - one being the player who shot and the other being the opposing player who was closest to the shooter. I have included the results of this regression below.

There are obviously some flaws to this analysis. Doesn't account for field goals, doesn't differentiate between drives and post ups (although this could be down by controlling for dribbles/touch time), and doesn't account for other teammates who may have helped to alter the shots. Still, the results pass the eye test, so I was wondering what you guys thought about this. Any feedback is welcome!

EDIT: I couldn't figure out a good way to display all the info, so I just posted it on my blog. The link is here. If anyone wanted to share how they tend to display large info like this, that could be very helpful for the future!

Defense: http://aabstats.weebly.com/blog/logisti ... im-defense

Offense: http://aabstats.weebly.com/blog/logisti ... at-the-rim
https://hwchase17.github.io/sports/

Follow me @aabsstats - I follow back ;)
Crow
Posts: 10624
Joined: Thu Apr 14, 2011 11:10 pm

Re: Logistic Regression with new SportVu data

Post by Crow »

Are the estimate per at rim shot, per possession or what? Last season data?
italia13calcio
Posts: 100
Joined: Sun Dec 08, 2013 2:54 am

Re: Logistic Regression with new SportVu data

Post by italia13calcio »

Yes, per shot in the in subset of the data, which in this case is within 5 feet of the rim given that the opponent is within 5 feet.
https://hwchase17.github.io/sports/

Follow me @aabsstats - I follow back ;)
J.E.
Posts: 852
Joined: Fri Apr 15, 2011 8:28 am

Re: Logistic Regression with new SportVu data

Post by J.E. »

Pretty cool and definitely passes the smell-test

And since you only have 2 dummy variables active per shot (instead of 10) you won't have the multicollinearity problems that my datasets usually have, and thus don't need regularization. (Though if you were to use regularization you wouldn't have to remove players with <82 shots from your sample)
Doesn't account for field goals
This confuses me. What do you mean by that? You account for whether the shot goes in or not(?). Are you accounting for fouls and FTs?

I think on top of "how much are players changing opp. FG% at the rim" we need a stat that measures how often a player is contesting these shots. A center that has average influence on opp FG%, but contests an above average number of shots - because of his footspeed or whatever - is still very valuable
If anyone wanted to share how they tend to display large info like this, that could be very helpful for the future!
For displaying tables you can use http://www.sensefulsolutions.com/2010/1 ... table.html with Style: Unicode art. Then post in [ code ] brackets. Might be a good idea to display the defensive numbers for each position seperately (it seems like it's pretty much all C's at the top, then PF, then SF, SG, PG)
italia13calcio
Posts: 100
Joined: Sun Dec 08, 2013 2:54 am

Re: Logistic Regression with new SportVu data

Post by italia13calcio »

Thanks for the comments J.E, and good catch - I definitely meant free throws. There isn't indication (given the available data) on whether the player was fouled when he was shooting :/

And I absolutely agree about your second, but I haven't quite found a way to deal with that. I know Seth Partnow has done some work with that, and it seems like a good start. There are definitely issues, one of the big ones being that teams often use players in different roles. For example, I think Seth has Anthony Davis rated fairly low (or at least he used to) because Davis simply doesn't contest that large of a percentage of shots. There was a camp that seemed to think that it was more do to the way that New Orleans/Monty Williams was using him rather than his lackings as a player.

Definitely a complicated issue, and, like most defensive stats, there is a lot can hopefully be improved upon.

Also, thanks for the link for styles!!!
https://hwchase17.github.io/sports/

Follow me @aabsstats - I follow back ;)
Post Reply