Logistic Regression with new SportVu data
Posted: Sun Dec 07, 2014 9:12 pm
Hey everyone,
So I'm sure most of you have seen/played around with the new SportVu data on nba.com, more specifically the new shot logs for each player, which contains specific detail about their shots. Partially inspired by J.E.'s logistic RAPM blend, I was fiddling around with the data, and came up with several things that I thought it could be used for. For example, by getting certain subsets of the data (like restricting it to shots within 5 ft of the basket in which there was also an opponent within 5 ft of the basket) I could get something fairly similar to what would be considered the rim protection stats also on nba.com. I also restricted the data set to shots between players who had shot more than 82 of these shots and players who had defended more than 82 of these shots (1 per game) just to get rid of the players with a small sample size. From there, I did a logistic regression (based on whether the shot went in) with two factors - one being the player who shot and the other being the opposing player who was closest to the shooter. I have included the results of this regression below.
There are obviously some flaws to this analysis. Doesn't account for field goals, doesn't differentiate between drives and post ups (although this could be down by controlling for dribbles/touch time), and doesn't account for other teammates who may have helped to alter the shots. Still, the results pass the eye test, so I was wondering what you guys thought about this. Any feedback is welcome!
EDIT: I couldn't figure out a good way to display all the info, so I just posted it on my blog. The link is here. If anyone wanted to share how they tend to display large info like this, that could be very helpful for the future!
Defense: http://aabstats.weebly.com/blog/logisti ... im-defense
Offense: http://aabstats.weebly.com/blog/logisti ... at-the-rim
So I'm sure most of you have seen/played around with the new SportVu data on nba.com, more specifically the new shot logs for each player, which contains specific detail about their shots. Partially inspired by J.E.'s logistic RAPM blend, I was fiddling around with the data, and came up with several things that I thought it could be used for. For example, by getting certain subsets of the data (like restricting it to shots within 5 ft of the basket in which there was also an opponent within 5 ft of the basket) I could get something fairly similar to what would be considered the rim protection stats also on nba.com. I also restricted the data set to shots between players who had shot more than 82 of these shots and players who had defended more than 82 of these shots (1 per game) just to get rid of the players with a small sample size. From there, I did a logistic regression (based on whether the shot went in) with two factors - one being the player who shot and the other being the opposing player who was closest to the shooter. I have included the results of this regression below.
There are obviously some flaws to this analysis. Doesn't account for field goals, doesn't differentiate between drives and post ups (although this could be down by controlling for dribbles/touch time), and doesn't account for other teammates who may have helped to alter the shots. Still, the results pass the eye test, so I was wondering what you guys thought about this. Any feedback is welcome!
EDIT: I couldn't figure out a good way to display all the info, so I just posted it on my blog. The link is here. If anyone wanted to share how they tend to display large info like this, that could be very helpful for the future!
Defense: http://aabstats.weebly.com/blog/logisti ... im-defense
Offense: http://aabstats.weebly.com/blog/logisti ... at-the-rim