Page 3 of 9

Re: putting some math to the problem of shot selection

Posted: Mon Jan 30, 2012 2:59 pm
by Guy
After thinking about the model some more since the time the paper was accepted for publication, I think I'm ready to say that I have very little confidence in the conclusion that NBA players "undershoot".
Kudos to Brian for being willing to rethink his conclusion. This is not always easy to do, especially once your work is public.* And I take him at his word that he did not hype his prior conclusion to journalists. But this paper is a great example of a serious problem with sports analysis by academics today: you will get a lot more attention if you find an ineffeciency in a sport. That's interesting, and allows writers to sound sophisticated and reference "Moneyball" (maybe even run a picture of Brad Pitt). If Brian's paper had concluded "NBA players shoot exactly when they should," I think media interest would have been much less, probably close to zero. And so researchers face a strong incentive that can bias their research, and certainly biases publication -- go find inefficiencies. (Similarly, when looking for signs of racial bias in sports, there is a strong incentive to find bias -- a study finding that "NBA Referees Not Racially Biased" doesn't get attention in the NYT.)

I'm not suggesting we should go back to the days when papers aren't made public until published in a refereed journal. We shouldn't have to wait 3 years to see research like this, and in any case journal referees likely wouldn't have spotted the problems mentioned here. (Lesson for young academics: use the Internet to get good feedback from subject-matter experts for free!) But I do think that journalists need to be more conscious of the bias in favor of this kind of finding. Get subject-matter experts to review this kind of paper BEFORE you write your article about it. The fact is, sports is in general a highly-efficient arena. Most findings of inefficiency will be wrong. In this case, it is highly unlikely (though possible) that teams systematically shoot too late. As long as there were some variance in how aggressive teams were, those who shot earlier would tend to win. That would encourage them to shoot still earlier, and other teams to emulate them. It's hard to imagine a league of professionals not finding the right equilibrium rather quickly.

* See: Berri, David.

Re: putting some math to the problem of shot selection

Posted: Mon Jan 30, 2012 3:10 pm
by mystic
gravityandlevity wrote:mystic, great synopsis of the problem. That was exactly the issue I had. How do I infer "shots that were available to take" when the record shows only "shots that were taken"?
That makes the interpretation of the results difficult. And unless someone is making a study based on video analysis, that issue will be there.

Anyway, I like how you approached it. Given the fact that I had to listen to my math professor explaining the marriage problem for about 4 hours, it was easy to follow your way of thought. I have to admit that I would have never thought I would thank that particular professor for taking so much time (and thanking myself for keeping my handwritten script of that). :)

Guy, I wouldn't necessarily say that his overall conclusion is wrong. I think there are lot of examples out there when good enough shooters are passing up open opportunities, because it is early in the shot clock. Especially late in games you see that kind of habit. Teams with so called "go-to-scorers" are often ending up playing iso instead of set plays, which reduces their scoring efficiency overall. So, there is indeed the tendency to "undershoot", at least in certain situations.

Re: putting some math to the problem of shot selection

Posted: Mon Jan 30, 2012 3:54 pm
by gravityandlevity
Anyway, I am learning an important lesson: I should have discussed the paper here before submitting it for publication! I think I'll leave a comment on the PLoS website directing readers here for relevant discussion.

mystic, I'm glad you liked the approach. I hope that in the future you (or someone else) can think of a good way to improve on it.

Re: putting some math to the problem of shot selection

Posted: Mon Jan 30, 2012 3:57 pm
by EvanZ
gravityandlevity wrote:Anyway, I am learning an important lesson: I should have discussed the paper here before submitting it for publication! I think I'll leave a comment on the PLoS website directing readers here for relevant discussion.
Well, that's not really a problem per se, but it would have helped. The problem is what I stated earlier, that PLoS One did not find appropriate reviewers with basketball expertise who would have raised these questions much earlier in the process. At least, that's my assumption. I assume that they didn't find John Hollinger or someone from this board to review it.

Re: putting some math to the problem of shot selection

Posted: Mon Jan 30, 2012 4:11 pm
by EvanZ
EvanZ wrote:
What is the empirical shot distribution? I assumed it would be normally distributed around 12 seconds, if you just look the shots after a made field goal (oh, and of course, ignoring the handful of shots coming off a missed And1 rebounded by the offensive team!).
Image

Does this look right to you guys? Shots as a function of shot clock time from the second table Brian gave. Not clear to me whether 1 or 24 is the start of the shot clock here. Otherwise, it does look normally distributed as I suggested (around 10 or 14 s depending on how you view time on the shot clock).

Re: putting some math to the problem of shot selection

Posted: Mon Jan 30, 2012 4:15 pm
by EvanZ
Here is points per shot (PPS) vs. time:

Image

Looks like 1 second must be at the end of the shot clock.

Re: putting some math to the problem of shot selection

Posted: Mon Jan 30, 2012 4:18 pm
by mystic
EvanZ wrote: Looks like 1 second must be at the end of the shot clock.
Yes. The time is meant as seconds left on the shot clock. So, the normal distribution around 14 sec sounds about right to me. The two earlier peaks might be first break and secondary break. At least that is the way I would interpret it.

Re: putting some math to the problem of shot selection

Posted: Mon Jan 30, 2012 4:58 pm
by Guy
Much of the paper is still useful, I think, but the assumption that shot opportunities arise at some uniform rate in time is not a very good one.
I think there is a second, perhaps more important issue, which is that the decision to shoot is not a random outcome from the distribution of opportunities. Players have some ability to assess the quality of shot available. Early in the clock, they can set the bar higher and only shoot if PPS is above a certain level. Basically, players should only take the shot at 16 remaining if PPS > expected value of the shot that will be available at 6 seconds remaining (or whenever PPS begins to sharply decline). As time elapses, the bar gets lowered. That is why you can't assume that additional shots taken early in the clock would have the same value as those taken now -- these "new" shots will, on average, be inferior opportunities.

I may very well be misunderstanding your analysis, but it doesn't seem to me it fully incorporates this element.

Re: putting some math to the problem of shot selection

Posted: Mon Jan 30, 2012 5:03 pm
by EvanZ
I think given that the shot distribution follows a normal curve and the PPS distribution is roughly linear, does Brian's model explain the behavior correctly? Any model would have to account for those two distributions. That wouldn't mean necessarily that the model is "proven", but what we can say (or Brian could say) is that the data are *consistent* with such a model.

Also, I would argue the first 5 or 6 seconds off the shot clock should be ignored.

Re: putting some math to the problem of shot selection

Posted: Mon Jan 30, 2012 5:17 pm
by gravityandlevity
Yes, you guys got it right. The "time" column of my table means "reading on the shot clock", so 24 means the beginning and 1 means the end.

Evan, thanks for your plots. The distribution of shots taken really does look like two Gaussian distributions centered at about 9 and 20 seconds, which I hadn't noticed.

Guy, the point you are raising is in fact at the heart of what I was trying to do. I didn't assume that the decision to shoot was random. I tried to derive, theoretically, "how high the bar should be" for taking a shot at a given shot clock time t. (In the paper I called this a function f(t) ). I then tried to look at data to see whether NBA players actually seemed to have a similar internal rule for how good a shot should be.

My answer is that the theoretically optimum shooting rate and the observed shooting don't look very similar. This is either an indication that NBA players are too hesitant to shoot early in the shot clock or that my assumption of shot opportunities arising uniformly in time is not a good one. My opinion at the moment is that the comparison fails primarily for the latter reason, although I have a suspicion that the former is somewhat true also.

Re: putting some math to the problem of shot selection

Posted: Mon Jan 30, 2012 5:49 pm
by Guy
OK, that model makes sense to me. But then what I don't get is your assumption that if teams took more early shots they would continue to enjoy the same efficiency at time t as they currently get. If they are now taking mostly good shots early, doesn't it follow that the additional marginal shots they would take -- which must come from an inferior distribution of the remaining possible shots -- would have a lower efficiency?

Re: putting some math to the problem of shot selection

Posted: Mon Jan 30, 2012 5:57 pm
by gravityandlevity
what I don't get is your assumption that if teams took more early shots they would continue to enjoy the same efficiency at time t as they currently get
I'm not making that assumption. If teams are willing to take lower-quality shots early in the clock, then their shooting rate will rise and their overall efficiency will decline. When I say that "the comparison suggests that NBA teams might be overly reluctant to shoot early in the clock", I mean that (maybe) teams should have a lower shot quality cutoff early in the shot clock. If the cutoff is indeed lowered, then the average quality of shots taken will decline, as you say, but the tradeoff will still (maybe) be worthwhile.

All those "(maybe)"s are there as reminders about the caveats about the theory that are discussed above.

Re: putting some math to the problem of shot selection

Posted: Mon Jan 30, 2012 6:06 pm
by EvanZ
Isn't another possibility that players can't find a good shot as quickly against good defenses, and the distribution we see is simply a function of that? Or would that balance out with offenses also varying in quality?

In other words, a "good" offense may find "good" shots very early against "bad" defense, while bad offense finds good shots much later against good defense. Would the result of this generate a uniform or Gaussian shot distribution vs. time and linear PPS vs. time as the data shows?

Re: putting some math to the problem of shot selection

Posted: Mon Jan 30, 2012 6:29 pm
by Guy
I'm not making that assumption. If teams are willing to take lower-quality shots early in the clock, then their shooting rate will rise and their overall efficiency will decline.
OK, thanks for clarifying. I guess I misunderstood your earlier comment (back on page 1 of the thread) when I asked how you arrived at the estimate a team could improve by 4.5 points by shooting optimally, and you answered: "Yes, I applied the optimal shooting rate from the theory, given by equation (18), using the observed shot quality that I saw for NBA players, Fig. 2b." But it sounds like your model does incorporate a reduced efficiency at time t when teams increase their rate of shooting. Then perhaps another issue to think about is the assumed variance in quality of opportunities at a give time t. If it is larger than you hypothesize, then the cost of increased shooting earlier will be larger (as the decline in quality of shots will be sharper).

Re: putting some math to the problem of shot selection

Posted: Mon Jan 30, 2012 8:20 pm
by gravityandlevity
That's a good point: that the distribution of shot quality can depend on the shot clock time. I'd love to do better than assuming a static distribution (same mean and variance of potential shot opportunities at all shot clock times) if I could only come up with a reasonable way to figure it out.