... the nitty-gritty of the math ...

A little bit less nitty-gritty and more conceptual.

There's a lot of techniques you can use to get a number. A simple example is to take the average total for Celtics games with Thomas, and the average total for Celtics games without Thomas and take the difference. An even simpler (but stupid) option is to just say the number is 0. Instinctively we want to say that the difference of averages is better, but why is that better, and is that the best option?

(If you just want a way to calculate a number, you can use the difference of averages and stop here. Heck, you could just use 0, but we all know that's silly.)

So, really, the question is really either, what's a way to predict the impact of Thomas sitting that's "good enough," or what's the "best" way to predict the impact of Thomas sitting. Without more insight, "good enough" and "best" are pretty vague notions here.

So let's take a step back. When you ask a question like "what's the numerical impact of X", you probably already have some kind of formula in mind that you want to use to predict point totals going forward. That formula is going to include things that you can measure directly or see in advance (for example, whether Thomas sits) and things that you can only get at indirectly (like how much it matters whether Thomas sits).

So what we can do - at least in principle - is we go back to every game in the past and write out the formula for it, filling in the things we know directly, and leaving the things we can only get at indirectly as variables. This gives us a "prediction" (in terms of those variables) for the point total in each game.

Then we start working out values for the variables so that the predictions in total for those past outcomes are as good as possible. (That means that we need to make up a 'goodness formula' too.)

It turns out that this is particularly easy, or works particularly well for certain kinds of 'prediction formulas' and 'goodness formulas'. For one of the simplest examples you can find youtube videos about 'least squares regression'. One part of the math knowledge is knowing the kind of prediction formulas that work well.

A convenient thing is that you can use the 'goodness formula' to estimate how well your prediction formula works. However, that's dangerous because we made up the prediction formula and the goodness formula so we can't really be sure that they'll work well for new data points, and it's also dangerous because there might be something important that's different between the data points we have, and the situation we're trying to predict.

So, in addition to working out how to deal with particular kinds of prediction formulas, people also work on identifying overly naive predictions. This part of the math is more subtle, but can help you avoid overconfidence.

Statistics: Posted by Nate — Fri Aug 26, 2016 4:16 pm

]]>

Statistics: Posted by rlee — Fri Aug 26, 2016 4:01 pm

]]>

]]>

Statistics: Posted by Mike G — Fri Aug 26, 2016 12:42 pm

]]>

]]>

https://www.dropbox.com/sh/l1yy0tzmudcg5tx/AAAjmCB-0H8ZDxVXSaBHtdzYa?dl=0

Play-by-play was scraped off of https://www.stats.ncaa.org. Unfortunately, some games just have completely wrong substitution patterns, so about 30% of games weren't accounted for. Also, I error-checked for misspelled player names, but there still might be some out there (apparently there are 4 ways to spell Michael Gbinije).

Let me know if you have any questions or concerns.

Statistics: Posted by tacoman206 — Thu Aug 25, 2016 9:51 pm

]]>

I assume most of the gamblers that play totals have answers, various approaches and layers; but they may not want to share.

Obviously you have the substitution affect (someone replaces Thomas), the substitute for the substitute, the impact of each on teammates and opponents, possible change in pace, change in what plays get attempted / executed, change in who leads, especially in clutchtime (likelihood of garbage time), etc. Home / road and quality of opponent obviously matter on their own and also affect how much this change in PG matters. I don't have a step by step model but I'd think about all these things and maybe there is more.

Statistics: Posted by Crow — Thu Aug 25, 2016 7:27 pm

]]>

]]>

]]>

http://www.basketball-reference.com/pla ... ais02.html

Statistics: Posted by Mike G — Thu Aug 25, 2016 3:48 pm

]]>

Statistics: Posted by rlee — Thu Aug 25, 2016 3:07 pm

]]>

"All things being equal, how would a total be affected by whether, say, Isiah Thomas plays or sits out?"

I don't want to lead anyone as I don't want to influence any answers (perhaps wrong ones anyway), but how would you go about solving this hypothetical? Thank you.

Statistics: Posted by OnKPDuty — Thu Aug 25, 2016 11:54 am

]]>

Statistics: Posted by Kevin Pelton — Wed Aug 24, 2016 4:57 pm

]]>

Statistics: Posted by rlee — Wed Aug 24, 2016 4:04 pm

]]>

Yes 538 did project future titles. I don't recall how much they gave away about the model. Did they run free agent markets?

A prominent ex-team analyst said fairly recently and not for the first time that with a rare exception he never ran or used player rankings. You can't run a grand strategy model imo without it. So either this was an exceptional use- and absolutely a non-trivial, actually huge one- or he didn't run an analytic grand strategy model.

Hinkie understood that the answer to the grand strategy question for the Sixers was "more than 3 years out" initially and to date. At some point it shifts to sooner. The peak should be when a few GOOD draft picks are on tier 1 maxes and many of the later draft picks are still on rookie scale but have grown into positive impact players. And maybe a very good or great vet has joined team. That is probably year 5 of the project at earliest. If the first wave draft picks are inadequate then it will be later... or never.

Statistics: Posted by Crow — Wed Aug 24, 2016 3:16 pm

]]>