Page 1 of 1
Box Score v Play-by-Play Player Performance Metric
Posted: Thu Jan 19, 2012 4:07 am
by LA Blue Devil
Hi everyone,
I am currently a senior econ major at Duke and am just starting the year long process of putting together a senior thesis. I have been lucky enough to find a couple of econometrics professors who love basketball and are willing to sponsor me on a basketball related topic and I would love to be able to create a new performance metric. As I mentioned before, I am just starting this process and am therefore working on the theory for the metric, which of course means I need to decide on whether I want to create a "box score" metric or an alternative. I know that many of you on here have your own metrics and I am hoping to hear why you chose the type that you did, and I would love to hear everyone else's opinion too.
First and foremost, I want the metric to be based on a sound theory of the game of basketball. I have no interest in going into academia or selling books (cough cough Berri). Like everyone else on this board, I am a fan who loves the game and wants to understand it a little better. However, my constraint is that I would prefer for the metric to not just be a black box in the sense that I would like as many fans as possible to be able to understand it and how it is calculated. My favorite thing about the growing popularity of "advanced sports stats" is the number of nonacademic people across the internet that are learning, teaching, and talking about topics in math, programming, and statistics that you used to have to go to a college campus to find. I know that I have learned so much more about these things than I ever have in a classroom, and I know that I am not the only one. So, really my hope is to create something that can contribute to this growing community and any advice would be greatly appreciated.
Re: Box Score v Play-by-Play Player Performance Metric
Posted: Thu Jan 19, 2012 5:18 am
by Ryan
Re: Box Score v Play-by-Play Player Performance Metric
Posted: Thu Jan 19, 2012 11:53 am
by J.E.
I think if you decide to go with a BoxScore based metric there's very little chance that it brings something new/worthwhile to the table because a) there are already many BoxScore based metrics out there and b) the BoxScore (obviously) can't tell you as much about a basketball game as the PBP can.
Not only does the PBP tell you who is on the court, which is useful to know when there are things like team turnovers/rebounds etc. but it will also be able to tell you things like goaltends, who took the charge when there was an offensive foul etc.
Re: Box Score v Play-by-Play Player Performance Metric
Posted: Thu Jan 19, 2012 1:16 pm
by EvanZ
You've come to the right place (or one of them, anyway).
I've got my own PBP metric (ezPM), which is discussed along with a lot of other "advanced stats" here:
http://thecity2.com/advanced-stats-primer/
Re: Box Score v Play-by-Play Player Performance Metric
Posted: Thu Jan 19, 2012 3:22 pm
by schtevie
My $0.02:
Another box-score metric would have the same relative value on the margin as a mid-range jump shot.
If the goal is a better understanding of how to account for game outcomes, a profitable line of inquiry would be to take a (R)APM approach to the so-called "four factors", more than four being necessary to tell the story properly.
As an alternative, here are a couple of research ideas for someone who is "a fan who loves the game and wants to understand it a little better".
(1) SAPM for coaches? Just saw a recent TrueHoop clip of Henry talking to Haralabos about the latter's claim there are but a few good coaches and a lot that suck. Furthermore, the claim was that Tom Thibodeau was worth about 13 games, which would translate to about 5 points per 100 possessions, in the ballpark of Jeremias' RAPM estimate of 2.4. I am not sure the best way to approach this, but my guess is that some right hand side variables could be created that would have some explanatory power. Might be a lot of work for an uncertain payoff.
(2) A topic having a historical bent? There have been dramatic changes over the history of the modern NBA in terms of the way the game has been played: the rise in the three point shot, correspondingly the decline of the mid-range shot, changes in the pace of the game and the efficiency of offenses/defenses. Explaining why these phenomena occurred and the speed of their adoption could add a lot of value. (That is, it would be something that I would very much like to read.)
Best of luck.
Re: Box Score v Play-by-Play Player Performance Metric
Posted: Thu Jan 19, 2012 10:37 pm
by mtamada
The top of my list of research ideas: additional tweaks to Adjusted Plus-Minus. Aside from ridge regression (which has quickly become one of the frequently-cited advanced metrics), I think the other most-promising regression models are ones which take into account correlated residuals: fixed effects models and random effects models. Ryan Parker did the first work that I know of a couple of years ago, using a fixed effects model which showed promising results. He started fiddling with a random effects model but except for some initial results hasn't shared the results; I don't know if that's because there haven't been any worth reporting or because his work is proprietary now. Or maybe he's buried in grad school work.
It's also a decent test of the quality of your undergraduate education at Duke. Most economics majors don't know about these models (so don't feel bad if you don't) because they're more of a graduate level topic. But if Duke's econ program has given you the background to tackle these kinds of regression models, they're hugely useful in econometrics and my intuition is that they'll be useful in hoopstats too.
Re: Box Score v Play-by-Play Player Performance Metric
Posted: Fri Jan 20, 2012 10:21 pm
by Guy
Like others, I would discourage you from trying to build a new meta-metric. Of if you are committed to building a metric, focus more narrowly on doing a superior job of measuring some specific dimension of value that isn't well-measured yet.
I think an interesting topic to tackle is the rationality of sports decision-makers. A popular view in sports economics in recent times is that sports decision makers are basically idiots, and make a lot of sub-optimal decisions. These often unpersuasive studies are usually based on comparing actual outcomes to a flawed model of efficient behavior. The failure of the two to match up is then deemed to be evidence of poor decision-making in sports. (This creates truly perverse incentives for sports economists: the worse your model, the more likely that you will find interesting evidence of sub-optimal behavior!). I would look at some of these studies and see if you can challenge their conclusions by using superior metrics. One example: some economist (Berri?) has found that NBA teams do a poor job of drafting players, greatly overvaluing the #1 pick. However, IIRC Neil Paine and one of the other writers at the old BB-Ref blog did some interesting work that showed #1 picks did return a great deal of value, about equal to what teams pay. (There was also a study showing NFL teams overvalued #1 picks, which has been challenged by non-academic analysts.)
Basically, the non-academic analytic community is way ahead of almost all the academic researchers in measuring performance. Subject knowledge trumps statistical sophistication most of the time (and frankly, the academics don't always have the statistical edge either). So that creates an opportunity for a young researcher like you to do a good takedown of some influential study(ies), by marshalling the best analytic thinking from outside the ivory tower.
Re: Box Score v Play-by-Play Player Performance Metric
Posted: Sat Jan 21, 2012 2:00 am
by Crow
"... I would prefer for the metric to not just be a black box in the sense that I would like as many fans as possible to be able to understand it and how it is calculated."
Ok, that would eliminate some proposals I might make.
If you want and can work with play by play data you might consider these ideas:
Analyze average player and team performance minute by minute for how long a player and / or lineup has been on the floor by stint and cumulative time in a game (and maybe for the season and maybe for their career service to a team too). And by position, experience, player quality, player type, etc.
By doing this you could try to find answers to like: do players on average start out stints slow (or who does / doesn't), when are PGs shooting best, passing best?; does defense (1 on 1 and help / team) improve with more time on the court or degrade or improve then degrade when some minute level is reached; what happens to rebounders, scorers, block and steal artists, etc. over the span of their time on the court ; how many minutes do you have to leave a lineup out there on average to get their best minute offense, defense or net performance, how does it vary by different criteria categorizing the lineups (including experience); etc.
In terms of year to year data, how long til players and lineups peak and, when they decay, can you pinpoint the dominant Factor that breaks down for that team and can you trace it to players and then trace future management decisions to that breakdown and those players?
I haven't seen anything like this yet. You could make it as simple or complicated as you want to attempt / handle.