APBRmetrics

Posted: **Thu Jun 02, 2011 4:29 pm**

Hello folks,

I've written a manuscript describing a new technique for estimated player ratings for the 2010-2011. It is basically a variant of the APM model, but incorporating box score information directly into the regression. You can sort of view it as a SPM algorithm that iteratively uses APM ratings to update box score weights, then uses those box score weights to update player ratings, ad infinitum. I was supposed to present part of this work at the Sloan Sports Conference in Boston, but unfortunately was not able to be there.

Some highlights of the manuscript:

1) APM by itself doesn't seem to be convincingly any better than a simply dummy home court advantage (HCA) estimator that says the home team wins each possession by 3.5 points (Section 1.6).
2) LambdaPM gives an enormous improvement in statistical performance relative to both the HCA and APM estimator by incorporating box score information (See Section 3.1)
3) There appears to be some benefit in taking box score information, computing an expanded box score matrix that records products of simple terms (e.g., Rebounds x Assists), and feeding this expanded box score matrix into the LambdaPM algorithm (Section 3.3).
4) LambdaPM seems to think that:
- A) three point shooting is even more valuable than intuition suggests, and offensive rebounding isn't as important as defensive rebounding (See Section 4.1).
  B) Chris Paul, Dwight Howard, LeBron James, and Dirk Nowitzki, were the best players in the NBA (in that order) this season (Table 4 on Page 16.)
  C) Kobe Bryant was not a top 25 player this past season, yet Pau, Bynum and Lamar Odom all were.
  D) Dirk, Nick Collison, Omer Asik and LaMarcus Aldridge were amongst the most underrated players in the 2010-2011 season relative to their box score production. I offer some guesses for why these particular players might be underrated relative to their box score production, but I'd love to hear your thoughts. See Table 5 on page 17 for the most underrated players in the league this past season.
  E) Andris Biedrens, Raja Bell, Ed Davis and Goran Dragic were the most overrated players in the league relative to their box score production (see Table 6 on page 18.) Again, I offer guesses about why this might be true. But I'd love to hear your thoughts.

Of course, that LambdaPM seems to believe in an NBA in which (A) through (E) are true doesn't mean that the algorithm is correct. All of this is just food for thought.

Here is a link to the manuscript: https://docs.google.com/viewer?a=v&pid= ... y=CJ6UzpUB

Anyway, I'd love to get detailed feedback on the paper. Both from a technical perspective ("Did you try considering this?", "Why do you do...?") and a prose perspective ("Your description of such-and-such is confusing, here is a better way to say the same thing.")

Thanks in advance.

EDIT: Here is a full list of the LambdaPM player ratings for the 2010-2011 season: https://spreadsheets.google.com/spreads ... y=CNK1nN4B

It is sorted by position, then by rating. The #s are on a "points per 100 possessions" scale.

Posted: **Thu Jun 02, 2011 5:09 pm**

Cool, I very much look forward to reading and trying to understand this. You should definitely take part in the retrodiction challenge, so we can see how it lines up with the other rating systems.

Posted: **Thu Jun 02, 2011 5:22 pm**

I'm loving that Udoh and Asik are among the top underrated by box score stats.

Posted: **Thu Jun 02, 2011 5:39 pm**

Cool, thanks for taking a look. Yeah, some of the results are a bit surprising and counter-intuitive to me. The offensive rebounding thing especially. Hollinger's PER (and I think statistical +/- also?) for example values ORebs more than DRebs. Yet you'll see certain pretty well-run teams like the Heat and Celtics value getting back on defense more than offensive rebounding. So they are probably using very different quantitative analysis than what Hollinger is using.

Posted: **Thu Jun 02, 2011 5:51 pm**

How did you get NSF support for this? Do you have a NSF Graduate Fellowship?

Posted: **Thu Jun 02, 2011 6:00 pm**

Yeah, I'm on an NSF fellowship.

The NSF probably funds a lot of applied statistics work though. At least a few papers I've come across in JASA (Journal of the American Statistical Association) and AAS (Annals of Applied Statistics) received some sort of support from them.

Posted: **Thu Jun 02, 2011 6:28 pm**

Thanks for sharing the article and the results.

I've just given the article a quick first skim but here are a few comments on various points:

I am very glad to see a model that incorporates boxscore-based measurement with outside the boxscore impact measurement. I am glad to see several people pursuing new types of statistical / Adjusted hybrids as I occasionally over time have suggested and supported. I am especially encouraged to see them integrated into the same model.

Your boxscore rating and player ratings remind me of my long ago discussion of Adjusted +/- split out to "local" player production and "global" player impact.

The desirable full display would be a matrix:

.....................overall rtg boxscore rtg player rtg

offensive...........x....................x...................x

each of 4 Factor ....x....................x...................x
level rtgs

defensive ...........x....................x...................x

4Factor rtgs...........x....................x...................x

Do you have interest in presenting the additional details suggested by this matrix?

It would also be possible to add multi-season ratings in addition to 1 year and maybe various rating splits. And perhaps similar matrices for pairs, other subunits and lineups.

I wonder how much impact there would be on the ratings of main rotation larger minute players if the model only tried to estimate the value of say the 150-250 players with the most minutes, and instead of trying to also estimate the impact of lower minute players to any degree by that model (even with regularization), set then all to a certain average impact value or assigned their value by a separate model.

I also wondered if the ability of the model to explain game level outcomes could be enhanced by finding, reporting and using boxscore rating and player rating volatility measures, overall or tied to certain factors (could potentially be almost any game factor but perhaps one could look at a few first). Would that help improve the results over what you see right now? Do players with ratings in one tail or the other of these 2 metrics tend to have higher volatility than those near the center of the distribution? Any notable volatility variation by position or role (especially role as indicated by offensive usage)?

I noticed that the average positional value is by far best for PFs. Are they unique positioned to show versatility and get more credit by this method?

The degree of "underating" by positions is +.002 (for SGs. SFs. PFs) to -.002 (for PGs) and -.003 (for Cs). How to react to that? Accept it as real or adjust the metric?

If the goal is to explain games, might it be worth it to go beyond one metric for all players to revised metrics for certain subsets of players if one can find subsets deemed "similar" and "stable" over time? Might these subsets by a hybrid of position and role? Might the earlier work of Ed Kupfer and David Sparks and others about player "types" be worth using as starting points for this extension (perhaps with an enhanced focus on usage to define subgroups of these types or separate types mindful of usage where no such required rally existed before)?

Thanks for checking and report the break-even point for shooting as near 46% and noting the much lower break-even for 3 point contributions / threat and its global spacing implications. I was wondering if you can simply and summarize the break-even levels for other discrete stats.

Would it be fair to say that defensive rebounding is considered a pretty important contribution of value by this metric? Any further thoughts about this particular much discussed issue? Is it balanced or does it lean toward accepting team level actual value of rebound over individual rating value based on understanding of how individual rebounding contributions fit into team production? How do these embedded values compare to Jerry Adjusted Rebounding impacts?

I get the impression that you feel adding consideration of pair level performance is worthwhile but could you provide a simpler, less technical summary of what you did toward the end of the article and what you might do beyond that in the future?

One small text comment. On page 5 you compare LeBron's impact to Home Court Advantage. I think it would probably be appropriate to take James' impact down to what it would be in 36-40 minutes instead of all 48.

Posted: **Thu Jun 02, 2011 6:54 pm**

^-- Thanks for the comment. Let me respond:

Crow wrote: Thanks for sharing the article and the results.

I've just given the article a quick first skim but here are a few comments on various points:

I am very glad to see a model that incorporates boxscore-based measurement with outside the boxscore impact measurement. I am glad to see several people pursuing new types of statistical / Adjusted hybrids as I occasionally over time have suggested and supported. I am especially encouraged to see them integrated into the same model.

Your boxscore rating and player ratings remind me of my long ago discussion of Adjusted +/- split out to "local" player production and "global" player impact.
The desirable full display would be a matrix:

overall boxscore rating player ratings

offensive x x x

each of 4 Factor
level rtgs x x x

defensive x x x

4Factor rtgs x x x

And then you could add 1 yr vs multi-season. And perhaps similar matrices for pairs, other subunits and lineups.

Yeah, I'm sure that I could do something like this. The same methodology should in principle should be able to get you Offensive +/-, Defensive +/-, etc estimates. I didn't bother doing this just because I wanted to focus on this key new idea of coupling two regression problems, using one regression to improve the other.

I wonder how much impact there would be on the ratings of main rotation larger minute players if the model only tried to estimate the value of say the 150-250 players with the most minutes, and instead of trying to also estimate the impact of lower minute players to any degree by that model (even with regularization), set then all to a certain average impact value or assigned their value by a separate model.

In other words, keep the 250 players with the most # of minutes and ignore the rest in the regression? I definitely could have done that. I'm just hesitant to throw away data. Whatever we set this minute threshold to be, how do we know we've chosen a good value (to ensure we are keeping good data and throwing away bad data, so to say?) Is 50 minutes, 100 minutes, 500 minutes, 1000 minutes the right cutoff? So you either have to know what a good cutoff is or somehow estimate it. You could estimate it from data I guess, but this just adds additional complexity.

It seems like it might be a useful thing to do in practice and maybe if you did it cleverly enough it would boost prediction results even more. But I didn't try it out.

I also wondered if the ability of the model to explain game level outcomes could be enhanced by finding, reporting and using boxscore rating and player rating volatility measures, overall or tied to certain factors (could potentially be almost any game factor but perhaps one could look at a few first). Do players with ratings in one tail or the other of these 2 metrics tend to have higher volatility than those near the center of the distribution? Any notable variation by position or role (especially role as indicated by offensive usage)?

What do you mean by volatility in this context? I'm not sure what you mean.

I noticed that the average positional value is by far best for PFs. Are they unique positioned to show versatility and get more credit by this method?

I didn't do anything fancy when I included position as a box score variable. I think this is mostly just a function of the PF position probably being the most stacked of any in the league right now.

If the goal is to explain games, might it be worth it to go beyond one metric for all players to revised metrics for certain subsets of players if one can find subsets deemed "similar" and "stable" over time? Might these subsets by a hybrid of position and role? Might the earlier work of Ed Kupfer and David Sparks and others be worth using as starting points for this (perhaps with an enhanced focus on usage to define the subgroups)?

I'm not sure what you mean by this. But in Section 6 I take a quick look at pairwise player interactions. The idea is to augment the +/- matrix with a matrix of player pairs (imagine having a term in your regression for whether Kobe+Bynum are on the floor or not.) It looks like this type of stuff helps (at least relative to APM.)

I'm not really sure of the best way to bring it into the LambdaPM framework; the fit might not be natural. It is something I guess that one could explore later... how do you both take advantage of box score information in improving your ratings and also use pairwise/triplet/etc interactions to get better prediction performance?

One small text comment. On page 5 you compare LeBron's impact to Home Court Advantage. I think it would probably be appropriate to take James' impact down to what it would be in 36-40 minutes instead of all 48.

Thanks, I'll make a note of that.

EDIT:

The degree of "underating" by positions is +.002 (for SGs. SFs. PFs) to -.002 (for PGs) and -.003 (for Cs). How to react to that? Accept it as real or adjust the metric?

Did you calculate that from the spreadsheet? Anyway, it is so close to zero that it probably doesn't mean anything significant.

Posted: **Thu Jun 02, 2011 7:26 pm**

Good to hear you are open to drilling down beyond the overall ratings to deeper levels of detail.

While one would be throwing away data with a minutes cutoff for looking for a player value rating, it would be in just one run. And you'd still have the run with everyone included without prejudice and you could compare. You could make several alternate runs and decide which is better after reviewing them and one might feel better about it after seeing them than before. If one doesn't like the additional products they can be abandoned but if time and interest permits, might be worth the look. I might set the initial and main minute threshold at somewhere between 250- 800 minutes but I'd also probably set one much higher that might be fairly consistent with the level of player who gets 10+ minutes per game in the playoffs.

"What do you mean by volatility in this context?" I was mainly thinking of the volatility in the net value of the production of boxscore captured contributions of a player game to game.

But perhaps there would also be some way to compare the derived Player average Rating (the global or "Adjusted" or outside the boxscore rating) back to the game level data and his raw +/- for his time in individual games and that of others and try to estimate the roughly inferred volatility of these for each player? Maybe not, but I throw out there for those with more training and ability.

To try to "go beyond one metric for all players to revised metrics for certain subsets of players if one can find subsets deemed "similar" and "stable" over time" that are "a hybrid of position and role" and "an enhanced focus on usage", maybe one could classify all players then run your model for these "types" or subtypes" instead of for "players" and see what the data shows and compare it to the specific player findings. At least it would provide a new basis for assess who fits a type or subtype closely or varies from the description more. And then one could perhaps somewhat modify player level ratings based on the more robust (much bigger minute) performance ratings of these type or subtypes and see if that enhances the ability to explain game level results? There might be something gained from messing around with types in addition to everything else even if the effort to improve game predictions (hopefully over a multi-season dataset)doesn't pan out strongly. I guess I am saying try innovative stuff (what I proposed or something else) and see whether it helps. If it doesn't help, set it aside and try something else and keep using the more conservative (less adventurous) model as well.

"The degree of "under-rating" by positions is +.002 (for SGs. SFs. PFs) to -.002 (for PGs) and -.003 (for Cs). How to react to that? Accept it as real or adjust the metric?"

"Did you calculate that from the spreadsheet?"

Yes I did. It is a small thing, maybe too small to worry about or do anything about. But I noticed it, so I thought I'd briefly mention it. There is a simple pattern here with the most undervalued positions at the extremes (PG & C) and the most over-rated in the middle... on height and skill mix and production. There has been some previous discussion of position level "bias" or "error" or just difference with other metrics. PGs and Cs are at the position extremes for size and assist and rebound production and other things including usage. They are probably at the extremes for shot types. Not sure about true shot attempts.

Posted: **Thu Jun 02, 2011 7:54 pm**

^- OK, I'll add the minute cutoff stuff as a pre-processing step to my code. At that point, then others can try this sort of experiment themselves and see what the impact of the different minute thresholds are.

"What do you mean by volatility in this context?" I was mainly thinking of the volatility in the net value of the production of boxscore captured contributions of a player game to game.

So something along the lines of, "how much did the box score stats capture how well player X played today?" E.g., did Carlos Boozer have a fake 20/10 game today?

If that is what you are saying, that isn't really the goal of this tool.

But if by volatility you just mean "variance", well you could do that. Like, we could look at PER for the 82 games the guy played in the season, and then calculate the variance. Is that the type of thing you are going after with this "volatility" idea?

Posted: **Thu Jun 02, 2011 7:57 pm**

Can you break this into offense/defense?

Posted: **Thu Jun 02, 2011 8:07 pm**

By volatility, I just meant "variance" in the game to game measurements.

(I am sorry if using them as equivalents was not typical, acceptable or clear.)

Posted: **Thu Jun 02, 2011 10:53 pm**

Did the Suns decide to move Dragic primarily because of what they saw as coaches and analysts or what his boxscore stats did year to year or did his bad and declining Adjusted +/- play a significant role? It appears to me that his Adjusted +/- was worse than his boxscore estimate. I didn't see him enough to really judge how his play looked.

Did the Rockets evaluate his tape or his basic stats differently? Did they refer to Adjusted +/- or other internal advanced metrics? Did his past look better to them by whatever means than traditional APM, RAPM and lambdaAPM viewed him or were they mainly betting on skills / size, development, bounce back from short term measurements or fit in their system? How much did they want him and how much did they just want something else (with value in the future) over Brooks and his expiring contract (formerly favored despite his weak defense and Adjusted +/- but no longer wanted after his even weaker play this season)?

How much weight did advanced metrics get in the Boston / OKC deal, the Denver / NY deal, the Orlando deals with Washington and Phoenix, etc.?

Won't get any direct answers from those potentially involved, but these are interesting questions to me.

Dragic did bounce back on basic stats in Houston. Wonder about on Adjusted +/-. Any version of APM author compute separate APMs for traded players with each team? Any interest in doing so in the future in general or for him in particular (if it is not time intensive)?

Posted: **Thu Jun 02, 2011 11:14 pm**

Dapo, you may not know, but I compile a whole bunch of "beyond the box score" stats from play-by-play data that you might could put in your model.

Assisted vs. unassisted 2pt, 3pt FG
assists to 2pt, 3pt FG
counterpart data (including missed field goals and rebounds)

Let me know if you're interested, and I'll give you a data file.

Posted: **Thu Jun 02, 2011 11:21 pm**

Crow wrote: How much weight did advanced metrics get in the Boston / OKC deal, the Denver / NY deal, the Orlando deals with Washington and Phoenix, etc.?

Seems like Boston and NY didn't give much weight to (r)APM. Green had a bad rating pretty much every year, and Anthony sure wasn't ranked as a superstar. Not sure about other advanced metrics

APBRmetrics

LambdaPM: A new way of looking at adjusted +/-

LambdaPM: A new way of looking at adjusted +/-

Re: LambdaPM: A new way of looking at adjusted +/-

Re: LambdaPM: A new way of looking at adjusted +/-

Re: LambdaPM: A new way of looking at adjusted +/-

Re: LambdaPM: A new way of looking at adjusted +/-

Re: LambdaPM: A new way of looking at adjusted +/-

Re: LambdaPM: A new way of looking at adjusted +/-

Re: LambdaPM: A new way of looking at adjusted +/-

Re: LambdaPM: A new way of looking at adjusted +/-

Re: LambdaPM: A new way of looking at adjusted +/-

Re: LambdaPM: A new way of looking at adjusted +/-

Re: LambdaPM: A new way of looking at adjusted +/-

Re: LambdaPM: A new way of looking at adjusted +/-

Re: LambdaPM: A new way of looking at adjusted +/-

Re: LambdaPM: A new way of looking at adjusted +/-