APBRmetrics

Posted: **Thu Dec 12, 2013 9:28 am**

Hello guys. This is my first time posting here but it isn't my first time reading. I've been trying to acclimate myself with the level of discussion that happens here and to be entirely honest, I'm overwhelmed. I've forgotten a lot about the mathematical theory that is mostly used here and that's why I find myself constantly trying to look at my books when I read stuff here. This is why it took me so long to post here - I've been "shy" (to a degree) to comment on anything because compared to you guys, I can be considered undereducated. I've been trying to play catch up (been reading books when I can and studying some programming in between) but I realized that there's no better way to get acclimated with this than just to dive right into the pool, swimming lessons or not. So here goes,

I'm a big fan of plus/minus based metrics and I think they're a huge key to evaluating players in the NBA in terms of current production and equivalent contract, future potential and what not. Two things:

If we've constructed a plus/minus model for all relevant metrics as it relates to a player, then we can in essence use this to breakdown a player's APM as it relates to points per possession. Example:

Player X's Improved APM (whatever method it supposedly becomes, whether it's the statistical or the regularized one) on a points per 100 basis is bad but it supposedly happens that in reality his APM in the "effective field goal department" (i.e. adjusting for teammates and quality of opponents and homecourt, his team is X1 points better at making shots when Player X is on the court), but his APM in the turnover department is bad then at least you have a basis for basing where his "APM-PPP" comes from since it's been long understood that efficiency differential comes from the four factors (eight for offense and defense).

I haven't thought about whether it's doable, whether there are certain things that can hinder the study, intrinsic ideas that I have yet to consider and what not. Just an idea.

2. Another thing where I think PM can be used is in evaluating continuity. I've always held a belief that the more players play together, the better they play and the sum of the parts truly becomes greater than the whole. Just looking at certain recent "cores" from teams (numbers are taken from bball ref unless otherwise stated):

KD/Russ/ibaka
2010/11 — KD/Russ/Ibaka > +3 points per 100
2011/12 — KD/Russ/Ibaka > +6 points per 100
2012/13 — KD/Russ/Ibaka > +9 points per 100

George/Hibbert
2010/11 — George/Hibbert > +0.7 points per 100
2011/12 — George/Hibbert > +6.7 points per 100
2012/13 — George/Hibbert > +8.5 points per 100
2013/14 — George/Hibbert > +17 points per 100

Griffin/Jordan
2008/09 – -5.6 points per 100
2009/10 – +4.4 points per 100
2010/11 – +4.2 points per 100
2011/12 – +6.4 points per 100
2012/13 – +11 points per 100

Kidd/Terry/Dirk
2008-09 > +5.5
2009-10 > +7.1
2010-11 > 15.5

Bad Boys 2.0 (Billups/Hamilton/Big Ben):
2002-03 > +4.8
2003-04 > +7.2
2004-05 > +9.4

Horford/Smith/Johnson:
2007-08 > 0.5
2008-09 > +1.9
2009-10 >8.0

Kobe/Lamar:
04-05 > -1.2
05-06 > 5.2
06-07 >2.4
07-08 > +9.5

Even considering that each iteration had a significant addition (except for OKC) in the last year of the date I used (West and Hill for Indiana, Paul for Clippers, TC for Dallas, Sheed for Detroit, Crawford for Atlanta, Pau for the Lakers), there’s seems to be an upward trend.

My idea is instead of using each player as a single element in the regression (example: EffDiff = A1+A2+A3+A4+A5 - B1 - B2 - B3 - B4 - B5 + HC+e), we clump them together. (EffDiff = A12 + A3 + A4 + A5 - B1 - B2 - B3 - B4 - B5 + HC + e). The sample size is definitely smaller and instead of tens of thousands of combinations, we may only have a couple of thousands.

Will this work especially for players who play a lot together over multiple season?

Please do tell if both these ideas are bad. I just thought with the emergence of NBAWowy, give it a couple of years and you could do a study on both. Just an idea.

Posted: **Thu Dec 12, 2013 4:02 pm**

I think it's an interesting idea, but survivor bias is going to be an issue. Players that stay together for a few years are going to be players that a team wanted to keep, which means that either at least one of the players is doing well or those two players are doing well together. In either case, that means you'll probably see increased ratings for the players, either individually or together. But maybe you could look for increases in the duo's rating compared to their individual ratings over time. For example, just making up numbers, if KD/Westbrook/Ibaka were each +1 in 2010 but the three on court together were +3.5, then in 2011 they were each +1.5 but were +7 together, the 'synergy' bonus has increased and maybe that's a sign of good continuity. That method is probably still vulnerable to survivor bias, but takes care of a few kinks.

Posted: **Thu Dec 12, 2013 4:36 pm**

Does a head coaching change generally result in a worse record? Or is this year unusual?

Posted: **Thu Dec 12, 2013 4:58 pm**

xkonk wrote:I think it's an interesting idea, but survivor bias is going to be an issue. Players that stay together for a few years are going to be players that a team wanted to keep, which means that either at least one of the players is doing well or those two players are doing well together. In either case, that means you'll probably see increased ratings for the players, either individually or together. But maybe you could look for increases in the duo's rating compared to their individual ratings over time. For example, just making up numbers, if KD/Westbrook/Ibaka were each +1 in 2010 but the three on court together were +3.5, then in 2011 they were each +1.5 but were +7 together, the 'synergy' bonus has increased and maybe that's a sign of good continuity. That method is probably still vulnerable to survivor bias, but takes care of a few kinks.

Which is exactly the point -- you don't get to survive as a "group" if you play crappy together. Rarely do you see a group of players stick together through losing seasons. I mean DMC and Reke - the two supposedly corner stones of the Kings - played horrible even though they played a lot over three years (~4000 minutes together from 2010-13) and they never improved as a duo. Part of that is because they had a crappy group around them (which highlights the importance of "isolating" the value) but part of that is because both players never really "clicked".

Another duo - Tmac/Yao (2004~08):

0405 > +2.6
0506 > +4.4
0607 > +13.1
0708 > +4.2

I think what i wanted to happen to "prove" with the 2nd idea was that there is value in keeping a core together (since a lot of people think it's all about talent -- just keep on trading for talent and everything will take care of the rest).

Posted: **Thu Dec 12, 2013 5:00 pm**

If you have multiple year matchupdata you could quickly run through all possible pairs/triplets in the league and see how their PM changed over time. That said, I think an aging adjustment is critical here, as I think a lot of the pairs will start playing together when both are relatively young and they will simply become a better pair because they're both entering their prime later. Adjusting for teammates (and possibly coaches) is probably also useful

Unfortunately, as xkonk said, survivor bias is definitely going to be a big issue

Posted: **Thu Dec 12, 2013 6:12 pm**

J.E. wrote:If you have multiple year matchupdata you could quickly run through all possible pairs/triplets in the league and see how their PM changed over time. That said, I think an aging adjustment is critical here, as I think a lot of the pairs will start playing together when both are relatively young and they will simply become a better pair because they're both entering their prime later. Adjusting for teammates (and possibly coaches) is probably also useful

Unfortunately, as xkonk said, survivor bias is definitely going to be a big issue

Can someone explain why "survivor" bias is a key deterrent?

Also, awesome work on RAPM sir

Posted: **Thu Dec 12, 2013 7:15 pm**

Here is my take on survivor bias. It depends on the goal of the study.

If you want to see if teams can improve through continuity of their core, then the survivor bias is a problem. It is far more likely that the groups that stay together are the ones that work and would improve. (The groups that would not improve are more likely to be disbanded before it plays out.)

If you are trying to predict the future success of a team, then there may be value in treating groups of players as one unit and identifying the types of trends you mention. As long as you don't over interpret the results to suggest that any team that keeps its current core together will improve according to the observed trend, I don't see the survivor bias as a problem.

Posted: **Thu Dec 12, 2013 8:44 pm**

Continuity of lineups year to year face challenges as players move but it does not seem like that many teams make a great push to maintain strong performing 5 man lineups intact year to year, maybe believing that 3-4 players main players and the same system suffices. Only 7 of the 20 best lineups used 250+ minutes from last season are still possible.

http://bkref.com/tiny/mfIkJ

Posted: **Fri Dec 13, 2013 4:53 pm**

To expand a bit/be more concrete on survivor bias: I think the example from the book Black Swan is an easy enough one to grasp. A bunch of people enter the stock market. None of them have any actual ability, but due to dumb luck some make money. The people who don't make money are fired/leave the market. New people enter the market for whatever reason. The next year, some people again make money due to dumb luck, including some of the people who made money the year before; again, losers drop out. This goes on for as long as you care to keep track. As long as the pool of people is large enough, there's going to be a group of people who look like they know how to make money in the stock market just due to dumb luck. You can work the same example with a coin flipping competition; start with enough people and you'll find someone who "knows" how to call coin flips even after 30 or 50 rounds.

The same thing applies to the continuity idea. If coaches scrap groups that don't appear to be playing well together (and we assume that they might have given more time), then pretty much by definition the groups that do play together for a while will appear to get better with time. Even if it's true that continuity boosts performance, you'll overestimate how important it is because of survivor bias; groups that would have had negative/flat/only small boosts with time have presumably dropped out of the data that you end up analyzing because their coaches never game them the time to demonstrate that smaller effect.

Posted: **Sat Dec 14, 2013 12:29 am**

Have you taken a look at continuity among bad teams? I would be curious to see if they do improve over time. Or, perhaps they remain at the bottom of the league due to a lack of continuity.

Posted: **Sat Dec 14, 2013 6:55 pm**

Of the 14 lineups used over 250 minutes last season that were negative on +/- only 2 are still possible. One has been used 5 minutes or less ( A. Gee | K. Irving | T. Thompson | D. Waiters | T. Zeller for CLE) and the other (B. Biyombo | G. Henderson | M. Kidd-Gilchrist | J. McRoberts | K. Walker for CHA) hasn't change performance level (still near -4 per 100 possessions). There were no bigger minute lineups for bad teams that were positive on +/- last season. There is one bigger minute lineup for a somewhat bad team (Boston) that has a positive +/- this season but it played trivial minutes last season if any.

Coaches do so much situational management in games that very very few lineups get enough minutes to judge half decently and it really does not appear that many teams have season long strategies to try to test many lineups to even a modest level. Within the about 4000 team minutes that are available in a season one can test 5-10+ lineups into the hundreds of minutes level, a few of the best prospects much more than others. But on average teams only had 3 lineups used over 100 minutes last season and only 1.3 used over 250 minutes or much over 3 minutes per game for the full season. Injuries and trades hamper this but teams could do far better at systematic test of lineups (for playoffs and long-term use) if they made that a priority with or on top of and before a lot of the in game coaching maneuvers. Popovich had 4 lineups used over 100 minutes and 1 over 250 minutes. He probably believes his system should work with almost any lineup. But they went into the playoffs with hardly any half-reliable knowledge of how any lineups actually perform on average. By contrast the Griz tested 2 lineups over 600 minutes last season and the Warriors tested 3 over 400 minutes each. Several other teams had one over 700 minutes. Still fairly modest sample sizes but better. Miami tested one over 700 minutes and tested a total of 7 for over 100 minutes. OKC ran on and on with its starting lineup for 1300 minutes last season. They tested 4 others over 100 minutes and thus had a bit more testing than average teams (such a low bar of testing); but they could have done more, especially since they had 2 straight seasons of that starting unit sucking in the playoffs and it sucked again last season and yet they stuck with it with little change. They did not know as much about the other options as they could / should have.

APBRmetrics

The value of continuity

The value of continuity

Re: The value of continuity

Re: The value of continuity

Re: The value of continuity

Re: The value of continuity

Re: The value of continuity

Re: The value of continuity

Re: The value of continuity

Re: The value of continuity

Re: The value of continuity

Re: The value of continuity