APBRmetrics

Posted: **Sat Sep 03, 2016 4:40 pm**

1. Does BPM still have the same coefficients which were found by using 14-year RAPM data?

2. Are xRAPM and RPM the same or really close? If they are really close where can I find xRAPM results for 2001-2016?

Posted: **Thu Sep 08, 2016 2:15 pm**

permaximum wrote:1. Does BPM still have the same coefficients which were found by using 14-year RAPM data? 2. Are xRAPM and RPM the same or really close? If they are really close where can I find xRAPM results for 2001-2016?

1. Yes. I haven't updated it from that yet, though I'd like to at some point. The writeup is still valid at http://www.basketball-reference.com/about/bpm.html

2. Yes, they are pretty much the same. But I don't think xRAPM historical numbers exist currently on the web other than what might have been posted here.

Posted: **Thu Sep 08, 2016 3:57 pm**

DSMok1 wrote:
permaximum wrote:1. Does BPM still have the same coefficients which were found by using 14-year RAPM data? 2. Are xRAPM and RPM the same or really close? If they are really close where can I find xRAPM results for 2001-2016?
1. Yes. I haven't updated it from that yet, though I'd like to at some point. The writeup is still valid at http://www.basketball-reference.com/about/bpm.html

2. Yes, they are pretty much the same. But I don't think xRAPM historical numbers exist currently on the web other than what might have been posted here.

I think a better approach for BPM 2.0 is to use a 97-16 20 year RAPM since that would incorporate data from the pre 3 point explosion and illegal defense era. It would increase the accuracy of historical players who didn't play in the 3 heavy era.

Another improvement to BPM could be a positional BPM. Defensive rebounds are less important for PG than Centers. A new BPM can accurately reflect that. Another change would be to incorporate team 4 factors into BPM. If a team has a good defense because of a high turnover rate, we can credit the guards more on defense. If they have a good defense because of rebounding or eFG%, the big men get more defensive credit.

Posted: **Thu Sep 08, 2016 11:24 pm**

DSMok1 wrote:
permaximum wrote:1. Does BPM still have the same coefficients which were found by using 14-year RAPM data? 2. Are xRAPM and RPM the same or really close? If they are really close where can I find xRAPM results for 2001-2016?
1. Yes. I haven't updated it from that yet, though I'd like to at some point. The writeup is still valid at http://www.basketball-reference.com/about/bpm.html

2. Yes, they are pretty much the same. But I don't think xRAPM historical numbers exist currently on the web other than what might have been posted here.

Thank you for the answer and the BPM.

BPM was the best public metric to predict 2015-16 team wins (including playoffs) surpassing RPM (still RPM has the lead when I include 2014-15 too. But it's a bit unfair since RPM has info about the previous years' data and ratings). Yet I should note, at assigning individual player value; non-empirical simple linear box score metrics, MPG and PER are better than BPM, WS, RAPM and RPM.

Posted: **Fri Sep 09, 2016 12:25 pm**

colts18 wrote: 1. Yes. I haven't updated it from that yet, though I'd like to at some point. The writeup is still valid at http://www.basketball-reference.com/about/bpm.html

2. Yes, they are pretty much the same. But I don't think xRAPM historical numbers exist currently on the web other than what might have been posted here.

I think a better approach for BPM 2.0 is to use a 97-16 20 year RAPM since that would incorporate data from the pre 3 point explosion and illegal defense era. It would increase the accuracy of historical players who didn't play in the 3 heavy era.

Another improvement to BPM could be a positional BPM. Defensive rebounds are less important for PG than Centers. A new BPM can accurately reflect that. Another change would be to incorporate team 4 factors into BPM. If a team has a good defense because of a high turnover rate, we can credit the guards more on defense. If they have a good defense because of rebounding or eFG%, the big men get more defensive credit.[/quote]

The longer the better. I don't currently have access to a 20 year RAPM. Who has that data? I know James Brocato does. Not sure who else does.

Posted: **Fri Sep 09, 2016 12:27 pm**

permaximum wrote:BPM was the best public metric to predict 2015-16 team wins (including playoffs) surpassing RPM (still RPM has the lead when I include 2014-15 too. But it's a bit unfair since RPM has info about the previous years' data and ratings). Yet I should note, at assigning individual player value; non-empirical simple linear box score metrics, MPG and PER are better than BPM, WS, RAPM and RPM.

I dont want to get into a debate, but could you point me to where that was tested? The whole point of BPM's construction is to assign individual player value optimally.

Posted: **Fri Sep 09, 2016 3:49 pm**

DSMok1 wrote:
permaximum wrote:BPM was the best public metric to predict 2015-16 team wins (including playoffs) surpassing RPM (still RPM has the lead when I include 2014-15 too. But it's a bit unfair since RPM has info about the previous years' data and ratings). Yet I should note, at assigning individual player value; non-empirical simple linear box score metrics, MPG and PER are better than BPM, WS, RAPM and RPM.
I dont want to get into a debate, but could you point me to where that was tested? The whole point of BPM's construction is to assign individual player value optimally.

I made an advanced retrodiction test for all popular public metrics + MPG + USG + 2 non-empirical simple linear boxscore metrics at game level between 1983/84-2015/16 where each player's possessional or per-minute metric score (depending on the metrics' regular season average in the previous year) and roster turnover rate for 38658 games were calculated different for each game. Players below 250 MP in the previous year were assigned average values. I will publish the test and the results just after the new season starts.

For now all I can say, BPM, WS, RAPM and RPM are all worse than MPG at assigning individual player value. PER is a bit better. Those two boxscore metrics which I picked to see how they do compared to advanced metrics are even better. Here's one's formula. Note that it's not empirical, it's too simple that it doesn't even make oreb/dreb differentiation. Supposedly Tom Thibodeau came with this around 2008.

(2*fg-(fga-fg)+ft-0.5*(fta-ft)+3*3pm-1.5 * (3pa-3pm)+reb+ast-pf+stl-tov+blk)/mp

Edit: BPM and RPM do best at what they're built to do. Predicting next year's team wins. I would encourage everyone to use BPM/RPM blends for that. P-M based metrics will always suffer multi-collinearity issues and it's clear BPM suffers that too. That's the reason I say they're the best at what they're built to do. Capturing lineup synergies better than other metrics which helps them a lot at predicting next year team wins.

Posted: **Tue Sep 13, 2016 12:22 pm**

permaximum wrote:I made an advanced retrodiction test for all popular public metrics + MPG + USG + 2 non-empirical simple linear boxscore metrics at game level between 1983/84-2015/16 where each player's possessional or per-minute metric score (depending on the metrics' regular season average in the previous year) and roster turnover rate for 38658 games were calculated different for each game. Players below 250 MP in the previous year were assigned average values. I will publish the test and the results just after the new season starts.

For now all I can say, BPM, WS, RAPM and RPM are all worse than MPG at assigning individual player value. PER is a bit better. Those two boxscore metrics which I picked to see how they do compared to advanced metrics are even better. Here's one's formula. Note that it's not empirical, it's too simple that it doesn't even make oreb/dreb differentiation. Supposedly Tom Thibodeau came with this around 2008.

(2*fg-(fga-fg)+ft-0.5*(fta-ft)+3*3pm-1.5 * (3pa-3pm)+reb+ast-pf+stl-tov+blk)/mp

Edit: BPM and RPM do best at what they're built to do. Predicting next year's team wins. I would encourage everyone to use BPM/RPM blends for that. P-M based metrics will always suffer multi-collinearity issues and it's clear BPM suffers that too. That's the reason I say they're the best at what they're built to do. Capturing lineup synergies better than other metrics which helps them a lot at predicting next year team wins.

Interesting, I look forward to seeing the results. I'm not sure I follow exactly what you tested. So you ran each game with the actual minutes played, and the metrics from the previous season, and looked for how well those predictions correlated for that game's results?

Posted: **Tue Sep 13, 2016 1:49 pm**

DSMok1 wrote:
permaximum wrote:I made an advanced retrodiction test for all popular public metrics + MPG + USG + 2 non-empirical simple linear boxscore metrics at game level between 1983/84-2015/16 where each player's possessional or per-minute metric score (depending on the metrics' regular season average in the previous year) and roster turnover rate for 38658 games were calculated different for each game. Players below 250 MP in the previous year were assigned average values. I will publish the test and the results just after the new season starts.

For now all I can say, BPM, WS, RAPM and RPM are all worse than MPG at assigning individual player value. PER is a bit better. Those two boxscore metrics which I picked to see how they do compared to advanced metrics are even better. Here's one's formula. Note that it's not empirical, it's too simple that it doesn't even make oreb/dreb differentiation. Supposedly Tom Thibodeau came with this around 2008.

(2*fg-(fga-fg)+ft-0.5*(fta-ft)+3*3pm-1.5 * (3pa-3pm)+reb+ast-pf+stl-tov+blk)/mp

Edit: BPM and RPM do best at what they're built to do. Predicting next year's team wins. I would encourage everyone to use BPM/RPM blends for that. P-M based metrics will always suffer multi-collinearity issues and it's clear BPM suffers that too. That's the reason I say they're the best at what they're built to do. Capturing lineup synergies better than other metrics which helps them a lot at predicting next year team wins.
Interesting, I look forward to seeing the results. I'm not sure I follow exactly what you tested. So you ran each game with the actual minutes played, and the metrics from the previous season, and looked for how well those predictions correlated for that game's results?

Yes. But I used the actual minutes or actual possessions depending on the metric (RPM, RAPM, BPM uses possessions) to calculate players' unique score for each metric in that game and I also calculated roster turnover rate for that game. To be more clear, 240 minutes from new players in Team 1 and 225 minutes from new players in team 2 in a 48-minute game translates to 96.875% roster turnover rate for the game. Average roster turnover rate is 31.5% for a single NBA game and only 6 games out of 38658 have 100% roster turnover rate between 1984/85-2015/16.

Posted: **Tue Sep 13, 2016 1:59 pm**

permaximum wrote:Yes. But I used the actual minutes or actual possessions depending on the metric (RPM, RAPM, BPM uses possessions) to calculate players' unique score for each metric in that game and I also calculated roster turnover rate for that game. To be more clear, 240 minutes from new players in Team 1 and 225 minutes from new players in team 2 in a 48-minute game translates to 96.875% roster turnover rate for the game. Average roster turnover rate is 31.5% for a single NBA game and only 6 games out of 38658 have 100% roster turnover rate between 1984/85-2015/16.

You're looking at the game as a whole, correct? Using the single game approach will be problematic, because of blowouts. On a good team, most blowouts will be blowout wins, so starters will play fewer minutes in blowouts and more minutes in losses. I.E. "There's a very high correlation--whenever bench player #15 plays, we win in blowout fashion!" It's like the "running the ball leads to wins" fallacy in football--the causation is going the opposite direction.

As such, using games as a whole will yield incorrect results. Using lineup stints will yield correct results, since the players actually on the court when the lead is built will get credit.

I may not understand the approach you're using exactly, so apologies if you were already accounting for this effect.

Posted: **Tue Sep 13, 2016 3:30 pm**

DSMok1 wrote:
permaximum wrote:Yes. But I used the actual minutes or actual possessions depending on the metric (RPM, RAPM, BPM uses possessions) to calculate players' unique score for each metric in that game and I also calculated roster turnover rate for that game. To be more clear, 240 minutes from new players in Team 1 and 225 minutes from new players in team 2 in a 48-minute game translates to 96.875% roster turnover rate for the game. Average roster turnover rate is 31.5% for a single NBA game and only 6 games out of 38658 have 100% roster turnover rate between 1984/85-2015/16.
You're looking at the game as a whole, correct? Using the single game approach will be problematic, because of blowouts. On a good team, most blowouts will be blowout wins, so starters will play fewer minutes in blowouts and more minutes in losses. I.E. "There's a very high correlation--whenever bench player #15 plays, we win in blowout fashion!" It's like the "running the ball leads to wins" fallacy in football--the causation is going the opposite direction.

As such, using games as a whole will yield incorrect results. Using lineup stints will yield correct results, since the players actually on the court when the lead is built will get credit.

I may not understand the approach you're using exactly, so apologies if you were already accounting for this effect.

What you said doesn't change things because;

1. Starters still play more than bench players in blowout wins and they end up as the deciding factor
2. Bench players usualy have near-average values (I also give average values to below-250 minute players) so they don't change the outcome of the game.
3. If starters are so good that it ends up as a blowout, but somehow bench players' rating with less than 20-min of playing time changes the metrical outcome of a game, that means that metric is not good anyways.
4. Still, I used actual wins AND point differential for the retrodiction if I could see any signs of what you describe. Not even the slightest. Results are 99% similar. (Percentage of wins, MAE, RMSE.)
5. Sample is simply too big for it to become the deciding factor between metrics' prediction power. It also averages out at that big of a sample.
6. Results are completely supportive of any public retrodiction tests AND my own previous retrodiction tests which have been done at the season level.

However, I'm one of those that don't relax without some real proof in practice and I tested with the matchup data I have from basketballvalue.com if things are different for the lineup level. Again 99% similarity. But I have to admit this was a rough test because I was 100% sure I would get the same results anyways so I quickly stopped doing any more tests for it.

To settle your worries about this issue, I'm publishing the results for 2002-2016 and 2015-2016 without taking roster turnover into the account. Actual wins is better for defining prediction accuracy but point differential results of MAE, RMSE is the same. Same order of metrics, same magnitude of difference between them. Big sample really helps.

Don't worry about if BPM is in-sample or out-of-sample for 2001-2014. I can confirm it's out-of-sample for 2001-2014 too because BPM's prediction accuracy doesn't change a bit in those years and it was built to predict RAPM not the outcome of games and RAPM itself have trouble at predicting the outcome of games.

Code: Select all

  2001/02-2015/16                
 ----------------- ------------- 
  BPM               0.639156066  
  WS                0.628089771  
  AWS               0.626745268  
  RAPM              0.624676802  
  Thibodeau         0.623332299  
  PER               0.612783121  
  MPG               0.588737201  
  USG               0.540645361

Code: Select all

  2014/15-2015/16                
 ----------------- ------------- 
  RPM               0.673391702  
  BPM               0.659307195  
  WS                 0.65055196  
  RAPM              0.647887324  
  Thibodeau         0.643319376  
  AWS               0.642177389  
  PER               0.622763609  
  MPG               0.594975257  
  USG               0.570232204

Edit: Also, I went for games instead of seasons because it's the correct way to do retrodiction. Not the other way around. Still, the list above is not the indicator of good assignment of individual player value and people do really forget that. That was the whole point of this retrodiction test. Like I said before, I will publish the results later.

Posted: **Tue Sep 13, 2016 6:40 pm**

People did retrodiction tests at season level because at game level, scraping the data and getting it ready for any retrodiction analysis simply takes too much time. Also, when you add same player name problems, possession calculation and especially roster turnover calculation for each game too, I can assure you it took my a looot of time. I didn't have any tools or anyhing when I started doing this. I literally started from scratch so I also had to check everything 4-5 times to make sure I didn't make any errors at any point because with one simple mistake I could screw lots of things.

APBRmetrics

BPM and RPM

BPM and RPM

Re: BPM and RPM

Re: BPM and RPM

Re: BPM and RPM

Re: BPM and RPM

Re: BPM and RPM

Re: BPM and RPM

Re: BPM and RPM

Re: BPM and RPM

Re: BPM and RPM

Re: BPM and RPM

Re: BPM and RPM