Backtesting Advanced Metrics

Home for all your discussion of basketball statistical analysis.
Post Reply
ca1294
Posts: 7
Joined: Wed Jan 07, 2015 4:57 am

Backtesting Advanced Metrics

Post by ca1294 »

Hi everyone, a friend suggested I look into this forum to discuss basketball statistics.

I recently started a basketball blog, and my last post was an attempt to backtest PER, WS, BPM, and VORP. Some other backtests I've seen compute each team's minute-weighted PER or WS or whatever metric, and then they find the correlation between the team metrics and season win percentage. But this doesn't control for strength of schedule.

I tried accounting for strength of schedule by backtesting on a game-by-game basis. I found the correlation between the minute-weighted metrics and point differential for each game. The full post is here if anyone is interested: http://blog.numbercrunch.in/backtesting ... d-metrics/

What do you guys think of this method? Any suggestions?
Mike G
Posts: 6145
Joined: Fri Apr 15, 2011 12:02 am
Location: Asheville, NC

Re: Backtesting Advanced Metrics

Post by Mike G »

Welcome to APBRMetrics!
You should know that BPM has a "raw" form (ASPM) which is then adjusted -- sometimes radically -- for players on a team, so that the team then has aggregate BPM equivalent to their point differential. Win Shares may have a similar adjustment; and PER definitely does not -- it gives better PER to offensively efficient teams, regardless of their defense.

And still, I suspect an error or several in your method. No aggregate stat should have a negative correlation with performance; and yet you find a -.178 relationship between PER and game results for 1991?

See the BPM thread here for further explanations; and if you got your numbers from basketball-reference.com, you should probably cite it in your article.
Carry on!
Crow
Posts: 10536
Joined: Thu Apr 14, 2011 11:10 pm

Re: Backtesting Advanced Metrics

Post by Crow »

If you do a followup, it would be interesting to see some metric blends. Say 70% BPM, 30% WS or 60% BPM, 20% WS, 20% PER. Do they beat BPM alone?
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Backtesting Advanced Metrics

Post by DSMok1 »

Mike G wrote:Welcome to APBRMetrics!
You should know that BPM has a "raw" form (ASPM) which is then adjusted -- sometimes radically -- for players on a team, so that the team then has aggregate BPM equivalent to their point differential. Win Shares may have a similar adjustment; and PER definitely does not -- it gives better PER to offensively efficient teams, regardless of their defense.

And still, I suspect an error or several in your method. No aggregate stat should have a negative correlation with performance; and yet you find a -.178 relationship between PER and game results for 1991?

See the BPM thread here for further explanations; and if you got your numbers from basketball-reference.com, you should probably cite it in your article.
Carry on!
BPM should not be thought of as a raw form that is then adjusted--the adjustment was an inherent part of the derivation regression itself.

---

Nice work, ca1294! VORP shouldn't be tested the same way, since all it is is (BPM+2)*%Minutes played--it's not a rate stat, it's a counting stat. BPM should be used, VORP doesn't make sense to use in this application.

You've got some nice posts on your site.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
sndesai1
Posts: 141
Joined: Fri Mar 08, 2013 10:00 pm

Re: Backtesting Advanced Metrics

Post by sndesai1 »

when you calculate a team's minute-weighted metric, are you using the value for each individual player from the previous season? or is it the value from the same season that you are testing the correlation for?
ca1294
Posts: 7
Joined: Wed Jan 07, 2015 4:57 am

Re: Backtesting Advanced Metrics

Post by ca1294 »

Mike G wrote:And still, I suspect an error or several in your method. No aggregate stat should have a negative correlation with performance; and yet you find a -.178 relationship between PER and game results for 1991?

See the BPM thread here for further explanations; and if you got your numbers from basketball-reference.com, you should probably cite it in your article.
Carry on!
I was surprised by the negative correlation as well, so I checked my code over and over again. But I didn't find any errors. I did some calculations for individual games manually, and they matched the results from my code. I'll keep checking, and if I find anything then I'll be sure to write a follow-up.

And thanks for the advice, I've included Basketball-Reference and I'll be sure to remember to include them in the future!
Crow wrote:If you do a followup, it would be interesting to see some metric blends. Say 70% BPM, 30% WS or 60% BPM, 20% WS, 20% PER. Do they beat BPM alone?
I did a linear regression using all the metrics, and the correlation was only about 0.5 compared to BPM's correlation of 0.488, so the gain was marginal. And it gave PER a negative coefficient, so I don't think the blend was too useful.
DSMok1 wrote:Nice work, ca1294! VORP shouldn't be tested the same way, since all it is is (BPM+2)*%Minutes played--it's not a rate stat, it's a counting stat. BPM should be used, VORP doesn't make sense to use in this application.

You've got some nice posts on your site.
Ahh, I appreciate the explanation. I've removed VORP from the post. And thanks for the compliment!
sndesai1 wrote:when you calculate a team's minute-weighted metric, are you using the value for each individual player from the previous season? or is it the value from the same season that you are testing the correlation for?
I used the players' values from the same season.
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Backtesting Advanced Metrics

Post by DSMok1 »

I'm cross posting this from my comments on your blog post itself to spur further discussion:


Excellent work, Chirag! I have long wanted to do this analysis, and just have never gotten around to doing it.

1. I do encourage adding the home court advantage and rest effects. With this data set, it should be possible to adjust for rest, and even develop new and better numbers for rest effects, similar to how I did it at http://godismyjudgeok.com/DStats/APBRme ... =2661.html (see the last post on the page), only with the superior granularity of knowing the player minute distributions. To tell the truth, better rest effect quantification would be a notable return from this analysis and worth a post and publicity by itself.

With that accounted for, the correlations should go up further. The overall results probably won't change much, though, since the effect will likely be similar across all metrics.

2. Do you have any more years of data? I'd love to see this as far back as you can do it. I'd also love to see any more metrics that can be included.

Actually, as a baseline, I'd love to see regressed MPG (MPG with 4 or 5 games of 0 minutes added in) and see how that correlates. That SHOULD be a significantly positive correlation.

3. What are you doing with low minutes players? BPM, WS/48 and PER for all of the players will be very unstable if the player has few minutes, and mess with the results. I would recommend that all players with fewer than some threshold value of minutes (say 200) be given a "replacement level" value for each of their metrics. For BPM, I'd call that -2.0. Not sure for other metrics.

4. Finally, there is one confounding situation that could complicate things. If a team has a very specific rotation all season (not realistic, I know), then the bad/bench players will only play in blowout wins. That could lead to some odd correlations--if the good players play less, the team wins by more! I'm not sure what can adjust for that effect, to tell the truth. Food for thought.
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
sndesai1
Posts: 141
Joined: Fri Mar 08, 2013 10:00 pm

Re: Backtesting Advanced Metrics

Post by sndesai1 »

if it's testing correlation within the same season, wouldn't just using a player's ORtg - DRtg perform very well?
DSMok1
Posts: 1119
Joined: Thu Apr 14, 2011 11:18 pm
Location: Maine
Contact:

Re: Backtesting Advanced Metrics

Post by DSMok1 »

sndesai1 wrote:if it's testing correlation within the same season, wouldn't just using a player's ORtg - DRtg perform very well?
Shouldn't that be correlated 100% to WS/48?
Developer of Box Plus/Minus
APBRmetrics Forum Administrator
Twitter.com/DSMok1
Post Reply